Guidance on statistical methods for forecasting
Spendable CIC is a non-profit organisation that is building an app to helps users manage their weekly budgets. Where bills are due at different frequencies (e.g. weekly, monthly, quarterly), and can have amounts that vary over time, it can be hard for someone to know how much they need to set aside in a given week. The app will identify funds that are free (i.e. “spendable”) by understanding a user’s expected income and outgoings in the future. These forecasts are established from the user’s historic bank transactions, and therefore require limited input from the user.
Spendable CIC wanted guidance on the statistical methods that would be suitable for making forecasts from historic bank transaction data. Such forecast can be challenging because a user’s income might be sporadic, or split across different payment types and people. Similarly, bills can vary in amount, payment dates, and payment method. Therefore identifying the connected events within transaction data can be challenging.
Kate Land volunteered for this project. She discussed the details with the Director of Spendable, and familiarised herself with the work that had been done by the organisation so far. This included work on accessing and organising banking data. In order to make the guidance as useful as possible, Kate committed to prototyping some approaches with real data. Unfortunately, it was not possible to access sample data that reflected exactly what would be available in the app, in terms of target users or data fields. Banking data is highly sensitive, and therefore this was a limitation the project needed to work within.
Using some of her own banking data, Kate developed a step-by-step methodology that identified transactions that were part of the same process. Forecasts could be made for each group of transactions, within some uncertainty. Kate felt it was important to use a methodology that was simple and transparent, so that the resulting forecasts were easy to explain and amenable to user input.
Kate presented the prototyped algorithm she had developed on one set of example data; classifying past transaction into groups and making forecasts for each group. She explained the motivation for the general approach and provided guidance about the caveats and limitations of the work, and ideas for further development. They discussed the issues around accessing sample data. It was understood that an algorithm would inevitably have to go live with limited testing, but once the app had some users there would be scope for iterative improvements. This could be part of a future project between Spendable and RSS’s Statisticians in Society.
The impact and benefits
Nick Lee, director of Spendable CIC wrote:
‘The algorithm developed through the project has been incorporated within the prototype budgeting service. It has improved the accuracy of our forecasting algorithms, improving the quality of the service provided to clients. The project has increased speed of development meaning that it can be offered to help people earlier. By focusing Spendable’s approach, it has saved the organisation staff time and money.’