Emotion and habit in savings

A first dive into Jupiter’s rich data on saving behaviour

Luke Jordan
Towards Data Science

--

pixabay.com

What makes people save more?

The question is relevant for ordinary people, fintechs, asset managers, banks and policy makers. It’s the core problem Jupiter is trying to solve, and we’re going to periodically share what we’re learning. Some might think that risky but we believe in openness, and our ability to execute, so we’ll take the risk.

We’re still in our very early stages, so our sample isn’t huge, but it is very granular, and contains by now hundreds of experiments. So this will be the first in a couple of posts. A summary of its three key points:

  • Emotion: If you give people a direct and immediate payoff to saving, even if small and primarily emotional, you significantly increase the chance that they respond to a future prompt or offer to save, and you make it much more likely they become frequent savers.
  • Habit: The strongest predictor of whether someone will respond to a prompt to save is how often they have actively saved before, i.e., how much of a habit spontaneous (if prompted) saving has become. We are confident that isn’t just picking up preexisting characteristics.
  • What doesn’t work: Surprisingly, the size of monetary reward has no effect compared to emotional and habit factors. The day of the month does matter, but significantly less than we expected. And, when executed well, inducing active habits is significantly easier than convincing people to do automated passive saving, and, by creating emotional commitment and building habits, has significant positive spillovers later.

The data

Jupiter cares about user privacy. So our data lake contains nothing about the demographics of our users (only a separate, anti-fraud system does). It also contains nothing on users’ financial lives outside the Jupiter app.

What we do have is extremely rich, anonymized data on user behaviour within Jupiter — in the few months of being live, roughly 1,000 data points per user already, for almost 1,000 active savers.

Within those data points are the results of more than 300 experiments in using messages and prompts to induce people to save more. Those experiments have varied over more than two dozen dimensions, including six different kinds of games, social cues, reward size, time to respond, wording, channel and so on. So our sample remains limited on one dimension, but comes from a system designed to capture exceptional granularity in behaviour and speed and variation in running experiments.

The models

Jupiter is trying to make people save more now. Our methods are integrally linked to taking action. So we have built two machine learning models, to predict:

  • Immediate save: If we offer you a given “boost”-a chance to play a game (with prizes), a direct boost, a dose of loyalty points-at this time now, are you likely to make a save (anchored to a target balance or target save amount) within the next 24–48 hours?
  • Save frequency: If we provide you some combination of content, boosts, and other attention, are you likely to become a more frequent saver?

Right now the models we’re using are fairly standard regressors and boosted trees/forests (apologies for violating the startup hype code of “pretend you have crazy amazing AI models” — if you don’t know it by now, 99% of those are a notebook with maybe 5 lines, including “import sklearn”).

We train the models on our complete data, after a little cleaning, e.g., to remove internal users, and to exclude the onboard boosts and messages (since for this purpose we are interested in users’ later behaviour). The models achieve accuracy levels of ~0.7, putting them below simple tasks or ones with huge datasets, but above, say, Facebook’s hate speech tagging.

The results

After training the model we use “SHAP(ley) values” and decision-tree importance scores to understand the relative importance of different features. A “feature” means some characteristic of a user — how many times they have opened the Jupiter app, or when they opened their account, or when they last saved, and so on. The importance scores and SHAP values then tell us, “how much did this particular feature affect this prediction?”

On the prediction model, when trained as a gradient boosted tree:

  • How many times a user had saved before the boost was offered to them had a feature importance score of 5
  • How many times a user had claimed a boost in the past had a feature importance score of 4
  • The day of the month — our in-going hypothesis for most-likely-to-matter (think payday)— was only third ranked, with a score of 2
  • No other feature even figures as important compared to these.

Some of this we did not expect. It turns out the ratio of monetary payoff to required save size has no feature importance at all. Neither did the raw size of the payout. We haven’t tested extreme values — it is quite possible that tiny payouts or extreme save-payout ratios will have an effect. But roughly half of our boosts already wash their own face at a 1% fee on AUM, i.e., the save::reward ratio is above 100 (most often achieved by using game tournaments and other “first N” type boosts).

So within a reasonable and cost effective band of reward, how often users had previously had an emotional payoff, and how much saving was becoming a habit, mattered the most for whether they could be induced to save again.

That’s for point-in-time modelling. When we turn to the frequency model, i.e., predicting if users will be valuable over time or not, we have similar results. For this, we classify users into five buckets based on saving frequency, and then using their frequency of other, non-saving actions, the model predicts what class each user falls into. We find:

  • Overall, the three dominant predictors are: how often the user redeemed a boost, how often the user opened the Jupiter app, and whether they had won a tournament game. No other features are close.
  • When we looked deeper into predictors for the highest-frequency category, we found the same pattern repeated, though with one other feature: how often the user had engaged with in-app messaging
  • We put a lot of effort a few months ago into convincing users to set up automatic, scheduled saves. We had less than 10 takers. We have 90+ savers (~10%) in that highest-frequency category (3+ saves per month), and a further 120 in the next-highest (~2–3 saves per month).

One question here might be if we are just finding some innate and implicit feature which is, “highly predisposed to saving”. If we had a slimmer set of data and were using it in fewer ways, we would be less confident. But:

  • We can’t think of anyway that a proclivity to save would somehow cause someone to win tournaments on tap-the-screen games
  • The SHAP values are strongly asymmetric on the important features, i.e., not having saved is nowhere nearly as negative for boost response as having saved is positive
  • It’s not clear why, in looking at frequency categories, raw app engagement numbers would matter (a “heavy saver anyway” would just open the app when they wanted to save, not all the time, or would be at least as likely to do either)
  • If preexisting proclivity to save were dominant, we’d expect to see users forming into a small number of cleanly separated clusters easily predicted by lifetime save. We find the opposite (more on this to come).
  • If we add or remove preexisting saves and prior boost redemptions, accuracy changes by ~0.1 — a big and meaningful change, but not so big as to suspect that the labels are being leaked.
  • Finally, the results tally with our qualitative experience — user reviews saying, “this made me love saving”, and, when we did a short but deep set of user surveys lately, an emphasis in users’ wish-lists not on auto-saving features or raw returns, but on boost-related features.

To come

There’s more to explore in the future. We’re still just starting to understand the clustering of user behaviour.

When we run K-means we find the optimal number of clusters around 8–10, but we still need to understand what those clusters are (simple projections to low-dimensional representations aren’t immediately intuitive). We’ll also be interrogating small changes in behaviour over time and any insights from connections among savers (via Jupiter’s “saving buddies” feature).

In the meantime, I’ll round off by saying we were definitely surprised by how little the “rational” payout ratios mattered compared to prior emotional connections. We were almost as surprised by how much more readily people conducted unplanned, spontaneous saves, once they had an emotional connection and some ease with the act, than they were to setting up automated, passive investments.

The whole theory and practice of improving financial security has, for decades, been focused on getting people to be virtuous — to set up an automated deduction from their saving, with a little emotional pay off once a year when they saw their balance. That hasn’t moved the needle on savings rates anywhere. Maybe it’s time to think differently?

--

--

Practitioner in Residence, MIT Gov/Lab | Founder, Grassroot and Jupiter | code, data, policy, politics, and other