What We Can Learn from Zillow on Basing a Business Around Machine Learning

Ten lessons that you can apply to your own AI strategy

Published in

Towards Data Science

7 min readNov 7, 2021

Zillow recently announced that it is exiting the home-flipping business, and blamed the inability of its “instant buying” model to forecast prices properly. Zillow also laid off 25% of its data science team. It’s worth looking for lessons that we can glean from this episode.

Machine Learning for One-Way Doors

Jeff Bezos popularized the idea that some decisions are two-way door decisions: you can walk through the door, and if you don’t like what you see, you can always walk back. These decisions are easy to reverse, and because of this, they are ideal for experimentation. Other decisions are one-way decisions, and hard to reverse.

Automated machine learning is ideal for two-way door decisions — personalization, recommendations, chat bots, etc. are usually two way decisions.

For Zillow, however, starting a home flipping program was a one-way door. They should have been looking for ways to make it a two-way door.

Here are ways that you can make a machine learning project a two-way door:

Human in the Loop: Use the machine learning model as a decision aid, not as the decider. Zillow should have had a human, on-the-ground appraiser approve the final bids.
Start small. For example, in just a small region. Or put a cap on the amount of money that you will put into the program. Then, wait. In this case, Zillow would have needed to wait 6+ months to see how the initial investments panned out.

It appears that Zillow did do #2, hit their capacity for 2021, looked carefully at the portfolio, and exited the business. In that sense, this is not a fiasco. Just a failed experiment. One that impacts 25% of their data science team. It appears that the cost of the pilot was itself too high.

This makes me suspect that Zillow’s executive team didn’t understand the extent to which even their pilot was a one-way door. The structure of Zillow’s program — to decide to buy a house if the difference between the current price, and the estimate 6 months from now is large enough — requires the model to be right for six whole months.

There are 3 machine learning models involved in the program:

A price estimate model. What is the home worth today on the market? This is Zillow’s Zestimate, the model they have the most internal data on.
A repair estimate model. How much will it cost to fix up the home to sell?
A future price estimate model. How much will the house sell for in 6 months?

Let’s take these models one by-one because there are useful lessons in each of these models.

Price Estimate: Design Agile Systems

Let’s look at the first model in Zillow’s system. The Zestimate is what they believe is their key differentiation, and the reason they believed they could flip houses.

The price estimate system is not trying to predict future prices, just the price that the house would sell for today. Even so, it will experience drift.

Every feature in a machine learning model is subject to drift — the training data distribution will not match the test data distribution and even if the model learns to not overfit the training data, what it has learned to do is to not overfit the data in ways that will cause failure on the validation data. The model is still dependent on the distribution of the training/validation/test data. The future, however, will be different from all three of these.

3. You will need to do continuous evaluation to catch these changes quickly. Once you detect a change, you will need to continuously retrain your models. This is a key aspect of MLOps.

However, catching the beginning of a trend is hard — however, good you are at MLOps, there will be a gap between changes in the distribution and catching those changes. This gap presents an opportunity and users will find a way to game your system before you can catch on to the trend.

4. Look for behavioral tricks by users. Do you offer a price discount to people who leave items in their cart overnight? Users will start to add items and go away, waiting for you to send them a discount coupon. Does Zillow make larger offers on houses near currently sold houses? Users will learn, and share, those tricks.

5. Look for unusual concentrations of activity. Sellers often have more information than the buyer. They may know when an airport about to get built and will cause home values in a neighborhood to drop. Perhaps a new employer is about to come to town. Maybe interest rates are about to rise. Those in the know will rush to sell before the algorithm catches on.

This sort of model monitoring is very common in games and anti-abuse situations. I’d be surprised if Zillow wasn’t doing some of this. However, it’s possible that they did not prepare what action to take if they did find such activity.

6. You will have to factor in the likely extent of unavoidable losses, and set up circuit-breakers to avoid buying too many houses in a neighborhood or timespan in a given time frame.

Repair Estimate: Adverse Selection

Now, take the second model — of figuring how much it will take to repair the house to get it up to market expectations. The risk here is that of adverse selection. Imagine that the pricing model pretty much nails the price of a home. It gets the price that the house will sell for correct within 5%.

It is very likely that more people whose houses are overestimated (i.e. +5%) will sell their houses to Zillow. Fewer people whose houses are underestimated (-5%) will sell. This is an effect of the asymmetry of information and so Zillow will buy many more overvalued houses than undervalued ones. Either Zillow makes stingy offers that very few people take, or overpays for houses and takes a loss.

Even worse, the downside risk is uncapped, whereas the upside risk is. If you buy a house for $100K, it is unlikely to sell for $150K. But there is a significant risk that the house has an undisclosed foundation problem and will sell for only $50K.

7. ML is a poor choice if the data that feeds into it is poor. The asymmetry in information leads to the ML model not having the same information as the seller and future buyer. Because the quality of machine models is driven much more by the quality of the data than by the sophistication of the algorithm, it’s a losing game to play in a market where you have less information than other participants.

Pricing an Option: Similar problems in different industries

Finally, let’s take the third problem. This is the model that predicts home prices 6 months into the future.

Did Zillow hire experts in pricing futures properly, or was their data science team naive enough to think that quantitative trading was nothing more than time series forecasting?

8. Hire the right type of expert. Options pricing is a well-known discipline, but it’s not machine learning. Trading in options, which is what Zillow was doing, is yet another thing. It’s quantitative trading. Machine learning models expect the world to be stationary, but traders know that future events will make past patterns irrelevant. Not hiring the right sort of expert is very much an executive failure.

Holistic Strategy

Knowing that there are three models here, it is important to consider the risk of the entire pipeline in a holistic way.

9. Know the limitations of models. Data scientists are well aware of drift. Realtors are well aware of adverse selection. Quantitative traders are well aware of the risks of trading on the basis of a model that claims to capture most of the historical variance. For every type of machine learning model that you use, you need to know what the underlying limitations are. Identify the risks of each model.

10. Design the business to separate out the risks of different models. What could Zillow have done to teased apart the risks inherent in the three models?. They could have used their price estimate model, made bids for houses, and sold a tranche of houses on the market immediately. Transfer the risk to traders who know how to handle that risk.

Had they done this, Zillow would have been in the market for buying houses at a discount to fair market value, a business that is better suited for their price estimation model.

Note: Needless to say, but I’ll say it anyway. Living in Seattle, I have friends who work at Zillow. ML is part of my team’s portfolio in Google Cloud (public sources indicate they are a customer of AWS but it is possible that Zillow is also a customer of ours.) Nevertheless, everything here is my personal opinion and not reflective either of my Zillow friends or of inside information about their workloads.

Note 2: The other way that Zillow could have handled the holistic risk would have been to build an actuarial team, price the risk, and hold houses for the long term, renting them out in the meantime. However, rents in the US are significantly less than the cost of servicing a mortgage, and so most landlords are betting on price appreciation making up for the sunk cost of owning a home.