Do we need to understand “black box” Automated Valuation Models​? Another perspective on Zillow’s algorithm problem.

Many credit Zillow’s success to its algorithms. As a real estate online marketplace, Zillow has been the data forerunner of the industry. Its algorithm-based tools, such as Zestimate, not only generate profits for the company, but also build a solid connection with the growing data science community: hundreds if not thousands of researchers, engineers, and students have been helping the company to improve its algorithms via lucrative competitions. In addition to the winning algorithms, worth $1M to the developer, the company has benefited from the talent base it connects with.

However, it was also an algorithm that caused the company’s recent debacle. The stock price plummeted as the company announced the closing of Zillow Offers last month, because the core algorithm failed to accurately predict house prices at a large scale. “We’ve determined the unpredictability in forecasting home prices far exceeds what we anticipated and continuing to scale Zillow Offers would result in too much earnings and balance-sheet volatility,” Zillow CEO Rich Barton said in a statement. As a result, the company announced a $304 million loss for Q3, 2021.

What went wrong?

The failure of the house price prediction algorithm can be attributed to a myriad of possible causes, such as the market's reaction to macroeconomic policy (i.e. unprecedented monetary easing), unusual employment conditions, or the pandemic. After all, during COVID, anything seems to be fraught with uncertainty. However, like the fall of LTCM (Long-Term Capital Management) in the late 90s, people’s concerns about AI algorithms are mainly because of (un)interpretability. In the LTCM case, people clearly knew (although a bit late) that the Russian crisis in 1998 broke some mathematical assumptions of the quantitative strategy. For an AI algorithm with complex frameworks and hundreds of hidden layers, it is disturbing that we know very little about what’s going on inside these black boxes, even though algorithms can still do their job pretty well.

Does interpretability really matter?

We still cannot fully understand why Zillow's algorithm failed, just as we cannot fully understand why AlphaGo became a top player without knowing any pre-defined strategy, and many AI chatbots unexpectedly learn all the prejudices of human society without design intervention. Thus, people can only blame the nature of (un)interpretability and try to interfere or even ban the application of AI in many areas such as healthcare, justice, or social media.

However, this may be a pessimistic idea: compared to the human mind, unexplainable AI algorithms may be closer to how the world works. In this process, human knowledge probably acts as a distraction instead of helping when we are developing or using AI algorithms. We can understand this problem in terms of the basic way in which deep neural networks work. As the basic building blocks of a neural network, each "neuron" imposes an activation function on the input values (e.g. Relu, sigmoid, tanh, and their variants). If we understand it in an abstract way, the activation function simply uses basic operations to achieve something like a "conditional statement": the neuron is "excited" or "silenced" by whatever conditions the input signal meets. In human neuronal cells, a similar process is accomplished by the selective binding of receptor proteins on the postsynaptic membrane to neurotransmitters. In human neuronal cells, a similar process is accomplished by the selective binding of receptor proteins on the postsynaptic membrane to neurotransmitters. (Again, the neuronal network of the brain is very different from the model network of deep learning.)


Figure 1 Biological neuron and its mathematical model

Why is this property of "conditional statements" interesting? We can imagine a neural network as a baby just starting to explore the world, needing to understand the rules of the external world through constant exploration and trial and error, and responding "instinctively" to complex situations (movement, communication, etc.). In this process, the infant is unable to understand generalized knowledge laws through language, but can only perform trial-and-error processes like "if (event A) happens, perform (behavior B)". In fact, the infant's motor and cognitive abilities alone far exceed those of current state-of-the-art AI models in most respects.

Of course, you could counter that "a baby could never learn by trial and error and become a top Go player, or an expert in any other field" which may be true for humans. But it is not inconceivable given the speed at which algorithms learn. In fact, rather than relying on highly generalized laws, human "experts" rely more on "tacit knowledge" – some shortcuts our brains have learned from empirical information, which at some point appears as inspiration, intuition, or proficiency.


Figure 2 The heuristic learning of AlphaGoZero. Left: Simplified Go game tree in 5 x 5 board; Right: hypothetical game tree of standard Go game, ~250^150 possible moves, in which the algorithm finds a “path” heuristically.

AI for real estate: we are still optimistic

We are proponents of accepting an approach that lacks explanations and tries to deepen our understanding of reality with the help of AI, rather than limiting it with our obsession with interpretability. In fact, the real estate field is (at least in theory) well suited to a methodology like AI that can grasp complex interactions, fine-grained statistical correlations, and contingency.

The real estate field is full of generalizations, such as the clichéd "location, location, location. But behind “location” are tens of thousands of factors that influence the real estate market, from the macro-level demographics, infrastructure, energy supply, or climate change, to the most micro level of user experience, indoor environmental quality, and even what a person saw on social media a minute before, or the CO2 level in his/her brain. It's an exciting process that is constantly revealing these overlooked influences. Moreover, is it possible that AI, without the input of generalized knowledge, can uncover more hidden factors and their fine-grained interconnections?

AI algorithms themselves naturally have the potential to do so. The short-term limitation comes mainly from the training data on which AI depends. But we are mining new data in many interesting areas, such as environmental quality, individual behavior, energy and infrastructure, and so on. At the same time, more AI algorithms that make efficient use of data to extract information are being created. There is reason to be optimistic about this.

Despite a bit of a setback, Zillow's stock price is still twice what it was when it started the Zillow iBuying program, which is probably another factor that keeps us optimistic.



Activation functions, and which is better on what task: Glorot, X., Bordes, A., and Bengio, Y. (2011b). Deep sparse rectifier neural networks. In JMLR W&CP: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011). 130, 297

How AlphaGo works: Pumperla, Max, and Kevin Ferguson. Deep learning and the game of Go. Vol. 231. Manning Publications Company, 2019, and https://nikcheerla.github.io/d...

Discussion on the black-box nature of deep learning models-answer by Moenova from Toronto University: https://www.zhihu.com/question...

What can we learn from deep learning https://mp.weixin.qq.com/s/vhG...

WSJ: What Went Wrong With Zillow? A Real-Estate Algorithm Derailed Its Big Bet

The company had staked its future growth on its digital home-flipping business, but getting the algorithm right proved difficult: https://www.wsj.com/articles/z...