by Malay Haldar, Liwei He and Moose Abdool
Airbnb connects millions of guests and hosts every day. Most of these connections are forged through research, the results of which are determined by a classification algorithm based on a neural network. While this neural network is adept at selecting individual listings for guests, we recently enhanced the neural network to better select the general collection of listings that make up a search result. In this post, we dive into this recent breakthrough that improves listing diversity in search results.
The classification neural network finds the best listings to surface for a given query by comparing two listings at a time and predicting which listing is most likely to be booked. To generate this probability estimate, the neural network puts different weights on various listing attributes such as price, location, and reviews. These weights are then refined by comparing booked vs. unbooked listings from search logs, with the goal of assigning higher probabilities to booked vs. unbooked listings.
What does the classification neural network learn in the process? For example, a concept taken over by the neural network is that lower prices are preferred. This is illustrated in the figure below, which plots the price increase on the x-axis and its corresponding effect on the normalized model scores on the y-axis. Increasing the price causes model scores to drop, which makes intuitive sense since most bookings on Airbnb lean towards the budget range.
But the price is not the only characteristic for which the model learns these concepts. Other characteristics such as ad distance from query location, number of reviews, number of bedrooms, and photo quality can show such trends. A large part of the neural network’s complexity lies in balancing all these various factors, tuning them to the best possible compromises that suit all cities and all seasons.
The way the classification neural network is constructed, its estimate of booking probability for a listing is determined by how many guests have booked listings with similar combinations of price, location, reviews, etc. in the past. The notion of higher booking probability essentially translates into what most guests have preferred in the past. For example, there is a strong correlation between high booking odds and low listing prices. Booking odds are customized based on location, number of guests, and length of trip, among other factors. However, in this context, the ranking algorithm ranks ads that the majority of the guest population would have preferred. This logic is repeated for each position in the search result, so the entire search result is constructed to favor the preference of the majority of guests. We refer to this as theMajority Ranking Principle : The overwhelming tendency of the ranking algorithm to follow the majority in every position.
But majority preference is not the best way to represent the preferences of the entire host population. Continuing our discussion of quoted prices, we look at the distribution of booked prices for a popular destination – Rome – and focus specifically on two-night trips for two guests. This allows us to focus only on price changes due to listing quality and eliminate most other variables. The figure below plots the distribution.
The x-axis corresponds to reservation values in USD, logarithmic scale. The left y-axis is the number of bookings corresponding to each price point on the x-axis. The orange shape confirms the log-normal distribution of the reservation value. The red line plots the percentage of total bookings in Rome that have a booking value less than or equal to the corresponding point on the x-axis, and the green line plots the percentage of the total booking value for Rome covered by these bookings. The 50/50 total reservation value split splits the reservations into two unequal groups of ~80/20. In other words, 20% of bookings represent 50% of the booking value. For this 20% minority, cheaper is not necessarily better and their preference tends more towards quality. This proves thePareto principle a coarse view of the heterogeneity of preferences among guests.
While the Pareto principle suggests the need to accommodate a wider range of preferences, the majority principle sums up what happens in practice. When it comes to search rankings, the majority principle is at odds with the Pareto principle.
Lack of listing diversity in search results can alternatively be seen as listings being too similar to each other. Reducing the similarity of listings, therefore, can remove some of the listings from search results which are redundant choices to begin with. For example, instead of dedicating every position in search results to cheap listings, we can use some positions for quality listings. The challenge here is how to quantify this similarity between listings and how to balance it against the base booking probabilities estimated by the classification neural network.
To solve this problem, let’s build another neural network, a companion to the classification neural network. The job of this complementary neural network is to estimate the similarity of a given list to lists previously entered into a search result.
To train the similarity neural network, we construct the training data from the recorded search results. All search results in which the reserved ad appears as the first result are deleted. For the remaining search results, we set aside the first result as a special list, called the antecedent list. Using ads from second position onwards, we create pairs of booked and unbooked ads. This is summarized in the figure below.
We then train a classification neural network to assign a higher reservation probability to the reserved list than the unreserved list, but with one modification: we subtract the output of the similarity neural network which gives a similarity estimate between the given list and the previous list. The reasoning here is that guests who skipped the previous list and then booked a listing from the results below must have chosen something different from the previous list. Otherwise, they would have booked the previous listing itself.
Once trained, we’re ready to use the similarity network to rank online ads. When ranking, we start by filling the top result with the listing that has the highest booking probability. For subsequent positions, we select the ad that has the highest booking probability from the remaining ads, after discounting its similarity to the ads already placed above it. The search result is built iteratively, with each location trying to be different from all locations above it. Lists that are too similar to those already entered are actually downgraded as shown below.
Following this strategy has led to one of the most dramatic changes to the rankings in recent times. We observed a 0.29% increase in uncancelled bookings, along with a 0.8% increase in booking value. The increase in booking value is far greater than the increase in bookings because the increase is dominated by high-quality ads that correlate with higher value. Booking value increase provides us with a reliable proxy to measure quality increase, although booking value increase is not the goal. We’ve also seen some direct evidence of increased booking quality: a 0.4% increase in 5-star ratings, indicating higher guest satisfaction across the entire trip.
We’ve discussed reducing the similarity between listings to improve the overall usefulness of search results and accommodate different guest preferences. While intuitive, to put the idea into practice we need a rigorous foundation in machine learning, which is described in our technical paper . Next, we’re taking a deeper look at the diversity of results location. We welcome all comments and suggestions for the white paper and blog post.
Interested in working at Airbnb? Look at these open roles .