Netflix takes great pride in the fact that it offers a highly personalized experience for every user on a wide array of platforms. One could argue that more than 75 million versions of Netflix exist, one for each of its users. While there is still some overlap between its suggestions, the recommendations are still tailored to each person’s viewing behaviour. So, how does Netflix actually accomplish such personalization? Well, the culprit is the array of machine learning algorithms at play behind the scenes. As part of the data science department at Mediaan, I work with machine learning algorithms on a daily basis and would like to provide some insight into these algorithms.
Netflix got rid of the 5-star rating system?
Most recommendation systems work by having users rank products based on a scale, which for Netflix used to be the 5-star rating system. The company decided to get rid of this system a while ago and move on to a more simplistic one; that of a thumbs-up or thumbs-down, as well as a percentage depicting the compatibility between you and each movie. This change was done for a variety of reasons including the fact that people tend to rate movies “for other people”, so their ratings become biased, and did not reflect their actual preferences. To a certain extent, this also changes the algorithm; i.e. the input and the output. Instead of receiving a number between 0 and 5 as input, it now only receives a binary yes or no depending on whether someone likes a movie or not. As for the output, it now needs to give a percentage match between a person and a movie.
What else has changed?
This is not the only change that Netflix has implemented. Since the company went public in 130 different countries two years ago they decided to use more data to select the best recommendations. The company now makes recommendations based on their global audience, so suggestions are not bound by viewers in specific regions anymore. Which, in essence, has made such a rapid expansion possible.
So how does it work?
Even with all this change, the essence of how things work remains quite similar to before. The main idea is that users are clustered into communities based on their viewing history and likes or dislikes. Each movie/tv show is manually tagged by Netflix employees (for example Black Mirror would get tags such as sci-fi, dark, technology, etc.). This way “similar” movies can be distinguished and people that have common interests can be put in the same communities, making recommendations based on people like you possible.
Another cool thing about Netflix is that it considers your viewing patterns as well. By this I mean: it registers the time and day of the week (i.e. are you chilling at home on a Saturday night or just back from work on a Thursday), the scrolling pattern (i.e. how much time you spend looking at each row), the pauses (i.e. the times when you paused a video and then continued watching), the videos you stopped halfway through (i.e. shows or movies you quit watching altogether, maybe because you did not enjoy them), etc. Feeding all this information to the array of algorithms to provide the best service they can.
The algorithms-the recommendations
The algorithms below make up the recommendation system Netflix uses. For more detailed information look into this paper.
- The personalized video ranker – This algorithm orders the entire catalogue of videos for each member profile in a personalized way. It is responsible for the “genre rows” (i.e. rows like Movies with a Strong Female Lead).
- Top N video ranker – This algorithm produces the recommendations for the “Top picks” row, it finds the best few personalized recommendations in the entire catalogue for each member.
- Trending Now – This algorithm detects the “short-term trends” for the “Trending Now” row. There are two types of trends, yearly trends (such as Halloween, Christmas, etc.) and one-off events that spike an interest in a certain category or movie (such as a hurricane spiking an interest in natural disaster documentaries).
- Continue Watching – This algorithm ranks the videos in the “Continue Watching” row based on how likely you are to resume watching a certain video.
- Video-Video similarity – This algorithm deals with the “Because you watched …” rows. Which is a two-part process; with the first part being the generation of a list of similar videos for each video in the catalogue (which is un-personalized), and the second a personalized ranking of each video within the row.
- Page Generation: Row Selection and Ranking – This algorithm is responsible for generating the whole page (i.e. which rows to show where in the page based on relevance and diversity)
These are not the only algorithms responsible for the personalization. The information on the left of the page, such as the short summary of the video, awards, cast, the thumbnail or other metadata, is all generated by what they call “evidence selection algorithms”. Let’s consider the artwork as an example; the movie “Good Will Hunting” could have different thumbnails for people that are part of different communities, i.e. if you like comedy movies, it could feature Robbie Williams in the thumbnail (as a well-known comedian), or if you are more into romantic films, the thumbnail can be Matt Damon and Minnie Driver.
Search is another one of the personalized features offered by Netflix. Even though it amounts for only 20% of the chosen videos, it requires its own set of algorithms. Since users can search for anything (videos, actors, genres) that might not be in the catalogue, search itself becomes similar to a recommendation problem. The search algorithms combine metadata, search data and play data to arrive to the results. You can search for a movie name, a director, an actor/actress, a genre, the video quality and even the language type. However, it is possible that the Netflix catalogue in the country you are in does not contain the video you are looking for, in that case Netflix will try to recommend similar videos and sort them on which one you would like the most.
Nowadays, recommender systems are very widely used. In Netflix-like services, social media, online shopping, internet advertisements (that I am sure we all love). Almost everywhere you go on the internet, recommendations seem to pop up. Instagram keeps showing “Similar profiles you may like”, YouTube recommends channels and videos based on your earlier viewing experience, Amazon suggests things you could buy, etc. But, what else can these recommender systems be used for? In essence, the function of a recommender system is to suggest things people may like based on data of other things they liked. So other applications of this method could include predicting and recommending services/products to new and existing customers in different industries, for example in insurance and telecom.