Youtube's machine learning (ML) algorithm

First published on April 1, 2022

 

6 minute read

John Patrick Hinek

Growth

TLDR

Youtube’s recommendation algorithm offsets the need for search as it delivers users a personalized ranking of relevant videos on their feeds.

  • Introduction

  • YouTube’s recommender system

  • Ethical algorithms

  • Conclusion

Youtube has come a long way from where it began as a simple video sharing platform. With 

over 5 billion

 videos watched everyday, Youtube has become a main source for information and news consumption. Machine learning (ML) and artificial intelligence (AI) are deployed throughout Youtube’s platform; but despite their benefit of increasing discovery and convenience, there are concerns that the algorithms increase bias, spread misinformation, and create radicalized behavior.

Youtube, like other tech giants (

Netflix

Amazon

Starbucks

), uses a recommendation system to get relevant videos in front of users–connecting the billions of videos on Youtube with the right audience.

In the early days of Youtube (2008), the platform’s recommendation system was built on ranking the most popular videos and displaying them on a trending page. Seeing that most of their users found content through searches, product developers knew that a more sophisticated algorithm was needed to advance Youtube.

Today’s recommendation system is built on a few factors to get the right content in front of users. One of the first ML predictive algorithms applied to Youtube was collaborative filtering. Collaborative filtering makes predictions for one user based on a collection of data from users with a similar watch history.

For example, if user A and B both watch videos about baking cookies, and user A also watches a video about magic tricks, Youtube’s algorithm may recommend user B videos on magic tricks even if user B hasn’t watched any before.

Youtube’s algorithm also recommends videos based on a user’s viewing history, and over-time, typically becomes very accurate at predicting what a user wants to watch. Youtube does this by comparing videos they had previously recommended with videos that were watched. Watching and ignoring a video tells the algorithm to serve up more or less of that type of content. Your watch history then dictates how high a recommendation appears on your feed through the use of an ML ranking system. You can read more about ranking 

here

.

Referring back to our cookie example, user A receives 3 recommendations for videos on baking cookies (sugar cookies, chocolate chip, snickerdoodles). User A watches the chocolate chip cookie video and ignores the other recommendation. This information tells the recommendation system to rank chocolate chip cookie videos higher, resulting in more of those videos being on a user’s feed.

Youtube also claims to recommend videos that offer the most value to their viewers, as they state on their 

website

 that “not all watchtime is equal” and going on to say “we don’t want viewers regretting the videos they spend time watching and realized we needed to do more to measure how much value you get from your time on Youtube.”

The first step to Youtube recommending videos that held value to their users was deploying a survey after a user completed their video. Surveys asked users to rate the video from 1 to 5 stars, with only videos rated 4 or 5 stars being deemed valuable. Youtube used machine learning to predict survey results for those who didn’t fill out the survey. This was based on collaborative filtering and watch history and engagement.

As stated on their website, Youtube is aware of the potential danger a recommendation system can have in spreading misinformation. Using both humans and algorithms as moderators, Youtube claims to flag and remove recommendations for content that could be at risk of spreading misinformation, conspiracy theories, and harmful information.

Despite Youtube’s claims of being able to flag and prevent harmful videos from being recommended, data suggests that the algorithm may need further training before it is successful at completely stopping this content from spreading through algorithm-based ML recommendations.

Recommendation algorithms are optimized towards relevance to a user. If the algorithm picks up on a trend towards greater watch time and engagement towards videos that are harmful, false, or misleading, it begins to show that content more often. Algorithmic bias is something that all social media companies using a similar AI/ML system need to be aware of.

UC Berkeley professor, Dr. Hany Farid, whose research has focused on misinformation, image analysis, and human perception discussed the role of algorithms in spreading misinformation on the internet. Farid said “algorithmic amplification is the root cause of the unprecedented dissemination of hate speech, misinformation… and harmful content online. Platforms have learned that divisive content attracts the highest number of users.”

Creating ethical algorithms can often go against a company’s business interests, as misleading and false content can drive more engagement and time spent on the platform. This creates a vicious cycle as creators are then incentivized to create more of this type of content, as it garners the most views and engagement. The more people are exposed to misleading and false information, the more validity it gives this type of content. More information about ethical algorithms can be found 

here

.

Youtube’s intended purpose for deploying a recommendation algorithm was likely to create a better experience for their users, allowing them to find the most relevant and entertaining content. The risk for algorithms to spread false and misleading information is an example of the importance of algorithm monitoring and responsible AI.

In his research study, 

A Longitudinal Analysis Of YouTube’s Promotion Of Conspiracy Videos

, Dr. Farid studied Youtube’s recommendation algorithm after their claims of algorithm improvement to fight against bias. That study took place over 15 months and studied thousands of Youtube channels and videos, manually identifying which recommendations on those videos were harmful.

The study found that after Youtube’s measures for more ethical recommendations were put in place, conspiratorial recommendations became 40% less common in Youtube’s recommendations. That study concluded by saying that “the overall reduction of conspiratorial recommendations is an encouraging trend. Nonetheless, this reduction does not make the problem of radicalization on YouTube obsolete nor fictional, as some have claimed.”

Youtube’s actions in reforming their recommendation algorithm to reduce the amount of harmful content shown is a great step in decreasing the harmful effects and narratives that these systems can produce. Youtube’s place as a primary source of information makes these implementations essential to decrease the spread of false and misleading content.

Youtube’s recommendation algorithm makes finding relevant videos in their extensive catalog much easier for users. Average watchtime on Youtube continues to rise, proving the success of their integrations of ML and AI. Youtube’s recommendation system also proves the need for constant monitoring and training of these systems. To create you own ranking and recommendation models, get started 

here

.

To reduce the chance of misinformation appearing on your Youtube feed, users can access their 

controls

 that let Youtube know how much data you want to provide. Users can also erase their watch history which prevents Youtube from generating recommendations.

Start building for free

No need for a credit card to get started.
Trying out Mage to build ranking models won’t cost a cent.

No need for a credit card to get started. Trying out Mage to build ranking models won’t cost a cent.

 2022 Mage Technologies, Inc.
 2022 Mage Technologies, Inc.