How does TikTok use machine learning (ML)?

First published on February 16, 2022

Last updated at February 24, 2022

 

5 minute read

John Patrick Hinek

Growth

TLDR

TikTok uses machine learning (ML) algorithms to curate their For You feed. The feed is personalized to every user's unique interests and interactions with TikTok's content.

Outline 

  • Intro

  • Categorizing content 

  • Recommendation system 

  • Conclusion

Intro

TikTok is known by its users for having a hyper-personalized, addictive algorithm. TikTok’s algorithm based homepage, the “For You” feed, differs from other social media platforms in that it serves users content based on specific user input and engagement rather than the traditional likes, comments, and following. Aided by machine learning (ML) algorithms, TikTok has the most downloaded app in 2021 (

forbes, 2021

). 

Categorizing Content

At nearly every stage of their content strategy, TikTok deploys an ML algorithm to give themselves quick and insightful data. The first step to TikTok’s recommendation strategy is analyzing the video based on three factors: computer vision, natural language processing (NLP), and metadata. 

Computer vision is a deep learning (subset of machine learning) process which uses neural networks to decipher images within a photo or video. The computer vision algorithm is backed by a dataset of millions of labeled images that allows the algorithm to recognize new images based on specific traits and characteristics. It enables the algorithm to see and understand the content of the videos being created. 

TikTok uses computer vision to analyze facial features, products, and other traits in people and objects to quickly understand the video’s content. It classifies the individual feature of the video to optimize categorization. 

NLP is then used to translate and describe the audio content of a video. NLP first extracts the audio information from a video and applies a level of analysis towards it. This could either come from classification or clustering models. Classification uses supervised ML which takes pre-labeled data and uses it to train and classify the content of new data. Clustering is a form of unsupervised ML which finds patterns in the data and groups similar findings together. Once the data is extracted and grouped, it can then determine who that content is most useful for. 

The final step in categorizing a TikTok video is the metadata that the user provides when posting: caption, hashtags, etc. Extraction of this content is primarily done by the user themselves. 

Recommendation system 

TikTok’s recommendation algorithm is highlighted on the For You feed–TikTok’s flagship feature. According to 

TikTok

, their For You feed is “a stream of videos curated to your interests, making it easy to find content and creators you love… powered by a recommendation system that delivers content to each user that is likely to be of interest to that particular user.” While crossover exists on standout popular videos, every For You feed is unique and catered towards individual users. 

Video classification and categorization is just one form of data TikTok uses to predict a successful user interaction. TikTok gathers a large sum of their data from user generated interactions on the app. TikTok’s short form video content allows the company the opportunity to analyze watch time and rewatch-rate of certain videos. According to the 

New York Times

“it [TikTok] has chosen to optimize for two closely related metrics in the stream of videos it serves: “retention” — that is, whether a user comes back — and “time spent.” The app wants to keep you there as long as possible.” 

When users open TikTok, they will be presented with a few different videos across varying topics. Based on how the user interacts with each video (re-watches, likes, shares, ignores) a new stream of videos will be curated. Based on the initial engagement, TikTok’s algorithm can then apply content-based filtering to further show the user relevant videos. Content-based filtering looks for similarities between new videos and videos that a user has already engaged with. The algorithm will then serve up new content to a user based on the content they have previously interacted with.

Once enough data is generated about a user, another layer of recommendation, collaborative filtering is applied to the For You feed. Used in other applications like 

Netflix

and 

Spotify

, TikTok uses collaborative filtering to feed users videos based on the behavior of similar users. As an overview to how this system works: if User A engages with video 1, 2, 3, 4, 5 and user B engages with 2, 3, 4, 5, and 6, TikTok’s algorithm is likely to pick up on the similarities between the two users and serve video 1 to user B and video 6 to user A. 

Users are continually being served content that is based on the content-based and collaborative filtering algorithms. Still, TikTok video recommendations don’t exist within a vacuum. The algorithm takes into account new trends and current events to feed their users new content. Users are often served with random content that doesn’t match their watch history or that of their closely related peers. This comes in hopes the user will engage with this content and the cycle is able to repeat itself. 

Conclusion

TikTok’s powerful and addictive algorithms have elevated them to 1 billion users and according to 

Forbes

, the world’s most popular web domain in 2021. Their strategic use of ML has no doubt played a role in their success. TikTok’s use of ML demonstrates the power that good data and a great algorithm can have in connecting users to the content they want to consume. 

Start building for free

No need for a credit card to get started.
Trying out Mage to build ranking models won’t cost a cent.

No need for a credit card to get started. Trying out Mage to build ranking models won’t cost a cent.