How does Spotify use Machine Learning (ML)?

First published on February 11, 2022

Last updated at April 22, 2022

 

4 minute read

John Patrick Hinek

Growth

TLDR

By using various modes of machine learning, Spotify revolutionized the way their subscribers discover and consume music. 

Outline: 

  • Intro

  • Collaborative filtering

  • Reinforcement learning

  • Natural language processing

  • Conclusion

Intro

Spotify houses a growing library of over 82 million songs. While music enthusiasts may see this as an opportunity for endless music discovery, casual listeners would likely be overwhelmed by the selection and stick to catalogs they are familiar with. Seeing how subscribers who used Spotify for music discovery enjoyed the platform more, Spotify began deploying machine learning (ML) algorithms to recommend new titles for all their users. 

Collaborative filtering

Collaborative filtering is a type of recommendation algorithm that makes predictions about one user's preferences based on a collection of data from many users. Across Spotify’s 406 million subscribers, numerous similarities occur between the types of music certain clusters of people listen to. 

As of October 2021, Spotify had 82 million songs, and 4 billion playlists. Spotify sees the billions of playlists that subscribers have created as paths to obtain useful data to optimize the user experience. Chief R&D Officer 

Gustav Söderstörm

said in 

Lex Fridman

’s AI podcast that Spotify uses playlists, or playlisting, as a programming language. 

Spotify subscribers who playlisted more reported having a much better experience than those who didn’t. Acquiring the music discovery company Tungio, Spotify began using human intelligence and statistical methods to create group playlists and manually adjust them for maximum performance. With the success of these group playlists, Spotify saw another opportunity; to use ML to transition from group to individual playlist personalization. 

Along with user engagement, the growing number of personal playlists being created gave Spotify a huge database to work with. Users were creating playlists by grouping and labeling songs that had semantic meaning. Spotify was able to use these playlists to recommend titles to subscribers with similar music taste. 

Spotify’s algorithm looked for subscribers who had shared similarities in playlists and listening history (user A and B both adding All Too Well by Taylor Swift to a playlist). Spotify’s ML algorithm would then serve up songs to users with a similar listening history to aid in retention and new music discovery (user A adding a Bon Iver song to their playlist would increase the chances of user B being served a Bon Iver song). 

Natural language processing

Natural language processing (NLP) is an algorithm that gives computers the ability to understand text and speech. To categorize their music, Spotify uses NLP by scraping the web for any text it can find about a particular song. 

Spotify’s NLP then categorized songs based on the language used to describe them. Keywords will be picked out and assigned a weight, which can measure how much a song exhibits a particular emotion. This helps spotify’s algorithms identify which songs and artists belong in playlists together, which can then be more easily deployed to the recommendation system. 

Reinforcement learning

Reinforcement learning (Rl)  is a type of ML-based recommendation system that learns and responds to data from an interactive trial and error. Spotify uses RL to bring accurate and meaningful songs and artists to their subscribers' home pages. 

New content is first served to subscribers using collaborative filtering or NLP. The subscriber will then engage with the song on varying levels (listen to the song once, on repeat, listen to more songs by the artist) or disengage by skipping the song. In either case, the user is sending information to the algorithm about how successful their prediction was. 

RL is used to maximize a long-term reward. In Spotify’s case, this is user engagement and satisfaction. When explaining how RL is used to reach business goals, Spotify’s Vice President of Personalization Oskar Stål said that “rather than handing users the empty calories of a content diet that will only satisfy them in the movement, RL aims to push them to a more sustainable, diverse, and fulfilling content diet.” 

RL pushes users into music discovery that departs from their typical tastes and listening history. Widening the scope of the music subscribers are listening to broadens the scope of music consumed on Spotify's catalog–benefiting both the artist and the platform. 

Conclusion 

Spotify’s ability to leverage machine learning has pushed them to become one of the most popular methods of music discovery and consumption. The growing number of playlists being curated and increasing number of active subscribers tells us that spotify will continue to be a leader in music consumption for years to come. 

Start building for free

No need for a credit card to get started.
Trying out Mage to build ranking models won’t cost a cent.

No need for a credit card to get started. Trying out Mage to build ranking models won’t cost a cent.