You're building a Netflix clone :material-netflix:. You have a dataset of movie reviews, where each review is a (user_id
, movie_id
, rating
) triplet.
movie_ids
are integers in the range[0, Nmovies)
user_ids
are integers in the range[0, Nusers)
ratings
are integers in the range[1, 5]
- Build a compressed sparse matrix where (i,j) gives the ith person's review of movie j.
- Normalize the movie vectors (column vectors) so that each of them has unit length.
- Calculate the Euclidean distance between normalized movie 2 and normalized movie 4.
For example
if our Netflix clone had three users and two movies with a review matrix like this
The normalized movie vectors would be
The Euclidean distance between these two normalized movie vectors is 1.41.