📚 Building a Book Recommendation System with Streamlit & Collaborative Filtering

- September 11, 2025

Recommendation systems are used in many of the apps we use every day – from Netflix suggesting movies to Amazon recommending products.

As a book lover and data science enthusiast, I wanted to create a system that helps readers find new books based on what they like.
In this post, I’ll walk you through how I built a Book Recommendation System using Python, Streamlit, and Collaborative Filtering.

📌 Project Overview

The goal was to create a simple yet effective web app that:

* Shows the Top 50 most popular books based on average ratings.

* Recommends similar books to a user-selected title using collaborative filtering.

* Provides a clean, interactive user interface built with Streamlit.

📁 Dataset

I used the Book Recommendation Dataset from Kaggle, which includes:

* Books.csv – Book details like title, author, and image URL
* Users.csv – User demographic data
* Ratings.csv – Book ratings given by users

This dataset provided a solid foundation for both popularity-based and collaborative filtering approaches.

🧠 Approach

1️⃣ Popularity-Based Recommendation

The first step was to identify the top books based on ratings.

* Counted the number of ratings for each book
* Calculated the average rating
* Filtered books with at least 250 votes
* Sorted them in descending order of average rating

This resulted in a list of Top 50 books that users can explore directly in the app.

2️⃣ Collaborative Filtering

To personalize recommendations, I used Collaborative Filtering:

* Created a pivot table of `Book-Title` × `User-ID` with ratings as values
* Filled missing ratings with 0
* Calculated cosine similarity between books
* For a given book, retrieved the top 5 most similar titles

This method works well because it suggests books that other users with similar preferences also enjoyed.

🧑‍💻 Implementation

The project is split into three main files:

* book-recommendation-system.ipynb – Data preprocessing, model building, and `.pkl` file generation
* app.py – Streamlit application that loads precomputed data and displays recommendations
* kaggle_handler.py – Helper script to download datasets directly from Kaggle

Once the data is processed, running the Streamlit app provides users with two main options:

1. View Top 50 Books
2. Select a book and get personalized recommendations

🖼️ User Interface

The UI is intentionally simple:

* Top 50 View – Displays book title, author, rating, and cover image in a clean grid format
* Recommendation View – Lets users pick a book from a dropdown and instantly see similar suggestions with cover images

This keeps the experience interactive and user-friendly.

🧰 Tech Stack

* Python 3
* Pandas & NumPy – Data manipulation
* scikit-learn – Cosine similarity
* Streamlit – Web UI
* Pickle – Saving and loading preprocessed data

🧪 Try It Yourself

You can check out the full code and run the app locally:
GitHub Repository

📌 Key Learnings

Working on this project helped me:

* Practice data cleaning and preprocessing with real-world datasets
* Understand collaborative filtering and similarity-based recommendations
* Learn how to deploy machine learning models with Streamlit
* Improve my ability to present data in a clean, user-focused interface

🧩 Final Thoughts

Building this project was a great way to combine data science, machine learning, and web development. The result is a practical app that book lovers can use to discover their next favorite read.

If you have suggestions for new features (like hybrid recommendation or user-based filtering), I’d love to hear them!

Search This Blog

Sptrop