📚 Building a Book Recommendation System with Streamlit & Collaborative Filtering
Recommendation systems are used in many of the apps we use every day – from Netflix suggesting movies to Amazon recommending products. 
As a book lover and data science enthusiast, I wanted to create a system that helps readers find new books based on what they like.
In this post, I’ll walk you through how I built a Book Recommendation System using Python, Streamlit, and Collaborative Filtering.
 
In this post, I’ll walk you through how I built a Book Recommendation System using Python, Streamlit, and Collaborative Filtering.
📌 Project Overview
The goal was to create a simple yet effective web app that:
* Shows the Top 50 most popular books based on average ratings.
* Recommends similar books to a user-selected title using collaborative filtering. 
* Provides a clean, interactive user interface built with Streamlit.
 
📁 Dataset
I used the Book Recommendation Dataset from Kaggle, which includes:
* Books.csv – Book details like title, author, and image URL
* Users.csv – User demographic data
* Ratings.csv – Book ratings given by users
This dataset provided a solid foundation for both popularity-based and collaborative filtering approaches.
🧠 Approach
1️⃣ Popularity-Based Recommendation
The first step was to identify the top books based on ratings.
* Counted the number of ratings for each book
* Calculated the average rating
* Filtered books with at least 250 votes
* Sorted them in descending order of average rating
This resulted in a list of Top 50 books that users can explore directly in the app.
2️⃣ Collaborative Filtering
To personalize recommendations, I used Collaborative Filtering:
* Created a pivot table of `Book-Title` × `User-ID` with ratings as values
* Filled missing ratings with 0
* Calculated cosine similarity between books
* For a given book, retrieved the top 5 most similar titles
This method works well because it suggests books that other users with similar preferences also enjoyed.
🧑💻 Implementation
* Books.csv – Book details like title, author, and image URL
* Users.csv – User demographic data
* Ratings.csv – Book ratings given by users
This dataset provided a solid foundation for both popularity-based and collaborative filtering approaches.
🧠 Approach
1️⃣ Popularity-Based Recommendation
The first step was to identify the top books based on ratings.
* Counted the number of ratings for each book
* Calculated the average rating
* Filtered books with at least 250 votes
* Sorted them in descending order of average rating
This resulted in a list of Top 50 books that users can explore directly in the app.
2️⃣ Collaborative Filtering
To personalize recommendations, I used Collaborative Filtering:
* Created a pivot table of `Book-Title` × `User-ID` with ratings as values
* Filled missing ratings with 0
* Calculated cosine similarity between books
* For a given book, retrieved the top 5 most similar titles
This method works well because it suggests books that other users with similar preferences also enjoyed.
🧑💻 Implementation
The project is split into three main files:
* book-recommendation-system.ipynb – Data preprocessing, model building, and `.pkl` file generation
* app.py – Streamlit application that loads precomputed data and displays recommendations
* kaggle_handler.py – Helper script to download datasets directly from Kaggle
Once the data is processed, running the Streamlit app provides users with two main options:
1. View Top 50 Books
2. Select a book and get personalized recommendations
* book-recommendation-system.ipynb – Data preprocessing, model building, and `.pkl` file generation
* app.py – Streamlit application that loads precomputed data and displays recommendations
* kaggle_handler.py – Helper script to download datasets directly from Kaggle
Once the data is processed, running the Streamlit app provides users with two main options:
1. View Top 50 Books
2. Select a book and get personalized recommendations
🖼️ User Interface
The UI is intentionally simple:
* Top 50 View – Displays book title, author, rating, and cover image in a clean grid format
* Recommendation View – Lets users pick a book from a dropdown and instantly see similar suggestions with cover images
This keeps the experience interactive and user-friendly.
🧰 Tech Stack
* Python 3
* Pandas & NumPy – Data manipulation
* scikit-learn – Cosine similarity
* Streamlit – Web UI
* Pickle – Saving and loading preprocessed data
🧪 Try It Yourself
You can check out the full code and run the app locally:
GitHub Repository
📌 Key Learnings
Working on this project helped me:
* Practice data cleaning and preprocessing with real-world datasets
* Understand collaborative filtering and similarity-based recommendations
* Learn how to deploy machine learning models with Streamlit
* Improve my ability to present data in a clean, user-focused interface
🧩 Final Thoughts
Building this project was a great way to combine data science, machine learning, and web development. The result is a practical app that book lovers can use to discover their next favorite read.
If you have suggestions for new features (like hybrid recommendation or user-based filtering), I’d love to hear them!

 
 
Comments
Post a Comment