End-to-End Emotion Detection: Data Processing, Modeling & Real-Time Deployment

🎭 Building a Robust Real-Time Emotion Detection System Using Ensemble Learning


 

🔗 GitHub Repository: https://github.com/KToppo/Emotion-Detection-ML

Human emotion recognition has emerged as a powerful tool in modern AI applications—ranging from digital well-being solutions to marketing analytics and interactive systems. In this project, I built a Real-Time Emotion Detection System that uses a camera feed or an image URL to classify a person’s facial expression into one of several emotion categories.

The complete project — including code, models, pipelines, and demo — is available on GitHub:
👉 https://github.com/KToppo/Emotion-Detection-ML

This blog documents the entire journey — from data preprocessing to final deployment — and highlights the experiments, improvements, and insights gained along the way.

📂 Project Structure

Here is the complete directory structure:

├── models/
│   ├── labels_1.pkl
│   ├── labels_2.pkl
│   ├── labels_3.pkl
│   ├── M1SMOTE_boost.png
│   ├── M1SMOTE_Clf.png
│   ├── M2ENN_boost.png
│   ├── M2ENN_Clf.png
│   ├── M3SMOTE_clf-NE.png
│   ├── model_1.pkl
│   ├── model_2.pkl
│   ├── model_3.pkl
│   ├── model-boost_1.pkl
│   ├── model-boost_2.pkl
│   ├── pipline_1.pkl
│   ├── pipline_2.pkl
│   ├── pipline_3.pkl
├── Model_Building.ipynb
├── Model_Testing.ipynb
├── image-to-vector.py
├── kaggle_handler.py
├── web-app.py
├── haarcascade_frontalface_default.xml

Purpose of Each File

File / FolderDescription
models/Stores all trained models, label encoders, pipelines, and model performance images.
image-to-vector.pyConverts raw images into gray-scaled 48×48 face vectors and builds data.csv.
kaggle_handler.pyDownloads datasets using kagglehub and organizes them into folders.
Model_Building.ipynbMain notebook containing preprocessing, sampling, model training, and saving models.
Model_Testing.ipynbUsed to test predictions from all models and evaluate ensemble performance.
web-app.pyStreamlit application for webcam-based and URL-based emotion prediction.
haarcascade_frontalface_default.xmlPretrained OpenCV cascade model for face detection.

🧱 Step 1: Dataset Preparation

Face Extraction & Vectorization

The file image-to-vector.py handles:

✔ Image loading
✔ Face detection using Haar Cascade
✔ Cropping & resizing to 48×48
✔ Flattening into 2304-pixel vectors
✔ Writing batches into data.csv

Even large datasets are handled efficiently using batch processing.


⚙️ Step 2: Model Building & Experiments

All experiments were executed inside Model_Building.ipynb.

📌 Preprocessing Flow

  1. Remove duplicates

  2. Split features (X) and labels (y)

  3. Scale images using MinMaxScaler

  4. Reduce dimensionality with PCA (400 components)


🧪 Step 3: Sampling Techniques & Their Impact

A major challenge was imbalanced emotion data. To handle this, several experiments were conducted.

🔹 Experiment 1: SMOTE + class_weight='balanced'

Models trained:

  • XGBoost

  • Stacking Classifier

Output classification reports saved as:

  • M1SMOTE_boost.png

  • M1SMOTE_Clf.png

Observation:
SMOTE increased the minority class recall but also introduced noise → affecting precision.


🔹 Experiment 2: SMOTEENN + class_weight='balanced'

SMOTEENN combines oversampling + cleaning using ENN.

Reports stored as:

  • M2ENN_boost.png

  • M2ENN_Clf.png

Observation:
Better than pure SMOTE, but class weights occasionally over-penalized majority classes.


🔹 Experiment 3: SMOTEENN + No Class Weight

This configuration was tested only for StackingClassifier.

Report saved as:

  • M3SMOTE_clf-NE.png

Observation:
This model showed the best balance between precision and recall across all emotions.


🏆 Step 4: Final Decision — Ensemble Voting

After analyzing all classification results, instead of selecting a single “best” model, I combined five models:

✔ model_1 (Stacking)

✔ model_2 (Stacking)

✔ model_3 (Stacking - Best version)

✔ model_boost_1 (XGBoost)

✔ model_boost_2 (XGBoost)

Each model includes its own pipeline (scaling + PCA) and label encoder.

I implemented a majority voting mechanism:

def combinepredic(img, models=models):
    pred = []
    for model, pipline in models.items():
        X = pipline[0].transform(img)
        emotion = model.predict(X)
        pred.append(pipline[1].inverse_transform(emotion)[0])
    return pd.Series(pred).mode()[0]

📌 Result: Improved f1-score & recall across nearly all emotion classes.
This final classification summary is stored in:

  • final_model.png


📸 Step 5: Real-Time Emotion Detection Web App 

The full working application, including Streamlit code, can be explored here:
🔗 GitHub: https://github.com/KToppo/Emotion-Detection-ML 

The web-app.py uses Streamlit + WebRTC for:

1. Webcam Emotion Detection

  • Captures frames

  • Detects face

  • Runs prediction using ensemble voting

  • Updates every 2 seconds

  • Displays emotion overlay on video

2. Image URL Emotion Detection

  • Load image from URL

  • Detect face → preprocess → predict

  • Output final predicted emotion


🧠 Key Learnings & Improvements

1️⃣ Importance of Data Balancing Techniques

SMOTE introduced synthetic noise, while SMOTEENN cleaned incorrectly generated samples.
Final learning: SMOTEENN (no class weights) gave the most stable performance.


2️⃣ PCA dim-reduction is essential

Without PCA, models suffered:

  • High training time

  • Overfitting

  • Poor generalization

Reducing to 400 PCA components preserved >95% variance.


3️⃣ Ensemble > Individual Models

No single model performed best on all metrics.
Using a voting ensemble of 5 independent models gave the most reliable outcome.


4️⃣ Designing for Deployment

To ensure smooth deployment:

  • Pipelines were saved with models

  • Label encoders saved separately

  • Streamlit caching improved performance

  • WebRTC allowed real-time video inference


🚀 How to Run This Project

1. Install dependencies

pip install -r requirements.txt

2. Ensure your models are in /models folder

3. Run the Streamlit app

streamlit run web-app.py

4. Use either:

  • Webcam Mode

  • Image URL Mode


📌 Final Thoughts

This project helped me understand:

✔ How sampling techniques influence model fairness
✔ How PCA & pipelines help maintain reproducibility
✔ How ensemble learning significantly boosts robustness
✔ How to integrate ML models into a real-time application
✔ Complete ML lifecycle — from dataset creation to deployment

If you're looking to build accurate emotion recognition systems, combining clean preprocessing, imbalanced handling, and multiple models is the most effective approach—not just choosing a single classifier.

Comments

Recommended

Kazam Video Error SOLLUTION | Video Codec Error SOLLUTION | Fix Codec Error: H.264 video editing on Windows. | Not Playing in other Platform (Windows/Android)

Lessons from the Leaders: How Veeba, Jumbo King, and Royal Enfield Built Sustainable Businesses Through Customer Focus and Strategic Growth

How to Become a CA (Chartered Accountant) in India 2020

Mastering Portfolio Analysis with Python: Calculate Risk and Return | Can Python Predict Your Next Million-Dollar Investment?

New Sketch | Not Good in Portraits Sketch | My way to Improve Portraits Sketch