End-to-End Emotion Detection: Data Processing, Modeling & Real-Time Deployment
🎭 Building a Robust Real-Time Emotion Detection System Using Ensemble Learning
Human emotion recognition has emerged as a powerful tool in modern AI applications—ranging from digital well-being solutions to marketing analytics and interactive systems. In this project, I built a Real-Time Emotion Detection System that uses a camera feed or an image URL to classify a person’s facial expression into one of several emotion categories.
The complete project — including code, models, pipelines, and demo — is available on GitHub:
👉 https://github.com/KToppo/Emotion-Detection-ML
📂 Project Structure
Here is the complete directory structure:
├── models/
│ ├── labels_1.pkl
│ ├── labels_2.pkl
│ ├── labels_3.pkl
│ ├── M1SMOTE_boost.png
│ ├── M1SMOTE_Clf.png
│ ├── M2ENN_boost.png
│ ├── M2ENN_Clf.png
│ ├── M3SMOTE_clf-NE.png
│ ├── model_1.pkl
│ ├── model_2.pkl
│ ├── model_3.pkl
│ ├── model-boost_1.pkl
│ ├── model-boost_2.pkl
│ ├── pipline_1.pkl
│ ├── pipline_2.pkl
│ ├── pipline_3.pkl
├── Model_Building.ipynb
├── Model_Testing.ipynb
├── image-to-vector.py
├── kaggle_handler.py
├── web-app.py
├── haarcascade_frontalface_default.xml
Purpose of Each File
| File / Folder | Description |
|---|---|
| models/ | Stores all trained models, label encoders, pipelines, and model performance images. |
| image-to-vector.py | Converts raw images into gray-scaled 48×48 face vectors and builds data.csv. |
| kaggle_handler.py | Downloads datasets using kagglehub and organizes them into folders. |
| Model_Building.ipynb | Main notebook containing preprocessing, sampling, model training, and saving models. |
| Model_Testing.ipynb | Used to test predictions from all models and evaluate ensemble performance. |
| web-app.py | Streamlit application for webcam-based and URL-based emotion prediction. |
| haarcascade_frontalface_default.xml | Pretrained OpenCV cascade model for face detection. |
🧱 Step 1: Dataset Preparation
Face Extraction & Vectorization
The file image-to-vector.py handles:
✔ Image loading
✔ Face detection using Haar Cascade
✔ Cropping & resizing to 48×48
✔ Flattening into 2304-pixel vectors
✔ Writing batches into data.csv
Even large datasets are handled efficiently using batch processing.
⚙️ Step 2: Model Building & Experiments
All experiments were executed inside Model_Building.ipynb.
📌 Preprocessing Flow
Remove duplicates
Split features (X) and labels (y)
Scale images using MinMaxScaler
Reduce dimensionality with PCA (400 components)
🧪 Step 3: Sampling Techniques & Their Impact
A major challenge was imbalanced emotion data. To handle this, several experiments were conducted.
🔹 Experiment 1: SMOTE + class_weight='balanced'
Models trained:
XGBoost
Stacking Classifier
Output classification reports saved as:
M1SMOTE_boost.png
M1SMOTE_Clf.png
Observation:
SMOTE increased the minority class recall but also introduced noise → affecting precision.
🔹 Experiment 2: SMOTEENN + class_weight='balanced'
SMOTEENN combines oversampling + cleaning using ENN.
Reports stored as:
M2ENN_boost.png
M2ENN_Clf.png
Observation:
Better than pure SMOTE, but class weights occasionally over-penalized majority classes.
🔹 Experiment 3: SMOTEENN + No Class Weight
This configuration was tested only for StackingClassifier.
Report saved as:
M3SMOTE_clf-NE.png
Observation:
This model showed the best balance between precision and recall across all emotions.
🏆 Step 4: Final Decision — Ensemble Voting
After analyzing all classification results, instead of selecting a single “best” model, I combined five models:
✔ model_1 (Stacking)
✔ model_2 (Stacking)
✔ model_3 (Stacking - Best version)
✔ model_boost_1 (XGBoost)
✔ model_boost_2 (XGBoost)
Each model includes its own pipeline (scaling + PCA) and label encoder.
I implemented a majority voting mechanism:
def combinepredic(img, models=models):pred = [] for model, pipline in models.items(): X = pipline[0].transform(img) emotion = model.predict(X) pred.append(pipline[1].inverse_transform(emotion)[0]) return pd.Series(pred).mode()[0]
📌 Result: Improved f1-score & recall across nearly all emotion classes.
This final classification summary is stored in:
final_model.png
📸 Step 5: Real-Time Emotion Detection Web App
The full working application, including Streamlit code, can be explored here:
🔗 GitHub: https://github.com/KToppo/Emotion-Detection-ML
The web-app.py uses Streamlit + WebRTC for:
1. Webcam Emotion Detection
Captures frames
Detects face
Runs prediction using ensemble voting
Updates every 2 seconds
Displays emotion overlay on video
2. Image URL Emotion Detection
Load image from URL
Detect face → preprocess → predict
Output final predicted emotion
🧠 Key Learnings & Improvements
1️⃣ Importance of Data Balancing Techniques
SMOTE introduced synthetic noise, while SMOTEENN cleaned incorrectly generated samples.
Final learning: SMOTEENN (no class weights) gave the most stable performance.
2️⃣ PCA dim-reduction is essential
Without PCA, models suffered:
High training time
Overfitting
Poor generalization
Reducing to 400 PCA components preserved >95% variance.
3️⃣ Ensemble > Individual Models
No single model performed best on all metrics.
Using a voting ensemble of 5 independent models gave the most reliable outcome.
4️⃣ Designing for Deployment
To ensure smooth deployment:
Pipelines were saved with models
Label encoders saved separately
Streamlit caching improved performance
WebRTC allowed real-time video inference
🚀 How to Run This Project
1. Install dependencies
pip install -r requirements.txt
2. Ensure your models are in /models folder
3. Run the Streamlit app
streamlit run web-app.py
4. Use either:
Webcam Mode
Image URL Mode
📌 Final Thoughts
This project helped me understand:
✔ How sampling techniques influence model fairness
✔ How PCA & pipelines help maintain reproducibility
✔ How ensemble learning significantly boosts robustness
✔ How to integrate ML models into a real-time application
✔ Complete ML lifecycle — from dataset creation to deployment
If you're looking to build accurate emotion recognition systems, combining clean preprocessing, imbalanced handling, and multiple models is the most effective approach—not just choosing a single classifier.

Comments
Post a Comment