Back to Projects

Hate Speech Detection in E2E Communication

A hybrid Flutter application integrating client-side BERT for real-time hate speech detection with RSA/AES encryption, achieving 84% accuracy on Roman Urdu datasets.

NLPBERTFlutterTFLite

Problem Statement

Traditional content moderation relies on centralized data processing, which often comes at the cost of user privacy. There is a significant challenge in creating systems that can detect hate speech without compromising the confidentiality provided by end-to-end encryption

Implementation Details

Model & Dataset

The core of the system is MobileBERT, which was fine-tuned on the Roman Urdu Hate Speech Dataset from Kaggle to detect offensive content in Roman Urdu script.

Optimization & Security Pipeline

To ensure the model could run efficiently and securely on mobile devices, we applied a rigorous optimization pipeline:

  • Quantization (INT8): To reduce size, the model was quantized to INT8 format.
    • Actual Model (after fine-tuning): 95.2 MB
    • After Quantization: 25.6 MB
  • Encryption: To protect intellectual property and prevent reverse engineering, the model file was encrypted before deployment.
    • Encrypted Payload: 34.1 MB
    • Tokenizer Overhead: ~1 MB (922 KB)

System Architecture

The solution is designed around a three-tier architecture to ensure secure, real-time communication:

  • Client-Side Intelligence: The optimized MobileBERT (TFLite) runs entirely on-device, classifying content locally before any data leaves the phone.
  • Security Layer: Implements RSA for secure key exchange and AES for message confidentiality.
  • Backend Infrastructure: Utilizes Firebase Authentication and Realtime Database for managing encrypted payloads.

Tech Stack

MobileBERTTensorFlow LiteFlutterRSA & AES EncryptionFirebase Realtime DatabasePython