Skip to content

athul0rameshan9/email-classification-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

title Email Classifier
emoji 📚
colorFrom gray
colorTo purple
sdk docker
pinned false
short_description Classifier for email using BERT

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Email Classifier

This project is an email classification application built using a BERT-based model. It classifies emails into predefined categories, enabling efficient email management and automation.


Approach

The email classifier leverages a pre-trained BERT (Bidirectional Encoder Representations from Transformers) model fine-tuned for text classification tasks. BERT is a state-of-the-art natural language processing model that understands the context of words in a sentence, making it highly effective for tasks like email classification.

Key Features of the Approach:

  1. Pre-trained BERT Model: Utilizes a pre-trained BERT model from Hugging Face for transfer learning.
  2. Fine-tuning: The model is fine-tuned on a labeled dataset of emails to adapt it to the specific classification task.
  3. Tokenization: Emails are tokenized using the BERT tokenizer to convert text into input IDs and attention masks.
  4. Multi-class Classification: The model predicts the probability of each email belonging to one of the predefined categories.

Methodology

  1. Data Preprocessing:

    • Emails are cleaned to remove unnecessary characters, HTML tags, and stopwords.
    • Tokenization is performed using the BERT tokenizer to prepare the input for the model.
  2. Model Training:

    • A pre-trained BERT model is fine-tuned on the labeled dataset.
    • The training process optimizes the model's weights using a cross-entropy loss function.
  3. Evaluation:

    • The model is evaluated on a validation dataset using metrics like accuracy, precision, recall, and F1-score.
    • Hyperparameters such as learning rate and batch size are tuned for optimal performance.
  4. Deployment:

    • The trained model is integrated into a FastAPI application for serving predictions.
    • Docker is used to containerize the application for easy deployment.

Implementation Details

Project Structure

  • app.py: The main FastAPI application that serves the email classification API.
  • bert-email-classifier/: Contains the BERT model configuration, tokenizer, and vocabulary files.
  • Dockerfile: Configuration for building the Docker image.
  • requirements.txt: List of Python dependencies required for the project.
  • data/: Directory for storing training and validation datasets (not included in the repository).

Running the Application

Prerequisites

  • Docker installed on your system
  • Python 3.9 or later (if running locally without Docker)
  • pip for managing Python dependencies

Using Docker

  1. Build the Docker image:
    docker build -t email-classifier .

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors