Portfolio.
Poli NLP Classifier_BG_Image
Poli NLP Classifier_Image

Poli NLP Classifier

TechIcon_0TechIcon_1TechIcon_2TechIcon_3TechIcon_4TechIcon_5TechIcon_6TechIcon_7TechIcon_8TechIcon_9TechIcon_10TechIcon_11TechIcon_12

About the project:

Poli NLP Classifier_Mockup_Image

Team project (4). I led NLP modeling and the inference API; modeling and fine-tuned LSTM, and shipped a full-stack demo. I owned partial data cleaning; trained and fine-tuned an LSTM; built React UI (logo, input flow, animations); implemented Flask inference APIs (model loading & tokenization); deployed on Heroku (backend) and Vercel (frontend) with cost-efficient external model storage via Google Drive.

<

Features

>
Open-ended Input

Users can submit any political social media post; predictions are not limited to the training dataset.

Bias Classification

Classifies posts as Neutral vs. Partisan using a fine-tuned BERT model.

Message Type Classification

Predicts one of: Attack, Constituency, Information, Media, Mobilization, Personal, Policy, Support.

Robust Preprocessing

Removes URLs, emojis, HTML tags, and special characters via regex & NLP; includes tokenization and TF-IDF.

Modeling Options

Trained and fine-tuned LSTM, GPT-2, and BERT; the production demo uses BERT for best performance.

Generalization

Trained on the 2015 Crowdflower Political Social Media Posts dataset and generalizes to unseen inputs.

Full-stack Demo

React frontend integrated with Flask APIs for real-time inference and results visualization.

</

Features

>
<

What I Built

>
  • Preprocessed textual data by removing extraneous characters, tokenizing, and generating sequences for model training
  • Developed and trained a bidirectional Long Short-Term Memory (Bi-LSTM) model using TensorFlow and Keras API, optimizing performance based on F1-score and saving the best model weights
  • Built a full-stack web application with React.js (frontend) and Flask (backend) to classify political media posts
  • Designed a responsive and interactive user interface using JavaScript, CSS, and Bootstrap, allowing users to input political posts and view classification results dynamically
  • Implemented API integration to send user input to the backend for processing and retrieve real-time classification results
</

What I Built

>
<

Tech Stack

/>
Google Drive API Google Drive API

The Google Drive API enables developers to create applications that interact with Google Drive's cloud storage. This API allows for programmatic access to manage files and folders within a user's Google Drive.

Figma Figma

Figma is a cloud-based design and prototyping tool for creating user interfaces for digital products like websites and apps, emphasizing real-time collaboration for teams. Key features include design, prototyping, and design system management.

Python Python

Python is a high-level, versatile programming language known for its simplicity and readability, widely used in data science, AI, web development, and beyond.

Scikit-learn Scikit-learn

Scikit-learn is a popular open-source Python library that provides simple and efficient tools for machine learning, including classification, regression, clustering, and model evaluation.

Google Colab Google Colab

Google Colab is a free cloud-based platform that lets you write and run Python code in Jupyter notebooks, with built-in support for machine learning libraries and free GPU/TPU access.

HuggingFace HuggingFace

HuggingFace Transformers acts as the model-definition framework for state-of-the-art machine learning models in text, computer vision, audio, video, and multimodal model, for both inference and training.

Spacy Spacy

spaCy is an open-source library for advanced Natural Language Processing (NLP) in Python.

React React

JavaScript library for building user interfaces with reusable components.

Flask Flask

Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications.

HTML HTML

HyperText Markup Language for creating the structure of web pages.

CSS CSS

Cascading Style Sheets for styling the presentation of HTML documents.

Vercel Vercel

Vercel is a cloud platform that provides the tools and infrastructure for developers to build, deploy, and scale modern web applications, focusing on speed, developer experience, and global distribution.

Heroku Heroku

Heroku is a cloud Platform as a Service (PaaS) that enables developers to build, run, and manage modern applications in the cloud without needing to manage the underlying infrastructure.

T