My Projects

Here you can find some of my Machine Learning and Deep Learning projects. The scope of these projects encompasses various aspects of machine learning such as predictive modeling, computer vision, and natural language processing. Each project is a complete end-to-end solution, incorporating modeling, deployment, and MLOps.

MLOps Project: Drug Discovery - Binary Classification

This ML project aimed to predict the permeability of compounds in PAMPA assay using their SMILES strings for binary classification. 18 algorithms, including traditional ML and GNN-ML frameworks like PyG and DeepPurpose, were evaluated using standard metrics. An accurate model was developed to guide drug discovery and development, and an end-to-end ML architecture was created, incorporating model training, testing, and operationalization. Technologies used included Python, shell scripting, AWS, Prometheus, Grafana, FastAPI, S3 bucket, and Jenkins for CI-CD. Endpoint monitoring and infrastructure were included for the ADMET_PAMPA_NCATS dataset.

MLOps Project: Red wine quality Classification + AWS Sagemaker integration

This project aims to classify the quality of Red Wine using physicochemical and sensory variables. The datasets include scores between 0 and 10 for normal, excellent, or poor wine quality. An end-to-end MLOps pipeline will be deployed using AWS for S3 storage, mlflow for experiment tracking, and Terraform for infrastructure deployment. The pipeline will include data preprocessing, feature engineering, model training, and deployment.

CNN Project: 101 foods classification using DVC

The project is an end-to-end multiclass classification problem that categorizes images of food into 101 classes using a VGG-16 architecture (CNN). The project has been implemented using TensorFlow Pipeline and DVC (Data Version Control). The codes used in this project follow standard Object-Oriented Programming (OOP) principles, ensuring consistency and maintainability.

DVC NLP Project: An end-to-end NLP binary sentiment classification + CI/CD/CT

This project is an LSTM-based text classification system that utilizes the IMDB dataset, which consists of 50K movie reviews for natural language processing. The dataset is suitable for binary sentiment classification and contains substantially more data than previous benchmark datasets, with 25,000 reviews provided for training and 25,000 for testing. The project's primary goal is to predict the number of positive and negative reviews using either classification or deep learning algorithms. Additionally, the project uses Flask and Gunicorn as an endpoint and has been developed using microservice architecture, making it an end-to-end project.

CNN Project: Cats vs Dogs classification using DVC

The project is an end-to-end binary classification problem using Convolutional Neural Networks (CNN) to recognize images of cats and dogs. It is built as a TensorFlow Pipeline and employs DVC for efficient version control. The project is implemented using standard Object-Oriented Programming (OOP) principles to ensure code reusability and maintainability. Overall, the project aims to provide an effective solution to the problem of classifying images of cats and dogs using state-of-the-art deep learning techniques.

Binary classification Project: ADME(PAMPA_NCATS) dataset + ML + GNNs

This tutorial illustrates how to predict the PAMPA permeability assay based on the SMILES string of a compound, using the supervised graph neural networks, the classical tree models, and the transformer models. 18 algorithms, including traditional ML and GNN-ML frameworks like PyG and DeepPurpose, were evaluated using standard metrics.

DVC- NLP Project: Binary Classification using Microservices Architecture for StackOverflow Tag Prediction with DVC Integration.

The project is a natural language processing (NLP) binary classifier problem of predicting tags for a given StackOverflow question. For example, we want one classifier which can predict a post that is about the R language by tagging it R. The project uses DVC for dataversionaning, and it is built on a microservices architecture, making it an end-to-end project. The dataset can be downloaded from this link.

mlflow- NLP Project: Binary Classification using Microservices Architecture for StackOverflow Tag Prediction.

The project is a natural language processing (NLP) binary classifier problem of predicting tags for a given StackOverflow question. For example, we want one classifier which can predict a post that is about the R language by tagging it R. The project uses MLflow for tracking our experiments, and it is built on a microservices architecture, making it an end-to-end project. The dataset can be downloaded from this link.

Regression Project: Boston house prices prediction

This project aims to develop and deploy a machine learning model for predicting Boston house prices as a web application. The model is implemented using standard Object-Oriented Programming (OOP) principles to ensure code reusability and maintainability. The application is built using Flask and Docker, with GitHub Action used for version control, testing, and continuous integration. The application is deployed on Heroku, providing easy access to the model for end-users. Overall, this is an end-to-end project that encompasses all aspects of developing, testing, and deploying a simple machine learning model as a web application.