Projects
Brain Tumor MRI Classification
Developed a deep learning pipeline for multi-class brain MRI classification using ANN models and VGG16 transfer learning. Compared baseline neural networks with fine-tuned CNN architectures, incorporating preprocessing, augmentation, and providing model evaluation using macro-F1 score and confusion matrices.
Tools: Python, PyTorch, Torchvision, NumPy, Pandas, Matplotlib
Methods: Transfer Learning, CNNs, ANN Modeling, Fine-Tuning,
Data Augmentation, Confusion Matrix Evaluation
This project applies deep learning and transfer learning techniques to classify brain MRI scans into multiple tumor and non-tumor categories using a medical imaging dataset.
The analysis includes image preprocessing, augmentation, neural network modeling, transfer learning with VGG16, and performance evaluation using macro-F1 score, confusion matrices, classification reports, and validation metrics for multi-class medical image classification.
Credit Card Fraud Detection w/ ML
Developed machine learning models to identify fraudulent credit card transactions using supervised classification techniques. Applied preprocessing, class imbalance handling, feature analysis, and statistical evaluation methods to compare model performance using precision, recall, F1-score, and confusion matrices.
Tools: Python, Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
Methods: EDA, Feature Engineering, Supervised Classification, Class Imbalance Handling, Precision-Recall Evaluation
This project applies machine learning techniques to detect fraudulent credit card transactions using a highly imbalanced financial dataset.
The analysis includes data preprocessing, exploratory data analysis, feature scaling, supervised classification modeling, and performance evaluation using industry-relevant fraud detection metrics such as precision, recall, F1-score, ROC-AUC, and Area Under the Precision-Recall Curve (AUCPR).
Airline Delay Performance Analysis
Developed data analysis models to evaluate airline operational performance using exploratory data analysis, feature engineering, and statistical hypothesis testing techniques to identify patterns in flight delays, cancellations, and operational performance trends across multiple airline datasets.
Tools: Python, Pandas, NumPy, Matplotlib, Seaborn, SciPy
Methods: EDA, T-Test, ANOVA, Pearson Correlation, Feature Engineering
This project applies statistical analysis and data science techniques to identify patterns in airline operational performance using a large commercial flight delay dataset.
The analysis includes data preprocessing, exploratory data analysis, feature engineering, statistical hypothesis testing, and business performance evaluation using Python-based analytical workflows to examine delays, cancellations, and operational trends.
Energy Intelligence RAG Model
Developed a Retrieval-Augmented Generation (RAG) pipeline for energy intelligence analysis using large language models, semantic retrieval, and vector-based document search. Compared baseline language generation with tuned retrieval architectures, incorporating prompt engineering, chunk evaluation, and grounded response generation analysis.
Tools: Python, LangChain, OpenAI API, ChromaDB, Pandas, NumPy
Methods: RAG Pipelines, Semantic Retrieval, Prompt Engineering,
Vector Search, Retrieval Tuning, Grounding Evaluation
This project applies Retrieval-Augmented Generation (RAG) techniques to analyze energy intelligence reports using large language models and vector-based semantic retrieval.
The analysis includes document preprocessing, chunking, embedding generation, prompt engineering, retrieval tuning, and performance evaluation using grounded response analysis, retrieval validation, comparative scoring, and hallucination reduction techniques for domain-specific question answering.