COVID-19 (SARS-COV-2) PANDEMIC ANALYSIS AND MODELING [PYTHON]
Algorithm & Techniques: Data Analysis, Data Science
Language: Python
Technology: Keras
Tools: Anaconda
Date: January - June 2020
- Analyzed Covid-19 spread across geographical locations (region/country, state/province and county) on day level and developed visualizations for all countries with 9+ derived features.
- Strived to build prediction model for pandemic spread for countries initially, followed by state, county.
CANCER DETECTION FROM MICROSCOPIC TISSUE IMAGES WITH DEEP LEARNING
Algorithm & Techniques: Image Classification, Deep Learning, Convolutional Neural Network (CNN), Transfer Learning, Medical Imaging.
Language: Python
Technology: Keras
Tools: Anaconda
Date: November 2018 - April 2019
Description:
- Detected Cancer from microscopic tissue images (histopathologic) with Google’s “NASNetLarge” model and attained testing accuracy (F1 score) of 93.72% and loss 0.30 on 277K (6.5GB+) image cancer dataset.
- Fully trained model from scratch and experimented by adding multiple custom layers to final output.
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
PNEUMONIA DETECTION FROM CHEST X-RAY IMAGES WITH DEEP LEARNING
Algorithm & Techniques: Image Classification, Deep Learning, Convolutional Neural Network (CNN), Transfer Learning, Medical Imaging.
Language: Python
Technology: Keras
Tools: Anaconda
Date: September - December 2018
Description:
- Detected Pneumonia from around 6K Chest X-Ray images (1.15GB) by training custom deep convolutional neural network (CNN) fully from scratch, also by retraining pretrained model “InceptionV3” with fine-tuning.
- With custom deep CNN attained testing accuracy (F1 score) - 89.53%, recall - 95.48% and precision - 88.37% and with InceptionV3 testing accuracy (F1 score) - 83.44%, and loss - 0.42.
- For fine-tuning InceptionV3, freezed first few layers and trained last two inception layers.
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
MALARIA PARASITE DETECTION IN THIN BLOOD SMEAR IMAGES WITH PRETRAINED CONVOLUTIONAL NEURAL NETWORK (NASHNETMOBILE)
Algorithm & Techniques: Image Classification, Deep Learning, Convolutional Neural Network (CNN), Transfer Learning, Medical Imaging.
Language: Python
Technology: Keras
Tools: Anaconda
Date: February - May 2019
Description:
- Detected Malaria Parasites from thin Blood Smear images collected from Malaria screening research activity by National Institutes of Health (NIH) with Deep Learning (Convolutional Neural Network) specifically by retraining pretrained model NaNetMobile completely from scratch.
- Before feeding data into model, preprocessed and augmented image dataset containing 27,558 images (337MB) by adding random flips, rotations and shears.
- After loading pretrainied model NasNetMobile, added global max pooling, global average pooling, flattened layer to output of trained model and concatenated them. Also added dropout and batch normalization layers for regularization before adding final output layer - a dense layer with softmax activation and compiling with optimizer-Adam with learning rate-0.0001, metric-accuracy and loss-categorical crossentropy.
- Trained for 10 iterations and attained training accuracy 96.47% and loss(categorical crossentrpy) 0.1026 and validation accuracy of 95.46% and loss 0.1385.
Link to Kaggle Notepad (Code and Visualization)
Link to Kaggle Notepad (Code and Visualization)
Link to Kaggle Notepad (Code and Visualization)
Link to Kaggle Notepad (Code and Visualization)
ARTIST IDENTIFICATION FROM ARTWORKS WITH DEEP LEARNING
Algorithm & Techniques: Image Classification, Deep Learning, Convolutional Neural Network (CNN), Transfer Learning.
Language: Python
Technology: Keras, PyTorch
Tools: Anaconda, Kaggle
Date: March - September 2019
Description:
- Detected Artists from their Artworks with Deep Learning (Convolutional Neural Network) specifically by retraining pretrained model "InceptionResNetV3" completely from scratch.
- Before feeding data into model, preprocessed and augmented image dataset containing 8,446 images (2GB) of 50 different Artists by adding random horizontal flips, rotations and width and height shifts.
- After loading pretrained model "InceptionResNetV3", added global average pooling 2D with and dense layer with 512 units followed by batch normalization, dropout layers for regularization and activation for only dense layer. Finally, added final output layer - a dense layer with softmax activation and compiled with optimizer-Adam with learning rate-0.0001, metric-accuracy and loss-categorical cross-entropy.
- Trained for 15 iterations and attained training accuracy 98.36% and loss (categorical cross-entropy) 0.0820 and validation accuracy of 78.75% and loss 0.9093.
Link to Artwork Dataset (Kaggle)
Link to GitHub Repository (Code)
Link to Kaggle Notebook (Code)
Link to Artwork Dataset (Kaggle)
MOVIE REVENUE & RATING PREDICTION FROM IMDB MOVIE DATA
Algorithm & Techniques: Machine Learning, Supervised Learning, Regression Analysis
Language: Python
Technology: scikit-learn
Tools: Anaconda
Date: October - December 2016
Description:
- Developed regression model for predicting revenue and ratings with 5,000 movies and attained regression error (Mean Squared Error) 0.0005 on scale of 1 for revenue after 5-fold cross-validation.
- Conducted preprocessing, feature extraction (28 numerical, textual and categorical feature).
- Performed data analysis, visualization, feature extraction, cleaning (missing value, anomaly), preprocessing (rescaling, normalization, feature transformation (one hot encoding)) and trained with cross-validation.
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
WEB RETRIEVAL & SEARCH ENGINE IMPLEMENTATION FOR UNIVERSITY WEB DOMAIN
Algorithm & Techniques: Search Engine, Search Relevance, Information Retrieval, Vector Space Model, Cosine Similarity
Language: Python
Technology: Django
Tools: Anaconda
Date: August - December 2017
Description:
- Developed vector space model based end-to-end web retrieval engine for University of Memphis and evaluated performance with 10, 000 web pages and docs (text, pdf, docx and pptx) from university domain.
- Used TF-IDF vector space model and cosine similarity function for web page matching and ranking.
- Developed modules - web crawler (with memory), text preprocessor (preprocess, tokenize, stem from raw HTML/docs), page indexer, page relevance ranker and performance evaluator (F1, precision, recall).
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
MOVIE RECOMMENDATION ENGINE USING USER BASED COLLABORATIVE FILTERING
Algorithm & Techniques: Recommendation Engine, Recommendation Systems, Collaborative Filtering
Language: C++, Python
Technology: NA
Tools: Sublime Text
Date: February - April 2017
Description:
- Developed user-based movie recommender system by implementing user-user collaborative filtering with runtime and space complexity optimization and separate implementation in both C++ and Python.
- Used Netflix movie dataset with 100K user records as dataset.
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
RESTAURANT RECOMMENDATION SYSTEM USING RELATIONAL DATABASE
Algorithm & Techniques: Recommendation System, Relational Database
Language: Python
Technology: MySQL, Django
Tools: Anaconda
Date: October - December 2017
Description:
- Implemented restaurant recommendation system based on user (eg. location, cuisine preference) and restaurant (location, cuisine, ratings, reviews) info.
- Included features to derive review effectiveness and user trustworthiness from available data.
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
TOXIC COMMENT IDENTIFICATION / CLASSIFICATION
Algorithm & Techniques: Machine Learning, Supervised Learning, Classification, Natural Language Processing, Text Classification, Text Analysis
Language: Python
Technology: scikit-learn, NLTK
Tools: Anaconda
Date: August - September 2018
Description:
- Classify around 130, 000 text comments of size 34MB on categories - "Toxic", "Severe Toxic", "Obscene", "Threat", "Insult", "Identity Hate", "Any of the Above", "None of the Above".
- Used features fro AAAI 2018 paper "Anatomy of Online Hate: Developing a Taxonomy and Machine Learning Models for Identifying and Classifying Hate in Online News Media" by "Salminen, Almerekhi".
- Built pipelines for machine learning model training for reading file, creating training testing dataset, preprocessing, extracting features, and training and evaluation in grid search approach for multiple models.
- Generated visualization and aggregated report on the performance of various models.
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
REGRESSION MODELING FOR HOUSING PRICE PREDICTION
Algorithm & Techniques: Machine Learning, Supervised Learning, Regression
Language: Python
Technology: scikit-learn, NLTK
Tools: Anaconda
Date: August - September 2018
Description:
- Built regression model for predicting housing price using 79 numerical and categorical features with regression error (Mean Squared Error) of 0.000685 on a scale of 1.
- Built pipelines for machine learning (regression) model training with preprocessing (normalization, label encoding of categorical features), features extraction, training and evaluation in grid search approach for multiple regression models with visualization and aggregated report on the performance.
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
RENT-A-BIKE WEB APPLICATION
Algorithm & Techniques: Software Development, Web Application Development, Agile Development, Share App, Ride Share App, Model-View-Controller (MVC)
Language: Ruby on Rails, JavaScript, HTML, CSS
Technology: MVC, Bootstrap, Git
Tools: Sublime Text, Virtual Box, GitHub
Date: January - April 2017
Description:
- Developed as a team member, a web application for renting, sharing, selling and buying bikes along with chat and map feature. The project was aimed at university students.
- Used MVC Architecture and CRUD operations on Rails platform.
IMAGE RECOGNITION USING DEEP CONVOLUTIONAL NEURAL NETWORK
Algorithm & Techniques: Image Classification, Deep Learning, Convolutional Neural Network (CNN), Transfer Learning.
Language: Python
Technology: Keras, TensorFlow
Tools: Anaconda
Date: September - December 2018
Description:
- Developed image classification tools using Deep Convolutional Neural Network built from scratch with Keras and pretrained model “InceptionV3” separately for fine-tuning with new class labels.
- Trained on multiple datasets - Flower dataset (testing accuracy - 85.68%, 5 species, 4.5K images, 228 MB), 10 Monkey species (validation accuracy – 97.06%, 553MB), Dog Breed dataset (Testing accuracy - 76.41%, 120 class, 10.2K images, 344MB).
Sample prediction on 5 species of flowers
Prediction on 5 species of flowers for 64 images
Prediction on 120 species of Dog images
Sample prediction on 5 species of flowers
SERVER-CLIENT CHAT APPLICATION
TECHNOLOGY: JAVA, ANDROID, TCP/ IP
Algorithm & Techniques: Machine Learning, Supervised Learning, Classification, Natural Language Processing, Text Classification, Text Analysis
Language: Java, Android
Technology: TCP/ IP
Tools: Android Studio
Date: August 2015
Description:
- Developed TCP/IP based chat server and client application where multiple clients can chat simultaneously.
- Built client application for both Android and desktop platform.