RESEARCH
Machine Learning, Deep Learning, Natural Language Processing, Computer Vision, Distributed Big Data Analytics, Information Retrieval, Recommendation Engine
GENERATIVE OPEN DOMAIN CHATBOT APPLICATION WITH DEEP LEARNING
Algorithm and Techniques: Machine Learning, Deep Learning, Recurrent Neural Network (RNN), LSTM, Bidirectional LSTM, Sequence to Sequence (Seq2Seq), Beam Search, Neural Attention Mechanism
Language: Python
Technology: TensorFlow, PyQT
Tools: Anaconda, Linux
Date: January - May 2018
Description:
- Developed generative model based open domain conversational agent (Human vs AI) using state of the art architecture, Sequence-to-Sequence (Seq2Seq) and attained validation perplexity 46.82 and Bleu 10.6.
- Trained encoder-decoder based Seq2Seq model fully from scratch and further optimized the Recurrent Neural Network based model with Bidirectional LSTM cells, Neural Attention Mechanism and Beam Search.
- Used Cornell Movie Subtitle Corpus following data preprocessing as data, PyQT for chat interface (GUI) development and untrained Google’s Neural Machine Translation (NMT) model for Seq2Seq module.
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to Demo
Link to GitHub Repository (Code)
DISTRIBUTED MACHINE LEARNING FOR BIOMARKERS DETECTION FROM WEARABLE SENSOR BIG DATA
Algorithm & Techniques: Machine Learning, Distributed Machine Learning, Classification, Supervised Learning, Mobile Health, Big Data Analytics
​
Language: Python
Technology: Apache Spark, scikit-learn, Git, GitHub
Tools: IntelliJ Idea, Linux
Date: January - April 2017
Description:
- Developed Machine Learning (ML) module for training ML models on multiple clusters with Apache Spark.
- Developed Grid & Random Grid Search CV module for training time and parameter search optimization.
- Detected biomarkers (psychological stress) from big stream data (accelerometer, ECG, respiration rate) from multi-modal wearable sensors with prediction accuracy (F-1 Score) of 87% with SVM radial kernel initially.
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
SURVEY ON MACHINE LEARNING BASED PHYSICAL ACTIVITY RECOGNITION METHODS FROM SENSOR DATA
Date: December, 2016 – February 2017
- Conducted research on machine learning based algorithms for physical activity recognition (eg. walking, running, eating and drinking) from multimodal wearable sensor data.
Walking, Running, Jogging
Walking, Running, Jogging
Detecting Eating, Drinking
Walking, Running, Jogging
DISTRIBUTED BIG DATA APPLICATION FOR LARGE SCALE US STOCK MARKET DATA ANALYSIS
Algorithm & Techniques: Financial Analysis, Stock Market Analysis, Anomaly Detection, Distributed Big Data Analytics, Big Data Analytics, Big Data, Data Analytics, FinTech
​
Language: Java, Python
Technology: Apache Spark, Maven, Git
Tools: IntelliJ Idea, Linux
​
Date: May - July 2017
Description:
- Developed framework for processing and analysis of 7 years of historical US stock market data (50TB) of nanosecond granularity from 13 US exchanges on multiple clusters with Apache Spark.
- Added support for information extraction from binary files based on field spec for multiple year, file formats.
- Conducted multi market analysis (for market dominance detection), anomaly detection (for Flash crash day).
- Proposed using unsupervised learning/clustering on large-scale unlabeled stock market data for anomaly detection and general market analysis in absence of labels.
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
Link to GitHub Repository (Code)
ECONOMIC MODEL DEVELOPMENT FOR COVID-19 PANDEMIC WITH MACHINE LEARNING
Algorithm & Techniques: Machine Learning, Data Analytics, Economics
​
Language: Python
Technology:
Tools: Anaconda
​
Date: July - September 2020
Description:
- Conducted analysis for developing economic model around COVID-19 pandemic with country level economic data of 20 years and applied machine learning algorithms.
Would you like to learn more about my research projects?