Crowdsourcing Analysis and Visualization

Best Project Award
Crowdsourcing Analysis and Visualization (CSAV) aims to look into a relatively unexplored data (Kickstarter website), with the aim to present a visually appealing and interactive dashboard for users, allowing them to view important metrics at geographical and categorical level. There is also a use of temporal data, to draw trends and insights into how the products have been launched on a daily basis. Since, there are majorly two final statuses of a project - successful, failure - the data has enabled us to implement logistic regression for predicting whether a project will be successful or not. Logistic regression has been chosen for its simplicity and efficiency in predicting balanced binary classes.Tech used: Python, D3.js, Flask, JavaScript, HTML, CSS

ManyLabs Data Completion and Interpolation

Engineered a model pipeline to clean, and then impute the missing value columns through machine learning techniques iteratively on a psychology study dataset. The dataset comes from psychology studies in the Many Labs series, an attempt to test the replicability or generalizability of psychological effects. To support the model pipeline and to impute the missing columns, the following algorithms were made from scratch (to understand the math behind the models):

  • Regression Algorithms: Mean Imputation (Baseline), Linear Regression, Ridge Regression, K-Nearest Neighbors, Decision Trees, Artificial Neural Networks
  • Classification Algorithms: Mode Imputation (Baseline), Logistic Regression, Softmax Regression, K-Nearest Neighbors, Decision Trees, Adaboost
  • Generative Algorithms: K-Means Clustering, Gaussian Mixture Models, Variational Autoencoders (not from scratch)
Tech used: Python

CS520: Artificial Intelligence Coursework

All the coursework was designed and conceptualized from scratch, using only Numpy.

  • Mazerunner - AI Search Algorithms
  • Minesweeper - Constraint Satisfaction Problems
  • Search and Destroy - Bayesian Networks
  • Image Colorizer - Neural Networks

Tech used: Python

Music Genre Belief Recognition

This was the major project for the course CS543: Massive Data Storage and Retrieval. The aim of the project was to be able to predict genre of a song with time sequencing. It was conceptualized and modelled with the help of spectrograms, convolutional neural networks and recurrent neural networks (later replaced by time-distributed layers to improve upon accuracy). Finally, the model was deployed and visualized on web by using TensorflowJS front-end development. Tech used: Python, TensorflowJS, Keras, JavaScript, HTML, CSS

Live Transcription of Sign Language using Convolutional Neural Networks

This project was my Bachelor Thesis project and my introduction to the world of convolutional neural networks. The aim of the project was to deploy a webcam based user interface which could predict the sign language in real time. The core of the model is a CNN architecture, trained on a dataset of ~0.17 million images, achieving an accuracy of 97%. The github repo is kind of unstructured, which needs to be updated soon. Tech used: Python, TensorflowJS, Keras, OpenCV

Scratching Statistics

A plethora of projects/assignments I completed as part of my CS581: Probability and Statistics, all from scratch. The github repo attached consists an EDA on Hurricane data, which used Monte Carlo Simulation to establish if the data followed Poisson distribution or not, Method of Moments and Maximum Likelihood estimations for discrete and continuous distributions, Goodness of Fit through Kolmogorov Smirnov test, Sampling methods like Bootstrap and Jackknife followed by the major project which involved finding out over and under-expressed genes in NCI60 data-set by using student t-test and false discovery rates.Tech used: R Programming

Test and Control Methodology - Retail Stores

During my time as a data analyst for The Smart Cube, I worked on providing a test and control methodology for a major retail client. The main purpose for the project was to give a clear picture of how the client stores would get impacted in terms of sales if the client decided to do some changes in a store or a bunch of stores (position/space/range/new/old sections). The technology I deployed used a lot of statistical tests like slope-test, co-integration test and correlation tests to group stores which behaved similar in terms of sales, demographics etc. Tech used: SAS Programming, Microstrategy, SQL, Excel

Hierarchical Mixed Effects Modeling - Retail Stores

Another project during my time as a data analyst for The Smart Cube was on creating a Hierarchical Mixed Effects model for a major retail client, which allowed the client to gauge on the predicted impact on sales of products in its stores based on its demographics, weather, holidays, store changes such as change in position/range/space of items or renewal and discontinuation of items among other variables. The reason for choosing this type of model was due to the multi-level nature of data. Tech used: R Programming, Excel

Gravitational Search Algorithm in Recommendation Systems

Project of many firsts. First Published paper - First International Conference - First Research experience. The motivation that stemmed the conceptualization of this paper was along the lines of fusing computational intelligence techniques with Collaborative filtering methods. It explored a relatively new bio-inspired meta-heuristic algorithm named Gravitational Search Algorithm(GSA) for the purpose of recommending jokes to users. Later, I presented the paper in the 8th International Conference on Swarm Intelligence, Fukuoka, Japan. Tech used: Python, Latex

Tackling TSP with Ant Colony Optimization and Visualization

Final project for CS512:Data Structures and Algorithms. The traveling salesman problem was tackled with the help of a nature inspired technique called ant colony Optimization. The project models the probabilistic behavior of pheromone chemical interactions between ants while foraging for food. A simple user interface was also created to allow the user to play with the number of ants and steps the program should be run for. The final output was a visualization of the order of cities to be traveled so as to cover the least distance possible. Tech used: Python

Recommendation Systems

Some of the basic and intermediate level work I did on recommendation systems before starting research work on my paper in my undergraduate days. Implementing song recommendation system based on popularity and items, movie recommendation system using Collaborative filtering with Particle Swarm Optimization (which later was a comparison algorithm for the paper). Tech used: Python

Machine Learning Algorithms from scratch

I worked on this project in order to understand how various machine learning algorithms work and to understand the math behind them. This project also serves as a refresher for someone who would like to go through the basic implementations of algorithms such as Linear/Logistic Regression, KNNs, ANNs, etc. Tech used: Python

Phone

Address

New Brunswick, NJ 08901
United States of America