Machine Learning Projects

Placeholder image

ECTSum: Bullet Point Summarization of Long Earnings Call Transcripts

In association with Goldman Sachs | -

Created ECTSum, a new dataset using Earnings Call Transcripts (ECTs) of publicly traded companies as documents, and short expert-written summaries derived from corresponding Reuters articles. ECTs are long unstructured documents without any prescribed length limit or format. Benchmarked the dataset using state-of-the-art summarization models such as BigBird, SummaRuNNer and Longformer Encoder Decoder. Proposed FinBERT-T5 based paraphraser model with 13.3% ROUGE-2 gain and 8.5% less factual hallucination. Published Long Paper at EMNLP 2022

Python PyTorch CUDA

Project Link: ECTSum: Bullet Point Summarization of Long ECTs

Paper Link: Long Paper in EMNLP 2022 (Main Conference)

Placeholder image

Video Game Level Generation using DCGAN

Advisor: Prof. Adway Mitra, Centre of Excellence in Artificial Intelligence, IIT Kharagpur | -

Generating levels for video games using Machine Learning models instead of human designers is becoming increasingly common. In this paper, we explore an alternative GAN architecture applied to the creation of playable game levels with a focus on Super Mario games. We also compare latent space search techniques to optimise inputs to the GAN from within the latent vector space

Python Java Bash

Project Link: Video Game Level Generation using DCGAN

Paper Link: VGL-GAN Paper

Placeholder image

Neural File Search Engine

Advisor: Prof. Palash Dey, Department of Computer Science & Engineering, IIT Kharagpur | -

Designed and developed CoeuSearch, an NLP based intelligent local-file search engine that searches for relevant documents in a directory, considering the semantics of the file’s name as well as it's content. Invented three-fold search strategy using SBERT based dual encoders and KeyBERT Topic Extraction model. Employed cache optimization techniques to reduce response time by 70%

Python PyTorch NLTK Django

Project Link: Neural File Search Engine

Placeholder image

Multilingual News Article Similarity

Advisor: Prof. Pawan Goyal, Department of Computer Science & Engineering, IIT Kharagpur | -

Leveraged the knowledge of pre-trained language models (mBERT and XLM) to predict the overall similarity between a given pair of articles. We propsed a model based on Sentence Transformer to estimate the contextualized embeddings coupled with cosine similarity. Our proposed approach using the Multilingual Setting is ranked 19th in the official SemEval 2022 Task 8 Leaderboard with a Pearson correlation score of 0.721.

Python PyTorch CUDA

Project Link: Multilingual News Article Similarity

Placeholder image

Entailment as Few Shot Learner For ACOS Quad Extraction Task

Advisor: Prof. Pawan Goyal, Department of Computer Science & Engineering, IIT Kharagpur | -

In this work, we highlight limitations of generative models by doing extensive data analysis and present two novel approaches to address these limitations. One of them reformulates category classification into entailment task, while the other one uses paraphrase modeling paradigm to cast the ACOS task to a paraphrase generation process. Acknowledging the scarcity of specialized datasets across domains, we compare both in-domain & cross-domain performance of the considered methods for the ACOS task and report new state-of-the-art results.

Python PyTorch Paddle-NLP CUDA

Project Link: Entailment as Few Shot Learner For ACOS Quad Extraction Task

Placeholder image

Investigating Generative Approaches For ACOS Quad Extraction Task

Advisor: Prof. Pawan Goyal, Department of Computer Science & Engineering, IIT Kharagpur | -

Developed three generative methods for Aspect Category Opinion Sentiment (ACOS) task, two of which respect the order of generated triplets/quads by means of using autoregressive decoders, while the other leverages a novel set-based bipartite matching loss to train a non-autoregressive parallel decoder. Acknowledging the scarcity of specialized datasets across domains, compared both in-domain & cross-domain performance of the considered methods for the ASTE task, thereby drawing notable inferences. Employed all proposed architectures for the ACOS task and reported new state-of-the-art results on the corresponding benchmark dataset.

Python PyTorch Fast-AI CUDA

Project Link: Investigating Generative Approaches For ACOS Quad Extraction Tasks

Placeholder image

Multitasking Framework for Emotional Analysis

Advisor: Prof. Pawan Goyal, Department of Computer Science & Engineering, IIT Kharagpur | -

This project is an implementation of the research paper All-in-One: Emotion, Sentiment and Intensity Prediction using a Multi-task Ensemble Framework which proposes a multi-task ensemble framework that jointly learns multiple related problems. The ensemble model aims to leverage the learned representations of three deep learning models (i.e., CNN, LSTM and GRU) and a hand-crafted feature representation for the predictions. Achieved 5.2% increase in accuracy and 0.33 increase in Pearson co-relation score for emotion classification and intensity tasks respectively.

Python Keras Tensorflow

Project Link: Multitasking Framework for Emotional Analysis

Placeholder image

Stock Price Movement Prediction using Sentiment Analysis

Advisor: Prof. Adway Mitra, Department of Computer Science & Engineering, IIT Kharagpur | -

Worked on establishing statistical correlation between social media sentiment and stock price movement of companies. Performed sentiment analysis using BERT on company's official tweets to generate social media sentiment score. Used it as an additional signal in LSTM network built on top of features like Open Stock price, Close Stock price, Low price, High price, Volume and Adj Close Price to predict the stock prices.

Python Keras Time Series Sequence Models

Project Link: Stock Price Movement Prediction using Sentiment Analysis

Placeholder image

Aurix, Smart-Electroacoustic-Transducers

Self Project | -

Imagine a world where you can immerse yourself in music while staying connected to the world around you. Enjoy your favorite tunes with your earphones or headphones, while still being alerted when someone is trying to reach you. For those times when you're on the move, this technology acts as your second set of ears, keeping you safe and alert. By alerting individuals using headphones while driving, we aim to significantly reduce the risk of accidents on the road.

Python Speech Recognition Natural Language Processing