This repository contains a collection of Natural Language Processing (NLP) projects demonstrating various NLP techniques and models.
Description: Implemented a sentiment analysis model to classify text reviews as positive or negative using the Bag of Words technique.
Technologies Used:
- Python
- NLTK
- scikit-learn
Key Outcomes/Achievements:
- Achieved F1 score of 0.86 on Amazon reviews dataset
- Demonstrated understanding of text preprocessing techniques (e.g., tokenization, stemming/lemmatization) and feature extraction.
Description: Implemented and compared CBOW (Continuous Bag of Words) and Word2Vec models to generate word embeddings and capture semantic relationships.
Technologies Used:
- Python
- Gensim
- PyTorch
Key Outcomes/Achievements:
- Gained practical experience with word embedding techniques.
- Visualized semantic relationships using dimensionality reduction techniques (t-SNE).
Description: Developed a sequence-to-sequence machine translation model using the Transformer architecture.
Technologies Used:
- Python
- PyTorch
- Transformers library (Hugging Face Transformers)
Key Outcomes/Achievements:
- Demonstrated understanding of attention mechanisms and sequence-to-sequence modeling.
Description: Developed and compared multiple models (Bag of Words, Multinomial Naive Bayes, BERT, and DistilBERT) to classify text generated by humans versus machines.
Technologies Used:
- Python
- scikit-learn
- Transformers library (Hugging Face Transformers)
- NLTK
Key Outcomes/Achievements:
- Compared the performance of traditional machine learning models with pre-trained language models, demonstrating the performance gains of transformers.
- Achieved 97% accuracy with BERT/DistilBERT.
- Analyzed the strengths and weaknesses of each model.