NLP_Project_2023

DADS7203 Text Analytics and Natural Language Processing

1. Implementing Word2Vec

In this part, you will implement the Word2Vec model and train your own word vectors with stochastic gradient descent (SGD). Numpy methods could be utilized to make your code both shorter and faster. The following requirements should be satisfied:

Negative sampling loss
Implement the skip-gram model from scratch
Train with real data
Show the resulting embeddings

2. Neural Machine Translation

a) Implement the below model
b) Use a small set of real machine translation data
c) Test with some (very similar) sentences

3. Implement A Simple Transformer Model

a) Implement a small transformer with one layer encoder and one layer decoder with self-attention according to the "Attention is All You Need" paper
b) A sub-layer can be constructed from code available in any package
c) Show and explain results (input, output) of each sub-layer during training and testing

4. Implement A Named Entity Recognizer (Bi-LSTM + CRF)

a) Implement according to https://aclanthology.org/N16-1030/, using a dataset from https://nlpforthai.com/tasks/ner/ (a very small subset is sufficient for demonstration)
b) You can put together the model from code available in any package

5. Music Generation by GPT

a) You can use GPT code from any package
b) Training uses MIDI files, downloadable from the web (only main, catchy parts of a song may be used)
c) Generate and play the generated tunes

Team Members

6420422002 Tanakorn Withurat - ธนากร วิธุรัติ
6420422008 Wannapa Sripen - วรรณนภา ศรีเพ็ญ
6420422011 Juntanee Pattanasukkul - จันทร์ทนีย์ พัฒนสุขกุล
6420422017 Witsarut Wongsim - วิศรุต วงศ์ซิ้ม
6420422021 Suchawalee Jeeratanyasakul - สุชาวลี จีระธัญญาสกุล

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Capture result		Capture result
Code		Code
Input data		Input data
NLP_ClassProjects_2565_v1.0.pdf		NLP_ClassProjects_2565_v1.0.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP_Project_2023

DADS7203 Text Analytics and Natural Language Processing

1. Implementing Word2Vec

2. Neural Machine Translation

3. Implement A Simple Transformer Model

4. Implement A Named Entity Recognizer (Bi-LSTM + CRF)

5. Music Generation by GPT

Team Members

About

Uh oh!

Releases

Packages

Languages

Hakulani/NLP_Project_2023

Folders and files

Latest commit

History

Repository files navigation

NLP_Project_2023

DADS7203 Text Analytics and Natural Language Processing

1. Implementing Word2Vec

2. Neural Machine Translation

3. Implement A Simple Transformer Model

4. Implement A Named Entity Recognizer (Bi-LSTM + CRF)

5. Music Generation by GPT

Team Members

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages