simple-g2p

Grapheme-to-phoneme (G2P) conversion predicts phonetic pronunciations from written words, essential for text-to-speech and speech recognition systems. This project implements an LSTM encoder–decoder model trained on CMUdict that converts graphemes to phoneme sequences.

Results

LSTM

The encoder–decoder Bi-LSTM model uses Luong attention, scheduled sampling during training, and greedy decoding for inference. It achieves a test sequence-level accuracy of 77% (WER of 23%) on the CMUdict dataset. Beam search did not improve upon greedy decoding when evaluated on the testing set.

Training Graphs

Loss vs Sequence Accuracy: Training loss plotted against sequence-level accuracy throughout training epochs.

Loss vs Teacher Forcing: Training loss evolution with varying teacher forcing probabilities during scheduled sampling.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
graphs		graphs
losses		losses
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cmu-dict-10-21-2025.pkl		cmu-dict-10-21-2025.pkl
config.py		config.py
constants.py		constants.py
dataset.py		dataset.py
evaluation.py		evaluation.py
g2p.py		g2p.py
models.py		models.py
preprocessing.py		preprocessing.py
test.py		test.py
train.py		train.py
utils.py		utils.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

simple-g2p

Results

LSTM

Training Graphs

About

Uh oh!

Releases

Packages

Languages

License

galactixx/simple-g2p

Folders and files

Latest commit

History

Repository files navigation

simple-g2p

Results

LSTM

Training Graphs

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages