This is a common library that can be used to determine sentiment on various text. This was training on the IMDB Dataset originally this was just practice on creating various models to learn. I wanted to expand it though, right now it primarily uses the Multinomial Naive Bayes model on top of a Term Frequency-Inverse Document Frequency (TFIDF) ranking.
In the future there can be a more dynamic approach to selecting which model can be used. The Jupyter Notebook that is training/creating the models created 4 unique models. 2 Linear Regression models and 2 Multinomial Naive Bayes models. One model uses Term Frequency-Inverse Document Frequency (TFIDF) ranks while the other uses Bag of Words vectors.
I might make some API that interacts with this common library. I will also make a dockerfile eventually to have all this setup happen.
uv syncon the main directory- Run the Jupyter Notebook fully. That will create all the models that you need.
uv run main.pyto run the code.- Test!