-
Notifications
You must be signed in to change notification settings - Fork 5
Description
The Reddit API doesn't always tells us the correct language, as I figured out in #267. The language-feature was partially removed (it can't be set anymore in the UI, only in the API).
While the legacy language was removed from the UI, another language setting was added, which doesn't expose with the API.
Apart from that, the language-identifiers are not ISO 639-1 as expected. They are pretty much a mess. Mexican is "es-mx" for example (which isn't a huge issue).
This leads for us to not being able to identify the correct language of most non-english subreddits.
The current config has some architectural limitations, which makes it not possible to add default subreddit-languages without them being parsed (also too many subreddits here would make the reddit api too long at some point).
SUBREDDITS=mathmemes:en+mathmemescirclejerk:en+unexpectedfactorial:en+factorialchain:en+doublefactorialchain:en+theydidthemath:en:shorten+theydidthemonstermath:en+uselessfactorial:en+redundantfactorial:en+anticipatedfactorial:en+expectedfactorial:en+unexpectedTermial:en:termial+Notorite:trI see only one viable option here right now:
- a dedicated
subreddit_configsimilar to thechannel_configfor discord SUBREDDITSin the.envshould be added to the api url and parsed- subreddits in
subreddit_configonly store configurations- this would make it possible to configure default languages to subreddits by hand
Another option would be using an LLM to figure out the language of a subreddit and store it a the default language, this is quite overkill though (albeit might be a fun idea for a dedicated api)