-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Languages without delimiters - Japanese and Chinese (Simplified, Traditional) and possibly other east Asian languages don't have any delimiter. eg) 九千九百九十九 (9999 in Japanese). These actually have a very similar structure compared to English but the lack of a delimiter makes it tougher.
Also, there isn't a delimiter as such (upto a certain number) for German and Dutch .
One approach in mind for the delimiter thing is reading words character by character and as soon as we have a match in any of the words we insert a space and after this pre-processing step, we can follow the same logic. This does increase the complexity O(string_length ^ 2) which shouldn't be a major issue I believe. (We can use this function only for certain languages without delimiters).
Concrete example
five thousand nine hundred and thirteen - English (5913)
fünftausendneunhundertdreizehn - German (5913)
nine hundred and thirteen - English (913)
negenhonderddertien - Dutch (913)
To handle this we first check f , fü, fün and finally hit fünf = 5 and similary get negen = 9 and insert a space and then start again from the next character.