Project Goals

Short term goals

#	Uzbek	English	Tag
1	От	Noun	NOUN
2	Феьл	Verb	VERB
3	Сифат	Adjective	ADJ
4	Равиш	Adverb	ADV
5	Олмош	Pronoun	PRON
6	Сон	Numeral	NUM
8	Боғловчи	Conjunction	CONJ
9	Юклама	Particle	PART
10	Ундов сўз	Interjection	INTJ
11	Тақлид сўз	Onomatopoeic words	X
12	Модал сўз	Modal words	AUX
13	Кўмакчи	Postposition	ADP

This may not be fully compatible with the Universal Dependencies POS tags.

Perform statistical analysis on the usage of words, generate a table using PDF books and news feed.
Implement stemming rules and a table within the database (for the fast reference), table for entity names.
Implement basic natural language analysis tool that would provide functionalities such tokenizing, parsing, stemming, POS tagging, named entity recognition, etc.

Expected deliverable: The tool tahlih gets an input text in Uzbek language and generates parsed (tokenized, stemmed, POS tagged) output.

Manually (Semi-automatically) generate Uzbek Treebank in CoNLL-U format that can be contributed to Universal Dependencies (UD) Framework.
Feed Uzbek Treebank to SyntaxNet, and perform analysis, training, improving
Implement initial applications with Uzbek NLU project: Telegram Q&A Bot, Twitter Bot, news summarizer.