-
Notifications
You must be signed in to change notification settings - Fork 1
Project Vision
Our project is going to develop a system, which can find out the candidate answers for the question from a huge amount of data. The system should be able to do the following task. When given a question such as “What does insulin regulate ” or a sentence like “ What moves neurons”, it should read from PubMed date, process them and find the answer the users want or the candidate answer list to users.
This task contains several separated works. Firstly, we find information from our corpus dataset. In this project, we choose to read text materials from PubMed. We do some preprocessing work in this phase to convert the corpus from raw language data to machine readable and processable information. Secondly, we do some parsing work to extract some useful information such as name entity, verb or link some pronouns to the corresponding name entity it refers to. Then we can use the information we got from PubMed to rank the candidate answers for each question. We select top five or top ten candidates as a raw result. Finally, we try to improve the performance of our system. Do some error testing and finish the paper.
(In our project, we will adopt command line interface at first. The user can input a question into the command line or write it to a text file and input the file location to our system, both ways will work in our system. We will print the answers on the command line directly.)
We plan to develop a web application as front-end user interface which is more user-friendly to get the questions and show the results. While the exploitation of dataset are processing in back-end. If needed, we can also write the answer into a file.