Skip to content
mgladkova edited this page Nov 10, 2015 · 3 revisions

MathWebSearch is a complete system capable of crawling, indexing and searching mathematical data. The components are implemented using POSIX-compliant C/C++ and a few third party libraries.

The main structure of the system is presented below:

Structure of the new system (MWS-0.4)

The crawler system (crawler) indexes ​MathML-rich websites and produces MWS Harvests, based on the Content-enabled m:math nodes it finds. The MWS Harvests are fed into the core which parses them and updates two indexes

  • a fast substitution-based tree for the Mathematical structure

BTree database for the additional information (like URIs+XPaths).

Clone this wiki locally