Skip to content

EvoTestOps/AISysRev

Repository files navigation

AISysRev

Project Status: Minimum viable product with core functionality working, but many features are missing and bugs remain. You can also checkout command line alternative AISysRevCmdLine

This web-application offers AI-based support for Systematic Literature Reviews. Currently, only one step is supported: title–abstract screening. Although the application runs in a web browser, all data is stored locally on your machine. LLMs are accessed through OpenRouter, and data for screening can be imported from Scopus. The application allows you to:

  • Import a CSV file with paper titles and abstracts. You can also use our Demo CSV file
  • Specify include/exclude criteria for paper screening
  • Evaluate papers against the criteria using multiple LLMs
  • Receive LLM evaluations as binary decisions (include/exclude), ordinal ratings (1-7), or inclusion probabilities (0–1)
  • Perform manual evaluation of titles and abstracts alongside LLM evaluations
  • Export evaluation results to CSV for further analysis in Microsoft Excel, Google Sheets, R, Python, etc.

The application is based on our research papers on this topic. Please consider citing if you use the application 1–2.


Main view shows LLM screening tasks.


Manual evaluation view, with LLM evaluations (binary, ordinal, probability) alongside manual review.


Manual evaluation list view, with papers sorted by inclusion probability according to all executed LLMs.

Getting started

Data

The tool has been tested with CSV data exported from Scopus. Support for Web of Science can be achieved by editing the columns headers to match the ones from Scopus. The minimum required fields are: Document title, DOI, Abstract, Authors, and Source title.

image

LLMs Access and LLM screening speed

The application is integrated with OpenRouter, which supports multiple LLMs ranging from very affordable to top-tier models like OpenAI’s ChatGPT, Google’s Gemini, Anthropic’s Claude, Meta's LLama, and Mistral. To use the models, you need to provide an OpenRouter key. You can set spending limits for each key directly on the OpenRouter website. New users also receive $5 in free credits when creating an account. {585DBE92-5A2F-412E-BEF1-A727015EE872}

Note: Paper screening speed is about 4,5s per paper. We are working on parallelizing this after which it should go down to about 0.2s/paper.

System and software requirements

See https://docs.docker.com/desktop/ for Docker installation instructions. Docker Desktop includes Docker Compose, Docker Buildx, Docker Engine and the Docker CLI.

If Docker Desktop did not include Buildx plugin, see: [https://github.com/docker/buildx][https://github.com/docker/buildx]

  1. Run docker info to verify you have Docker installed
    • Docker 26.0.0 has been tested as working. For MacOS computers with Colima, Docker version 28.5.1 confirmed to be working.
  2. Run docker buildx version to verify you have Docker Buildx installed. For MacOS computers, Buildx plugin version 0.29.1 confirmed to be working.
  3. Run docker compose version to verify you have Compose installed. For MacOS computers, Compose plugin version 2.40.3 confirmed to be working.
    • Version 2.33.1 has been tested as working, newer versions should also work.
    • Note: Older versions of Compose use docker-compose as the compose command. We don't provide support for legacy Compose versions.

Running the application

First, clone the repository to your local computer.

git clone https://github.com/EvoTestOps/AISysRev.git

move to correct directory

cd AISysRev

MacOS, Linux and Windows (WSL)

Start the application in production mode:

make start-prod

If you want to develop the app, run:

make start-dev

The startup of the app may a while due to the download of corresponding Docker images & services, application dependencies and building of the application.

After startup, open the application:

If you ran start-prod, navigate to https://localhost (the Caddy server's root CA is by default untrusted. You can bypass the browser warning).

If you used make start-dev, navigate to http://localhost:3001

Windows (non-WSL)

If you do not have Windows Subsystem for Linux (WSL), start the application with

./start-prod.bat

Technology

Front-end

TypeScript, React, Tailwind CSS, Vite, Wouter, Zod, Redux

Back-end

Python, FastAPI, PostgreSQL, SQLAlchemy, Alembic

Development requirements

Running in development mode

MacOS, Linux and Windows (WSL)

make start-dev

Windows (non-WSL)

./start-dev.bat

Getting started with development

Open up the client: http://localhost:3000

/api is proxied to the backend container, e.g. http://localhost:3000/api/v1/health will be proxied to http://localhost:8080/api/v1/health.

API docs: http://localhost:3000/documentation

Server: http://localhost:8080

Adminer GUI: http://localhost:8081/?pgsql=postgres&username=your_username&db=your_database_dev&ns= password: your_password

Mock data

Mock data is located in data/mock -folder.

Tests

Run in client/ npm test for e2e tests

Run in root make backend-test (./backend-test.bat for Windows non-WSL) for backend tests

Run in root make backend-test-html (./backend-test-html.bat for Windows non-WSL) for backend tests and HTML coverage report

Makefile Commands

The project includes a Makefile for common development and database operations:

Development

Command Description
make start-dev Start dev containers with live reloading and build on startup (default setup)
make start-test Start test containers and rebuild images (isolated test environment)
make start-prod Start production container and rebuild images

Note: Run all commands from the project root.
Containers are isolated by environment using the Docker Compose -p flag.

Database Migrations (Alembic)

Command Description
make m-create m="Message" Create a new migration with an autogenerated diff (replace Message)
make m-up Apply all pending migrations (upgrade to latest)
make m-hist Show the full migration history with details
make m-current Display the current migration version in the database

Supported LLMs

Currently, we support models provided via Openrouter.

License

MIT

References

[1] Huotala, A., Kuutila, M., Ralph, P., & Mäntylä, M. (2024). The promise and challenges of using llms to accelerate the screening process of systematic reviews. Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering, 262–271. https://doi.org/10.1145/3661167.3661172

[2] Huotala A, Kuutila M, Mäntylä M. SESR-Eval: Dataset for Evaluating LLMs in the Title-Abstract Screening of Systematic Reviews. In Proceedings of the The 19th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) 2025 Oct 218 (pp. 1-12) https://arxiv.org/abs/2507.19027

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 5