Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 22 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,30 +2,34 @@

Web application for visualizing and analysing log files.

![Project page](./images/project_page.png)
![Project page](./docs/images/project_page.png)

## Feature Overview
## Feature Overview

Visual Log Analyzer provides similar log analysis capabilities as those found in [LogDelta](https://github.com/EvoTestOps/LogDelta).
Visual Log Analyzer provides similar log analysis capabilities as those found in [LogDelta](https://github.com/EvoTestOps/LogDelta).

Analysis types:
- High-level visualizations
Analysis types:

- High-level visualizations
- Log distance analysis
- Anomaly detection
- Anomaly detection

These features can be applied at different levels: directory, file or line, depending on the analysis type.
These features can be applied at different levels: directory, file or line, depending on the analysis type.

Analyses are run as background tasks which gives the option to run multiple analyses in parallel. However, performance may vary depending on the size of the datasets and the available memory.
Analyses are run as background tasks which gives the option to run multiple analyses in parallel. However, performance may vary depending on the size of the datasets and the available memory.

## Run locally

For a more detailed guide, see the [Usage guide](./docs/usage_guide.md).

By default, the program expects the datasets to be located within the `log_data/` directory. To change the location of the log data, update the `LOG_DATA_DIRECTORY` environment variable in the `.env.sample` file. The analysis results are stored in `analysis_results/` as parquet files.

It is a good idea to create the result directory yourself so you don't run into permission issues.

To start the application, clone this repository and run: `docker compose up`

**Note:** Before starting, either:

- Rename `.env.sample` to `.env`
- or run compose with `docker compose --env-file .env.sample up`

Expand Down Expand Up @@ -56,6 +60,7 @@ log_data/
```

You can also structure your datasets into subdirectories for easier management, and then specify the base path when creating a project:

```
log_data/
├── hadoop
Expand All @@ -66,17 +71,19 @@ log_data/
├── correct_1
└── correct_n
```

With project base path `./log_data/LO2` when working with the LO2-dataset.

## Usage Tips & Troubleshooting
- **Isolate a specific trace:** Double-clicking a legend item will isolate that trace, hiding all others. This is especially useful in line-level anomaly detection, where plots can be cluttered.
## Usage Tips & Troubleshooting

- **Isolate a specific trace:** Double-clicking a legend item will isolate that trace, hiding all others. This is especially useful in line-level anomaly detection, where plots can be cluttered.

- **Resize plots:** Plots can be resized by dragging the bottom-right corner of the plot.
- **Resize plots:** Plots can be resized by dragging the bottom-right corner of the plot.

- **Manual filename entry:** Consider enabling “Manual filename entry” in the settings to avoid dropdown generation delays when working with large datasets with a lot of files, with the drawback of having to manually input filenames.
- **Manual filename entry:** Consider enabling “Manual filename entry” in the settings to avoid dropdown generation delays when working with large datasets with a lot of files, with the drawback of having to manually input filenames.

- **Display moving averages:** To visualize moving averages in line-level plots, make sure to enable them in the settings.
- **Display moving averages:** To visualize moving averages in line-level plots, make sure to enable them in the settings.

- **“No comparison runs found”-error:** Check “Match filenames”-setting. If it is enabled intentionally, ensure that the log data directory structure is consistent.
- **“No comparison runs found”-error:** Check “Match filenames”-setting. If it is enabled intentionally, ensure that the log data directory structure is consistent.

- **Timestamps:** If the timestamps are incorrect, try modifying the PostgreSQL time zone setting in the env file.
- **Timestamps:** If the timestamps are incorrect, try modifying the PostgreSQL time zone setting in the env file.
Binary file added docs/images/ano_line_lvl_form.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/ano_line_moving_avg.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/ano_line_results.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/example_visual_log_analyzer.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/high_lvl_viz_complete.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/high_lvl_viz_form.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/high_lvl_viz_results.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/multi_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/project_creation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
Binary file added docs/images/project_view.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
90 changes: 90 additions & 0 deletions docs/usage_guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Usage guide

Usage and installation overview of the core features. Additional helpful information can be found in the README.

## Installation

Before installing and starting the application, ensure Docker and Docker-compose are installed. Both can be installed according to the official Docker documentation. If you’re a student in the University of Helsinki and using a fuksi laptop, there might be some other steps you need to take to install Docker correctly (Docker might use the root partition and if it gets filled your computer might become unusable). Some information about installing Docker on cubbli-linux [here](https://version.helsinki.fi/cubbli/cubbli-help/-/wikis/Docker).

To run the application, execute the following commands:

```
git clone https://github.com/EvoTestOps/VisualLogAnalyzer.git
cd VisualLogAnalyzer
mkdir analysis_results # Optional but recommended to avoid permission issues
docker compose --env-file .env.sample up
```

To omit the `--env-file` flag, rename `.env.sample` to `.env`. For example with command: `mv .env.sample .env`

Building the application for the first time may take 1-2 minutes. Once the applitcation is running navigate to <http://localhost:5000/dash/> to access the homepage.

## Running analyses

The repository includes an example light-oauth-2 dataset. It contains a `Labeled` directory which has known cases (either correct or some type of error), and a `Hidden_Group_1` which contains unknown cases. Reviewing the dataset structure is recommended.

To use the dataset:

1. Navigate to the homepage.
2. Click "Create a new project", give it a name, and set the base path to `./log_data/LO2`

With setting the base path we are able to access the directories inside the LO2 directory.

![Project creation](/docs/images/project_creation.png)

After the project has been created click it on the list which should take you to the project view. Here we have four main components:

- List of analyses run in the project
- "Create a new analysis"-options. When a option is clicked e.g. "Anomaly Detection" it will give you options on which level you want to run it
- Settings box with relevant settings (hover for explanations)
- "Recent analyses" table for monitoring the status of analyses, inspecting logs and error messages.

![Project view](/docs/images/project_view.png)

Lets start by running an simple example analysis:

1. Click "High Level Visualisations" and then "Directory Level".
2. Select "Labeled" from the "Directory" dropdown and click "Analyze".

![High lvl viz form](/docs/images/high_lvl_viz_form.png)

This redirects you back to the project page, where the results will appear once the analysis is complete.

![High lvl viz complete](/docs/images/high_lvl_viz_complete.png)

Clicking the result will show a plot and analysis details. Hover over data points for more information.

![High lvl viz results](/docs/images/high_lvl_viz_results.png)

Next, lets try running anomaly detection:

1. Go back to the project page and select "Anomaly Detection" and then "Line Level".
2. Select `Labeled` from both "Train data directory" and "Test data directory".

- Compared to high level visualisations and log distance analysis, you will need to input train data directory and test data directory separately. This either means having two directories where one contains only train data and the other one only test data, or by filtering the data in the "Directories to include in train data" and "Directories to include in test data".

3. From the "Directories to include in train data" select only the correct cases.
4. For the "Directories to include in test data" select the rest which were not selected to the train data.
5. Select a "Regex mask type" (e.g. "Myllari" or "Myllari Extended")\*\*.

- In general it is a good idea to use a mask unless there is some really good reason not to.

6. Click "Analyze".

![Ano line lvl form](/docs/images/ano_line_lvl_form.png)

The results will be displayed on the list similarly as in the previous analysis. After clicking the results you can select plot (file) to display from the drop down. This will generate a plot and a table containing the analysis results of that specific file in line level. Clicking a data point on the plot takes you to the correct line on the table.

![Ano line lvl results](/docs/images/ano_line_results.png)

To view moving averages in the plots, navigate back to the project page and select either "Moving averages only" or "Show all" from the settings, and click "Apply". Now when navigating back to the results the moving averages are displayed.

![Ano line moving averages](/docs/images/ano_line_moving_avg.png)

In line level anomaly detection, you can create a multi-plot image to visualize multiple plots side-by-side. In a anomaly detection line level results page, click "Create multi-plot image" and from the form select the files and columns to include. In general using moving averages as the columns is the most useful.

![Multi-plot](/docs/images/multi_plot.png)

## Adding your own datasets

By default, the application looks for datasets in the `log_data`-directory. To add your own dataset, simply copy/move them in to that directory (you might need to refresh the page to see the directory). To use directories in the root of `log_data`, leave the base path of the project empty or use the default value. You can organize your datasets similarly as the example dataset was organized and specify the path in the base path input. The base path can also be used to navigate further down in the directory structure what the directory input options normally allows, which might be useful in some log data structures.
Loading