Skip to content

Contributing to JPlag #8

@SimDing

Description

@SimDing

Clustering

All clustering related classes are contained within the de.jplag.clustering(.*) packages.

The central idea behind the structure of clustering is the ease of use: To use the clustering calling code should only ever interact with the ClusteringOptions, ClusteringFactory, and ClusteringResult classes:

New clustering algorithms and preprocessors can be implemented using the GenericClusteringAlgorithm and ClusteringPreprocessor interfaces which operate on similarity matrices only. ClusteringAdapter handles the conversion between de.jplag classes and matrices. PreprocessedClusteringAlgorithm adds a preprocessor onto another ClusteringAlgorithm.

Remarks on Spectral Clustering

Integration Tests

There are integration tests for the Spectral Clustering to verify, that a least in the case of two known sets of similarities the groups known to be colluders are found. However, these are considered to be sensitive data. The datasets are not available to the public and these tests can only be run by maintainers with access.

To run these tests the contents of the PseudonymizedReports repository must added in the folder jplag/src/test/resources/de/jplag/PseudonymizedReports.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions