Skip to content
This repository was archived by the owner on Aug 9, 2024. It is now read-only.
This repository was archived by the owner on Aug 9, 2024. It is now read-only.

Content replacement features -- use removed content to inform measures of added content in diffs #152

@eranroz

Description

@eranroz

revscoring and AbuseFilter (and other tools) allow to catch easily vandalism that use some "bad regex"/bad words. However, the existing tools don't have ability to identify word replacements:
E.g "Barack Obama is president" => "Barack Obama is terrorist". While "terrorist" is not a bad word, a replacement of some other word to terrorist is most probably bad.

While it isn't always obvious there is "alignment" between words in the previous and the new revisions, if such exist the tool can use it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions