Skip to content

Incorrect BLEU score calculation #3

@MorinoseiMorizo

Description

@MorinoseiMorizo

Hi,

I've just found that the calculation of BLEU score during the training is incorrect.
Current implementation of evaluateBLEU function in train.cc,
just compare hypothesis and target word IDs, which are including unk,
and not deal with these tokens.

For example, if I set the source and target vocabulary size to 4,
I can get really high BLEU score because almost all of the words are unk.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions