Incorrect BLEU score calculation

Hi, 

I've just found that the calculation of BLEU score during the training is incorrect.
Current implementation of evaluateBLEU function in train.cc,
just compare hypothesis and target word IDs, which are including unk,
and not deal with these tokens.

For example, if I set the source and target vocabulary size to 4, 
I can get really high BLEU score because almost all of the words are unk.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect BLEU score calculation #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Incorrect BLEU score calculation #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions