-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Labels
Description
Hi,
I've just found that the calculation of BLEU score during the training is incorrect.
Current implementation of evaluateBLEU function in train.cc,
just compare hypothesis and target word IDs, which are including unk,
and not deal with these tokens.
For example, if I set the source and target vocabulary size to 4,
I can get really high BLEU score because almost all of the words are unk.