-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
Train 2B 0.5/1TT models with different proportions of Multisynth data:
Option 1: Aim for near-uniform mix of languages other than English, include up to 50% language total translated data when the amount of native data is insufficient
Option 2: Aim for near-uniform mix of languages other than English, include 50% translated data even when sufficient native data is available
Option 3: Aim for near-uniform mix of languages other than English, including as much translated data as necessary to get close to uniform
Option 4: Aim for a balance between "natural" and uniform mix of languages other than English (Jörg Tiedemann when should we use translated data in this option?)
Metadata
Metadata
Labels
Type
Projects
Status
No status