Limitations of the translation API

Hi there,

I'm not entirely certain of where to post this but given that I picked up this repo from the Chromium source I'm gonna guess this is the best place to land it.

The translation API is already an amazing thing to have available in the browser, and the current implementation brings me joy and wonders. However I'm seeing improvements that I would really like to have available.

# Current issues

For example, if like me you are developing an extension which allows for translating the user's messages, especially in a WhatsApp/IM context, you might find that some notions are hard to take into account:

## The speaker's gender

For example in Spanish, "I am tall" would translate to "Soy alto" or "Soy alta" depending whether you're a man or a woman.

This is obviously not the only gender-related issue, but it's a big one if you are working on IM (thus mostly first-person sentences).

## Relative references

Particularly in the context of a conversation, you might be referencing the previous message, that you are not passing to the translation API.  This can impact the declination of adjectives, disambiguate meanings, etc.

For example in French you could have:

- "Qui s'occupe de la baguette ?" ("who takes care of the baguette?")
- "Je la prends" ("I'll take it")

Again, a gender-related issue. The sentence could be entirely different if it were not for a baguette. For example "pain" is masculine so you would say "Je **le** prends".

In this case if you are only translating the second sentence, and you cannot contextualize the translation with the first one, then you don't know which gender to use in there.

## Regionalisms

The language is different depending on the region. Obviously Brits will get upset whenever they see "color" instead of "colour" but it can get a lot more dramatic with other languages. For example French French is entirely different from French Canadian on any word that appeared after the 17th Century. Or even regions of France will not agree on some words (have a look at the "chocolatine" debate). Examples are countless.

My point is that different regions and different people are going to pick different words or idioms and this is absolutely not bound by country or language.

## Source language

The `Translator` API requires a `sourceLanguage`, which is in my opinion not bringing any value.

For example, in my current company, our _standard_ meeting will happen in two if not three languages. Take a transcript of that, you will have French, Spanish and English in there. What is the source language of that?

On the other hand, when you're going to be translating, you will probably want the smoothest user experience and thus not ask for the source language. Most of the time you are probably going to use the `LanguageDetector`.

And finally, the goal of translation is not to be gate-keeping incorrect spelling or grammar. We want to understand the source material, whichever its language, and generate an output in the target language.

So essentially today you need to start two APIs (language detection and translation), download the associated models, all that to determine a single source language while the source material is very probably mixed.

# Proposed improvements

As an avid LLM user, I have been using LLMs to do translation for quite a while now, and my workflow essentially is:

- Give a bit of context (previous message, lexical field, etc)
- A bit about myself
- Target language
- What I want to translate

For example:

```
I'm a man, please translate this message for my doctor in Spanish from Spain: Could you renew my receta? Thanks!
```

This workflow works well for me, because:

- It does not ask me for the source language
- I can ask whatever context is relevant to the conversation, using my judgement

At the end  of the day this suggests that a more powerful API could be possible. For example modify the `create` options:

```typscript
/**
 * Core options required for creating a translator
 */
export interface TranslatorCreateCoreOptions {
    /** Optional source language hint in BCP-47 format (e.g., "en", "es", "zh") */
    sourceLanguage?: string;
    /** Target language in BCP-47 format (e.g., "en", "es", "zh") */
    targetLanguage: string;
    /** Semi-formal information about the dialect/region to translate for */
    targetDialect: string; // "es", "south-west of France", etc
}
```

As you can see:

- `sourceLanguage` becomes optional, and it's a hint not a hard requirement (helps with disambiguation, but that's it)
- `targetDialect` gives more information about which variants of the language the developer wants. Semi formal, because you might want to instruct regionalisms, tone, use of specific slang, etc

And add to the `translate` options:

```typescript
/**
 * Options for translation operations
 */
export interface TranslatorTranslateOptions {
    /** AbortSignal to cancel the translation */
    signal?: AbortSignal;
    /** Plain-text general context of the conversation (speaker's gender, lexical field, etc) */
    context?: string;
    /** Previous parts of the text/conversation being translated */
    previousText?: string;
}
```

Essentially, we're adding two things in there:

- `context` is essentially an add-on to the system prompt, to give any info deemed valuable by the developer. Given all the possible nuances that can happen, it seems futile to me to model it any further, we can probably let the developer give us plain-text details of what they know in regards to the expected translation
- `previousText` that would be the previous parts of the conversation, essentially we'd ask the LLM to complete this text by translating the input text

# Conclusion

After using these APIs for my app, those are the things that I feel are needed for my use-case. I'm sure that I'm all kinds of wrong between not posting in the right form, place or with the right constraints in mind. But let me know if this can go any further, and if so I will happily contribute where required.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Limitations of the translation API #63

Current issues

The speaker's gender

Relative references

Regionalisms

Source language

Proposed improvements

Conclusion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Limitations of the translation API #63

Description

Current issues

The speaker's gender

Relative references

Regionalisms

Source language

Proposed improvements

Conclusion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions