Context-aware translations of reviews, comments and everything else


pool in a pool

Websites often use automatic translations for reviews and comments. When these translations are accurate, they can be extremely valuable. Translated reviews, for example, help attract a wider customer base, especially when there is no shared language between new users and the existing reviewers.

Unfortunately, many automatic translations end up looking like this:

Somewhere between barely understandable and completely incomprehensible. Part of the problem is that large, established companies like Google cannot easily switch translation methods. Any modern, state-of-the-art large language model (LLM) can produce a better translation. Here is DeepSeek:

A while ago, the alarm went off due to insufficient charge, and a store employee came over. However, they were constantly on their headset mic, angrily saying, “I can’t leave this spot right now!” I apologized several times, but they continued to handle the situation in complete silence.

Context

There’s more to improving translations than just using a better model. Often, the translation environment is set up in a way that makes accurate translation nearly impossible.

Above review isn’t for a swimming pool. It’s for a bar where you can play pocket billiards. While English shares a word for both types of pool, most languages don’t. Without any additional context, Google’s model is forced to guess and ends up choosing “swimming bath”:

Unless context is provided, LLMs will make the same mistake. However, once we inform the LLMs that the place being reviewed is called “Roxy Ball Room Manchester Deansgate”, they all correctly identify pool as something played on a billiard table:

Toller Service – gutes Ziel für Billard.

Context isn’t just important to translate words with multiple meanings like “pool”. It helps improve translations across the board. Reviewers and commenters assume their audience knows what they’re reviewing or responding to–so the translation model should too.

Cheap and quick context-aware translations

Batch Translation is a tool we built where you can upload a set of comments or reviews, and receive their translations. It is about ten times as cheap as Google’s Translation API. You can view some example translations here.

The formats we support are CSV, JSONL and JSON. We explain how to export a compatible file from your database in our guide.

In our interface, you choose one field in the file to translate. The other fields are used as context. Besides per-row context, you can also provide a global context that applies to everything, such as “a comment on a car blog”. The service returns the input file plus a new field, translated_text, which can be copied back to your database or spreadsheet.

All in all, translating a database or spreadsheet column should take 5 minutes of actual work, plus a short wait while your file gets processed.

To display the translations on your site, you can try an LLM prompt like this (alter the details and fill in the placeholders):

I’m looking to expand my website to [new language]-speaking users. However, most existing user reviews are in another language. Because of this, I added machine translations to the database. This is the database model:

[database model]

If the user’s browser’s primary Accept-Language is [new language], not [old language], AND the review has a translation available, show the review as:

<TRANSLATED REVIEW>

— ORIGINAL TEXT —

<ORIGINAL REVIEW>

In all other cases, just show the original review. This is our code–let me know if you need more files as context:

[your code]

This allows you to test whether translations help retain certain audiences.

A note on prompt injection and mistakes

LLMs are vulnerable to prompt injection, so be careful where you put user input when using Batch Translation.

All translations occur in isolated contexts. This means that a prompt injection in one row won’t affect others. Malicious users can only tamper with their own translation. And if you monitor your site’s reviews for abuse, you’ll usually catch injection attempts—they tend to make no sense in context.

LLMs make mistakes, but no more than older machine translation systems:

Long-form

This article focused on short texts, but Batch Translation also works for long-form content—blog posts, book excerpts, static websites, even templates. LLMs have limited context windows, but translating a few pages at a time is usually fine. If you’re working with 4+ page texts, try translating one first before batching the rest. If texts are self-contained, you often won’t need global or per-row context at all.

Go submit a translation job!