June 19, 2026 · 7 min read · Industry

Localization for travel content: more than translating strings.

Hagia Sophia in English. Ayasofya in Turkish. Святая София in Russian. Αγία Σοφία in Greek. These aren't translations of each other — they are different authoritative names for the same place. A travel app that runs a translation API over place strings will get every one of them wrong.

The category mistake

Translation services treat input as natural language. "The big mosque in Istanbul" translates fine. "Hagia Sophia" gets word-for-word handling and produces nonsense in most target languages — Google Translate has historically returned literal "Holy Wisdom" in some pairs, which is etymologically true and pragmatically useless. A Turkish reader doesn't search for "Kutsal Bilgelik." They search for "Ayasofya."

Place names are proper nouns with authoritative local-language labels. They are not strings to be translated; they are entries to be looked up. The lookup table exists, is open, and is large: Wikidata.

Wikidata as the localization substrate

Every notable place on Wikidata has a Q-number and a labels block. Hagia Sophia is Q12506. Its labels block contains the authoritative name in roughly 200 languages, curated by speakers of those languages, with edit history. That is not the same as a machine translation. It is the recorded local-language name.

Tokyo is Q1490. The Japanese label is 東京. The Korean label is 도쿄. The Hebrew label is טוקיו. None of these are produced by translating the Latin string "Tokyo" — they are the names native speakers use. Generic translators sometimes get them right by accident; for less famous places they get them wrong.

Examples that break translators consistently: the Acropolis of Athens (Greek requires the article construction), the Forbidden City (Mandarin uses 故宫, which means "Old Palace," not "Forbidden City"), Reykjavik (Icelandic preserves the diacritic; many translators strip it), Buenos Aires (Spanish does not localize this — and many translators will Spanish-ify English place names that should not be Spanish-ified).

The shape that works

The /v1/cities and /v1/places responses include a Wikidata identifier where one exists. Given a Q-number, your client makes a single SPARQL or REST call to Wikidata to fetch the labels block, caches it, and renders the local-language label. Wikidata serves this for free at very high rate limits.

GET /v1/places?city=istanbul&type=monument
{
  "name": "Hagia Sophia",
  "wikidata_id": "Q12506",
  "lat": 41.0086,
  "lon": 28.9802
}

# Then, client-side:
GET https://www.wikidata.org/wiki/Special:EntityData/Q12506.json
# returns labels.tr = "Ayasofya"
# returns labels.ru = "Святая София"
# returns labels.ja = "アヤソフィア"
# returns labels.ar = "آيا صوفيا"

Cache aggressively. The labels change rarely. A weekly refresh against Wikidata's last-modified timestamp is more than enough.

What translation services are still good for

Descriptions, reviews, opening-hours notes, transit instructions, body copy — all genuine natural language, all suitable for machine translation. The split is: proper nouns from the labels table, everything else from the translator. A travel app that gets this split right reads as competently localized in 30 languages with two days of work. A travel app that doesn't reads as a clumsy auto-translation in every language other than its source.

Edge cases the labels table handles

Cities with multiple official names: Brussels in French is Bruxelles, in Dutch is Brussel — the labels block carries both. Cities that transliterate differently per language: Mexico City is Ciudad de México in Spanish (the official endonym), México to most Mexicans informally, and "Mexico City" is itself an English construction. Wikidata has the right answer per language. Cities with politically contested names: Istanbul / Constantinople / Konstantinopolis — Wikidata documents all three with usage notes.

Where the gaps are

Smaller places, neighbourhoods, individual restaurants, and recently- renamed venues are sparser on Wikidata. For those, the API response falls back to the original name and lets your translator handle the surrounding sentence. We don't fabricate a localized label when the authoritative source doesn't have one — the user is better served seeing the canonical name than seeing a guessed transliteration.