May 26, 2026 · 6 min read · Data
Hyderabad is two cities. Your travel AI should know which one.
Hyderabad, Telangana, India: 10 million people, the Charminar, biryani, Golconda Fort. Hyderabad, Sindh, Pakistan: 1.7 million people, the Pacco Qillo, Mughal-era tombs, distinct cuisine. Generic LLMs without grounding pick one based on surrounding tokens. They are often wrong.
What goes wrong
"Tell me about the Charminar in Hyderabad." Asked plainly, most frontier models answer correctly — Telangana. But change the surrounding context. Mention biryani: still correct. Mention "Sindh region cuisine": the model wavers. Mention "Indus river": a frontier-model travel chatbot we tested confidently described the Charminar as being on the Indus. The Charminar is on the Musi.
The model is doing pattern completion across two adjacent high-frequency clusters. Both cities are real. Both have Mughal architecture. Both have distinctive culinary traditions. The statistical signature is similar enough that token bias decides the answer.
The disambiguation key
The fix is structural: never let the system reason about "Hyderabad" without an attached country code, and ideally an attached subdivision code. The API takes both as request parameters:
GET https://api.travelminds.ai/v1/cities?name=Hyderabad&iso_alpha2=IN&state_code=TG
Authorization: Bearer YOUR_KEY
{
"cities": [{
"name": "Hyderabad",
"iso_alpha2": "IN",
"state_code": "TG",
"lat": 17.3850, "lon": 78.4867,
"narrative": "Capital of Telangana...",
"master_score": 88
}]
}
GET https://api.travelminds.ai/v1/cities?name=Hyderabad&iso_alpha2=PK&state_code=SD
{
"cities": [{
"name": "Hyderabad",
"iso_alpha2": "PK",
"state_code": "SD",
"lat": 25.3960, "lon": 68.3578,
"narrative": "Second-largest city of Sindh...",
"master_score": 71
}]
}
The two-character ISO 3166-1 alpha-2 country code plus the subdivision code is sufficient to disambiguate every same-named pair we have ever encountered. It's also the right primitive for downstream systems — visa rules key off iso_alpha2, transit schedules key off iso_alpha2 plus state, currency keys off iso_alpha2.
Other classic same-name pairs
- Lahore: the Pakistani megacity in Punjab, PK; and historical references to Lahore-area villages on the Indian side. Pre-1947 sources use the name liberally on both sides of what is now the border.
- Cambridge: the original in England, plus Cambridge Massachusetts (Harvard, MIT), Cambridge Ontario, Cambridge New Zealand, Cambridge Maryland, and a dozen more in the United States alone.
- Mathura: the major pilgrimage city in Uttar Pradesh associated with Krishna; and Mathura villages in Punjab and elsewhere with no religious significance.
- Kashi / Varanasi / Banaras: same place, three names — the inverse problem, where one entity is referenced by aliases. iso_alpha2 plus state_code plus a canonical id solves both directions.
- Vijayawada / Bezawada: same place, modern and historic name.
- Goa: the Indian state versus Goa, a town in Iran. Less common, real in geographic gazetteers.
- Tripoli: the Libyan capital and the Lebanese city. Both ancient, both Mediterranean, both real.
The product implication
If your AI travel product accepts free-text city names from users and passes them straight to a model, you have a disambiguation bug that surfaces randomly. The fix is to resolve every city name to a (name, iso_alpha2, state_code) triple before the model sees it, and to surface a disambiguation prompt to the user when the triple is ambiguous.
"Did you mean Hyderabad, India or Hyderabad, Pakistan?" is a one-line UI fix that prevents an entire class of confidently wrong answers downstream. The data layer just needs to make that question askable.