Vietnamese developer’s Cabin AI offers near real‑time interpretation at events with pre-learning

By Luu Quy   November 10, 2025 | 01:14 am PT
Built by Vietnamese developer Tran Vu Anh, Cabin AI delivers translations in three to five seconds by “pre‑learning” an event’s context.

At an international innovation forum in Ho Chi Minh City in late October, organizers asked hundreds of attendees to scan a QR code at the door. A web page popped up saying "pick your language, choose audio or on‑screen subtitles. No headsets. No interpreter booths."

People simply followed along on their phones. That was a live showcase for Cabin AI, a system developed by Anh and his team to streamline interpreting at conferences, workshops and meetings.

Solving the ‘translate while they’re talking’ problemCabin AI targets a stubborn challenge: translating while a speaker is still talking, not after each sentence. "We wanted a tool that reacts like a human, listening, understanding and translating at the same time, but powered by artificial intelligence," Anh says.

Users of Cabin AI follow the translated speech directly on their phones. Video by VnExpress/ Luu Quy

After launching the document‑translation platform DocTranslate.io, the team spent more than a year extending its work to speech and video. Real-time translation hinges on speed and reliable speech recognition.

Older systems often wait for sentence breaks, creating awkward delays, and they frequently stumble on names, dates, numbers, foreign terms, varied accents, and mixed languages.

Cabin AI tackles this with specialized speech‑recognition and translation models fine‑tuned using curated data. Its signature feature, Anh says, is the ability to learn the context before an event: from slide decks, agendas and planned topics.

With that background, the system handles domain‑specific terminology more precisely while keeping latency low. In ideal conditions, it responds within three to five seconds of the speaker starting.

It has performed well with accented speech and code‑switching and currently supports more than 32 languages, including Vietnamese, English, Chinese, Japanese, Korean, Thai, French, German, Spanish, Italian, Russian, and Hindi.

Giao diện trang web của Cabin AI cho phép người dùng chọn ngôn ngữ mong muốn. Ảnh: Lưu Quý

Through Cabin AI’s web interface, users can select the language they want to use. Photo by VnExpress/Luu Quy

Trial by fire at major forumsIn October Cabin AI was tested at large events, including the Open Innovation Forum and a Quantum Technology workshop. It served as the official interpreter across full programs and dozens of unscripted panels.

"Subtitles appeared almost simultaneously with the speech, it felt like the speaker was using my native language!" one attendee said.

The pitch is not to replace professionals but to broaden coverage and simplify logistics. Organizers can deploy Cabin AI for about VND 500,000–1,000,000 (about US$19–$38) per hour, depending on event size, number of languages, and technical support. Instead of maintaining headset networks or staffing multiple interpreters for parallel sessions, audiences use their own devices.

Afterward, the system can export transcripts or summarized minutes to save time. Beyond conference halls, the platform has also been optimized for online meetings and direct conversations. "The solution helps multinational teams communicate effectively without language barriers," Anh says.

The founderAnh was named one of 10 young technology leaders at the 2022 CTO Summit organized by VnExpress. His earlier DocTranslate project made the Top 5 at Techfest run by the Ministry of Science and Technology and was selected for the Google for Startups Accelerator.

 
 
go to top