referenceFree

browser-translation-demo

This is a small browser demo for one-way live translation from tab audio. The

Browser Realtime Translation Demo

This is a small browser demo for one-way live translation from tab audio. The server creates a short-lived OpenAI Realtime Translation client secret, and the browser uses WebRTC to send captured tab audio and play translated speech with captions.

Setup

Create a local .env file in this demo folder using .env.example as the list of required variables.

Required:

OPENAI_API_KEY=your-openai-api-key

Optional:

OPENAI_TRANSLATION_MODEL=gpt-realtime-translate
OPENAI_INPUT_TRANSCRIPTION_MODEL=gpt-realtime-whisper
PORT=5173
HOST=127.0.0.1

Run

From the cookbook repo root:

cd examples/voice_solutions/realtime_translation_guide/browser-translation-demo
npm install
npm run dev

Open the local URL printed by the server. Choose a browser tab with audio, pick the language you want to hear, and start translation.

Audio mix

The app includes an Audio mix slider for balancing translated speech with the original tab audio. By default, it plays 85% translated audio and 15% original audio, matching the LiveKit demo.

When the selected source is a browser tab, the demo still requests suppressLocalAudioPlayback in the getDisplayMedia() audio constraints. If the browser honors that setting, the slider controls both the translated audio and the original audio playback from this app. If the browser does not support local playback suppression, the source tab may continue playing outside the slider, so lower or mute the source tab if you hear too much original audio.

Validation

From the cookbook repo root:

cd examples/voice_solutions/realtime_translation_guide/browser-translation-demo
npm test

To run a live API smoke test:

npm run smoke

Notes

  • The browser uses getDisplayMedia() so the user explicitly chooses the source tab.
  • WebRTC handles browser audio transport, so the browser does not need to resample tab audio or manually send PCM chunks.