
👾 ElatoAI: Running OpenAI Realtime API Speech on ESP32 on Arduino with Deno Edge Functions
This guide shows how to build a AI voice agent device with Realtime AI Speech powered by OpenAI Realtime API, ESP32, Secure WebSockets, and Deno Edge Functions for >10-minute uninterrupted global conversations.
An active version of this README is available at ElatoAI.
<div align="center"> <a href="https://www.youtube.com/watch?v=o1eIAwVll5I" target="_blank"> <img src="https://raw.githubusercontent.com/akdeb/ElatoAI/refs/heads/main/assets/thumbnail.png" alt="Elato AI Demo Video" width="100%" style="border-radius:10px" /> </a> </div>⚡️ DIY Hardware Design
The reference implementation uses an ESP32-S3 microcontroller with minimal additional components:
<img src="https://raw.githubusercontent.com/openai/openai-cookbook/refs/heads/main/examples/voice_solutions/arduino_ai_speech_assets/pcb-design.png" alt="Hardware Setup" width="100%">Required Components:
- ESP32-S3 development board
- I2S microphone (e.g., INMP441)
- I2S amplifier and speaker (e.g., MAX98357A)
- Push button to start/stop the conversation
- RGB LED for visual feedback
- Optional: touch sensor for alternative control
Hardware options: A fully assembled PCB and device is available in the ElatoAI store.
📱 App Design
Control your ESP32 AI device from your phone with your own webapp.
<img src="https://raw.githubusercontent.com/openai/openai-cookbook/refs/heads/main/examples/voice_solutions/arduino_ai_speech_assets/mockups.png" alt="App Screenshots" width="100%">| Select from a list of AI characters | Talk to your AI with real-time responses | Create personalized AI characters |
|---|
✨ Quick Start Tutorial
<a href="https://www.youtube.com