Jump to content
Sign in to follow this  
Maxidonkey

POC: Delphi VCL + WebView2 component for OpenAI Realtime (WebRTC, voice & text)

Recommended Posts

Hello everyone,

 

Following up on my previous post about EdgeAudio,  and after a suggestion from Kas Ob. about using WebRTC with WebView2, I’ve published a new project on GitHub:  Edge-OpenAI-Realtime

 

This project provides a VCL component that implements WebRTC through WebView2 and leverages OpenAI’s Realtime VAD.  
It supports the full set of Realtime APIs (as of September 2025), including functions and remote MCP tools.

 

To accompany the code, I wrote a white paper (included in the README) that details the architecture and runtime sequence.  

 

A demo ZIP archive is also available in the samples folder, so you can quickly try out the component once installed.

 

I’d be glad to hear your feedback or answer any questions!

 

 

Sample1.png

Edited by Maxidonkey
  • Like 3

Share this post


Link to post

Hi, 

 

I am sorry i can't test the samples as Edge is not there for older Delphi's, and not planning on installing CE as i don't see a real reason (now at least for my self) to do it, anyways..

 

The code is neat and looks very clean, and most important part the bridge is piece of art, Nice !

 

and i ( and many here i think) want to hear your finding and experience on

1) using audio with WebRTC and its audio processing, CPU load, does VAD on Delphi with Edge perform as advertised, see, i am very familiar with Jitsi https://en.wikipedia.org/wiki/Jitsi and been a user for years and suggested it for many and many run their own servers, it always astonished me with its performance, like running on old Android even within a browser it was fast and responsive, so what is your experience with it ? how this compare to your EdgeAudio lib.

 

2) What is the real problem (struggle, may this is not the word but stopped you) with enabling AEC and AGC ? see, Jitsi performed better than native Skype on the same old mobile device with the same connection, on Desktop if you are in middle of conversation and audio is playing then if you change the position of the microphone to close (way closer suddenly) or far from the speakers then a loopback happen ( some distortion and may be echo) for second or fraction of second then it will correct, and to my recall VAD/AEC/AGC are enabled by default, why only VAD !

 

Thank you for this lib and for sharing!

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×