Audio to Text Components

David Champion · December 1, 2022

Is there a VCL library available to convert human voice audio into text?

It has to work without access to the internet, the English language preferably.

programmerdelphi2k · December 1, 2022

https://blogs.embarcadero.com/this-google-api-easily-adds-powerful-speech-recognition-to-apps/

https://github.com/halilhanbadem/delphi-google-speech-to-text

now, if you want just "read the text", you can use the "ISpeechVoice" from MS, just importing it in your IDE, nothing more:

Component -> Import component... type library sapi

Edited December 1, 2022 by programmerdelphi2k

David Champion · December 1, 2022

@programmerdelphi2k This doesn't achieve what I'm trying to do which is to transliterate without any cloud services.

The client's environment prohibits the use of the internet; there is only a local network.

programmerdelphi2k · December 1, 2022

ok! if I know some I let you know!

David Champion · December 1, 2022

I have found the Windows.Media.SpeechRecognition namespace in Microsoft WinRT.

That may be a way forward without the Speech Recognition needing to connect to Azure.

PeteG · May 2, 2024

Hi David,

Long time later but did you ever get anywhere with this? I've found https://github.com/ggerganov/whisper.cpp which is a C++ library, could work at getting that doing something with a fair bit of work.

Pete

David Champion · May 2, 2024

Thanks for the recommendation. The feature that I was suggesting as part of an on going project was not thought to be worthwhile.

So, no, it was canned.

Edited May 2, 2024 by David Champion

Rollo62 · May 2, 2024

You can have a look here, from Grijjy, its quite old, but worked well for me under iOS and Android, so I assume Windows is OK too.

David Champion · May 2, 2024

@Rollo62 It was more the other way round; limited Speech Recognition.

Logging to text at various intervals what people are saying, so that positions in Audio log can be sparsely described.

Also, the application cannot connect to the internet.

Edited May 2, 2024 by David Champion

Rollo62 · May 2, 2024

Oh yes, of course.
I had skimmed the title too quickly, normally terms like TTS, TextToSpeech, SAPI, SpeechToText trigger me in the right direction.

Maybe this will be helpful
"https://learn.microsoft.com/de-de/windows/apps/develop/speech"
and an older article with Rx1.4.2 sources by Brian Long
"http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm"

Sign In

Audio to Text Components

Recommended Posts

David Champion 48

Share this post

Link to post

programmerdelphi2k 239

Share this post

Link to post

David Champion 48

Share this post

Link to post

programmerdelphi2k 239

Share this post

Link to post

David Champion 48

Share this post

Link to post

PeteG 0

Share this post

Link to post

David Champion 48

Share this post

Link to post

Rollo62 589

Share this post

Link to post

David Champion 48

Share this post

Link to post

Rollo62 589

Share this post

Link to post

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity