lucarnet 0 Posted yesterday at 10:37 AM Hello, In a cross-platform home automation project (Windows 11/Android 12/Debian 12), I'm having a new problem when I run this program under Delphi 11.2/FMX/Win64 which calls a "vosk_model_new" function from a "libvosk.dll" library (speech recognition library): 1) On Windows 11/Delphi 11.2FMX/Win64: when the prg is executed in the IDE in debug or release mode, the function execution time is approximately 2.5 minutes! 2) On Windows 11/Delphi 11.2FMX/Win64: when the prg is executed from the command line in a Windows window, the function execution time is acceptable (approximately 1 to 2 seconds). 3) On Delphi 11.2FMX/Android 12, the function execution time outside the IDE is acceptable (approximately 1 to 2 seconds). (Impossible to test in the IDE in DEBUG mode for Android) - On RASPI/Debian 12/Lazarus 4.0, there are no problems with the equivalent libvosk.so file. After numerous tests: - Function call in Stdcall or cdecl - With lib in static or dynamic format - Options/project/remote debugging unchecked The problem is the same, and I don't see any solution. Is it more the IDE that I should be looking for, or the Vosk DLL? Does anyone have any leads, ideally a solution? Thanks in advance, and have a nice day. The IDE log events, which list the multiple thread start/exit events (>100) during the execution of the function in the IDE. Extrait LOGS: Header unit of the DLL "Vosk_api.pas": unit VoskModelNew; interface uses SysUtils, Classes, vosk_api; type TVoskModel = class(TComponent) private { Private declarations } FModel: PVoskModel; FModelPath: String; protected { Protected declarations } public { Public declarations } //constructor Create(AOwner: TComponent); overload; override; constructor Create(AOwner: TComponent; AModelPath: String); reintroduce; destructor Destroy; override; function FindWord(AWord: String): Integer; property Model: PVoskModel read FModel; published { Published declarations } property ModelPath: String read FModelPath; end; implementation constructor TVoskModel.Create(AOwner: TComponent; AModelPath: String); var pmodelpath:pAnsichar; begin inherited Create(AOwner); FModelPath := AModelPath; pmodelpath := AnsiStrAlloc(length(Amodelpath)); StrPLCopy(pmodelpath, Amodelpath, length(Amodelpath)); try FModel := vosk_model_new(pmodelpath);//FModelPath)); finally StrDispose(pmodelpath); end; end; ... DLL function call unit: unit vosk_api; { This unit is automatically generated by Chet: https://github.com/neslib/Chet } {$MINENUMSIZE 4} interface const {$IF Defined(WIN32)} LIB_VOSK = 'libvosk.dll'; _PU = ''; {$ELSEIF Defined(WIN64)} LIB_VOSK = 'libvosk.dll'; _PU = ''; {$ELSE} {$MESSAGE Error 'Unsupported platform'} {$IFEND} type PVoskModel = Pointer; PPVoskModel = ^PVoskModel; PVoskSpkModel = Pointer; PPVoskSpkModel = ^PVoskSpkModel; PVoskRecognizer = Pointer; PPVoskRecognizer = ^PVoskRecognizer; PVoskBatchModel = Pointer; PPVoskBatchModel = ^PVoskBatchModel; PVoskBatchRecognizer = Pointer; PPVoskBatchRecognizer = ^PVoskBatchRecognizer; function vosk_model_new(const model_path: PAnsiChar): PVoskModel; cdecl; external LIB_VOSK name _PU + 'vosk_model_new'; ... DLL function call unit: logevents20250613.txt Share this post Link to post
Der schöne Günther 336 Posted yesterday at 10:59 AM This is the same question as last time, right? Problem time execution Mode DEBUG or RUN Library Vosk - Delphi IDE and APIs - Delphi-PRAXiS [en] There is no functional problem, it's just that the performance is not acceptable when being debugged, right? Have you tried using a regular profiler? Share this post Link to post
Kas Ob. 138 Posted yesterday at 11:08 AM 28 minutes ago, lucarnet said: Is it more the IDE that I should be looking for, or the Vosk DLL? Does anyone have any leads, ideally a solution? Don't know what is Vosk is and will not search the internet, one thing i can see is very wrong in your attached log file The amount of threads started and exited is huge and wrong, so 1) Either there is some setting your are missing to let such library utilize threading pool right, or you are calling the wrong model. 2) You are loading and unloading the library or part of it so may times ! Check these, because i don't believe a normal and tested library should use so much thread in such manner, that makes no sense at all. Share this post Link to post
Kas Ob. 138 Posted yesterday at 11:24 AM I just remembered seeing a code or something were the the programmer tried to decode audio captured into image, so the audio is PCM and at minimum (standard minimum) and will be 8000 sample per second, and he was trying to spawn a thread for each sample to perform Fourier transformation on each and every sample, so his idea was to spawn 8000 thread per second at least, i think you are making a mistake close or similar to this, so in case these threads are yours then rethink again and find a working demo or the from documentation on how to feed the data the right way. Share this post Link to post
lucarnet 0 Posted 23 hours ago Quote This is the same question as last time, right? Problem time execution Mode DEBUG or RUN Library Vosk - Delphi IDE and APIs - Delphi-PRAXiS [en] There is no functional problem, it's just that the performance is not acceptable when being debugged, right? Have you tried using a regular profiler? -Yes indeed, I preferred to change my topic to a more appropriate folder. -The problem only occurs when running the code in the Delphi IDE, both in debug and release mode, surprising isn’t it? -I've never used a regular profiler, but I'll look into it. thanks for your response. Share this post Link to post
lucarnet 0 Posted 23 hours ago 5 hours ago, Kas Ob. said: I just remembered seeing a code or something were the the programmer tried to decode audio captured into image, so the audio is PCM and at minimum (standard minimum) and will be 8000 sample per second, and he was trying to spawn a thread for each sample to perform Fourier transformation on each and every sample, so his idea was to spawn 8000 thread per second at least, i think you are making a mistake close or similar to this, so in case these threads are yours then rethink again and find a working demo or the from documentation on how to feed the data the right way. Your comment is very interesting; we're indeed dealing with the same type of processing. However, why does the timing issue only appear when running the code in the Delphi IDE? I'm using this same library for the speech recognition code on a Debian 12 server with Lazarus 4.0. Under Lazarus in debug or release mode, I don't have the problem; testing these same Vosk library functions under Python and C also works fine. However, I've reported the issue to the Vosk developer concerned. Perhaps you have the link to another forum where someone has encountered this same problem? Thanks in any case. Share this post Link to post
Lajos Juhász 320 Posted 20 hours ago It could be that when the library detects the debugger it slows down. You can try using Lazarus at Windows to see if it behaves the same. Share this post Link to post
Kas Ob. 138 Posted 4 hours ago 13 hours ago, lucarnet said: However, why does the timing issue only appear when running the code in the Delphi IDE? Spent more than 4 hours on this, not because i already witnessed this many times, but just i wanted to refresh my frustration with this pile of sh** called Delphi 64bit debugger, Here i want to make few things clear 1) The Delphi debugger 32bit and 64bit does have parts are the least can be called masterpiece, so it is not useless in whole, but the smart people worked on it, either left and the debugger stuck with different people, who have least knowledge about the existing code should be or those who wrote that beautiful debugger loop, have no idea on what to do as next step. 2) I have XE8 so it might be different from modern IDEs, as Embarcadero in the last decade have the phrase "lets make Delphi great again" rephrased a little, and from readin they are focusing on LSP and making it great again, good luck with that ! considering they are just adding process and memory allocation, it is the exact thing that will be impossible to squeeze performance from, on the contrary it will be worse with every step in that direction. Now back to the subject, and most likely i will rant again about the debugger, @lucarnet you said it takes 2.5 minutes under debugging while it takes 1-2 seconds without a debugger, my CPU is Sandy Bridge i5-2500 from more that decade and half, the these times is strangely close, yes it takes between 2.5-5 minutes and i am talking about just loading the model, not even loading the recognizer. Out of intrigue i looked at this VOSK library and using their documentation i download the binaries with pyhon pip, so far so good, browsed the demos and liked this one https://github.com/alphacep/vosk-api/blob/master/c/test_vosk.c so translated the needed headers, and easier than that there is none const VOSK_LIBNAME = 'libvosk.dll'; type PVoskModel = Pointer; PVoskRecognizer = Pointer; function vosk_model_new(const model_path: PAnsiChar): PVoskModel; external VOSK_LIBNAME name 'vosk_model_new'; procedure vosk_model_free(model: PVoskModel); external VOSK_LIBNAME name 'vosk_model_free'; function vosk_recognizer_new(model: PVoskModel; sample_rate: Double): PVoskRecognizer; external VOSK_LIBNAME name 'vosk_recognizer_new'; procedure vosk_recognizer_free(rec: PVoskRecognizer); external VOSK_LIBNAME name 'vosk_recognizer_free'; function vosk_recognizer_accept_waveform(rec: PVoskRecognizer; const data: Pointer; len: Integer): Integer; external VOSK_LIBNAME name 'vosk_recognizer_accept_waveform'; function vosk_recognizer_result(rec: PVoskRecognizer): PAnsiChar; external VOSK_LIBNAME name 'vosk_recognizer_result'; function vosk_recognizer_partial_result(rec: PVoskRecognizer): PAnsiChar; external VOSK_LIBNAME name 'vosk_recognizer_partial_result'; function vosk_recognizer_final_result(rec: PVoskRecognizer): PAnsiChar; external VOSK_LIBNAME name 'vosk_recognizer_final_result'; great, now i need a model and wave file, so https://alphacephei.com/vosk/models and grabbed the first and smallest model 40mb from https://github.com/alphacep/vosk-api/tree/master/python/example grabbed the wave file 256kb so far so good build i the same demo and low and behold modelPath := 'vosk-model-small-en-us'; model := vosk_model_new(PAnsiChar(modelPath)); if model = nil then begin Log('Error loadin model'); Exit; end; This model loading takes minutes under debugger while takes something less then 3 minutes a little, and without the debugger it literally takes 400ms (milliseconds) it is 0.4 second ! So i am confirming your problem with debugger, later on this will write something. Now to your other problem, loading the model and recognizer doesn't even create a single background thread, as vosk_recognizer_accept_waveform kept failing and hugging the CPU, i couldn't make it work, but also didn't create threads ! So the threads in the log of yours is coming from your own miss using something somewhere, may be in handling wave file or microphone, fix these as these threads will only make the debugger even slower. Also the demo is somehow wrong in assuming and feeding 3200 bytes and didn't read the header right way to check for sample rate and channels every thing else, but it should work i guess, yet it didn't work on my device. Recommendation, Delphi debugger is useless and i am talking specifically about 64bit, it is wrong and doing wrong on so many levels 1) The demo and this library do allocate 4-5 chunks of memory each are 128mb, and will add more chunks for the recognizer and will increase but there is no memory leak, just huge amount of memory is needed, and it is expected. 2) The debugger utilize for unknown reason, well may be there was valid reasons 20 years ago, to have two processes to perform the debugging, but i can't justify having a sentinel process calling DebugBreak, WHY ? who knows. 3) Now comes the fun part, the debugger must be 64bit while the IDE 32bit so an IPC is due, great, so lets take one and never look back, so they choose TCP, TCP (WTF), even on loopback it is TCP packets, no shared memory ?! no no shared memory ! 4) Now the debugger perform ReadProcessMemory as like every debugger out there, but wait we mist do it differently, see since the dawn of computers half century ago there was pages and they are 4kb and on Windows an address is valid if the page is valid, so lets perform the reading on 512byte ! 5) 512bytes is somehow still better than 1 byte, so lets perform reading on 1 byte, we must not gave up, and in case we wrote the breakpoint by using WriteProcessMemory at single byte which is fine and justified, lets do all the breaks points to make sure and even the API is success, success means it does wrote the memory, we must be sure and read them all again , !! 6) OK we have the memory read in small chunks we need to ship them to the IDE and the IDE debugger, lets send them in 1,8,12,16 bytes, genius !, so hundred of thousands of send over TCP in the smallest way possible, notice TCP require ACK, so switching to UDP will enhance the throughput, we are against that, and still no shared memory ! 7) how to handle sockets in Turbo Pascal way over dialup modem, every IOCTLSocket (the successful ones) should be followed be WSAGetLastError for no freaking reason, and lets mix Select with WSAAsyncSelect to create an abomination, just to make sure, while the IDE and its UI will block for every packet, so the IDE should freeze showing it is busy, great ideas and great design. .. So many things to be angry about, as i can't be unbiased here, but the IPC between IDE and real debugging process is wrong and outdated, it can work and debug a 10 lines project, not a real life semi-heavy application. Now i am angry 😪 @lucarnet there is no way to debug such heavy memory usage application using very, yes very outdated debugger, the Delphi debugger, you should fix the threads creation as i mentioned above, as it is has nothing to do with VOSK library, and skip debugging on Delphi, use logging instead, i am guessing here, those thread are coming form your different library you are using to handle microphone or wave file reading. And good luck ! Ps: looking at the assembly generated for 64bit it is a sight to see NOP there between the instruction serving literally nothing, and even one of them was ducking up the alignment in the freaking loop, wow just wow. Share this post Link to post
DelphiUdIT 244 Posted 4 hours ago (edited) May be is a stupid information: and try with ATTACH instead to do a direct debugging ? Edited 4 hours ago by DelphiUdIT Share this post Link to post
Kas Ob. 138 Posted 4 hours ago 13 minutes ago, DelphiUdIT said: May be is a stupid information: and try with ATTACH instead to do a direct debugging ? between same and worse. 1 Share this post Link to post