Renate Schaaf 68 Posted 21 hours ago I've just uploaded an update to my project https://github.com/rmesch/Bitmaps2Video-for-Media-Foundation What it does: Contains a VCL-class which encodes a series of bitmaps and video-clips together with an audio-file to video. The result is an .mp4-file with H264 or H265 compression together with AAC-audio. It uses windows mediafoundation, which is usually contained in windows. Hardware-encoding is supported, if your graphics-card can do it. Requires: Headers for mediafoundation from FactoryXCode: https://github.com/FactoryXCode/MfPack Windows 10 or higher Encoder (MF-Transform) for H264/H265, usually come with the graphics-driver Delphi XE7 or higher, if I haven't messed it up again, I've only got the CE and Delphi2006 (Win32 and Win64 should be working, but Win64 recently crashes for me with "The session was disconnected".) The demo-project shows some uses: Record a series of canvas-drawings to video Make a slideshow from image-files (.bmp,.jpg,.png,.gif) with music (.wav, .mp3, .wmv, ...) and 2 kinds of transitions Insert a videoclip into a slideshow (anything that windows can decode should work) Transcode a video-file including the first audio-stream. Improvements: I think I now better understand how to feed frames to the encoder. With the right settings it makes stutter-free videos with good audio-video-synchronization. It's now usable for me in my "big" project, and I no longer need to rely on ffmpeg - dlls. More info in changes.txt. Just try it, if you're interested, I'd be glad. Renate 2 1 Share this post Link to post
Kas Ob. 140 Posted 7 hours ago Hi @Renate Schaaf , Friendly reminder i pointed to in the past, the audio duration and handling is having hidden problem, it might be not visible (ok hearable) now, but it render the library not future proof and any change in the way codec API works will cause either errors or desynced audio/video so here my thoughts on this part about TBitmapEncoderWMF.WriteAudio https://github.com/rmesch/Bitmaps2Video-for-Media-Foundation/blob/main/Source/uBitmaps2VideoWMF.pas#L1523-L1618 1) Ditch "goto Done;" and use try..finally it is safer and here there is no need for goto and loop is not complex, it is just exit. 2) https://github.com/rmesch/Bitmaps2Video-for-Media-Foundation/blob/main/Source/uBitmaps2VideoWMF.pas#L1685 doesn't check, fail or warn about audio failure 3) after reading samples with pSourceReader.ReadSample https://github.com/rmesch/Bitmaps2Video-for-Media-Foundation/blob/main/Source/uBitmaps2VideoWMF.pas#L1556-L1562 you should check the returned bytes size, are they aligned with the requested audio format ?! This what bring the last discussion if my memory didn't fail me, the audio duration should be aligned, in other words AudioBlock := nChannels * wBitsPerSample / 8 , this will give the exact amount in bytes that can't be divided, so any audio data passing should be multiple of AudioBlocks, but we almost always have these as integers then AudioBlock := (nChannels * wBitsPerSample) div 8 ; this should do Now to do the the extra check for AudioDuration you can have it like this AudioDuration := (BufferSize / AudioBlock) * (10000000 / nSamplesPerSec); The difference between Audio and Video i am sure you know a lot about, but you may be didn't experience or witness when codec start to 1) fail with errors 2) desync the audio-video due dropping the less than block 3) corrupt the quality with sound artefacts due internal padding of the samples on its own. each of these is a bug could be in any codec, they all evolve and change as their implementation keep optimized and worked on. I remember this very clearly, it was pain in the back with ASF and WMV, sometimes works next day doesn't on the same Windows, the root cause was the block alignment, even if the other codec handling the audio decoding did the mistake and returned wrong size you should hold on the left over and feed it later, example 2channels with 16bit samples the size is 4 bytes, for 6 channels and 24bits the size is 18 bytes , you can test different audio files like with 5.1 and 7.1 (6 channels and 8 channels) using sample from https://www.jensign.com/bdp95/7dot1voiced/index.html Hope that help. ps this part // fAudioDuration can be false! // if fAudioTime >= fAudioDuration then // fAudioDone := true; if fAudioDone then hr := pSinkWriter.NotifyEndOfSegment(fSinkStreamIndexAudio); // The following should not be necessary in Delphi, // since interfaces are automatically released, // but it fixes a memory leak when reading .mkv-files. SafeRelease(pAudioSample); Is disturbing, 1) The commented "if fAudioTime >= fAudioDuration then" is right and should be used but "fAudioDuration can be false!" well i would love to hear how this happen. 2) "but it fixes a memory leak when reading .mkv-files." return us to (1) from above using try..finally is best and will prevent memory leak, but such a case for .mkv files is strange and should be investigated deeper as it could be serious problem and might cause huge leak in the loop it self depleting OS memory specially for 64bit. 1 Share this post Link to post
Renate Schaaf 68 Posted 6 hours ago Hi Kas, Good to see you again, and sorry for the long time of inactivity on my part. Thank you for the detailed input, which I need to digest first. Since you already invested so much thought, wouldn't you like to be a contributor? When I incorporate the changes you mention, I wouldn't even know how to list you as contributor. The issues you mention definitely need to be looked into. For the audio-part I was just glad it worked, and haven't put much thought into it lately. The wrong audio-duration was returned by some .vobs, which aren't really supported in the first place. The missing SafeRelease(pAudioSample) has caused memory leaks for me in a totally different context too, when I tried to write some code which simply plays an audio file through the default-device. Renate 1 Share this post Link to post
Anders Melander 2012 Posted 5 hours ago 1 hour ago, Kas Ob. said: 1) Ditch "goto Done;" and use try..finally it is safer and here there is no need for goto and loop is not complex, it is just exit. There's not even a need for try..finally since there no resources to protect; Get rid of hr, assign the results to Result directly and just exit. Also, instead of: raise Exception.Create('Fail in call nr. ' + IntToStr(Count) + ' of ' + ProcName + ' with result $' + IntToHex(hr)); I would use: raise Exception.CreateFmt('Fail in call no. %d of %s with result %x', [Count, ProcName, hr]); for readability. Share this post Link to post
Renate Schaaf 68 Posted 4 hours ago Hi Anders, Thanks for that. I hate the format-strings, because I can never remember the code for the place-holders. I had already thought before, that I should get used to them, though. Now I also see, that I forgot to use IntToHex(hr,8) 🙂 Share this post Link to post
Anders Melander 2012 Posted 2 hours ago 1 hour ago, Renate Schaaf said: I hate the format-strings, because I can never remember the code for the place-holders. If only there was some magic key you could press in the editor to display help about various relevant topics... 🙂 I only ever use %s %d, %n and %x - and I use those a lot so that helps but I sometime need to consult that magic key when it comes to the precision or index specifiers. Share this post Link to post
Kas Ob. 140 Posted 2 hours ago 4 hours ago, Renate Schaaf said: Hi Kas, Good to see you again, and sorry for the long time of inactivity on my part. Thank you for the detailed input, which I need to digest first. Since you already invested so much thought, wouldn't you like to be a contributor? When I incorporate the changes you mention, I wouldn't even know how to list you as contributor. The issues you mention definitely need to be looked into. For the audio-part I was just glad it worked, and haven't put much thought into it lately. The wrong audio-duration was returned by some .vobs, which aren't really supported in the first place. The missing SafeRelease(pAudioSample) has caused memory leaks for me in a totally different context too, when I tried to write some code which simply plays an audio file through the default-device. Renate I didn't contribute anything, and very much thank you for offering, but don't worry about this. One more thing, about the whole duration thing, but to explain i want to go back in time to many decades back, when the standard of highest audio quality chose 44100hz as CD quality and best quality, this is strange number at first, when you know how they came up with it, things get clearer, read this https://en.wikipedia.org/wiki/44,100_Hz#Recording_on_video_equipment So, to interleave the audio and video because things was very different back then and storing or buffering the audio was very expensive and complicated using simple circuits available back then, they needed a number to make sure to divide and support 50fps and 60fps with 3 sample per line, so they can interleave the samples with the lines data for video. Fast forward decades later, and we don't have the only PAL and NTSC system, we have so many combination of FPS with sizes, but still most used standard fps are 23.976 and 29.97 (among less used 24, 25, 29.97, 30, 50, 60..), strange ? that is question have to do with old systems, the internet have so many resources answering this question, yet web broadcasting and multiplexing streams changed things a lot, so we can't depend on these only, while most likely you don't see 44100 like it was back then. Anyway, in the past they changed the audio duration based on video to be compliant, but for modern times and better sample rate which at least 48k, also technology that allow or relief the need to output directly, but use the buffer ahead, the whole thing still need syncing, and to make sure the audio-video are synced then they should follow a rule, the audio should be aligned into sample per second and the video should should accommodate this unlike what was happening in the past, So even the video duration should be multiple of audio sample duration, consider this if you need your encoding %100 synced or the best it can be, in other word correct the video duration too, while audio sample rate is high number you have more flexible fps per second, the difference could be between 40 and 40.00002, yet if your video is 40.00002 i never saw i player show it as 40.00002 but always 40, this difference will prevent desyncing. Hope that was clear and thank you again for the work on this and for the offering, Share this post Link to post