Jump to content
Renate Schaaf

Bitmaps to Video for Mediafoundation

Recommended Posts

I've just uploaded an update to my project 

 

https://github.com/rmesch/Bitmaps2Video-for-Media-Foundation

 

What it does: 
Contains a VCL-class which encodes a series of bitmaps and video-clips together with an audio-file to video.

The result is an .mp4-file with H264 or H265 compression together with AAC-audio.

It uses windows mediafoundation, which is usually contained in windows. Hardware-encoding is supported, if your graphics-card can do it.

 

Requires:
Headers for mediafoundation from FactoryXCode: https://github.com/FactoryXCode/MfPack
Windows 10 or higher
Encoder (MF-Transform) for H264/H265, usually come with the graphics-driver
Delphi XE7 or higher, if I haven't messed it up again, I've only got the CE and Delphi2006
(Win32 and Win64 should be working, but Win64 recently crashes for me with "The session was disconnected".)

 

The demo-project shows some uses:
    Record a series of canvas-drawings to video
    Make a slideshow from image-files (.bmp,.jpg,.png,.gif) with music (.wav, .mp3, .wmv, ...) and 2 kinds of transitions
    Insert a videoclip into a slideshow (anything that windows can decode should work)
    Transcode a video-file including the first audio-stream.

 

Improvements: 
I think I now better understand how to feed frames to the encoder. With the right settings it makes stutter-free videos with good audio-video-synchronization. It's now usable for me in my "big" project, and I no longer need to rely on ffmpeg - dlls.

 

More info in changes.txt.

 

Just try it, if you're interested, I'd be glad. 

Renate

 

  • Like 2
  • Thanks 1

Share this post


Link to post

Hi @Renate Schaaf ,

 

Friendly reminder i pointed to in the past, the audio duration and handling is having hidden problem, it might be not visible (ok hearable) now, but it render the library not future proof and any change in the way codec API works will cause either errors or desynced audio/video

 

so here my thoughts on this part

about TBitmapEncoderWMF.WriteAudio https://github.com/rmesch/Bitmaps2Video-for-Media-Foundation/blob/main/Source/uBitmaps2VideoWMF.pas#L1523-L1618

1) Ditch "goto Done;" and use try..finally it is safer and here there is no need for goto and loop is not complex, it is just exit.

2) https://github.com/rmesch/Bitmaps2Video-for-Media-Foundation/blob/main/Source/uBitmaps2VideoWMF.pas#L1685 doesn't check, fail or warn about audio failure

3) after reading samples with pSourceReader.ReadSample https://github.com/rmesch/Bitmaps2Video-for-Media-Foundation/blob/main/Source/uBitmaps2VideoWMF.pas#L1556-L1562 you should check the returned bytes size, are they aligned with the requested audio format ?!

This what bring the last discussion if my memory didn't fail me, the audio duration should be aligned, in other words

AudioBlock := nChannels * wBitsPerSample / 8 , this will give the exact amount in bytes that can't be divided, so any audio data passing should be multiple of AudioBlocks, but we almost always have these as integers then

AudioBlock := (nChannels * wBitsPerSample) div 8 ; this should do 

Now to do the the extra check for AudioDuration you can have it like this

AudioDuration := (BufferSize / AudioBlock) * (10000000 / nSamplesPerSec);

 

The difference between Audio and Video i am sure you know a lot about, but you may be didn't experience or witness when codec start to

1) fail with errors

2) desync the audio-video due dropping the less than block 

3) corrupt the quality with sound artefacts due internal padding of the samples on its own.

each of these is a bug could be in any codec, they all evolve and change as their implementation keep optimized and worked on.

 

I remember this very clearly, it was pain in the back with ASF and WMV, sometimes works next day doesn't on the same Windows, the root cause was the block alignment, even if the other codec handling the audio decoding did the mistake and returned wrong size you should hold on the left over and feed it later, example 2channels with 16bit samples the size is 4 bytes, for 6 channels and 24bits the size is 18 bytes , you can test different audio files like with 5.1 and 7.1 (6 channels and 8 channels) using sample from https://www.jensign.com/bdp95/7dot1voiced/index.html

 

Hope that help.

 

ps this part 

    // fAudioDuration can be false!
    // if fAudioTime >= fAudioDuration then
    // fAudioDone := true;
    if fAudioDone then
      hr := pSinkWriter.NotifyEndOfSegment(fSinkStreamIndexAudio);
    // The following should not be necessary in Delphi,
    // since interfaces are automatically released,
    // but it fixes a memory leak when reading .mkv-files.
    SafeRelease(pAudioSample);

Is disturbing, 

1) The commented "if fAudioTime >= fAudioDuration then" is right and should be used but "fAudioDuration can be false!" well i would love to hear how this happen.

2) "but it fixes a memory leak when reading .mkv-files." return us to (1) from above using try..finally is best and will prevent memory leak, but such a case for .mkv files is strange and should be investigated deeper as it could be serious problem and might cause huge leak in the loop it self depleting OS memory specially for 64bit.

  • Thanks 1

Share this post


Link to post

Hi Kas,

Good to see you again, and sorry for the long time of inactivity on my part. Thank you for the detailed input, which I need to digest first. Since you already invested so much thought, wouldn't you like to be a contributor? When I incorporate the changes you mention, I wouldn't even know how to list you as contributor. The issues you mention definitely need to be looked into. For the audio-part I was just glad it worked, and haven't put much thought into it lately. The wrong audio-duration was returned by some .vobs, which aren't really supported in the first place. The missing SafeRelease(pAudioSample) has caused memory leaks for me in a totally different context too, when I tried to write some code which simply plays an audio file through the default-device.

 

Renate

  • Like 1

Share this post


Link to post
1 hour ago, Kas Ob. said:

1) Ditch "goto Done;" and use try..finally it is safer and here there is no need for goto and loop is not complex, it is just exit.

There's not even a need for try..finally since there no resources to protect; Get rid of hr, assign the results to Result directly and just exit.

 

Also, instead of:

raise Exception.Create('Fail in call nr. ' + IntToStr(Count) + ' of ' +
  ProcName + ' with result $' + IntToHex(hr));

I would use:

raise Exception.CreateFmt('Fail in call no. %d of %s with result %x', [Count, ProcName, hr]);

for readability.

Share this post


Link to post

Hi Anders,

Thanks for that. I hate the format-strings, because I can never remember the code for the place-holders.  I had already thought before, that I should get used to them, though. Now I also see, that I forgot to use IntToHex(hr,8) 🙂

Share this post


Link to post
1 hour ago, Renate Schaaf said:

I hate the format-strings, because I can never remember the code for the place-holders.

If only there was some magic key you could press in the editor to display help about various relevant topics... 🙂 

I only ever use %s %d, %n and %x - and I use those a lot so that helps but I sometime need to consult that magic key when it comes to the precision or index specifiers.

  • Like 1

Share this post


Link to post
4 hours ago, Renate Schaaf said:

Hi Kas,

Good to see you again, and sorry for the long time of inactivity on my part. Thank you for the detailed input, which I need to digest first. Since you already invested so much thought, wouldn't you like to be a contributor? When I incorporate the changes you mention, I wouldn't even know how to list you as contributor. The issues you mention definitely need to be looked into. For the audio-part I was just glad it worked, and haven't put much thought into it lately. The wrong audio-duration was returned by some .vobs, which aren't really supported in the first place. The missing SafeRelease(pAudioSample) has caused memory leaks for me in a totally different context too, when I tried to write some code which simply plays an audio file through the default-device.

 

Renate

I didn't contribute anything, and very much thank you for offering, but don't worry about this.

 

One more thing, about the whole duration thing, but to explain i want to go back in time to many decades back, when the standard of highest audio quality chose 44100hz as CD quality and best quality, this is strange number at first, when you know how they came up with it, things get clearer, read this https://en.wikipedia.org/wiki/44,100_Hz#Recording_on_video_equipment

So, to interleave the audio and video because things was very different back then and storing or buffering the audio was very expensive and complicated using simple circuits available back then, they needed a number to make sure to divide and support 50fps and 60fps with 3 sample per line, so they can interleave the samples with the lines data for video.

 

Fast forward decades later, and we don't have the only PAL and NTSC system, we have so many combination of FPS with sizes, but still most used standard fps are 23.976 and 29.97 (among less used 24, 25, 29.97, 30, 50, 60..), strange ? that is question have to do with old systems, the internet have so many resources answering this question, yet web broadcasting and multiplexing streams changed things a lot, so we can't depend on these only, while most likely you don't see 44100 like it was back then.

 

Anyway, in the past they changed the audio duration based on video to be compliant, but for modern times and better sample rate which at least 48k, also technology that allow or relief the need to output directly, but use the buffer ahead, the whole thing still need syncing, and to make sure the audio-video are synced then they should follow a rule, the audio should be aligned into sample per second and the video should should accommodate this unlike what was happening in the past, 

 

So even the video duration should be multiple of audio sample duration, consider this if you need your encoding %100 synced or the best it can be, in other word correct the video duration too, while audio sample rate is high number you have more flexible fps per second, the difference could be between 40 and 40.00002, yet if your video is 40.00002 i never saw i player show it as 40.00002 but always 40, this difference will prevent desyncing.

 

Hope that was clear and thank you again for the work on this and for the offering, 

  • Like 1

Share this post


Link to post
5 hours ago, Kas Ob. said:

So even the video duration should be multiple of audio sample duration, consider this if you need your encoding %100 synced or the best it can be, in other word correct the video duration too, while audio sample rate is high number you have more flexible fps per second, the difference could be between 40 and 40.00002, yet if your video is 40.00002 i never saw i player show it as 40.00002 but always 40, this difference will prevent desyncing.

 

Hope that was clear and thank you again for the work on this and for the offering, 

If I understand this right, I should match video-timestamp and duration to the closest blockalign-boundary of audio? If the difference in frame rate is really that negligable that should be doable, if I can get the math right :). Talk about not contributing, you just forced a new way of seeing things down my throat, not a small achievement.

Edited by Renate Schaaf

Share this post


Link to post
9 hours ago, Renate Schaaf said:

Talk about not contributing, you just forced a new way of seeing things down my throat, not a small achievement.

Oh, dear, it is the last thing i want is to waste your time or complicate things.

 

9 hours ago, Renate Schaaf said:

If the difference in frame rate is really that negligable that should be doable,

Ok, here a little miss understanding, the difference is always so small that is might be negligible, but might not, to get to this i want to waste few minutes to look at this page, but before reading or wasting your time,

This is about ffmpeg playing style and how the player do try to fix the syncing or enforce best result, that tutorial is nice but the relevancy is limited to this case, it is nice read and yet it might confuse the reader so i will point to the needed paragraph

http://dranger.com/ffmpeg/tutorial05.html

 

Quote

First is the issue of knowing when the next PTS will be. Now, you might think that we can just add the video rate to the current PTS — and you'd be mostly right. However, some kinds of video call for frames to be repeated. This means that we're supposed to repeat the current frame a certain number of times. This could cause the program to display the next frame too soon. So we need to account for that.

The second issue is that as the program stands now, the video and the audio chugging away happily, not bothering to sync at all. We wouldn't have to worry about that if everything worked perfectly. But your computer isn't perfect, and a lot of video files aren't, either. So we have three choices: sync the audio to the video, sync the video to the audio, or sync both to an external clock (like your computer). For now, we're going to sync the video to the audio.

 That from player point of view, all players has limited way to fix it at runtime, so the better timing from the encoder, the better player will play, by player i am talking about decoder and any player out there advanced or stupid, in other words if the encoder did the right things then the decoder will right thing and give right result ( synced audio-video)

 

Main issue as i pointed to in PAL/NYSC earlier, is the difference between audio and video, just reminder, video can be at arbitrary duration can delay or go faster a fraction of second and human eyes most likely will not notice, audio on the other hand we can't stop playing at intervals like skipping 1/1000 of second every 1/20 of second will generate acoustic artefact and will be hearable, if we over supplied more samples then the player happily will play them and cause desynced audio-video, with that in mind lets see how and what to adjust

 

9 hours ago, Renate Schaaf said:

If I understand this right, I should match video-timestamp and duration to the closest blockalign-boundary of audio?

Right, that is it.

 

I think you already went the wrong path and overengineering the whole thing, this is not re-write and in fact (i think) it will be less than 100 lines of changes here and there, small ones including altering an existing lines.

 

I went again to try to compile the demo on my XE8 , and there is small bugs

1) LogicalCompare doesn't compile as StrCmpLogicalW is not found

2) IntToHex Must have second parameter

3) removed TArray<string> , this one is very strange, causing access violation as the array is not initialized, the AV appear in the caller as it cause corrupt stack, or may be i am living under a rock and modern Delphi compiler do initialize it. 

procedure TDirectoryTree.GetAllFiles(
  const aStringList: TStringlist;
  const aFileMask:   string);
..
{$IF COMPILERVERSION < 30.0}
  i: integer;
  //Strings: TArray<string>;
  ClassicStrings: TStringDynArray;
{$ENDIF}
..
    // Copy all fields to the new array
    for i := Low(ClassicStrings) to High(ClassicStrings) do
      aStringList.Add(ClassicStrings[i]);   // <--
      //Strings[i] := ClassicStrings[i];    // <--

    //aStringList.AddStrings(Strings);      // <--

 

Now it do compile, the steps i did 

1) changed nothing in the settings, just selected SlideShow.. 

2) i have one icon in that path and it is selected by default

3) checked "Display dialog to add audio..

4) checked "Adjust presentation time to audio time

5) clicked make slideshow 

6) It did ask for audio file, so i selected "Nums_5dot1_24_48000.wav" from earlier post the file with 6 channels (5.1)

7) message popped up with "Calculated image time: 7049 ms" this is strange as the audio file is reported 9 seconds by my MPC-HC player

Clicking yes report

Slideshow time: 9049 ms (00:00:09 [h:m:s])
Output video duration: 9066 ms (00:00:09 [h:m:s])
Audio duration: 9049 ms
File size: 0.23 MB

and the result is this ".mkv" video file Example_H264.mp4 Example_H264.zip (had to compress it to prevent the forum from encoding or messing with it)

 

It looks nice, but no way to check for problems without deeper debugging !

So downloaded ffmpeg-git-full.7z from https://www.gyan.dev/ffmpeg/builds/

 

then ran this command to see the frames 

Quote

ffprobe -show_frames Example_H264.mp4 > AVframes.txt

Here is the output file AVframes.txt

 

the frames are interleaved as they should be some video frames followed by audio frames, the pattern is correct and very similar to any other video file, BUT....

Looking at the last audio frame and comparing it with the last video frame shows this

Quote

[FRAME]
media_type=video
stream_index=0
key_frame=0
pts=271000
pts_time=9.033333
pkt_dts=271000
pkt_dts_time=9.033333
best_effort_timestamp=271000
best_effort_timestamp_time=9.033333
duration=1000
duration_time=0.033333
pkt_pos=233240
pkt_size=238
width=1280
height=720
crop_top=0
crop_bottom=0
crop_left=0
crop_right=0
pix_fmt=yuv420p
sample_aspect_ratio=1:1
pict_type=P
interlaced_frame=0
top_field_first=0
lossless=0
repeat_pict=0
color_range=unknown
color_space=unknown
color_primaries=unknown
color_transfer=unknown
chroma_location=left
[/FRAME]
[FRAME]
media_type=audio
stream_index=1
key_frame=1
pts=434176
pts_time=9.045333
pkt_dts=434176
pkt_dts_time=9.045333
best_effort_timestamp=434176
best_effort_timestamp_time=9.045333
duration=1024
duration_time=0.021333
pkt_pos=235043
pkt_size=313
sample_fmt=fltp
nb_samples=1024
channels=2
channel_layout=stereo
[/FRAME]

This is clearly desyncing, small or big is an argument, but lets don['t forget the whole video is 9 seconds and already drifted by 12ms, with this same settings and input but with an hour length video the drift will be (with fast math) around 4.8 seconds ! 

 

I will submit this replay before any blackout that will make me cry and out of laziness and frustration i might ditch the whole subject, as received an SMS to inform me of planned blackout

Share this post


Link to post

FWIW & FYI:

 

54 minutes ago, Kas Ob. said:

IntToHex Must have second parameter

Newer versions of Delphi have IntToHex overloads for NativeInt to match the size of a pointer. I think it got introduced in Delphi 10.

 

52 minutes ago, Kas Ob. said:

removed TArray<string> , this one is very strange, causing access violation as the array is not initialized, the AV appear in the caller as it cause corrupt stack, or may be i am living under a rock and modern Delphi compiler do initialize it.

TArray<T>, being a dynamic array, is a manages type and string too, so TArray<string> should be initialized automatically. Probably a compiler bug.

Share this post


Link to post

Now i show that there is problem in syncing, confirming my theoretical doubts looking at how audio frame duration against video frame duration was calculated.

And to reiterate about how important this is, if you take long video, a fully correct and synced one, and split/extract the audio and video streams into files then used the same/similar code to generate mkv again the result will not be the same as the original and they will be desynced, (there is a chance that the result synced but it will be pure luck based on the parameter).

 

Now to more math reason why this happen in my generated demo video 

Video: MPEG4 Video (H264) 1280x720 30fps 59kbps [V: h264 high L3.1, yuv420p, 1280x720, 59 kb/s]
Audio: AAC 48000Hz stereo 148kbps [A: aac lc, 48000 Hz, stereo, 148 kb/s]

1) Audio from the input is irrelevant as clearly had been re-encoded from 6channels to 2channels 

Audio is 48000hz so, lets get one audio sample duration 

  AudioSampleDuration := 1.0 / 48000;   // 20.833 microsecond (microsecond = 1/1000000 of second)
  VideoFps := 30.0;
  VideoFrameDuration := 1.0 / VideoFps; // 0.033333 ms

Now we need to decide how many audio frames is needed to accommodate the VideoFrameDuration, we after all will interleave them, so we can put video frames and compensate with wider interleave, in other words we don't need to make sure after one video frame we should add audio frames 

  AudioSamlesPerVideoF := round(VideoFrameDuration / AudioSampleDuration); // ~1600 sample per video frame
  AudioFrameDuration := AudioSamlesPerVideoF * AudioSampleDuration;        // 0.033333 ms
  AdjudstedVideoFPS := 1.0 / AudioFrameDuration;                           // ~30.0003 fps
  // so the frame rate aspect ratio (MF_MT_FRAME_RATE) should be
  fpsNum := 300003;
  fpsDen := 10000;

see, if we use 0.03333 instead of 0.033333 the result will be 30.003fps, it happen for this exact parameters the difference is so small,

Anyway, this shows small desyncing, and with the way audio frames requested and written cause the little 12ms drift, as we have 30*9=270 frame, while the math above doesn't explain the whole 12ms but and (i hope) clears the math i talked about earlier.

 

Now to solve and prevent this, we have two ways, but first we must keep in mind two essential things

1) Video frames and their duration are very forgiving, so increasing by 0.01ms or 0.001ms is OK, as we saw in ffmpeg it does its algorithm to correct as long the increase is momentary not accumulated for every frame.

2) Audio is not forgiving if we were short (under-buffer) by even one sample, this will be heard, with audio you can't stop playing the speaker will stop for fraction of second corrupt the wave frequency, in short skips it might be hissing or buzzing or all sort of acoustic artefacts, and if we over-buffer the audio will be desynced, the player can keep track for so much then it will give up, like under-buffer also will desync except without the artefacts.

 

The solution is either 

1) Adjust the video FPS like above to something 30.0003 or 3.003, but even this is not enough.

2) Feed audio frames with dynamic length/size, this might also allow us to not touch video FPS,

but HOW ?

The answer is refactor your loops, we calculated the real and synced video frame duration and we have audio frame duration but we must calculate and then add on the audio sample duration

In other words this can be solved within the audio writing loop to check if we under the exact duration then we add a sample or more, but we must keep a record and keep t(time we overflow) how much we overflow, it is better to always overflow with at least one sample, then on the next loop of writing we use that we write less by t, next we will just rinse and repeat, 

In other words and audio frames will not and should not be fixed count, it will more followed by fewer then more then fewer ....

 

And that is it, this should ensure the output video is synced.

 

Hope that is clear enough.

 

Again don't overthink it, because i think you went in that direction, just close your eyes and imagine TV signal PAL/NTSC and the interleave, back then there was one format, now we have different numbers so will compensate be creating dancing equilibrium,

And again it should be on Video side, but i think my very this suggestion of dancing or twitching audio frames size should be logically and factually correct, unless some codecs doesn't accept arbitrary audio frames and will only accept constant frame duration, in that case the video frame duration must be corrected and as it is the forgiving one, this is the codecs hell.

From my understanding of the current MediaFoundation it is forgiving in the audio frame duration.

 

PS;

1) the speed of video seeking in player is highly affected by this overflow and underflow, so the more accurate and close for correction we do the faster nay player will be.

2) there is noting you add to these loops that can affect the speed of encoding or playing, so more math lines, even the long ones has zero on the whole process decoding video and audio takes hundreds of millions of CPU cycles, so there is sense to cut the the feeding algorithm no matter how slow it is or how much math and variable storing is involved.

 

Ps2; just don't feed or accept audio buffers (in bytes) in any direction (feeding in or out) if they are not aligned to the bytes per sample, it will ruin everything, i can't emphasize enough how this is important, so more checks are needed and the best you can do is fail to proceed.

Share this post


Link to post
2 minutes ago, Anders Melander said:

TArray<T>, being a dynamic array, is a manages type and string too, so TArray<string> should be initialized automatically. Probably a compiler bug.

Even without SetLength ?

Does the following work ?!

Procedure Test;
var
  Strings: TArray<string>;
begin
  strings[5] := 'Hi';
end;

Because it is used there similar to this.

Share this post


Link to post
Just now, Kas Ob. said:

Does the following work ?!


Procedure Test;
var
  Strings: TArray<string>;
begin
  strings[5] := 'Hi';
end;

Because it is used there similar to this.

🙂

No, of course that doesn't work

A dynamic array is initialized to an empty array so there needs to be a SetLength first.

Share this post


Link to post
1 minute ago, Anders Melander said:

No, of course that doesn't work

A dynamic array is initialized to an empty array so there needs to be a SetLength first.

Well i am bad writer, i said strange as this is exactly how it is used https://github.com/rmesch/Bitmaps2Video-for-Media-Foundation/blob/main/Utilities/uDirectoryTree.pas#L313-L314

And how it didn't raise AV on Renate debugger, that is strange thing.

Share this post


Link to post
7 minutes ago, Kas Ob. said:

Well i am bad writer, i said strange as this is exactly how it is used https://github.com/rmesch/Bitmaps2Video-for-Media-Foundation/blob/main/Utilities/uDirectoryTree.pas#L313-L314

And how it didn't raise AV on Renate debugger, that is strange thing.

Likely because Renate is using a newer version of Delphi. As far as I can tell the faulty code was contributed by someone else.

Share this post


Link to post

Thanks everybody. Now I have a lot to think about, a great chance to expand my horizon at the age of 74:). I'll fix the code. But then I need a bit of time to think. The info is great.

1 minute ago, Kas Ob. said:

And how it didn't raise AV on Renate debugger, that is strange thing.

Because my poor debugger didn't run the code, because I didn't tell it to do so. I pasted that compatibility code in without checking, probably missed another piece. Mistake I won't do again. So I need to disable LogicalCompare for more compiler versions, or write a header  for StrCmpLogicalW.  

  • Like 1

Share this post


Link to post
1 hour ago, Kas Ob. said:

message popped up with "Calculated image time: 7049 ms" this is strange as the audio file is reported 9 seconds by my MPC-HC player

presentation time = image time + effect time (2000).

Share this post


Link to post
On 6/25/2025 at 10:47 AM, Kas Ob. said:

I don't think that's quite true, if it fails the rest of WriteOneFrame isn't executed and in Line 1713 an exception is raised with errorcode hr. I could translate it into an EAudioFormatException, though, at the spot you indicate.

 

On 6/25/2025 at 10:47 AM, Kas Ob. said:

The commented "if fAudioTime >= fAudioDuration then" is right and should be used

It was meant as an extra safety check, since the code already checks for EndOfStream, and that hasn't failed so far. But I've put it back in.

 

 

Edited by Renate Schaaf
  • Like 1

Share this post


Link to post
1 minute ago, Renate Schaaf said:

I don't think that's quite true,

Yes, my mistake i saw it as inside the loop, stupid hasty looking.

 

2 minutes ago, Renate Schaaf said:
On 6/25/2025 at 11:47 AM, Kas Ob. said:

The commented "if fAudioTime >= fAudioDuration then" is right and should be used

It was meant as an extra safety check, since the code already checks for EndOfStream, and that hasn't failed so far. But I've put it back in

If you are going to adjust the audio frame duration then it will be used but you must save/remember the difference for the next frame to subtract/reduce, so in its current usage now it is redundant.

Switching to dynamic audio frames will be way better and more accurate, and will prevent desyncing.

Share this post


Link to post
10 hours ago, Kas Ob. said:

And to reiterate about how important this is, if you take long video, a fully correct and synced one, and split/extract the audio and video streams into files then used the same/similar code to generate mkv again the result will not be the same as the original and they will be desynced, (there is a chance that the result synced but it will be pure luck based on the parameter).

With a little change you can perform that test from within the demo, I think. Just put a little change into TBitmapEncodeWMF.AddVideo:

 

procedure TBitmapEncoderWMF.AddVideo(
  const
  VideoFile: string;
  TransitionTime: integer = 0;
  crop:           boolean = false;
  stretch:        boolean = false);
var
  VT: TVideoTransformer;
  bm: TBitmap;
  TimeStamp, Duration, VideoStart: int64;
begin
  if not fInitialized then
    exit;
  VT := TVideoTransformer.Create(
    VideoFile,
    fVideoHeight,
    fFrameRate);
  try
    bm := TBitmap.Create;
    try
      if not VT.NextValidSampleToBitmap(bm, TimeStamp, Duration) then
        exit;
      if TransitionTime > 0 then
        CrossFadeTo(
          bm,
          TransitionTime,
          crop,
          stretch);
      VideoStart := fWriteStart;
      // fill gap at beginning of video stream
      if TimeStamp > 0 then
        AddStillImage(
          bm,
          Trunc(TimeStamp / 10000),
          crop,
          stretch);
      while (not VT.EndOfFile) and fInitialized do
      begin
        BitmapToRGBA(
          bm,
          fBmRGBA,
          crop,
          stretch);
        bmRGBAToSampleBuffer(fBmRGBA);
        // !!!!! Change is here for extra hard sync-check:
        // WriteOneFrame(
        // VideoStart + TimeStamp,
        // Duration);

        // Write the decoded video stream in exactly the same way as AddFrame would.
        // I.e. with the same timestamps, not taking any timestamps from the
        // video-input
        WriteOneFrame(
          fWriteStart,
          fSampleDuration);
        if not VT.NextValidSampleToBitmap(bm, TimeStamp, Duration) then
          Break;
      end;
      // FrameCount*FrameTime > Video-end? (shouldn't differ by much)
      // if fWriteStart > VideoStart + TimeStamp + Duration then
      // Freeze((fWriteStart - VideoStart - TimeStamp - Duration) div 10000);
    finally
      bm.Free;
    end;
  finally
    VT.Free;
  end;
end;

Then transcode a movie on the Demo-Tab "Use TBitmapEncoderWMF as a transcoder". It uses the procedure TranscodeVideoFile, treating the video- and audiostream of an input-video as totally independent inputs. AddVideo decodes the video-stream into a stream of bitmaps, and the input-video is used again as audiofile. I encoded 40 minutes of "Fellowship of the Ring" this way, and did not see any desynching. You'll probably say that's no proof, and you'd be right, but it might be an indication that the problem isn't as severe.

Or the video player is just very good at making something usable out of the input. 

Share this post


Link to post
On 6/25/2025 at 1:02 PM, Anders Melander said:

I would use:


raise Exception.CreateFmt('Fail in call no. %d of %s with result %x', [Count, ProcName, hr]);

for readability.

Hi, Anders,

CreateFmt uses internally

 

constructor Exception.CreateFmt(const Msg: string;
  const Args: array of const);
begin
  FMessage := Format(Msg, Args);
end;

and help says that this version of Format isn't threadsafe, since it uses the locale for the decimal separator. Now I'm not using decimal-separators here, and I guess once the exception is raised in a thread thread-safety doesn't really matter anymore?

 

Another thing: Is %x.8 doing the same as IntToHex(hr,8)?

 

Renate

Edited by Renate Schaaf

Share this post


Link to post
4 minutes ago, Renate Schaaf said:

help says that this version of Format isn't threadsafe

"Not thread-safe" in this case doesn't mean crash and burn. It just means that if one thread modifies the global FormatSettings then it will affect all other threads also using it.

Hardly a problem - even if you did output floating point values in the exception message.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×