Renate Schaaf

September 14, 2022

On 8/13/2022 at 3:56 PM, Tom F said:

Is there a way to show the "old style" (with a tree) when using styles??

You can add the following line e.g. in the OnCreate of the main form:

TStyleManager.SystemHooks := TStyleManager.SystemHooks - [shDialogs];

VCL.Themes needs to be in the uses list. The drawback is, that the style is also removed from all message dialogs. But I find it easier to make your own message boxes than your own file dialogs.

May 20, 2022

1 hour ago, Fr0sT.Brutal said:

Do you mean these issues exist inside the DLL itself?

No. The functions have just not been used not quite correctly in the example procedure that encodes a file. I should have made that more clear in my post.

May 20, 2022

29 minutes ago, David Heffernan said:

Did you submit a PR for these changes?

No. I'd like to find out if this really works in all cases, first. Then I'll have to teach myself how to do PRs :).

May 19, 2022

I spent some time to have Delphi interface correctly with the Lame-encoder-DLL, so I thought it a good idea to share the result, since I also could not find any good Delphi-code for this on the net.

The Lame-source comes with a rudimentary Delphi-header-file, but this has several issues, which I have tried to fix:

The file references unnecessary stuff, preventing compilation, easy to fix.
The encoding starts at the beginning of the wave-file, thereby encoding the header. This gives a noise at the beginning and can switch the stereo-channels. Fix: Offset the source into the data-section of the wave-file. Since this offset can vary, I have used the utility functions by Kambiz R. Khojasteh (http://www.delphiarea.com) to retrieve the necessary info using WinApi.MMSystem.
Lame suggests writing a VBR-Header to the file, even though it's CBR, I have changed the routine accordingly. This way devices can e.g. figure out the duration of the mp3-audio more easily.
Instead of file-handles I'm using TFileStream, that seems to speed up encoding considerably.

Usage: EncodeWavToMP3(WaveFile, MP3File, Bitrate)

WaveFile needs to be 16-bit Stereo, but that could be adjusted.

Bitrate is a constant bitrate, for example 128. Support for VBR could be added.

If you use it and find something wrong, I'd like to know 🙂

Here is the unit:

unit MP3ExportLame;

interface

Uses System.SysUtils, WinApi.Windows, System.Classes;

type
  // type definitions

  PHBE_STREAM = ^THBE_STREAM;
  THBE_STREAM = LongWord;
  BE_ERR = LongWord;

const
  // encoding formats

  BE_CONFIG_MP3 = 0;
  BE_CONFIG_LAME = 256;

  // error codes

  BE_ERR_SUCCESSFUL: LongWord = 0;
  BE_ERR_INVALID_FORMAT: LongWord = 1;
  BE_ERR_INVALID_FORMAT_PARAMETERS: LongWord = 2;
  BE_ERR_NO_MORE_HANDLES: LongWord = 3;
  BE_ERR_INVALID_HANDLE: LongWord = 4;

   // format specific variables

  BE_MP3_MODE_STEREO = 0;
  BE_MP3_MODE_DUALCHANNEL = 2;
  BE_MP3_MODE_MONO = 3;

  // other constants

  BE_MAX_HOMEPAGE = 256;

type

  TMP3 = packed record
    dwSampleRate: LongWord;
    byMode: Byte;
    wBitRate: Word;
    bPrivate: LongWord;
    bCRC: LongWord;
    bCopyright: LongWord;
    bOriginal: LongWord;
  end;

  TLHV1 = packed record
    // STRUCTURE INFORMATION
    dwStructVersion: DWORD;
    dwStructSize: DWORD;

    // BASIC ENCODER SETTINGS
    dwSampleRate: DWORD; // ALLOWED SAMPLERATE VALUES DEPENDS ON dwMPEGVersion
    dwReSampleRate: DWORD; // DOWNSAMPLERATE, 0=ENCODER DECIDES
    nMode: Integer;
    // BE_MP3_MODE_STEREO, BE_MP3_MODE_DUALCHANNEL, BE_MP3_MODE_MONO
    dwBitrate: DWORD; // CBR bitrate, VBR min bitrate
    dwMaxBitrate: DWORD; // CBR ignored, VBR Max bitrate
    nQuality: Integer; // Quality setting (NORMAL,HIGH,LOW,VOICE)
    dwMpegVersion: DWORD; // MPEG-1 OR MPEG-2
    dwPsyModel: DWORD; // FUTURE USE, SET TO 0
    dwEmphasis: DWORD; // FUTURE USE, SET TO 0

    // BIT STREAM SETTINGS
    bPrivate: LONGBOOL; // Set Private Bit (TRUE/FALSE)
    bCRC: LONGBOOL; // Insert CRC (TRUE/FALSE)
    bCopyright: LONGBOOL; // Set Copyright Bit (TRUE/FALSE)
    bOriginal: LONGBOOL; // Set Original Bit (TRUE/FALSE_

    // VBR STUFF
    bWriteVBRHeader: LONGBOOL; // WRITE XING VBR HEADER (TRUE/FALSE)
    bEnableVBR: LONGBOOL; // USE VBR ENCODING (TRUE/FALSE)
    nVBRQuality: Integer; // VBR QUALITY 0..9

    btReserved: array [0 .. 255] of Byte; // FUTURE USE, SET TO 0
  end;

  TAAC = packed record
    dwSampleRate: LongWord;
    byMode: Byte;
    wBitRate: Word;
    byEncodingMethod: Byte;
  end;

  TFormat = packed record
    case Byte of
      1:
        (mp3: TMP3);
      2:
        (lhv1: TLHV1);
      3:
        (aac: TAAC);
  end;

  TBE_Config = packed record
    dwConfig: LongWord;
    format: TFormat;
  end;

  PBE_Config = ^TBE_Config;

  TBE_Version = record
    byDLLMajorVersion: Byte;
    byDLLMinorVersion: Byte;

    byMajorVersion: Byte;
    byMinorVersion: Byte;

    byDay: Byte;
    byMonth: Byte;
    wYear: Word;

    zHomePage: Array [0 .. BE_MAX_HOMEPAGE + 1] of Char;
  end;

  PBE_Version = ^TBE_Version;


//Headers for Lame_enc.dll (ver. 3.100)

Function beInitStream(var pbeConfig: TBE_Config; var dwSample: LongWord;
  var dwBufferSize: LongWord; var phbeStream: THBE_STREAM): BE_ERR; cdecl;
  external 'Lame_enc.dll';
Function beEncodeChunk(hbeStream: THBE_STREAM; nSamples: LongWord; var pSample;
  var pOutput; var pdwOutput: LongWord): BE_ERR; cdecl; external 'Lame_enc.dll';
Function beDeinitStream(hbeStream: THBE_STREAM; var pOutput;
  var pdwOutput: LongWord): BE_ERR; cdecl; external 'Lame_enc.dll';
Function beCloseStream(hbeStream: THBE_STREAM): BE_ERR; cdecl;
  external 'Lame_enc.dll';
Procedure beVersion(var pbeVersion: TBE_Version); cdecl;
  external 'Lame_enc.dll';
// Added header for beWriteVBRHeader
Procedure beWriteVBRHeader(MP3FileName: pAnsiChar); cdecl;
  external 'Lame_enc.dll';

Procedure EncodeWavToMP3(WaveFile, MP3File: string; BitRate: Integer);
// BitRate 128 192 256 etc.

implementation

uses WinApi.MMSystem;

{ ---------------------------------------- }

{ The following functions retrieve the necessary info from the input-wave-file. }
{ Source: }
{ WaveUtils - Utility functions and data types }
{ by Kambiz R. Khojasteh }
{ }
{ kambiz@delphiarea.com }
{ http://www.delphiarea.com }

function mmioStreamProc(lpmmIOInfo: PMMIOInfo; uMsg, lParam1, lParam2: DWORD)
  : LRESULT; stdcall;
var
  Stream: TStream;
begin
  if Assigned(lpmmIOInfo) and (lpmmIOInfo^.adwInfo[0] <> 0) then
  begin
    Stream := TStream(lpmmIOInfo^.adwInfo[0]);
    case uMsg of
      MMIOM_OPEN:
        begin
          if TObject(lpmmIOInfo^.adwInfo[0]) is TStream then
          begin
            Stream.Seek(0, SEEK_SET);
            lpmmIOInfo^.lDiskOffset := 0;
            Result := MMSYSERR_NOERROR;
          end
          else
            Result := -1;
        end;
      MMIOM_CLOSE:
        Result := MMSYSERR_NOERROR;
      MMIOM_SEEK:
        try
          if lParam2 = SEEK_CUR then
            Stream.Seek(lpmmIOInfo^.lDiskOffset, SEEK_SET);
          Result := Stream.Seek(lParam1, lParam2);
          lpmmIOInfo^.lDiskOffset := Result;
        except
          Result := -1;
        end;
      MMIOM_READ:
        try
          Stream.Seek(lpmmIOInfo^.lDiskOffset, SEEK_SET);
          Result := Stream.Read(Pointer(lParam1)^, lParam2);
          lpmmIOInfo^.lDiskOffset := Stream.Seek(0, SEEK_CUR);
        except
          Result := -1;
        end;
      MMIOM_WRITE, MMIOM_WRITEFLUSH:
        try
          Stream.Seek(lpmmIOInfo^.lDiskOffset, SEEK_SET);
          Result := Stream.Write(Pointer(lParam1)^, lParam2);
          lpmmIOInfo^.lDiskOffset := Stream.Seek(0, SEEK_CUR);
        except
          Result := -1;
        end
    else
      Result := MMSYSERR_NOERROR;
    end;
  end
  else
    Result := -1;
end;

function OpenStreamWaveAudio(Stream: TStream): HMMIO;
var
  mmIOInfo: TMMIOINFO;
begin
  FillChar(mmIOInfo, SizeOf(mmIOInfo), 0);
  mmIOInfo.pIOProc := @mmioStreamProc;
  mmIOInfo.adwInfo[0] := DWORD(Stream);
  Result := mmioOpen(nil, @mmIOInfo, MMIO_READWRITE);
end;

function GetWaveAudioInfo(mmIO: HMMIO; var pWaveFormat: PWaveFormatEx;
  var DataSize, DataOffset: DWORD): Boolean;

  function GetWaveFormat(const ckRIFF: TMMCKInfo): Boolean;
  var
    ckFormat: TMMCKInfo;
  begin
    Result := False;
    ckFormat.ckid := mmioStringToFOURCC('fmt', 0);
    if (mmioDescend(mmIO, @ckFormat, @ckRIFF, MMIO_FINDCHUNK)
      = MMSYSERR_NOERROR) and (ckFormat.cksize >= SizeOf(TWaveFormat)) then
    begin
      if ckFormat.cksize < SizeOf(TWaveFormatEx) then
      begin
        GetMem(pWaveFormat, SizeOf(TWaveFormatEx));
        FillChar(pWaveFormat^, SizeOf(TWaveFormatEx), 0);
      end
      else
        GetMem(pWaveFormat, ckFormat.cksize);
      Result := (mmioRead(mmIO, pAnsiChar(pWaveFormat), ckFormat.cksize)
        = Integer(ckFormat.cksize));
    end;
  end;

  function GetWaveData(const ckRIFF: TMMCKInfo): Boolean;
  var
    ckData: TMMCKInfo;
  begin
    Result := False;
    ckData.ckid := mmioStringToFOURCC('data', 0);
    if (mmioDescend(mmIO, @ckData, @ckRIFF, MMIO_FINDCHUNK) = MMSYSERR_NOERROR)
    then
    begin
      DataSize := ckData.cksize;
      DataOffset := ckData.dwDataOffset;
      Result := True;
    end;
  end;

var
  ckRIFF: TMMCKInfo;
  OrgPos: Integer;
begin
  Result := False;
  OrgPos := mmioSeek(mmIO, 0, SEEK_CUR);
  try
    mmioSeek(mmIO, 0, SEEK_SET);
    ckRIFF.fccType := mmioStringToFOURCC('WAVE', 0);
    if (mmioDescend(mmIO, @ckRIFF, nil, MMIO_FINDRIFF) = MMSYSERR_NOERROR) then
    begin
      pWaveFormat := nil;
      if GetWaveFormat(ckRIFF) and GetWaveData(ckRIFF) then
        Result := True
      else if Assigned(pWaveFormat) then
        ReallocMem(pWaveFormat, 0);
    end
  finally
    mmioSeek(mmIO, OrgPos, SEEK_SET);
  end;
end;

function GetStreamWaveAudioInfo(Stream: TStream; var pWaveFormat: PWaveFormatEx;
  var DataSize, DataOffset: DWORD): Boolean;
var
  mmIO: HMMIO;
begin
  Result := False;
  if Stream.Size <> 0 then
  begin
    mmIO := OpenStreamWaveAudio(Stream);
    if mmIO <> 0 then
      try
        Result := GetWaveAudioInfo(mmIO, pWaveFormat, DataSize, DataOffset);
      finally
        mmioClose(mmIO, MMIO_FHOPEN);
      end;
  end;
end;

Procedure EncodeWavToMP3(WaveFile, MP3File: string; BitRate: Integer);
var
  beConfig: TBE_Config;
  dwSamples, dwSamplesMP3: LongWord;
  hbeStream: THBE_STREAM;
  error: BE_ERR;
  pBuffer: PSmallInt;
  pMP3Buffer: PByte;

  done: LongWord;
  dwWrite: LongWord;
  ToRead: LongWord;
  ToWrite: LongWord;

  // changed from THandle to TFileStream
  fs, ft: TFileStream;
  TotalSize: DWORD;

  // variables to hold the wave info necessary for encoding
  pWaveFormat: PWaveFormatEx;
  DataOffset, DataSize, InputSampleRate: DWORD;

begin
  beConfig.dwConfig := BE_CONFIG_LAME;
  fs := TFileStream.Create(WaveFile, fmOpenRead or fmShareDenyWrite);
  ft := TFileStream.Create(MP3File, fmCreate or fmShareDenyWrite);
  try
    TotalSize := fs.Size;

    // obtain info from source wave file
    try
      if not GetStreamWaveAudioInfo(fs, pWaveFormat, DataSize, DataOffset) then
        raise Exception.Create
          ('Unable to obtain necessary info from wave file.');
      if (pWaveFormat.nChannels <> 2) or (pWaveFormat.wBitsPerSample <> 16) then
        raise Exception.Create('Wave format must be 16bit Stereo.');
      InputSampleRate := pWaveFormat.nSamplesPerSec;
    finally
      FreeMem(pWaveFormat);
    end;

    // Structure information
    beConfig.format.lhv1.dwStructVersion := 1;
    beConfig.format.lhv1.dwStructSize := SizeOf(beConfig);
    // Basic encoder setting
    beConfig.format.lhv1.dwSampleRate := InputSampleRate;
    beConfig.format.lhv1.dwReSampleRate := InputSampleRate;
    beConfig.format.lhv1.nMode := BE_MP3_MODE_STEREO;
    beConfig.format.lhv1.dwBitrate := BitRate;
    beConfig.format.lhv1.dwMaxBitrate := BitRate;
    beConfig.format.lhv1.nQuality := 4;
    beConfig.format.lhv1.dwMpegVersion := 1;
    // MPEG1
    beConfig.format.lhv1.dwPsyModel := 0;
    beConfig.format.lhv1.dwEmphasis := 0;
    // Bit Stream Settings
    beConfig.format.lhv1.bPrivate := False;
    beConfig.format.lhv1.bCRC := True;
    beConfig.format.lhv1.bCopyright := True;
    beConfig.format.lhv1.bOriginal := True;
    // VBR Stuff
    // Have it write a VBRHeader, as recommended by Lame, even though it's CBR
    beConfig.format.lhv1.bWriteVBRHeader := True;
    beConfig.format.lhv1.bEnableVBR := False;
    beConfig.format.lhv1.nVBRQuality := 0;

    error := beInitStream(beConfig, dwSamples, dwSamplesMP3, hbeStream);
    if error = BE_ERR_SUCCESSFUL then
    begin
      pBuffer := AllocMem(dwSamples * 2);
      pMP3Buffer := AllocMem(dwSamplesMP3);
      try
        // Position the source file stream at the beginning of the PCM-data:
        done := DataOffset;
        fs.Seek(DataOffset, soFromBeginning);
        While (done < TotalSize) do
        begin
          if (done + dwSamples * 2 < TotalSize) then
            ToRead := dwSamples * 2
          else
          begin
            ToRead := TotalSize - done;
            FillChar(pBuffer^, dwSamples * 2, 0);
          end;

          fs.Read(pBuffer^, ToRead);

          error := beEncodeChunk(hbeStream, ToRead div 2, pBuffer^,
            pMP3Buffer^, ToWrite);

          if error <> BE_ERR_SUCCESSFUL then
          begin
            beCloseStream(hbeStream);
            raise Exception.Create('Encoding Error');
          end;

          ft.Write(pMP3Buffer^, ToWrite);

          done := done + ToRead;

        end;

        error := beDeinitStream(hbeStream, pMP3Buffer^, dwWrite);

        if error <> BE_ERR_SUCCESSFUL then
        begin
          beCloseStream(hbeStream);
          raise Exception.Create('Close Error');
        end;

        if dwWrite <> 0 then
        begin
          ft.Write(pMP3Buffer^, dwWrite);
        end;

        error := beCloseStream(hbeStream);
        if error <> BE_ERR_SUCCESSFUL then
        begin
          raise Exception.Create('Close Error');
        end;
      finally
        FreeMem(pBuffer);
        FreeMem(pMP3Buffer);
      end;
    end
    else
    begin
      Raise Exception.Create('InitStream failure');
    end;
  finally
    fs.free;
    ft.free;
  end;
  beWriteVBRHeader(pAnsiChar(AnsiString(MP3File)));
end;

end.

May 1, 2022

1 hour ago, Remy Lebeau said:

I generally prefer to use a local accessor class instead:

What's the advantage? When do you use a class helper and when do you use a local accessor class, to me they both seem to serve the same purpose.

April 18, 2022

I might have introduced a bug in GR32_Resamplers, as it is, the left bound of the source rectangle is ignored. The fix is simple:

Line 1778 needs to be

SourceColor := @Src.Bits[ClusterY[0].Pos * Src.Width+SrcRect.Left];  //+SrcRect.Left was missing!

and line 1806:

        SourceColor := @Src.Bits[ClusterY[Y].Pos * Src.Width+SrcRect.Left];//+SrcRect.Left was missing!

Hope, you read this, Anders. If I don't hear from you, I'll create an issue on GitHub.

Edit: I definitely intoduced it by changing the order of the loops, I checked against an old version. Instead of

+SrcRect.Left

one should probably use

+MapXLoPos

Renate

April 11, 2022

55 minutes ago, Anders Melander said:

Please verify that the comments I've added in the source are correct

Correct and very clear.

I like the introduction of the MappingTablePrecicion.. constants.

April 11, 2022

5 hours ago, Anders Melander said:

The [10:22] must mean that you have multiplied two [21:11] values at some point but I can't really spot where that happens.

One weight in x-direction times another one in y-direction. Thanks for the explanation, now I understand the notation.

5 hours ago, Anders Melander said:

I don't understand what this refers to

That my posted "correction" can be safely ignored :).

5 hours ago, Anders Melander said:

Ah, now I get it. It was your comments that confused me. For example:

Sometimes I'm making it too complicated, sorry.

April 10, 2022

2 hours ago, Anders Melander said:

You say that you're "computing the integral for the convolution with the filter using the midpoint-rule" so I would have expected to see the averaging of two filter values in order to find the midpoint but I can't really match that with the above. Can you explain what's going on, please?

No, that would be the trapezoidal rule, and that is just as bad as using the antiderivatives.

Midpoint rule:

integral from x1 to x2 f(x) dx is appoximately f(0.5*(x1+x2))*(x2-x1).

The multiplications by oldscale transform this from the scale of the source (pixelwidth 1) to the scale of the destination ("pixelwidth" NewWidth/OldWidth).

If you want to know why using the integral is a good way of thinking about the algorithm, there's a little article in the doc folder. I couldn't explain it any better here.

2 hours ago, Anders Melander said:

Also, am I correct in assuming that you're doing calculation in [21:11] fixed precision and storing the weights in [10:22] instead of the old [24:8] and [16:16]?

I'm not sure what you're asking, could you do some explaining back? For the pre-mult I'm using precision $100, for the others $800. Is that what you are asking?

BTW, the check x2>x1 isn't necessary, Filter(x3) would be zero if not true.

April 10, 2022

14 hours ago, Anders Melander said:

One thing that almost gave me brain cancer was the abysmal ASM generated by this code (not your fault):

I was hoping for you to untwiddle this 🙂

Meanwhile I found the reason for the box-kernel not being up to par, it's here:

function TBoxKernel.GetWidth: TFloat;
begin
  Result := 1; //must be 0.5!
end;

I also spotted a mistake in my code. It could be that the interval [x0-0.5,x0+0.5] is completely outside of the support of the filter. In this case a false non-zero weight would be generated. So a check of x2>x1 needs to be added:

      ...
      for J := Left to Right do
      begin
        x0 := J - Center; // previous weight: Filter(x0*Oldscale)*Oldscale
        x1 := max(x0 - 0.5, -FilterWidth);
        x2 := min(x0 + 0.5, FilterWidth);
        // intersect symmetric interval of length 1 about x0 with support of scaled filter
        if (x2 > x1) then
        begin
          x3 := 0.5 * (x2 + x1); // new center
          Weight := Round(Prec * Filter(x3 * OldScale) * OldScale * (x2 - x1));
          // intersection with support entered into the weight
          if Weight <> 0 then
          begin
            Inc(Count, Weight);
            K := Length(Result[I]);
            SetLength(Result[I], K + 1);
            Result[I][K].Pos := Constrain(J, SrcLo, SrcHi - 1);
            Result[I][K].Weight := Weight;
          end;
        end;
      end;

Also, at the analogous place for the case scale>1. The code for the 2 cases could be unified, but it's better to understand as it is.

April 9, 2022

I've managed to translate the alternative computation of weights into Graphics32. It was actually quite easy :). The idea is, to compute the intergral for the convolution with the filter via the midpoint-rule. Before I've used the exact antiderivatives, leading to constant underestimations of peaks and valleys in the bitmap function, and thus to a loss of detail. Now pixels not lying totally within the support of the filter get their weight reduced, leading to less artefacts, but the peaks are better estimated, so contrast and detail is better preserved (the math is for readability):

//Precision of weights,
//Totals Cb,Cg,Cr,Ca in Resample need to be unscaled by Prec * Prec
const Prec = $800;

function BuildMappingTableNew(DstLo, DstHi: Integer; ClipLo, ClipHi: Integer;
  SrcLo, SrcHi: Integer; Kernel: TCustomKernel): TMappingTable;
var
 ...
begin
  ...
  else if Scale < 1 then
  begin
    OldScale := Scale;
    Scale := 1 / Scale;
    FilterWidth := FilterWidth * Scale;
    for I := 0 to ClipW - 1 do
    begin
      if FullEdge then
        Center := SrcLo - 0.5 + (I - DstLo + ClipLo + 0.5) * Scale
      else
        Center := SrcLo + (I - DstLo + ClipLo) * Scale;
      Left := Floor(Center - FilterWidth);
      Right := Ceil(Center + FilterWidth);
      Count := -Prec;
      for J := Left to Right do
      begin
                   
        //changed part           
        x0 := J - Center; // old weight: Filter(x0*Oldscale)*Oldscale
        x1 := max(x0 - 0.5, -FilterWidth);
        x2 := min(x0 + 0.5, FilterWidth); // intersect symmetric interval of length 1 about x0 with support of scaled filter
        x3 := 0.5 * (x2 + x1); // new center
        Weight := Round(Prec * Filter(x3 * OldScale) * OldScale * (x2 - x1));  // intersection with support entered into the weight
                   
        if Weight <> 0 then
        begin
          Inc(Count, Weight);
          K := Length(Result[I]);
          SetLength(Result[I], K + 1);
          Result[I][K].Pos := Constrain(J, SrcLo, SrcHi - 1);
          Result[I][K].Weight := Weight;
        end;
      end;
...

At first the results were getting too dark and contrast was increased. By increasing the accuracy of the weights and using my own way of rounding the averaged result into bytes, this seems no longer the case:

If RangeCheck then
        begin
          C.B := min((max(Cb, 0) + $1FFFFF) shr 22, 255);  //unscale and round
          C.G := min((max(Cg, 0) + $1FFFFF) shr 22, 255);
          C.R := min((max(Cr, 0) + $1FFFFF) shr 22, 255);
          C.A := min((max(Ca, 0) + $1FFFFF) shr 22, 255);
        end
        else
        begin
          C.B := (Cb + $1FFFFF) shr 22;
          C.G := (Cg + $1FFFFF) shr 22;
          C.R := (Cr + $1FFFFF) shr 22;
          C.A := (Ca + $1FFFFF) shr 22;
        end;

        // Combine it with the background
        case CombineOp of
          dmOpaque:
            DstLine[I] := C.ARGB;
...

The changed file uScalingProcsGR32.pas is attached.

If you are interested in a test, here is a short video, zooms and pans have been done with the new Lanczos. The second picture is one of the most notorious in my collection.

uScalingProcsGR32.zip

April 8, 2022

4 hours ago, Anders Melander said:

With optimization enabled, the performance of Resample() now matches, and in some cases even exceed, your algorithm

Confirmed. Nice.

4 hours ago, Anders Melander said:

The performance of Resample() with the 64-bit compiler is horrible.

I've always been disappointed in the performance of 64-bit code. No idea, why mine is faster.

Meanwhile I have come up with an alternative way to compute the weights, which seems to decrease artefacts while keeping the brilliance. So far I could not translate it into Graphics32, the filters there all live on different sized intervals, wheras mine all live on [-1,1], and at this time of the day my aging brain can't deal with the math. Maybe tomorrow.

April 7, 2022

I'm too stupid to create a pull-request on GitHub, if I read the help for it, I don't understand the first thing about it.

Somehow I can post changes to my own repository using GitHub-Desktop, but I don't really understand what it's doing :).

So here is the changed GR32_Resamplers.pas.

7 hours ago, Anders Melander said:

The quality could probably be improved to match WIC but the question is if it's worth the trouble or if the current quality is "good enough"...?

I personally think the quality is good enough, on Ultra-High DPI I can hardly see the difference between a low-quality and high-quality filter for "normal" images, glyphs are another story.

For me the filters kick in when animating pictures by zooms and pans, the artifacts then really show up.

Theoretically the quality could be improved by a higher precision of the weights, which currently run from -256 to 256. Up to $800 should be possible, which I have done for the none-premultiply-modes. But again, I have a hard time seeing a difference.

Also, again theoretically, the algorithm using antiderivates of filters should yield better results (except there isn't any in closed form for the Lanczos). But though I can see less artifacts, they decrease details, as you have seen. I've probably made some mistake in the implementation. It could be the same kind of mistake you can make by computing numerical derivatives, Small divided by Small.

Time to hit the sack.

Renate

GR32_Resamplers.zip

April 7, 2022

I have been able to make the GR32-resampling as fast as mine unthreaded, by making some simple changes to the procedure GR32_Resamplers.Resample (in Implementation-part):

changing the order of X- and Y- loop in the filling of the horizontal buffer, avoiding jumps in the bitmap-memory,

using pointers to walk along the arrays,

turning on compiler-optimization for the procedure (biggest improvement)

If you want to see for yourself, in the attachment are 3 changed .pas-files that need to overwrite the corresponding ones in the Algorithms-folder under Bitmap Scaling.

Renate

Bitmap Scaling-Diff.zip

April 6, 2022

Here is a new version with lots of changes (thanks, Anders):

I have changed the algorithm for the filters except box from continous- to discrete-space convolution, because the result didn't seem to look as good as expected, maybe rounding errors, further investigation needed. But I just noticed, that for my videos I need to go back to continuous space, because that gives much smoother zooms and pans.
Anyway, now the algorithm itself is more or less identical to Graphics32, so you have a completely fair comparison. I also followed Anders' idea of including a stopwatch in the signature of the test procedures that can be turned on only if it matters.

More changes:

Alpha:
I've put some alpha-shenanigans into the loaded bitmaps, so you can see how the alpha-channel is handled.

There now is a TAlphaCombineMode=(amIndependent, amPreMultiply, amIgnore) doing the following:

amIndependent:
The behaviour as before, all channels are sampled independently of one another.
This is the behavior of GR32 for drawmode dmOpaque.

amPreMultiply:
RGB is mulitplied by alpha prior to resampling, after that nonzero alpha is "unmultiplied".
I had to sacrifice some precision for the weights and the alpha-multiplication in order to stay within integer-range.
This is the behavior of WICImage with Source.AlphaFormat=afDefined (it seems to have made the same sacrifices).
GR32 with drawmode dmBlend does pre-multiply, but doesn't unmultiply, or, rather it "post-multiplies".

amIgnore:
RGB is resampled, alpha is set to 255. Faster for images not needing an alpha-channel.
This is the behavior of WICImage with Source.AlphaFormat=afIgnored.

To prevent apples from being compared to oranges I have included a notice when a certain algorithm does not fully support the chosen mode.

To avoid code repetition I had to introduce a number of procedural variables, which slow things down a little bit, but it's still nice and fast.

Threads:
The <= 16 threads hardly consume any CPU-time while waiting, for the time being I want to keep them.
There is a TTask-version included in the source, it has worked so far, but TTask behaves somewhat erratically timing-wise, and I don't understand what it's doing.

I'll try to write a TKernelResample-descendent (unthreaded) for Graphics32, maybe I can speed it up, let's see.

Renate

Bitmap Scaling-New.zip

April 3, 2022

Now I got the pre-multiplication in place, but before I do an update, I have a question for Anders:

The pre-multiplication completely erases any RGB-Info stored at transparent pixels. So those can't be unmultiplied. I can see that this is desirable if the image contains a mask and needs to be rescaled.

The RGB-part of the scaled image looks something like this, and btw. the result for WICImage looks the same.

What puzzles me though is the fact, that Graphics32 still keeps the transparent parts of the RGB in place, so it must have some magic way to compute 0/0, which I would like to learn ...

Further looking at the source code, I cannot see any pre-multiplication done:

          C := Src.Bits[X + ClusterY[Y].Pos * Src.Width];
          ClustYW := ClusterY[Y].Weight;
          Inc(Ca, Integer(C shr 24) * ClustYW);
          Inc(Cr, Integer(C and $00FF0000) shr 16 * ClustYW);
          Inc(Cg, Integer(C and $0000FF00) shr 8 * ClustYW);
          Inc(Cb, Integer(C and $000000FF) * ClustYW);

The Bytes of the channels are just lumped into one cardinal, the bytes are extracted and multiplied by the weights, or am I just dense, or has this already been done to Src.Bits?

Have a nice Sunday,

Renate

April 1, 2022

50 minutes ago, Anders Melander said:

The premultiply/unpremultiply really messes with the quality since you're operating on 8-bit values

My cache arrays are floats scaled to integer, so I can do the multiply while filling the array, and the unmultiply on the integers. Let's see.

52 minutes ago, Anders Melander said:

The memory usage can be lessened by processing one channel at a time.

That's a great idea.

53 minutes ago, Anders Melander said:

Of course it's wanted. Improvements are always welcome

I'll see, whether I can plug it in, but the source of Graphics32 is a bit intimidating. But then, being retired, I have all day to understand it.

April 1, 2022

4 hours ago, Anders Melander said:

As a Graphics32 contributor I would have preferred though that you'd spent the (probably considerable) effort improving Graphics32 🙂

I can still do that it it's wanted and still makes sense (see below).

4 hours ago, Anders Melander said:

You cannot process each RGBA channel independently. Instead you need to operate on premultiplied RGBA values. For example you wouldn't want an RGBA value of $00FFFFFF to have any influence on neighboring pixels. So: 1) Premultiply, 2) Resample, 3) Unpremultiply.

Oh, I need to fix that ASA . And include the better GR32-routine.

Thanks for your feedback,

Renate

March 31, 2022

I have made my parallel resampling of bitmaps now as fast as I can get it. Now I would find it interesting to know, how the algorithm performs on other systems, and it would be super to get suggestions for improvement.

The procedures can be found in the unit uScale under Algorithms in the attached zip.

I have tested against Windows StretchBlt-Half_Tone, WICImage and Graphics32. On my PC (AMD Ryzen 7 1700, 8-core) I see a substantial improvement in speed.

The threads are based on TThread rather than TTask or TParallel, because I had failures using the latter two, whereas the oldfashioned threads haven't failed me ever in gazillions of runs.

If you want to compile and run the comparison project, you need to include the source folder of Graphics32 (latest version) in your search path. For convenience it is included in the zip under Algorithms. I couldn't find a way to divvy out only the units needed for resampling.

The test against Graphics32 might be slightly unfair, because of bitmaps being assigned to TBitmap32 before the processing.

Right now, the procedure itself cannot be run in concurrent threads, because the array of TThreads is a global variable, I need to change the design (ideas welcome).

There might be still room for improvement by minimizing the cache misses, but my head can't handle it at the moment.

Hope you find it useful.

Renate

Bitmap Scaling.zip

March 22, 2022

57 minutes ago, Stefan Glienke said:

setting freed memory to a certain pattern

Thanks, I didn't know that.

Otherwise my question (starting with why) has, as far as I am concerned, been answered by Attila, and I wouldn't want to use anymore bandwidth for it.

March 22, 2022

1 minute ago, David Heffernan said:

No I don't think so. Follow the actual rules.

I'm doing that, unless I'm too dim to notice. I just wanted to understand why it doesn't blow up in my face .. oh well.

March 22, 2022

30 minutes ago, Attila Kovacs said:

After the "Free" the reference is a "dangling pointer".

That's why I don't use it :).

33 minutes ago, Attila Kovacs said:

Try to use explicit Free's and you don't have to keep the implementation details in your head.

Good advice.

34 minutes ago, Attila Kovacs said:

Or keep it there and follow the rules.

So you are saying it is Ok (other than maintainance headaches) to keep it there on condition that one follows the rules? I'm saying this, because I remember reactions like "Shriek" when somebody posted code like this.

March 22, 2022

type
  TDoSomething = class
    Danger: string;
    constructor create;
    procedure DoSomething;
  end;

implementation
  ...
constructor TDoSomething.create;
begin
  Danger := 'Danger!';
end;

procedure TDoSomething.DoSomething;
begin
  ShowMessage('Doing something');
  Sleep(1000);
  ShowMessage('Done doing something');
  Free; //<-----
end;

procedure TForm2.Button1Click(Sender: TObject);
begin
  var
    DoSomething: TDoSomething := TDoSomething.create;
  DoSomething.DoSomething;
  // ShowMessage(DoSomething.Danger);
end;

initialization
ReportMemoryLeaksOnShutDown := true;
end.

I'm asking because I just noticed that I was doing something similar in my code , only it was a bit more hidden. It has worked without complaint for quite a while, so I was wondering why. My guess is, it's because the 'Free' is the very last thing I ask the class to do. In the destructor the class is still alive, after that the memory is freed, and as long as I don't do anything like the commented ShowMessage, that should be safe or no?

I am cleaning this up, of course, but I didn't want to miss the chance of learning something, I don't understand what's going on in the Delphi-source at this point all that well.

Renate

March 3, 2022

Hi, Anders,

Thanks a lot for yet another great tool. Totally straightforward, fast and hasslefree to use.

I have 3 little issues with it, which I can work around, but maybe you'd like to know (newest compiled version 1.3.8055.21506):

1. On my 4K-monitor scaled 200% the ribbon text is barely readable. Minimizing the ribbon displays the text correctly when folded out.

2. Translating a rather long string using the text-editor inserts line breaks at the end of every displayed line.

Translation manager then rightly complains about mismatching line breaks.

I tried to find a setting which turns this off, but couldn't.

Using notepad for the translation and pasting the translated string into the in-place-editor cuts off the string.

3. The font of the text-editor is rather large on my display, adding to the problem in 2.

Thanks again,

Renate

April 20, 2021

Confirmed: Works with uProf now. Great Job.

I get the same info as with VTune: hotspot timings, processor use, stack graph.

Sign In

Renate Schaaf

Content Count

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Calendar

Posts posted by Renate Schaaf

TSaveDialog is not resizable with Styles

Using lame_enc.dll to encode to MP3

Using lame_enc.dll to encode to MP3

Using lame_enc.dll to encode to MP3

How to change the icon of the active tab in TTabControl?

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Parallel Resampling of (VCL-) Bitmaps

Why isn't this dangerous, or is it?

Why isn't this dangerous, or is it?

Why isn't this dangerous, or is it?

Why isn't this dangerous, or is it?

ANN: Better Translation Manager released

MAP2PDB - Profiling with VTune

Browse

Activity