Renate Schaaf 64 Posted April 19, 2023 (edited) A new version is available at https://github.com/rmesch/Repository-R.Schaaf. The parallel resampler can now be used in concurrent threads. You can define more than one thread pool for the resampling and tell the parallel procedure which pool to use. A new demo "ThreadsInThreads" has been added. Initially I had the problem that this demo would sometimes hang waiting for an event, but that has hopefully been fixed. I'd much appreciate a feedback if it still occurs. Thanks, Renate Edited April 19, 2023 by Renate Schaaf 2 1 Share this post Link to post
Tommi Prami 130 Posted April 20, 2023 Nice work. Couple of points that are not that big of a deal, just think about them. Split the unit into maybe three units. For low level utils and types For TTask versions "Thread pool" unit, so if not used, threads are not initialized to consume memory (as pointed out in comments CPU and human time is small, but some might like to save the memory) Change repository name to something more describing. Like ThreadedResampler or something like that. Not to critisize, but was just thinking would those changes make sense to you, not 100% sure are they good idea or not. 1 Share this post Link to post
Renate Schaaf 64 Posted April 20, 2023 Hi Tommi, Your suggestions make a lot of sense. I had already thought it would be better not to initialize the thread pool in initialization, but moving the thread pool stuff into a different unit is a much better idea. Renate Share this post Link to post
Renate Schaaf 64 Posted April 24, 2023 On 4/20/2023 at 5:15 AM, Tommi Prami said: threads are not initialized to consume memory I've implemented this in the new version, but not via a new unit, too much of a can of worms for me right now. Changes to https://github.com/rmesch/Repository-R.Schaaf are now as follows: Threads will no longer be automatically created in Initialization. You can call InitDefaultResamplingThreads to initialize them before you need them, or they will be initialized in the first call of a parallel procedure, which causes a delay. FinalizeDefaultResamplingThreads will free them. If you forget, uScale frees them in Finalization. Source bitmaps with alphaformat=afDefined are now handled correctly, and the target will have the same alphaformat. The resampler works best though with alphaformat=afIgnored. You should define the alphaformat only before display to avoid imprecisions. The unit uTools has routines to transfer a TPngImage or a TWICImage to TBitmap without setting alphaformat=afDefined, which is what TBitmap. Assign alway does in VCL.Graphics (and which is a waste of time for .jpg). A TWICImage is very convenient for fast decoding of .jpg, .png or .tif. Alas TWICImage.Create is not threadsafe, so it needs to be created in the main thread. There is a new TAlphaCombineMode amTransparentColor which preserves the TransparentColor and the regions of transparency upon resampling. The ThreadsInThreads demo now makes thumbnails in 2 concurrent threads, that should be a good crashtest for thread-safety. Cleaned up the code in uScale, moving everything into implementation that is not needed in interface, and added more comments. Also moved 4 almost identical routines into one, sacrificing a bit of performance. I know I should have picked a better name for the repository, but if I change it now, what happens to my three little stars? Renate 1 Share this post Link to post
angusj 126 Posted April 24, 2023 (edited) 55 minutes ago, Renate Schaaf said: what happens to my three little stars? 4 😁 Edit: Perhaps it just means re-star-ting? Edit2: It's very easy to rename a repository (via Settings) and you won't lose any stars! Edited April 24, 2023 by angusj 1 1 Share this post Link to post
dwrbudr 8 Posted April 24, 2023 What makes you think TWICImage.Create is not thread-safe in recent Delphi versions? Share this post Link to post
Renate Schaaf 64 Posted April 24, 2023 (edited) It certainly isn't in 10.4, I tried it. Don't know about 11.x, since I don't have it. I'm claiming to support 10.x and higher. See here: https://www.delphitools.info/2019/07/18/workaround-for-twicimage-imagefactory-bug-in-delphi-10-x/ Edit: I played around with this some more, and it seems like you just have to create any old TWICImage in the main thread before you send off a thread. Then the thread can safely use TWICImage.Create, because the factory has been initialized already. If the factory is initialized in the thread, it bombs. Edited April 24, 2023 by Renate Schaaf Share this post Link to post
Anders Melander 1783 Posted April 24, 2023 10 hours ago, Renate Schaaf said: The resampler works best though with alphaformat=afIgnored I wish they would just deprecate AlphaFormat. It's an abomination that was supposed to make it easy to draw a 32-bit TBitmap with alpha. I guess it did that but what it also did was make it almost impossible to do anything beyond the single use case they could think of. 1 Share this post Link to post
Renate Schaaf 64 Posted April 25, 2023 22 hours ago, angusj said: It's very easy to rename a repository Wow, it was indeed very easy, and even the previous link still works! Thanks! 1 Share this post Link to post
Renate Schaaf 64 Posted April 25, 2023 I just managed to get the routine for the pre-multiplication much faster. I naively assumed that MulDiv is faster than multiplication and subsequent div. So I used this (PremultPrecision is a constant of value 4): if ps.rgbReserved > 0 then begin alpha := Weight * ps.rgbReserved; inc(Cache.b, MulDiv(ps.rgbBlue, alpha, PreMultPrecision)); inc(Cache.g, MulDiv(ps.rgbGreen, alpha, PreMultPrecision)); inc(Cache.r, MulDiv(ps.rgbRed, alpha, PreMultPrecision)); inc(Cache.a, alpha); end; But this is in fact faster (with compiler-optimization turned on): if ps.rgbReserved > 0 then begin alpha := Weight * ps.rgbReserved; inc(Cache.b, ps.rgbBlue*alpha div PreMultPrecision); inc(Cache.g, ps.rgbGreen*alpha div PreMultPrecision); inc(Cache.r, ps.rgbRed*alpha div PreMultPrecision); inc(Cache.a, alpha); end; And now I remember that Anders even explained that behavior to me before: If the div is by a constant that is a power of 2, the optimizer is smart enough to turn div into a shift. Guess the optimizer isn't smart enough to do this in MulDiv. Share this post Link to post
angusj 126 Posted April 25, 2023 (edited) Or you could make the bitshifting explicit : inc(Cache.b, (ps.rgbBlue*alpha) shr 2); And given that you're multiplying two bytes you could optimise this further by using a lookup table: MulTable: array [Byte,Byte] of integer; //initialized in initialization section ... inc(Cache.b, MulTable[ps.rgbBlue, alpha] shr 2); Edit: Oops, sorry, it appears that alpha is an integer, not a byte. Edited April 25, 2023 by angusj Share this post Link to post
Renate Schaaf 64 Posted April 25, 2023 (edited) 3 hours ago, angusj said: Oops, sorry, it appears that alpha is an integer, not a byte. It could still work, since the weights only run from -$100 to $100. So a 3-dimensional table would work. Let's try. Edit: No, such a table would just be too big. Edit2: shr 2 doesn't work, because the result of the multiplication could be negative. Thanks for the idea! Edited April 25, 2023 by Renate Schaaf 1 Share this post Link to post
Anders Melander 1783 Posted April 26, 2023 12 hours ago, Renate Schaaf said: shr 2 doesn't work, because the result of the multiplication could be negative. Yes, div 2^n is compiled to an arithmetic shift (sar instruction) while shr n is compiled to a logical shift (shr instruction). 14 hours ago, angusj said: And given that you're multiplying two bytes you could optimise this further by using a lookup table Even if that would have been possible it wouldn't magically have become faster. The calculation of the index into the table is a (implicit) multiplication too so you would just be trading one integer multiplication for another - and adding a memory access. Share this post Link to post
angusj 126 Posted April 26, 2023 2 hours ago, Anders Melander said: Even if that would have been possible it wouldn't magically have become faster. Yep, point taken. Share this post Link to post
Renate Schaaf 64 Posted April 26, 2023 On 4/24/2023 at 8:19 PM, dwrbudr said: What makes you think TWICImage.Create is not thread-safe in recent Delphi versions? The TWICImage-factory-bug in threads is still present in 11.3 (I just got the new community-edition 🙂) Share this post Link to post
dwrbudr 8 Posted April 27, 2023 17 hours ago, Renate Schaaf said: The TWICImage-factory-bug in threads is still present in 11.3 (I just got the new community-edition 🙂) The creation of the FImagingFactory is protected by a critical section. Is there anything else that shall be protected by a critical section, e.g. some other class variable or something else. Share this post Link to post
Renate Schaaf 64 Posted April 27, 2023 1 hour ago, dwrbudr said: Is there anything else that shall be protected by a critical section, e.g. some other class variable or something else. Is has to do with COM, ActiveX, don't ask me to explain, I know next to nothing about these things. But I browsed QC a bit, in which this or similar bugs have been reported multiple times (almost all closed), but I got the idea that a CoInitialize call could be missing. Look at this simple test: uses System.Threading, WinApi.ActiveX; procedure TForm2.Button1Click(Sender: TObject); var task: iTask; begin task := TTask.Run( procedure var WIC: TWICImage; begin //CoInitializeEx(nil,COINIT_MULTITHREADED); WIC := TWICImage.Create; try finally WIC.Free; end; end); end; As it is, this gives an error in VCL.Graphics in the line marked: constructor TWICImage.Create; var LResult: HResult; begin inherited; FInterpolationMode := wipmNone; EnterCriticalSection(WicImageLock); try if FImagingFactory = nil then begin LResult := CoCreateInstance(CLSID_WICImagingFactory, nil, CLSCTX_INPROC_SERVER or //<--------------------------- CLSCTX_LOCAL_SERVER, IUnknown, FImagingFactory); if Failed(LResult) then raise EInvalidGraphicOperation.CreateFmt(SWinRTInstanceError + ' (%X)', ['CLSID_WICImagingFactory', LResult]); end else FImagingFactory._AddRef; finally LeaveCriticalSection(WicImageLock); end; Uncommenting the line with CoInitializeEx in the test, seems to fix the error in the test procedure. Renate Share this post Link to post
Anders Melander 1783 Posted April 27, 2023 18 hours ago, Renate Schaaf said: The TWICImage-factory-bug in threads is still present in 11.3 I don't think it is, but... 9 minutes ago, Renate Schaaf said: Uncommenting the line with CoInitializeEx in the test, seems to fix the error in the test procedure. ...since the TWICImage constructor calls CoCreateInstance you need to have COM initialized before that happens. As you've observed CoInitialize/CoInitializeEx does that. Remember though to check the result of CoInitialize(Ex); It will return an error code (it's not an error as such) if COM has already been initialized (on the calling thread). You should only call CoUninitialize if the call to CoInitialize(Ex) succeeded. If you search here I'm guessing the how and why has been explained many times. Share this post Link to post
Renate Schaaf 64 Posted April 27, 2023 11 minutes ago, Anders Melander said: I don't think it is, but... Well, the test clearly shows it, or doesn't it? 13 minutes ago, Anders Melander said: You should only call CoUninitialize if the call to CoInitialize(Ex) succeeded. Because I could step on the toes of some other code, I guess. Thanks, I'll keep it in mind. Anyway, for the time being I feel safer with just creating a TWICImage in initialization. Share this post Link to post
Anders Melander 1783 Posted April 27, 2023 2 minutes ago, Renate Schaaf said: Well, the test clearly shows it, or doesn't it? No. It shows that there's some problem. It doesn't show that the bug is still there. Btw, I guess it's this one: RSP-27825 (status: Closed, resolution: Fixed, fixed version: 10.4.1). 5 minutes ago, Renate Schaaf said: Because I could step on the toes of some other code, I guess. Or they could step on yours. 5 minutes ago, Renate Schaaf said: I feel safer with just creating a TWICImage in initialization 🙂 1 Share this post Link to post