The question is whether they can do something about it at all.
For start, Skia in Delphi operates on several layers of indirection (including some poor design choices) which also involve memory intensive operations on each paint (including reference counting). When you add FireMonkey on top that is additional layer. Now, all that does make Skia slower than it could be. But that problem exists on all platforms. ARM and especially Apple Silicon have better performance when it comes to reference counting, but that is probably still not enough to explain the difference.
Some of the difference can also be explained with juggling memory between CPU and GPU on Windows as those other platforms use unified memory for CPU/GPU. However, my son Rene (who spends his days fiddling with GPU rendering stuff) said that from the way CPU and GPU behaves when running Skia benchmark test, that all this still might not be enough to explain observed difference, and that tiled based rendering on those other platforms could play a significant role.
Another question is how Skia painting actually works underneath all those layers and whether there is some additional batching of operations or not.
@Hans♫ Do the Macs you tested it on have Apple Silicon? If they do then they also support tile based rendering.
As far a solutions are concerned, it is always possible to achieve some speedups by reusing an caching some objects that are unnecessarily allocated on the fly, those could be both in FMX code itself, or in how FMX controls are used and how many are them on the screen. I used to create custom controls that would cache and reuse some FMX graphic objects when painting instead of creating complex layouts with many individual FMX controls.