In the end, I had to abandon PUCU as the author never bothered to react to my bug reports.
Instead, I tried to adapt the JEDI implementation: I removed all dependencies and fixed the worst bugs and performance issues. Finally, I ran the code against my unit tests. The result was a big disappointment; While it didn't crash like PUCU, it failed even more of the test cases. The test suite uses the 19,000 NFC (compose) and NFD (decompose) normalization test cases published by the Unicode consortium.
So back to square one again. Comparing the algorithms used by the JEDI library against countless (I've looked at over a hundred) other Unicode libraries didn't reveal the cause. They all used the same algorithms. Blogs and articles that described the algorithm also matched what was being done. I was beginning to suspect that Unicode's own test cases were wrong, but then I finally got around to reading the actual Unicode specification where the rules and algorithms are described, and guess what - Apart from Unicode's own reference library and a few others, they're all doing it wrong.
I have now implemented the normalization functions from scratch based on the Unicode v15 specs and all tests now pass.
The functions can be found here, in case anyone needs them: https://gitlab.com/anders.bo.melander/pascaltype2/-/blob/master/Source/PascalType.Unicode.pas#L258
Note that while the functions implement both canonical (NFC/NFD) and compatible (NFKC/NFKD) normalization, only the canonical variants have been tested as they are the only ones I need.