Thanks. There is definitely a structure and ranges that are assigned the same value. But there is no special documentation in the Unicode standard that would go beyond what you can directly deduce from the mapping available in Delphi. I actually started with the standard, then looked for efficient encodings. The standard vaguely suggests using a data structure like a trie.
The Unicode documentation itself only lists every character and gives it a matching category.
The Delphi implementation apparently uses some kind of Hashmap. But I haven't been able to figure out the "inverse function" yet, to create the table.
Edit: I have looked into writing my own hashing function, assuming the division of the original key into three parts (one 13 bit key, and two 4 bit keys) as the original RTL code does. I could reproduce the values after a while, eventhough it seems the RTL wastes a bit of value range. I will update this post when I found out the final solution.