Guest Posted January 15, 2022 The CJK dicussion is pertinent even if you will not index (support FTS) for these languages in you implementation (you being someone like me) In my system users "just uploads [anything 🙂 ]" so there is a LOT of noise. In order to cut down on that "noice" such things needs to be understood. I have other "noise" problems (not language related, rather format related) and all things performed to filter it out will "hang together" if you do not have every chunk of text tagged with its language. So CJK knowledge and support routines should not be down-prioritised IMHO. Just my $.05. Share this post Link to post
Ann Lynnworth 11 Posted January 19, 2022 From my colleague in Singapore, I learned about this dictionary for Chinese: https://www.pleco.com/about/ That would be one way to identify undelimited words. I guess Pleco would have to be licensed not only by us but by any developer who needed that functionality.  I have done another round of improvements on the Rubicon installer. I am still very interested to hear from anyone wanting to do optimized full text search with FireDAC and non-CJK database content and willing to give us new-user feedback. The comments thus far have been extremely helpful.  Release notes for Rubicon v4.070 are here: https://www.href.com/rbrelnotes  To get the discount, just fill out the tiny survey and we will get back to you. You will get a real Rubicon Pro license with full source and free upgrades until the next major release.  Best wishes to all for the "year 2022" ride around the sun.  Share this post Link to post