Skullcode 0 Posted July 1, 2020 i have create this function in order to allow arabic and english letters and numbers only in a string var Regexs : TRegEx; begin if Regexs.IsMatch(astr, '[ء-ي-A-Z-a-z-0-9 ]+') then begin Result := True; end else begin Result := False; end; if i type the string as example abcdefgÄ the result returned True and i did not specfiy Ä in the regex pattern how to make regex match return true with the given pattern only ? Share this post Link to post
Mahdi Safsafi 225 Posted July 1, 2020 Quote if i type the string as example abcdefgÄ the result returned True and i did not specfiy Ä in the regex pattern You need to match EOS "$": if Regexs.IsMatch(astr, '^[ء-ي-A-Z-a-z-0-9 ]+$') then // match fro the begining to the end of astr Share this post Link to post
Mahdi Safsafi 225 Posted July 1, 2020 For curiosity, why are you mixing arabic with latin letters. Does such ArabicLatin word make sense for you? Share this post Link to post
Skullcode 0 Posted July 1, 2020 30 minutes ago, Mahdi Safsafi said: For curiosity, why are you mixing arabic with latin letters. Does such ArabicLatin word make sense for you? i am trying to validate an input that will be written by a client and trying to prevent special charcters and allow only alphabets arabic and english Share this post Link to post
Skullcode 0 Posted July 1, 2020 (edited) current i use this function is it fine or more correction needed ? Function Checkstr(const astr: string):Boolean; var Regexs : TRegEx; i : integer; svalue : string; Allowed : string; begin svalue := Trim(astr); for i := 1 to Length(svalue) do begin if Regexs.IsMatch(svalue[i], '^[ء-يA-Za-z0-9$&+=?@#~<>.^*()%!\s]+$') then begin Allowed := 'YES'; end else begin Allowed := 'NO'; Break; end; end; if Allowed = 'YES' then begin Result := True; end else begin Result := False; end; end; Edited July 1, 2020 by Skullcode Share this post Link to post
aehimself 396 Posted July 1, 2020 18 minutes ago, Skullcode said: current i use this function is it fine or more correction needed ? Function Checkstr(const astr: string):Boolean; var Regexs : TRegEx; i : integer; svalue : string; Allowed : string; begin svalue := Trim(astr); for i := 1 to Length(svalue) do begin if Regexs.IsMatch(svalue[i], '^[ء-يA-Za-z0-9$&+=?@#~<>.^*()%!\s]+$') then begin Allowed := 'YES'; end else begin Allowed := 'NO'; Break; end; end; if Allowed = 'YES' then begin Result := True; end else begin Result := False; end; end; Ummm.... Why make a string? Why not getting rid of Allowed and setting Result directly? Why going through character by character if Regex can validate the whole string at once? If I'm not mistaken you'll achieve the same with this one line: Result := Regexs.IsMatch(svalue.Trim, '^[ء-يA-Za-z0-9$&+=?@#~<>.^*()%!\s]+$'); 1 Share this post Link to post
Anders Melander 1782 Posted July 1, 2020 Why on earth are you using RegEx at all? Just use one of the many existing functions or bake your own: function ValidateChars(const Value, ValidChars: string): boolean; begin for var v in Value do begin var Valid: boolean := False; for var c in ValidChars do if (c = v) then begin Valid := True; break; end; if (not Valid) then Exit(False); end; Result := True; end; Also consider using character classes. See TCharHelper in the System.Character unit. 1 Share this post Link to post
Guest Posted July 1, 2020 Don't validate per char as it will be maze to follow in Unicode, and don't use RegEX as it will be different problem as numbers settings in the OS might be Arabic (those numbers in fact Hindu mistakenly called Arabic) for all Locale and they will cause havoc. I do it differently, i also wrote a filter to filter out any non English and non Arabic chars TEncoding.GetEncoding(1256).GetString(TEncoding.GetEncoding(1256).GetBytes(aText)) any other languages chars will be '?', many special chars will be '?' , special chars wasn't not a problem in my case, it was requested to prevent bad (offending) words that is written in different languages and the administrators don't understand them. I don't use TEncoding but windows API directly, but i think you got the idea. Share this post Link to post
Mahdi Safsafi 225 Posted July 1, 2020 Quote Why on earth are you using RegEx at all? There are many reasons : - This is a good place where using RegEx makes sense. - RegEx interacts perfectly with UDB. Eg: '^[\p{Arabic}\p{P}A-Za-z0-9\s]+$'. How someone can represent \p{P}, \p{Arabic} smoothly when doing hand-writing ? - In many times(but not always), RegEx can outperform a hand writing function specially if you compile them. I didn't test but I believe RegEx-solution performs better than your hand-writing-solution. - RegEx is much faster for typing and reading. - Extending a RegEx pattern is much simple than extending a function. - ... Quote RegEX as it will be different problem as numbers settings in the OS might be Arabic (those numbers in fact Hindu mistakenly called Arabic) for all Locale and they will cause havoc. AFAIK, RegEx do not rely on OS. They have their own DB. 1 Share this post Link to post
Guest Posted July 1, 2020 @Mahdi Safsafi Thank you, i didn't know that. It was one line to filter out every other Unicode, and will be interesting to see how RegEX handle numbers (Hindu and Arabic), later will investigate. Share this post Link to post
Mahdi Safsafi 225 Posted July 1, 2020 29 minutes ago, Kas Ob. said: and will be interesting to see how RegEX handle numbers (Hindu and Arabic), later will investigate. \d handles number from any language. Share this post Link to post
A.M. Hoornweg 144 Posted July 6, 2020 On 7/1/2020 at 9:12 AM, Mahdi Safsafi said: For curiosity, why are you mixing arabic with latin letters. Does such ArabicLatin word make sense for you? In my experience, when reporting is done on oil wells in the Middle East, the Arabic text is often interspersed with English technical terms. And for a developer it is quite a challenge to get mixed LTR-RTL text input right. 1 Share this post Link to post
Mahdi Safsafi 225 Posted July 6, 2020 1 hour ago, A.M. Hoornweg said: In my experience, when reporting is done on oil wells in the Middle East, the Arabic text is often interspersed with English technical terms. And for a developer it is quite a challenge to get mixed LTR-RTL text input right. Thanks man ! This "ArabicWord EnglishWord" sentence makes sense to me, But using a word that has a mixed Arabic and English letter such "ArabicLettersEnglishLetters" is much harder to make sense. Share this post Link to post