Regex Validate string

Skullcode · July 1, 2020

i have create this function in order to allow arabic and english letters and numbers only in a string

var
Regexs : TRegEx;
begin
if Regexs.IsMatch(astr, '[ء-ي-A-Z-a-z-0-9 ]+') then
begin
Result := True;
end else
begin
Result := False;
end;

if i type the string as example abcdefgÄ the result returned True and i did not specfiy Ä in the regex pattern

how to make regex match return true with the given pattern only ?

Mahdi Safsafi · July 1, 2020

Quote

if i type the string as example abcdefgÄ the result returned True and i did not specfiy Ä in the regex pattern

You need to match EOS "$":

if Regexs.IsMatch(astr, '^[ء-ي-A-Z-a-z-0-9 ]+$') then // match fro the begining to the end of astr

Mahdi Safsafi · July 1, 2020

For curiosity, why are you mixing arabic with latin letters. Does such ArabicLatin word make sense for you?

Skullcode · July 1, 2020

30 minutes ago, Mahdi Safsafi said:

For curiosity, why are you mixing arabic with latin letters. Does such ArabicLatin word make sense for you?

i am trying to validate an input that will be written by a client and trying to prevent special charcters and allow only alphabets arabic and english

Skullcode · July 1, 2020

current i use this function is it fine or more correction needed ?


Function Checkstr(const astr: string):Boolean;
var
Regexs : TRegEx;
i : integer;
svalue : string;
Allowed : string;
begin

svalue := Trim(astr);
for i := 1 to Length(svalue) do
begin

if Regexs.IsMatch(svalue[i], '^[ء-يA-Za-z0-9$&+=?@#~<>.^*()%!\s]+$') then
begin
Allowed := 'YES';
end else
begin
Allowed := 'NO';
Break;
end;


end;


if Allowed = 'YES' then
begin
Result := True;
end else
begin
Result := False;
end;

end;

Edited July 1, 2020 by Skullcode

aehimself · July 1, 2020

18 minutes ago, Skullcode said:

current i use this function is it fine or more correction needed ?



Function Checkstr(const astr: string):Boolean;
var
Regexs : TRegEx;
i : integer;
svalue : string;
Allowed : string;
begin

svalue := Trim(astr);
for i := 1 to Length(svalue) do
begin

if Regexs.IsMatch(svalue[i], '^[ء-يA-Za-z0-9$&+=?@#~<>.^*()%!\s]+$') then
begin
Allowed := 'YES';
end else
begin
Allowed := 'NO';
Break;
end;


end;


if Allowed = 'YES' then
begin
Result := True;
end else
begin
Result := False;
end;

end;

Ummm.... Why make a string?

Why not getting rid of Allowed and setting Result directly? Why going through character by character if Regex can validate the whole string at once?

If I'm not mistaken you'll achieve the same with this one line:

Result := Regexs.IsMatch(svalue.Trim, '^[ء-يA-Za-z0-9$&+=?@#~<>.^*()%!\s]+$');

Anders Melander · July 1, 2020

Why on earth are you using RegEx at all?

Just use one of the many existing functions or bake your own:

function ValidateChars(const Value, ValidChars: string): boolean;
begin
  for var v in Value do
  begin
    var Valid: boolean := False;
    for var c in ValidChars do
      if (c = v) then
      begin
        Valid := True;
        break;
      end;
    if (not Valid) then
      Exit(False);
  end;
  Result := True;
end;

Also consider using character classes. See TCharHelper in the System.Character unit.

July 1, 2020

Don't validate per char as it will be maze to follow in Unicode, and don't use RegEX as it will be different problem as numbers settings in the OS might be Arabic (those numbers in fact Hindu mistakenly called Arabic) for all Locale and they will cause havoc.

I do it differently, i also wrote a filter to filter out any non English and non Arabic chars

TEncoding.GetEncoding(1256).GetString(TEncoding.GetEncoding(1256).GetBytes(aText))

any other languages chars will be '?', many special chars will be '?' , special chars wasn't not a problem in my case, it was requested to prevent bad (offending) words that is written in different languages and the administrators don't understand them.

I don't use TEncoding but windows API directly, but i think you got the idea.

Mahdi Safsafi · July 1, 2020

Quote

Why on earth are you using RegEx at all?

There are many reasons :

- This is a good place where using RegEx makes sense.

- RegEx interacts perfectly with UDB. Eg: '^[\p{Arabic}\p{P}A-Za-z0-9\s]+$'. How someone can represent \p{P}, \p{Arabic} smoothly when doing hand-writing ?

- In many times(but not always), RegEx can outperform a hand writing function specially if you compile them. I didn't test but I believe RegEx-solution performs better than your hand-writing-solution.

- RegEx is much faster for typing and reading.

- Extending a RegEx pattern is much simple than extending a function.
- ...

Quote

RegEX as it will be different problem as numbers settings in the OS might be Arabic (those numbers in fact Hindu mistakenly called Arabic) for all Locale and they will cause havoc.

AFAIK, RegEx do not rely on OS. They have their own DB.

July 1, 2020

@Mahdi Safsafi Thank you, i didn't know that.

It was one line to filter out every other Unicode, and will be interesting to see how RegEX handle numbers (Hindu and Arabic), later will investigate.

Mahdi Safsafi · July 1, 2020

29 minutes ago, Kas Ob. said:

and will be interesting to see how RegEX handle numbers (Hindu and Arabic), later will investigate.

\d handles number from any language.

A.M. Hoornweg · July 6, 2020

On 7/1/2020 at 9:12 AM, Mahdi Safsafi said:

For curiosity, why are you mixing arabic with latin letters. Does such ArabicLatin word make sense for you?

In my experience, when reporting is done on oil wells in the Middle East, the Arabic text is often interspersed with English technical terms. And for a developer it is quite a challenge to get mixed LTR-RTL text input right.

Mahdi Safsafi · July 6, 2020

1 hour ago, A.M. Hoornweg said:

In my experience, when reporting is done on oil wells in the Middle East, the Arabic text is often interspersed with English technical terms. And for a developer it is quite a challenge to get mixed LTR-RTL text input right.

Thanks man !

This "ArabicWord EnglishWord" sentence makes sense to me, But using a word that has a mixed Arabic and English letter such "ArabicLettersEnglishLetters" is much harder to make sense.

Sign In

Regex Validate string

Recommended Posts

Skullcode 0

Share this post

Link to post

Mahdi Safsafi 225

Share this post

Link to post

Mahdi Safsafi 225

Share this post

Link to post

Skullcode 0

Share this post

Link to post

Skullcode 0

Share this post

Link to post

aehimself 404

Share this post

Link to post

Anders Melander 2026

Share this post

Link to post

Guest

Share this post

Link to post

Mahdi Safsafi 225

Share this post

Link to post

Guest

Share this post

Link to post

Mahdi Safsafi 225

Share this post

Link to post

A.M. Hoornweg 153

Share this post

Link to post

Mahdi Safsafi 225

Share this post

Link to post

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity