Jump to content
Sign in to follow this  
pyscripter

Using PCRE options that are not exposed.

Recommended Posts

PCRE, the regular expression engine used in Delphi has a large number of compile time options only few of which are exposed in the high-level (System.RegularExpressions) or the low-lever (System.RegularExpressionsCore) Delphi interface.  For example a useful PCRE option that is not exposed is the PCRE_UCP, which controls the meaning of \w \d etc.  When this options is set for example \w matches any Unicode letter or _ character.  If it is not set (in Delphi usage) it only matches ascii letter characters.   Class helpers can come to the rescue again.

 

uses
  System.RegularExpressionsAPI,
  System.RegularExpressionsCore,
  System.RegularExpressions;

type
  { TPerlRegExHelper }
  TPerlRegExHelper = class helper for TPerlRegEx
    procedure SetAdditionalPCREOptions(PCREOptions : Integer);
  end;

procedure TPerlRegExHelper.SetAdditionalPCREOptions(PCREOptions: Integer);
begin
  with Self do FPCREOptions := FPCREOptions or PCREOptions;
end;


type
  { TRegExHelper }
  TRegExHelper = record helper for TRegEx
  public
    procedure Study;
    procedure SetAdditionalPCREOptions(PCREOptions : Integer);
  end;

procedure TRegExHelper.Study;
begin
  with Self do FRegEx.Study;
end;

procedure TRegExHelper.SetAdditionalPCREOptions(PCREOptions: Integer);
begin
  with Self do FRegEx.SetAdditionalPCREOptions(PCREOptions);
end;

Example usage: 

 

Var
  RE : TRegEx;
  Match : TMatch;

begin
  RE.Create('\w+');
  RE.SetAdditionalPCREOptions(PCRE_UCP);  // No match without this
  Match := RE.Match('汉堡包/漢堡包');
  if Match.Success then
    ShowMessage(Match.Groups[0].Value);

 

  • Like 3
  • Thanks 2

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×