Jump to content
Mike Torrettinni

Possible custom Format types?

Recommended Posts

I'm trying to generalize my method to remove formatting tokens from the string, and just want to be sure that I handle all cases.

So, the questions is: besides the d, e, f, g, m, n, p, s, u, x are there other possible options in Format (SysUtils) function?

 

I have this method and would like to make sure it covers all options:

 

// Remove any %s %d... from string
// Used for cleaning formatted ready strings to be display ready
// 'Customer name %s contains invalid characters.' -> 'Customer name contains invalid characters.'
function RemoveFormatSettingsFromString(const aString: string): string;
const
  cReplacements: array [1..10] of string =
    ('%s ', '%d ', '%e ', '%f ', '%g ', '%m ', '%n ', '%p ', '%u ', '%x ');
var
  vReplacement: string;
begin
  Result := aString;
  for vReplacement in cReplacements do
    Result := StringReplace(Result, vReplacement, '', [rfReplaceAll]);
end;

 

Share this post


Link to post
Guest

Not an answer to your question, but a suggestion 

Change the function to return boolean if such invalid characters detected, and the result string (sanitized) will return in var parameter, such you will be able to show and log the original invalid string, yet still can silently discard it and use the fixed one.

Share this post


Link to post
Guest
 
Quote

 


"%" [index ":"] ["-"] [width] ["." prec] type

A format specifier begins with a % character. After the % come the following elements, in this order: 

An optional argument zero-offset index specifier (that is, the first item has index 0), [index ":"] 

An optional left justification indicator, ["-"] 

An optional width specifier, [width] 

An optional precision specifier, ["." prec] 

The conversion type character, type

 

 

 

So the cases where all formaters are a % and just one char are specific.

 

 

What about matching for %[anything w/o space][type][one space] where type is one of those type characters?

Share this post


Link to post
57 minutes ago, Mike Torrettinni said:

I'm trying to generalize my method to remove formatting tokens from the string, and just want to be sure that I handle all cases.

So, the questions is: besides the d, e, f, g, m, n, p, s, u, x are there other possible options in Format (SysUtils) function?

 

I have this method and would like to make sure it covers all options:

 


function RemoveFormatSettingsFromString(const aString: string): string;
const
  cReplacements: array [1..10] of string =
    ('%s ', '%d ', '%e ', '%f ', '%g ', '%m ', '%n ', '%p ', '%u ', '%x ');

That is just a simple start. There might be indexes, width and precision informations, like 'Total amount: %1.2f'. You will need much more parsing. Perhaps some RegEx filtering.

 

Edited by Achim Kalwa

Share this post


Link to post
Guest

IMHO

Quote

"%" [index ":"] ["-"] [width] ["." prec] type

is the "Key".

 

This line specifies ALL possible combinations. You can expect. Obviously the formatter ALWAYS starts with % and ends with a "type" char. You should be able to skip testing for a space ater the type char.

Share this post


Link to post
Guest

I dunno, but i seem to remember an escape character in the mix. Cannot find it now on the phone. I.e. how would you print a "%".

Share this post


Link to post

You can find a function to strip out format specifiers (among other things), including the index, width and precision stuff, here:

https://bitbucket.org/anders_melander/better-translation-manager/src/a9e47ac90e7f80b67176cdb61b72aa34f4a8f165/Source/amLocalization.Normalization.pas#lines-306

 

The code as-is replaces %... with space to make the result readable.

 

This is the relevant code:

Result := Value;

// Find first format specifier
n := PosEx('%', Result, 1);

while (n > 0) and (n < Length(Result)) do
begin
  Inc(n);

  if (Result[n] = '%') then
  begin
    // Escaped % - ignore
    Delete(Result, n, 1);
  end else
  if (IsAnsi(Result[n])) and (AnsiChar(Result[n]) in ['0'..'9', '-', '.', 'd', 'u', 'e', 'f', 'g', 'n', 'm', 'p', 's', 'x']) then
  begin
    Result[n-1] := ' '; // Replace %... with space

    // Remove chars until end of format specifier
    while (Result[n].IsDigit) do
      Delete(Result, n, 1);

    if (Result[n] = ':') then
      Delete(Result, n, 1);

    if (Result[n] = '-') then
      Delete(Result, n, 1);

    while (Result[n].IsDigit) do
      Delete(Result, n, 1);

    if (Result[n] = '.') then
      Delete(Result, n, 1);

    while (Result[n].IsDigit) do
      Delete(Result, n, 1);

    if (IsAnsi(Result[n])) and (AnsiChar(Result[n]) in ['d', 'u', 'e', 'f', 'g', 'n', 'm', 'p', 's', 'x']) then
      Delete(Result, n, 1)
    else
    begin
      // Not a format string - undo
      Result := Value;
      break;
    end;
  end else
  begin
    // Not a format string - undo
    Result := Value;
    break;
  end;

  // Find next format specifier
  n := PosEx('%', Result, n);
end;

 

  • Thanks 2

Share this post


Link to post
26 minutes ago, Anders Melander said:

The code as-is replaces %... with space to make the result readable.

Thanks!

You never had the need to not replace with space, but just delete the %... ? I usually have spaces around the %..., so in this case you end up with triple space, right?

'Customer name %s is wrong' -> 'Customer name   is wrong'.  No?

Share this post


Link to post
1 minute ago, Mike Torrettinni said:

You never had the need to not replace with space, but just delete the %... ? I usually have spaces around the %..., so in this case you end up with triple space, right?

'Customer name %s is wrong' -> 'Customer name   is wrong'.  No?

It depends on what I use the string for afterwards.

For example if I need to compare with another string I just trim consecutive spaces down to a single. If I need to parse out the individual words I leave the spaces there since the parser will skip over them anyway.

The function is for use in a translation tool. AFAIR it can also remove shortcut accelerators, () [] {} <> pairs and punctuation .:;? etc.

Share this post


Link to post
17 minutes ago, Anders Melander said:

It depends on what I use the string for afterwards.

For example if I need to compare with another string I just trim consecutive spaces down to a single. If I need to parse out the individual words I leave the spaces there since the parser will skip over them anyway.

The function is for use in a translation tool. AFAIR it can also remove shortcut accelerators, () [] {} <> pairs and punctuation .:;? etc.

OK, makes sense. It looks like very versatile function!

  • Thanks 1

Share this post


Link to post

@Mike Torrettinni I don't know if this going to help. A while ago I wrote a regular expression to match the following pattern ""%" [index ":"] ["-"] [width] ["." prec] type" for format 

%((\d+|\*)\:)?[\-]?(\d+|\*)?(\.(\d+|\*))?[duefgnmpsx]

The regex was used with Perl but I believe its still compatible with PCRE like.

  • Like 1

Share this post


Link to post
37 minutes ago, Mahdi Safsafi said:

@Mike Torrettinni I don't know if this going to help. A while ago I wrote a regular expression to match the following pattern ""%" [index ":"] ["-"] [width] ["." prec] type" for format 


%((\d+|\*)\:)?[\-]?(\d+|\*)?(\.(\d+|\*))?[duefgnmpsx]

The regex was used with Perl but I believe its still compatible with PCRE like.

Thanks, but I rarely use RegEx and only with very simple expressions. 

Share this post


Link to post
1 hour ago, Mike Torrettinni said:

Thanks, but I rarely use RegEx and only with very simple expressions.

Wise decision.

They are fun to write but a nightmare to maintain.

Share this post


Link to post
2 hours ago, Mahdi Safsafi said:

%((\d+|\*)\:)?[\-]?(\d+|\*)?(\.(\d+|\*))?[duefgnmpsx]

Hmm. Looking at that RegEx I just realized that I forgot to handle the asterisk parameter specifier in my own code.

 

To fix replace all three occurrences of:

while (Result[n].IsDigit) do
  Delete(Result, n, 1);

with:

if (Result[n] = '*') then
  Delete(Result, n, 1)
else
  while (Result[n].IsDigit) do
    Delete(Result, n, 1);

 

  • Thanks 1

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×