Possible custom Format types?

Mike Torrettinni · June 26, 2020

I'm trying to generalize my method to remove formatting tokens from the string, and just want to be sure that I handle all cases.

So, the questions is: besides the d, e, f, g, m, n, p, s, u, x are there other possible options in Format (SysUtils) function?

I have this method and would like to make sure it covers all options:

// Remove any %s %d... from string
// Used for cleaning formatted ready strings to be display ready
// 'Customer name %s contains invalid characters.' -> 'Customer name contains invalid characters.'
function RemoveFormatSettingsFromString(const aString: string): string;
const
  cReplacements: array [1..10] of string =
    ('%s ', '%d ', '%e ', '%f ', '%g ', '%m ', '%n ', '%p ', '%u ', '%x ');
var
  vReplacement: string;
begin
  Result := aString;
  for vReplacement in cReplacements do
    Result := StringReplace(Result, vReplacement, '', [rfReplaceAll]);
end;

June 26, 2020

Not an answer to your question, but a suggestion

Change the function to return boolean if such invalid characters detected, and the result string (sanitized) will return in var parameter, such you will be able to show and log the original invalid string, yet still can silently discard it and use the fixed one.

June 26, 2020

Quote
Copy Code
"%" [index ":"] ["-"] [width] ["." prec] type
A format specifier begins with a % character. After the % come the following elements, in this order:

An optional argument zero-offset index specifier (that is, the first item has index 0), [index ":"]

An optional left justification indicator, ["-"]

An optional width specifier, [width]

An optional precision specifier, ["." prec]

The conversion type character, type

So the cases where all formaters are a % and just one char are specific.

What about matching for %[anything w/o space][type][one space] where type is one of those type characters?

Achim Kalwa · June 26, 2020

57 minutes ago, Mike Torrettinni said:
I'm trying to generalize my method to remove formatting tokens from the string, and just want to be sure that I handle all cases.

So, the questions is: besides the d, e, f, g, m, n, p, s, u, x are there other possible options in Format (SysUtils) function?

I have this method and would like to make sure it covers all options:
function RemoveFormatSettingsFromString(const aString: string): string;
const
  cReplacements: array [1..10] of string =
    ('%s ', '%d ', '%e ', '%f ', '%g ', '%m ', '%n ', '%p ', '%u ', '%x ');
That is just a simple start. There might be indexes, width and precision informations, like 'Total amount: %1.2f'. You will need much more parsing. Perhaps some RegEx filtering.

Edited June 26, 2020 by Achim Kalwa

Mike Torrettinni · June 26, 2020

4 minutes ago, Dany Marmur said:

So the cases where all formaters are a % and just one char are specific.

Wow, a lot more options. I see it now. I was reading this page and didn't scroll down: http://www.delphibasics.co.uk/RTL.asp?Name=format

June 26, 2020

IMHO

Quote


"%" [index ":"] ["-"] [width] ["." prec] type

is the "Key".

This line specifies ALL possible combinations. You can expect. Obviously the formatter ALWAYS starts with % and ends with a "type" char. You should be able to skip testing for a space ater the type char.

June 26, 2020

I dunno, but i seem to remember an escape character in the mix. Cannot find it now on the phone. I.e. how would you print a "%".

Anders Melander · June 26, 2020

You can find a function to strip out format specifiers (among other things), including the index, width and precision stuff, here:

https://bitbucket.org/anders_melander/better-translation-manager/src/a9e47ac90e7f80b67176cdb61b72aa34f4a8f165/Source/amLocalization.Normalization.pas#lines-306

The code as-is replaces %... with space to make the result readable.

This is the relevant code:

Result := Value;

// Find first format specifier
n := PosEx('%', Result, 1);

while (n > 0) and (n < Length(Result)) do
begin
  Inc(n);

  if (Result[n] = '%') then
  begin
    // Escaped % - ignore
    Delete(Result, n, 1);
  end else
  if (IsAnsi(Result[n])) and (AnsiChar(Result[n]) in ['0'..'9', '-', '.', 'd', 'u', 'e', 'f', 'g', 'n', 'm', 'p', 's', 'x']) then
  begin
    Result[n-1] := ' '; // Replace %... with space

    // Remove chars until end of format specifier
    while (Result[n].IsDigit) do
      Delete(Result, n, 1);

    if (Result[n] = ':') then
      Delete(Result, n, 1);

    if (Result[n] = '-') then
      Delete(Result, n, 1);

    while (Result[n].IsDigit) do
      Delete(Result, n, 1);

    if (Result[n] = '.') then
      Delete(Result, n, 1);

    while (Result[n].IsDigit) do
      Delete(Result, n, 1);

    if (IsAnsi(Result[n])) and (AnsiChar(Result[n]) in ['d', 'u', 'e', 'f', 'g', 'n', 'm', 'p', 's', 'x']) then
      Delete(Result, n, 1)
    else
    begin
      // Not a format string - undo
      Result := Value;
      break;
    end;
  end else
  begin
    // Not a format string - undo
    Result := Value;
    break;
  end;

  // Find next format specifier
  n := PosEx('%', Result, n);
end;

Anders Melander · June 26, 2020

14 minutes ago, Dany Marmur said:

how would you print a "%"

'%%'

Mike Torrettinni · June 26, 2020

26 minutes ago, Anders Melander said:

The code as-is replaces %... with space to make the result readable.

Thanks!

You never had the need to not replace with space, but just delete the %... ? I usually have spaces around the %..., so in this case you end up with triple space, right?

'Customer name %s is wrong' -> 'Customer name is wrong'. No?

Anders Melander · June 26, 2020

1 minute ago, Mike Torrettinni said:

You never had the need to not replace with space, but just delete the %... ? I usually have spaces around the %..., so in this case you end up with triple space, right?

'Customer name %s is wrong' -> 'Customer name is wrong'. No?

It depends on what I use the string for afterwards.

For example if I need to compare with another string I just trim consecutive spaces down to a single. If I need to parse out the individual words I leave the spaces there since the parser will skip over them anyway.

The function is for use in a translation tool. AFAIR it can also remove shortcut accelerators, () [] {} <> pairs and punctuation .:;? etc.

Mike Torrettinni · June 26, 2020

17 minutes ago, Anders Melander said:

It depends on what I use the string for afterwards.

For example if I need to compare with another string I just trim consecutive spaces down to a single. If I need to parse out the individual words I leave the spaces there since the parser will skip over them anyway.

The function is for use in a translation tool. AFAIR it can also remove shortcut accelerators, () [] {} <> pairs and punctuation .:;? etc.

OK, makes sense. It looks like very versatile function!

Mahdi Safsafi · June 26, 2020

@Mike Torrettinni I don't know if this going to help. A while ago I wrote a regular expression to match the following pattern ""%" [index ":"] ["-"] [width] ["." prec] type" for format

%((\d+|\*)\:)?[\-]?(\d+|\*)?(\.(\d+|\*))?[duefgnmpsx]

The regex was used with Perl but I believe its still compatible with PCRE like.

Mike Torrettinni · June 26, 2020

37 minutes ago, Mahdi Safsafi said:
@Mike Torrettinni I don't know if this going to help. A while ago I wrote a regular expression to match the following pattern ""%" [index ":"] ["-"] [width] ["." prec] type" for format
%((\d+|\*)\:)?[\-]?(\d+|\*)?(\.(\d+|\*))?[duefgnmpsx]
The regex was used with Perl but I believe its still compatible with PCRE like.

Thanks, but I rarely use RegEx and only with very simple expressions.

Anders Melander · June 26, 2020

1 hour ago, Mike Torrettinni said:

Thanks, but I rarely use RegEx and only with very simple expressions.

Wise decision.

They are fun to write but a nightmare to maintain.

Anders Melander · June 26, 2020

2 hours ago, Mahdi Safsafi said:

%((\d+|\*)\:)?[\-]?(\d+|\*)?(\.(\d+|\*))?[duefgnmpsx]

Hmm. Looking at that RegEx I just realized that I forgot to handle the asterisk parameter specifier in my own code.

To fix replace all three occurrences of:

while (Result[n].IsDigit) do
  Delete(Result, n, 1);

with:

if (Result[n] = '*') then
  Delete(Result, n, 1)
else
  while (Result[n].IsDigit) do
    Delete(Result, n, 1);

Fr0sT.Brutal · July 24, 2020

Don't forget '%%d' (meant to produce '%d' after formatting) and '%s%s' cases

Sign In

Possible custom Format types?

Recommended Posts

Mike Torrettinni 199

Share this post

Link to post

Guest

Share this post

Link to post

Guest

Share this post

Link to post

Achim Kalwa 61

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Guest

Share this post

Link to post

Guest

Share this post

Link to post

Anders Melander 2023

Share this post

Link to post

Anders Melander 2023

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Anders Melander 2023

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Mahdi Safsafi 225

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Anders Melander 2023

Share this post

Link to post

Anders Melander 2023

Share this post

Link to post

Fr0sT.Brutal 903

Share this post

Link to post

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity