Jump to content
dummzeuch

Grep search and DFM files

Recommended Posts

I looked into two problems with GExperts Grep and DFM files. While it is possible to fix both, the fixes have side effects:

 

Grep Search does not find strings in DFM files that are split in multiple lines (#49)

Searching for text that gets split in multiple lines can be solved by first joining these lines:

  object l_Test: TLabel
    Left = 8
    Top = 8
    Width = 434
    Height = 13
    Caption = 
      'Dies ist die Ueberschrift, und sie ist sehr lang, damit es einen ' +
      'Umbruch in der DFM-Datei gibt'
  end

So instead of searching "einen Umbruch" in two lines and not finding it, we join these lines into one and then we will find it.

'Dies ist die '#220'berschrift, und sie ist sehr lang, damit es einen Umbruch in der DFM-Datei gibt'

This works fine but breaks the preview in the result window. But that could probably be fixed too.

But what about this:

  object m_memo: TMemo
    Left = 8
    Top = 104
    Width = 281
    Height = 193
    Lines.Strings = (
      'first line'
      'second line'
      'third line'
      'and a '
      'veeeeeeeeeeeeerrrrrrrrrrrrrrrrrrrrrrrrrryyyyyyyyyyyyy'
      'yyyyyy long line at the end that should get wrapped.')
  end

There are three lines that are short and then a very long fourth line that gets wrapped into 3 separate lines in the DFM file. I can see no way how this could be detected. I wonder how the Delphi streaming mechanism handles this. (Actually it doesn't. When I load that form, I get a memo with 6 lines. At least in Delphi 2007. And that's a bug.)

EDIT: Something apparently went wrong in my first test. I cannot reproduce the problem any more. Now, if I add long lines, the DFM file looks like this:

  object m_memo: TMemo
    Left = 8
    Top = 104
    Width = 281
    Height = 193
    Lines.Strings = (
      'first line'
      'second line'
      'third line'

        'and a ' +
        'veeeeeeeeeeeeerrrrrrrrrrrrrrrrrrrrrrrrrryyyyyyyyyyyyy' +
        'yyyyyy long line at the end that should get wrapped.')
  end

And that can easily get parsed.

 

Grep search fails to find some words with Umlaut in fmx files (#112) (which also applies to DFM files)

  object l_short: TLabel
    Left = 8
    Top = 80
    Width = 83
    Height = 13
    Caption = 'Kurze '#220'berschrift'
  end

Any non ASCII character apparently gets converted to its #<number> representation. Again, a fix would be to parse theses strings and convert them back:

'Kurze Überschrift'

Which then would be found when searching for "Überschrift".

But then, how do we search for

'first line'#13#10'second line' ?

The proposed fix would convert the #13#10 numerical representation to a carriage return and line feed character respectively. Searching for "#13#10" would no longer find these. Searching for "\r\n" with Regular Expression turned on, also doesn't find it, but that might still be a bug in my code.

 

So, after spending a few hours on these "bug fixes" I am not sure whether I want to commit them. They might break more than they fix.

 

Any ideas on this?

Edited by dummzeuch
  • Thanks 1

Share this post


Link to post

I would not try to interpret strings encoded in a special way (such as using # and a number, or string lists which generate a list of lines which begin and end with ' and have a special escaping for '). You would expect the text search to work with the verbatim text file, and only consider things such as text encoding (UTF-8, Windows-1252) that applies to every text file, but otherwise nothing smart that does syntax interpretation.

 

If such a mode would be introduced, I think it should be an explicit option, then people will also be less surprised by behavioral changes (only if that option is checked).

Edited by mael

Share this post


Link to post

Well, that sounds OK for english users or people that program in english. For other languages sooner or later you may reach a curios situation. I am the one that reported #112. After a decade of using GExperts I just encountered the issue last month. Normally I search for identifiers of some sort, but this time I actually searched for a text I saw on my GUI, which is in german. So the priority to solve this need not be high. I even dare say it's an IDE issue: pas files may be UTF8, why not dfm/fmx files? Instead, some poor chap was forced to write a conversion method for non ANSI characters...

 

On fixing this in GExperts...I have no clue how. But it would complete the package, so to speak.

Share this post


Link to post

And once you are done patching these into your search routines, one day you will face a component with its own TWriter and TReader which breaks all these regular rules and won't work, or in worst case, it will even screw up your concat logic and crash badly.

  • Thanks 1

Share this post


Link to post

There are now two experimental options for searching forms that address these issues at least for standard components.

The output still needs some polishing.

  • Like 1
  • Thanks 1

Share this post


Link to post

That appears to be sufficient for English users or programmers. For other languages, you may encounter a peculiar circumstance sooner or later. I am the one who reported #112. After a decade of using GExperts, I only discovered the problem last month. Normally, I look for identifiers, but this time I looked for a sentence I saw on my GUI, which is in German. As a result, resolving this issue should not be a top priority. I'd even go so far as to argue it's an IDE issue: if pas files can be UTF8, why can't dfm/fmx files? Instead, some unfortunate soul was obliged to create a mechanism for converting non-ANSI characters...

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×