Jump to content
giomach

D5 data file misread in XE program

Recommended Posts

Posted (edited)

I took an old program (Delphi 5) and tried to recompile and run it under XE.  There was a data file, created under D5, which I tried to read in the recompiled XE program but it is misread — even the number of records comes out wrong (the correct number in 951).  Here is a minimal example of the program source (the form is empty).  It works perfectly when compiled in D5 under WinXP and run on the same data file. I'm also dropping the data file below.

unit MorphAll;

interface

uses
  Forms, SysUtils, Dialogs;

type
  TForm1 = class(TForm)
    procedure FormActivate(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.DFM}

type
   valid_range = -1..5;
   string32 = string [32];
   supp_rec = record
                 flag: integer;
                 case nform: integer of
                   1: (single: array [1..1] of string32);
                   2: (double: array [1..2] of string32);
                   3: (treble: array [1..3] of string32)
                 end;
   normal = array [1..20] of integer;
   lexrec = record
                root_part: string32;
                case speech_part: valid_range of
                  -1: (suppletive_type: supp_rec);
                   0: (nsupp_type: integer);
           1,2,3,4,5: (normal_type: normal)
                  end;
   lexfile = file of lexrec;

var morphlex: lexfile;
    nooflexrecs: integer;

procedure TForm1.FormActivate(Sender: TObject);
begin
assignfile (morphlex, 'MorphIr.dat');
reset (morphlex);
nooflexrecs := System.FileSize (morphlex);
ShowMessage ('Data file contains '+inttostr (nooflexrecs)+ ' records');
end;

end.

I suspect the trouble is the type declaration string[32].  When the data file was created (D5), this would have been an old-style string, and moreover the encoding would have been Windows 1252.  How can I make XE interpret these strings correctly (if that is what is wrong)?

MorphIr.dat

Edited by giomach
Correct no of records is 951, sorry.

Share this post


Link to post
9 minutes ago, giomach said:

I suspect the trouble is the type declaration string[32].  When the data file was created (D5), this would have been an old-style string, and moreover the encoding would have been Windows 1252.  How can I make XE interpret these strings correctly (if that is what is wrong)?

 

That should not be the issue, shortstring should be compatible with D5 (I have never tested, it is not unicode like string).

Share this post


Link to post

I just testing this ChatGPT IDE implementation and I thought let's give it a shot.

 

//To ensure that XE interprets the strings correctly,
//you can try explicitly setting the encoding when reading
//the strings from the data file.
//Here's an example of how you could modify your code to
//read the strings as Windows 1252 encoded: delphi
procedure TForm1.FormActivate(Sender: TObject);
var
  morphlex: file of string[32];
  str: AnsiString;
  nooflexrecs: Integer;
begin
  AssignFile(morphlex, 'MorphIr.dat');
  Reset(morphlex);
  nooflexrecs := FileSize(morphlex);

  ShowMessage('Data file contains ' + IntToStr(nooflexrecs) + ' records');

  // Read strings using Windows 1252 encoding
  while not Eof(morphlex) do
  begin
    BlockRead(morphlex, str, SizeOf(str));
    // Convert string to Unicode if needed
    ShowMessage(TEncoding.GetEncoding(1252).GetString(PAnsiChar(str)));
  end;

  CloseFile(morphlex);
end;

//In this updated code snippet, we use `AnsiString` to
//read the string data from the file as it is not Unicode.
//We then convert the `AnsiString` to Unicode using
//the `TEncoding.GetEncoding(1252).GetString` method to
//interpret the Windows 1252 encoded strings correctly.
//Please note that this code assumes that the string data
//in the file is in Windows 1252 encoding.
//If the actual encoding is different, you may need to
//adjust the encoding parameter accordingly

 

Share this post


Link to post
Posted (edited)

The data file has 140.748 bytes - that doesn't work well with 971 records, but with 951 records with 148 bytes each. That record size also matches the actual file content.

 

You may get better results with this declaration:

  lexrec = record
    root_part: string32;
    filler: string[2];
    case speech_part: valid_range of
      - 1:
        (suppletive_type: supp_rec);
      0:
        (nsupp_type: integer);
      1, 2, 3, 4, 5:
        (normal_type: normal)
  end;

 

Edited by Uwe Raabe
add better type declaration

Share this post


Link to post

Thanks, Holländer, but the program doesn't get as far as that.  After it misreports the number of records, any further reading of data (by 'seek' and 'read') just produces rubbish (out of range, etc).  I think I need to make it get the number of records correct before it can even separate the records, and I think that means some change to my definition of lexrec. 

 

Thanks Uwe.  951 records is correct, and that is what the program compiled in D5 reports. Compiled in XE, it reports 998 records, before going nuts.  Some of the things I have tried in place of string32, like shortstring or array[1..32] of char, produce other wrong numbers of records.

Share this post


Link to post

Adding filler: string [2] reduces the reported no of records in XE from 998 to 977, but this does not help, I'm still unable to retrieve them.

 

I found that changing the XE compiler option "record field alignment" from byte to word also reduces the number to 977.  But also no help.

 

The option in D5 is a check-box named "aligned record fields", which is checked by default, and this was probably the setting when that file was produced.

Share this post


Link to post
25 minutes ago, giomach said:

I found that changing the XE compiler option "record field alignment" from byte to word also reduces the number to 977.

Did you try Double Word?

Share this post


Link to post

I've just tried all values from off to quad word several times and they are all giving 998.  An hour ago, they were all giving 977 ...  I can't make any sense of it.

 

If we don't know what alignment D5 does, and can't find out how to make XE do the same, the only solution may be to extend the program with a procedure to export the datafile to a text file, compile and run on D5, then extend the program with a procedure to import from the textfile and run that on XE.

Share this post


Link to post
1 minute ago, giomach said:

I've just tried all values from off to quad word several times and they are all giving 998.

The compiler setting can be overwritten in code by the {$A and {$ALIGN compiler directive. To avoid irritations it might be better to place an {$A4} directly in front of the record declaration.

 

BTW, I was able to read the file correctly with the declaration shown above with Delphi 12.

  • Like 1

Share this post


Link to post

I hadn't used any $A directives in the project, but I put in {$A4} as you suggested, and magically the number of records is 951.

 

There are still other errors to fix, but I seem to be able to access the records now.

 

Thank you very much for your help.

Share this post


Link to post
2 hours ago, giomach said:

Some of the things I have tried in place of string32, like shortstring or array[1..32] of char, produce other wrong numbers of records.

Neither of those types have the same size as string32.

 

string32, aka string[32], is 33 bytes (1 byte length + 32 AnsiChars). This has not changed over the years.

 

ShortString, aka string[255], is 256 bytes (1 byte length + 255 AnsiChars). This has not changed over the years.

 

array[1..32] of Char is 32 bytes in D5, but 64 bytes in XE. There is no length byte in the array, and Char itself is different. Prior to 2009, it was AnsiChar. Post 2009, it is WideChar.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×