Jump to content
ewong

TMemo and unicode

Recommended Posts

Hi,

I'm using Delphi 10.3.3 and I have two Memo fields on a form and a button.   I copy and paste a RFC2047 utf8 text to the left memo field.  Then I press the button, and the resulting unicode text should be on the right memo field; but I get gibberish. 

 

procedure TForm1.convertbuttonclick(Sender: TObject);
var
  use_st,
  s, s2 : string;
  b, q : boolean;

begin
  s := Memo1.lines[0];
  use_st := '?=';
  if pos('=?utf-8?', lowercase(s)) > 0 then
  begin
    s2 := stringreplace(s, '=?utf-8?', '', [rfReplaceAll, rfIgnoreCase]);
    if pos('b?', lowercase(s2)) > 0 then
    begin
      s2 := stringreplace(s2, 'b?', '', [rfReplaceAll, rfIgnoreCase]);
      b := True;
      use_st := '=?=';
    end
    else if pos('q?', lowercase(s2)) > 0 then
    begin
      s2 := stringreplace(s2, 'q?', '', [rfReplaceAll, rfIgnoreCase]);
      q := True;
    end;
  end
  else
    s2 := s;

  if pos(use_st, s2) > 0 then
    s2 := stringreplace(s2, use_st, '', [rfReplaceAll, rfIgnoreCase]);

  if q then
    s3 := idDecoderQuotedPrintable1.decodestring(s2)
  else if b then
    s3 := idDecoderMime1.decodeString(s2)
  else
    s3 := s2;

  memo2.lines.clear;
  memo2.lines.add(s3);
end;

 

Say I copy and paste "=?UTF-8?B?5aaC5L2V6K6TIGFydC1tYXRlIOaIkOeCug==?="  to the first memo.  I press the button, and I get

some string of which only I see "art-mate'.  

 

Can someone point out what I'm missing?

 

Thanks

 

Ed

 

 

 

 

Share this post


Link to post

This question has nothing to do with VCL and Memo. You should ask it in Delphi Third-Party -> Indy as you're using Indy to decode the data.

 

Most probably your data is mailformed (I've tried to decode it using https://www.base64encode.org/ and the result there is: 如何讓 art-mate 成為) . 

 

In order to decode UTF-8 characters you have to add idGlobal to the uses and call DecodeString:

s3 := idDecoderMime1.decodeString(s2, IndyTextEncoding_UTF8)

 

Share this post


Link to post

The code shown is not the correct logic to parse an RFC2047-encoded string.  Try something more like the instead (though this could be simplified using IdGlobal.Fetch(), for instance):

uses
  IdGlobal, IdGobalProtocols, IdCoderMIME, idCoderQuotedPrintable;

procedure TForm1.convertbuttonclick(Sender: TObject);
var
  s, s2, charset, encoding, data : string;
  i, j: Integer;
begin
  s := Memo1.Lines[0];
  s2 := s;

  i := Pos('=?', s);
  if i > 0 then
  begin
    Inc(i, 2);
    j := Pos('?', s, i);
    if j > 0 then
    begin
      charset := Copy(s, i, j-i);
      i := j+1;
      j := Pos('?', s, i);
      if j > 0 then
      begin
        encoding := Copy(s, i, j-i);
        i := j + 1;
        j := Pos('?=', s, i);
        if j > 0 then
        begin
          data := Copy(s, i, j-i);
          if TextIsSame(encoding, 'B') then
            s2 := idDecoderMIME.DecodeString(data, CharsetToEncoding(charset))
          else if TextIsSame(encoding, 'Q') then
            s2 := idDecoderQuotedPrintable1.DecodeString(data, CharsetToEncoding(charset));
        end;
      end;
    end;
  end;

  Memo2.Lines.Clear;
  Memo2.Lines.Add(s2);
end;

That being said, Indy already implements a decoder for RFC2047-encoded strings, in the DecodeHeader() function of the IdCoderHeader unit, eg:

uses
  IdCoderHeader;

procedure TForm1.convertbuttonclick(Sender: TObject);
var
  s, s2 : string;
begin
  s := Memo1.Lines[0]; // '=?UTF-8?B?5aaC5L2V6K6TIGFydC1tYXRlIOaIkOeCug==?='
  s2 := DecodeHeader(s);
  Memo2.Text := s2;
end;

 

Edited by Remy Lebeau

Share this post


Link to post
3 minutes ago, Remy Lebeau said:

TIdDecoderMIME is implemented in the IdCoderMIME unit, not in the IdGlobal unit.

IdGlobal is required for the function IndyTextEncoding_UTF8. 

Share this post


Link to post
9 hours ago, Remy Lebeau said:

That being said, Indy already implements a decoder for RFC2047-encoded strings, in the DecodeHeader() function of the IdCoderHeader unit, eg:
 


uses
  IdCoderHeader;

procedure TForm1.convertbuttonclick(Sender: TObject);
var
  s, s2 : string;
begin
  s := Memo1.Lines[0]; // '=?UTF-8?B?5aaC5L2V6K6TIGFydC1tYXRlIOaIkOeCug==?='
  s2 := DecodeHeader(s);
  Memo2.Text := s2;
end;

 

Hi Remy!

 

That's both short and sweet!  This is much better than the code I had. 

 

Edmund

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×