Jump to content
Amberel

UTF-8 with dynamically prepared web page

Recommended Posts

I'm using a web server based on the ICS 8 sample code. It appears to render UTF-8 correctly when the source is a fixed part of a web page, but not when I generate the content dynamically. Specifically, I am using a dynamically generated string as an Ajax response - this is the relevant part of the Delphi code:

 

Flags := hgWillSendMySelf;
s := 'Chaitén / αυτό είναι ένα τεστ';
TMyHttpConnection(Client).AnswerString(Flags, '', '', '', s);
Log('Unicode response: ' + s);
Result := TRUE;

 

The unicode is correctly rendered in the log file.

 

I have put a very short example file at http://www.amberel.com/test/unicodetest.htm

At the top of the page is a static paragraph that displays the unicode correctly. If you click the Ajax button, the response is shown as Ajax response: Chait�n / a?t? e??a? ??a test
 

The dynamically generated string can be accessed directly from the address bar of the browser using http://www.amberel.com/test/ajaxunicodetest.htm

In this case I see Chaitén / a?t? e??a? ??a test - the first part appears correct but the second part not - I think this is because that, while the first part is UTF-8, it coincidentally happens that the same character appears in the single byte iso-8859-1 character set.

 

I am using ICS 8.41, it's not the latest version because when I tried to use a later version, the SSL stopped working. If it is necessary to use the latest version for this to work, I will have another try at the update.

 

I've spent a lot of time making no headway on this - this stuff is far removed from my speciality! Any idea for fixing it, or even just ideas of how to narrow the problem down, gratefully received.

 

Rgds, Andy

Share this post


Link to post

maybe s needs to be encoded to utf-8??

seems to be lot's of routines for that..

 

UTF-8Conversions.jpg

Share this post


Link to post

The ICS web server sends binary data from a stream.  If you want to send UTF-8, you need to make sure that stream is loaded with 8-bit data as UTF-8.  Can not be more specific since there are many ways to build the response page with several helpers.

 

ICS 8.41 is ancient, it only supported OpenSSL that are no longer supported.  You should be using ICS V8,.68 and OpenSSL 3, new version today.

 

Angus

 

 

 

Share this post


Link to post

I have ensured that the string s is UTF-8 encoded, but it has made no difference. I have declared

 

s: UTF8String; and tec: TEncodeType;

 

s := 'Chaitén / αυτό είναι ένα τεστ';
tec := System.WideStrUtils.DetectUTF8Encoding(s);
if tec =  etUSASCII then begin
   Log('ASCII');
end
else if tec =  etUTF8 then begin
   Log('UTF8');
end
else if tec =  etANSI then begin
   Log('ANSI');
end
else begin
   Log('Encoding not defined');
end;

 

The log reports that s is UTF8 encoded.

 

I have also tried TMyHttpConnection(Client).AnswerString(Flags, '', '', '', System.UTF8Encode(s)); but that too had no effect

 

(These changes were tried on a test server on port 42080 and not on the live server, to avoid any disruption)

 

Angus - I will try the upgrade later, but I'd rather deal with one issue at a time unless the older version is directly responsible for this problem.

 

Rgds, Andy

Share this post


Link to post

OK, I think I found the solution; use AnswerStringEx

 

TMyHttpConnection(Client).AnswerStringEx(Flags, '', '', '', s, CP_UTF8);

 

Very many thanks for the comments that got me going down the correct path 🙂

 

Rgds, Andy

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×