egnew 3 Posted September 15 I am converting all my internet support code to use native TNetHttpClient and TNetHttpRequest components. I am getting the exception "No mapping for the Unicode character exists in the target multi-byte code page" for some web pages I am downloading using the function shown below. Here is an instance where the exception occur: GetText('https://www.google.com/index.html'); I assume this is an encoding issue. What is the best way to handle this to get my string result? function TIndigoHttp.GetText (const p_Url: String): String; var v_Response: IHTTPResponse; begin f_Error := ''; // Used by OnRequestError try v_Response := f_NetHTTPRequest.Get(p_Url); f_StatusCode := v_Response.StatusCode; f_StatusText := v_Response.StatusText; Result := v_Response.ContentAsString; except on E: Exception do with v_Response do begin begin Result := ''; f_StatusCode := -1*v_Response.StatusCode; f_StatusText := E.Message+' ['+v_Response.StatusText+']'; end; end; end; end; Share this post Link to post
ertank 27 Posted September 15 Hi, I do not see any problem that may raise such an error in the shared code. You might want to check other events assigned to f_NetHTTPRequest. Exception may be raising in them. If you are sure that TIndigoHttp.GetText() is where the error occurs then which line is it? What is the computer codepage that you are making tests. BTW, your request might complete without exception. But response received might be an error. I would check if "f_StatusCode" is in successful response range. In my own code I check it to be ">= 200" and "<= 299" Share this post Link to post
egnew 3 Posted September 15 The status is 200 - OK as there is not a problem fetching the webpage. The exception occurs during the call to ContentAsString when TEncoding.GetString is executed. function TEncoding.GetString(const Bytes: TBytes; ByteIndex, ByteCount: Integer): string; var Len: Integer; begin if (Length(Bytes) = 0) and (ByteCount <> 0) then raise EEncodingError.CreateRes(@SInvalidSourceArray); if ByteIndex < 0 then raise EEncodingError.CreateResFmt(@SByteIndexOutOfBounds, [ByteIndex]); if ByteCount < 0 then raise EEncodingError.CreateResFmt(@SInvalidCharCount, [ByteCount]); if (Length(Bytes) - ByteIndex) < ByteCount then raise EEncodingError.CreateResFmt(@SInvalidCharCount, [ByteCount]); Len := GetCharCount(Bytes, ByteIndex, ByteCount); if (ByteCount > 0) and (Len = 0) then raise EEncodingError.CreateRes(@SNoMappingForUnicodeCharacter); SetLength(Result, Len); GetChars(@Bytes[ByteIndex], ByteCount, PChar(Result), Len); end; The value for LEN is zero which causes the EEncodingError exception. As originally stated, I suspect the problem is related to encoding. The question is how to resolve the issue with native Http. I have no problem using Indy as it seems to handle the necessary details on its own. Thanks, Sidney Share this post Link to post
ertank 27 Posted September 15 Below works for me without any exception. I see "All good" message and debugging shows data is actually in LResult variable. uses System.Net.HttpClient, System.Net.HttpClientComponent; procedure TForm1.Button1Click(Sender: TObject); var LHttp: TNetHTTPClient; LResponse: IHTTPResponse; LResult: string; begin LHttp := TNetHTTPClient.Create(Self); try try LResponse := LHttp.Get('https://www.google.com/index.html'); except on E: Exception do begin ShowMessage('Cannot communicate' + sLineBreak + E.Message); Exit(); end; end; if (LResponse.StatusCode < 200) or (LResponse.StatusCode > 299) then begin ShowMessage('Error status received'); Exit(); end; LResult := LResponse.ContentAsString(); ShowMessage('All good'); finally LHttp.Free(); end; end; You may want to test this code in a new project. If you do not get exception for google, but some other URL. You need to be sure that you are not downloading something binary. There are binary contents that can be retrieved using GET and these cannot be simply read as string. For example, I download my application update setup executables using GET into a TStream. Share this post Link to post
egnew 3 Posted September 15 Thanks -- I copied your code into my program and it worked. I traced the problem to the constructor of my TIndigoHttp type. I had recently added custom headers to match those chrome was sending when I was having an issue logging into a website. After resolving the logon issue, I forgot to remove the custom headers. The "No mapping for the Unicode character exists in the target multi-byte code page" error occurs when the two custom headers shown below are added to the request.. The error does not occur when I comment out either Add. I am not sure why there is a conflict. Google's encoding is "br". If you want to observe the issue, copy the custom header code below to immediately after you create lhttp. Comment out either Add and the mapping error will not occur. Do you have an idea why the headers cause the error? Thanks for your help, Sidney lhttp.CustHeaders.Clear; lHttp.CustHeaders.Add('Accept-Encoding','gzip, deflate, br, zstd'); lHttp.CustHeaders.Add('User-Agent','Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36'); Share this post Link to post
Remy Lebeau 1392 Posted September 16 5 hours ago, egnew said: Do you have an idea why the headers cause the error? Because you are explicitly giving the server permission to send compressed response, even though by default IHttpResponse DOES NOT support compressed responses. So, you are likely getting a compressed response in binary format, but IHTttpResponse does not decompress it, and then you try to convert the compressed data into a String, which fails, You need to use the TNetHTTPClient.AutomaticDecompression property to enable handling of "gzip" and "deflate" compressions. In general, DO NOT manipulate the "Accept-Encoding" header manually, unless you are prepared to decode the response manually (ie, by receiving it as a TStream and decompressing it yourself). Just because a BROWSER sends that header (and browsers do support compression) does not mean YOU should send it. TNetHTTPClient will manage the "Accept-Encoding" header for you. It will allow "gzip" and "deflate" compression if the AutomaticDecompression property enables them. Similarly, Indy's TIdHTTP does the same thing. It supports "gzip" and "deflate" compressions, and will set the "Accept-Encoding" accordingly, if you have a Compressor assigned to it. Share this post Link to post