Jump to content
Turan Can

WinInet API 4095 How can I remove this limit?

Recommended Posts

Hi All,

 

When I want to download a page on the website, it downloads a maximum of 4095 bytes. How can I remove this limit?

 

function TDownloadFile.WebGetData(const UserAgent: string; const Url: string): string;
var
  hInet: HINTERNET;
  hURL: HINTERNET;
  Buffer: array [0 .. 10] of AnsiChar;
  BufferLen: Cardinal;
  dwTimeOut: DWORD;
begin
  Result := '';

  hInet := InternetOpen(PChar(UserAgent), INTERNET_OPEN_TYPE_PRECONFIG, nil, nil, 0);
  if hInet = nil then
    RaiseLastOSError;
  try
//    dwTimeOut := 2000; // Timeout in milliseconds
//    InternetSetOption(hInet, INTERNET_OPTION_CONNECT_TIMEOUT, @dwTimeOut, SizeOf(dwTimeOut));

    hURL := InternetOpenUrl(hInet, PChar(Url), nil, 0, INTERNET_FLAG_PRAGMA_NOCACHE or INTERNET_FLAG_NO_CACHE_WRITE or INTERNET_FLAG_RELOAD, 0);
    if hURL = nil then
      RaiseLastOSError;
    try
      repeat
        if not InternetReadFile(hURL, @Buffer, SizeOf(Buffer), BufferLen) then
          RaiseLastOSError;
        Result := Result + UTF8Decode(Copy(Buffer, 1, BufferLen))
      until BufferLen = 0;
    finally
      InternetCloseHandle(hURL);
    end;
  finally
    InternetCloseHandle(hInet);
  end;
end;

Share this post


Link to post
7 hours ago, Turan Can said:

When I want to download a page on the website, it downloads a maximum of 4095 bytes. How can I remove this limit?

I can't answer that.  I see nothing in your code that would limit the size of the download.

 

But, I can address other problems I see with your code.

 

For one thing, you are only downloading 11 bytes at a time, why so small a buffer?  That is pretty inefficient for a network transfer.

 

Also, when you are trying to decode the buffer, you are converting from AnsiChar[] to String, and then copying BufferLen characters.  The conversion to String is wrong, as you don't take the BufferLen into account during that conversion.  And there is no guarantee that the number of bytes in the buffer is the same as the number of characters in the String.  You shouldn't even be using Copy() in this manner at all - use TEncoding.UTF.GetString() instead, which lets you pass in a byte array, starting index, and byte count.

 

But most importantly, you simply can't UTF8-decode arbitrary byte buffers while you are downloading them.  You should download all of the bytes first, then decode them as a whole.  Otherwise, if you really want to decode on the fly, you must take codepoint boundaries into account properly, decoding only complete byte sequences, and saving incomplete sequences for future iterations to complete and decode.  Otherwise you risk corrupting the decoded data.

 

And all of this assumes the data is even encoded in UTF-8 to begin with, which your code is not checking for by looking at the HTTP Content-Type response header or the body content (in case of a <meta> charset tag in HTML, etc).

  • Like 1

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×