Jump to content
Antero

Bug in THttpCli (and the SSL/TLS -version also)

Recommended Posts

I made my own Windows app using the TSslHttpCli component.

My app can run weeks without any problem, but then the next time when I click a button that should do a HttpCli.Get -operation (despite the name HttpCli, the component is really of type TSslHttpCli),
instead of the wanted operation, I get an Exception.

First time that happened, I was too tired and made a mistake:

I just shut down and restarted my App - that cleared the problem, but also left me without the knowledge - what exactly went wrong ?

 

The Next time the same happened, I now have more information:

"Request aborted on timeout" - this is the Message associated with the exception.

In unit: OverbyteIcsHttpProt


I found:

procedure THttpCli.DoRequestSync(Rq : THttpRequest);

And there (Line 3708), I found:

FReasonPhrase     := 'Request aborted on timeout';
FRequestDoneError := httperrCustomTimeOut;

I do not yet have 100% accurate information, how to avoid the problem, but the one thing I will do is: in my own copy of the ICS suite, I will do the following change to the unit OverbyteIcsUtils:


The utility function IcsCalcTickDiff:
I will do TWO things:

1. Move the function to the very end of the unit - immediately initialization section. If you don't want to do that for whatever reason, you can avoid moving the function to the very end of the unit

by doing something like {$UNDef RangeCheckWasOn} {$IFOPT R+} {$Define RangeCheckWasOn} {$ENDIF} {$R-}

and then after the function do: {$IFDEF RangeCheckWasOn} {$R+} {$ENDIF}

 

Update: The most important here is the Overflow checking - it must  be OFF here: {$Q-}


2. Change the content of that function as follows (My edited version here):

{$R-,Q-,O+}
function IcsCalcTickDiff(const StartTick, EndTick : LongWord): LongWord;
begin
  Result := EndTick - StartTick;
end;


// If you want to test my version of that function, do something like:


procedure TestIt(SL:TStrings);
const
  OldTime = $FFFFfffe; // = -2 as LongWord
  NewTime = 3;

var
  dT : LongWord;

begin
  //
  dT := IcsCalcTickDiff(OldTime, NewTime);

  asm nop end; // a convenient location for a breakpoint (in Delphi IDE debugger).

  // dT is expected to be = 5 ; since 3 - (-2) = 5.
  SL.Add( 'The value of dT is: ' + IntToStr(dT) );
end;


// Call the TestIt (from a TButton handler):

begin

  TestIt(Memo1.Lines);

end;

It MAY also be a useful idea to do this:

Instead of simply doing this:


    CursorSav := Screen.Cursor;
    Screen.Cursor := crHourGlass;
    try
      HttpCli.Get;

    finally
      Screen.Cursor := CursorSav;
    end;


I could change that into:


    CursorSav := Screen.Cursor;
    Screen.Cursor := crHourGlass;
    try

      try
      HttpCli.Get;

      except

          On E:Exception do LogException(E);

      end;

    finally
      Screen.Cursor := CursorSav;
    end;


where LogException(E) will write to a log file all information available about the Exception, such as:

E.ClassName - What Exception happened

E.Message     - The user-visible Message, in this case it would be: 'Request aborted on timeout'
ExceptAddr    - the location (Executable address) where the Exception happened.

.

One possible problem:  the EXE file (compiled using Delphi by me) is NOT marked with the "Allow ASLR - Address Space Layout Randomization" attribute  - but STILL it seems that for some crazy reason the exact RAM address where the Exception happens, still is not the same every time the Exception happens.

This makes it more difficult (impossible ?) to use Delphi IDE's "Search | Find Error" -menu item.

Any suggestions, What exactly should I do to make Delphi IDE's "Search | Find Error" -menu item usable despite Windows doing something crazy with the RAM address the Exception happens ?

IF this fix alone does still not correct the problem - what next ?

I think - maybe I should write my own version of the HttpCli.Get, where immediately BEFORE calling HttpCli.Get, I should reset some time -related values (inside the TSslHttpCli class) to a proper value to avoid the problem from occurring again.

At least I can say this:

When you want to use plain old http - ICS, Indy and Ararat Synapse all work.

But when it comes to https - instead of plain old http - ICS is the one that work most consistently.

 With Indy / Synapse, when you use https, with some servers it works without any problem, but with some other servers it always fails.

 

ICS is the one component suite, where even https works (so far with ALL servers I ever have tried).

 

What about Synopse (By Arnaud Bouchez) ?

I would not use it ! 

Why?

Because of this anti-pattern (that makes debugging a nightmare):

with TSomeClass.Create do 

try

  // Use TSomeClass (anonymously) - No chance to refer to the instance of TSomeClass just created in Delphi IDE's Run|Evaluate/Modify -window (Ctrl-F4 for those who use the IDE Classic key binding).

finally

  Free;

end;

This kind of coding means that when you (try to) use the Delphi IDE's Run|Evaluate/Modify -window, you do NOT have a variable name to reference the (newly created) instance of the TSomeClass class !

The Synopse source code unfortunately is full of code like that.

 

However - I am afraid that I just scratched just the surface of the problem - there may be more behind this !

My guess: The following function is of flawed design:

function TWSocketCounter.GetLastAliveTick   (unit OverbyteIcsWSocket)

It ALL boils down to the following fact:

The BEST way to handle timeouts (using GetTickCount as the base of timing) is this:

var

  T1, T2, dT : LongWord;


begin

  // Start operation:

  T1 := GetTickCount;

  // Do something potentially slow here.

  T2 := GetTickCount;

  dT := T2 - T1; // MUST have the Overflow checking OFF here. - option $Q- in Delphi.

  // Check dT (aka "Delta T") here - if too much time has elapsed, then react to that.

end;

IF you want to check the current timestamp (obtained by calling GetTickCount) and compare it to MORE than ONE past timestamp, then the correct way would be this:

dT1 := T2 - T1_opt1;

dT2 := T2 - T1_opt2;

dT3 := T2 - T1_opt3;

// Select either the smallest or biggest of (dT1, dT2, dT3) -> dT.

// Then use dT to determine if TimeOut or if normal processing.

There MAY be another way:

Instead of using the GetTickCount (an API function), use GetRelativeTickCount() - defined like this:

 

var
  GetTickCount_At_Start_Of_Operation : LongWord;

 

      // code to put before call to "HttpCli.Get":
      GetTickCount_At_Start_Of_Operation := GetTickCount;
      HttpCli.Get;

      // Other code ... same as it was before this change ...

 

function GetRelativeTickCount:LongWord;
begin
  Result := GetTickCount - GetTickCount_At_Start_Of_Operation;
end;


NOW, it becomes possible to use code similar to what is written in function TWSocketCounter.GetLastAliveTick (unit OverbyteIcsWSocket), BUT:

ALL of the FLastRecvTick, FLastSendTick, FConnectTick (class TWSocketCounter) should have been be initialized using GetRelativeTickCount, NOT GetTickCount !

Reason ?

The Windows API function  GetTickCount returns a 32-bit unsigned number that tells: How many milliseconds since start of this Windows session.'

Because of this, it rolls over (from $FFFFffff to zero) every 4294967296 milliseconds = about 49.71 days = 49 days and 17 hours and 2 minutes and 47.3 seconds.

 

I would NOT recommend of using GetTickCount64, for two reasons:

1) It does not totally erase the problem, it only pushes it to a much later date (about 584 million years).

2) It does not exist on older versions of Windows. So demanding it's existence would make ICS incompatible with those older versions of Windows ; requires Windows Vista or newer.

 

I think nobody wants to wait for an http/https operation to last over 49 days. (Unless you use a slow internet connection to download a DVD ISO image representing for example Knoppix Linux).

And in that case, the TOTAL waiting time (from the call of the Http Get until all of that 4.7 gigabytes have been downloaded) may exceed 49 days.

 

So in all other cases the use of my GetRelativeTickCount instead of GetTickCount would solve the problem, but if someone really uses ICS to download a DVD ISO image (about 4.7 Gigabytes) using a slow internet connection, then using GetRelativeTickCount will NOT solve the problem - but using my other idea by calculating the difference of TimeStamp_Now - TimeStamp_Last_Recv and calculating separately other differences like TimeStamp_Now - TimeStamp_LastSend, and then comparing those differences and selecting either smallest/biggest of those - that WILL solve the problem (although done that way, there is a little bit more to type).

 

So, it should be:

    FConnectDT    : TDateTime;
    FConnectTick  : Cardinal;
    FLastRecvTick : Cardinal;
    FLastSendTick : Cardinal;

 

function TWSocketCounter.GetLastAliveTick : Cardinal;
{$IFDEF PUREPASCAL}
var
  T2, SinceLastRecv_ms, SinceLastSend_ms, SinceLastConnect_ms : Longword;

begin
  T2 := GetTickCount;

  SinceLastRecv_ms    := T2 - FLastRecvTick;
  SinceLastSend_ms    := T2 - FLastSendTick;
  SinceLastConnect_ms := T2 - FConnectTick;


    if SinceLastRecv_ms > SinceLastSend_ms then
        if SinceLastRecv_ms > SinceLastConnect_ms then
            Result := SinceLastRecv_ms
        else
            Result := SinceLastConnect_ms
    else
        if SinceLastSend_ms > SinceLastConnect_ms then
            Result := SinceLastSend_ms
        else
            Result := SinceLastConnect_ms;
{$ELSE}

...

{$ENDIF}

Please NOTE:

After this change, GetLastAliveTick will return the number of ms passed since last Receive/Send/Connect event !

So, instead of returning a past GetTickCount value directly, after this change it instead returns number of milliseconds passed since such event - this means the need to revise all code USING this function !

 

To avoid missing something, my advice:

Please remove the TWSocketCounter.GetLastAliveTick -function, and add a new function instead, with a different name.

This way, code designed to call the original version of GetLastAliveTick will not accidentally end up calling the revised (differently named) function.

Suggested new name: GetTimeElapsed_Since_LastAliveTick_ms

Suggested alternate name: GetMillisecondsElapsed_Since_LastAliveTick

 

Share this post


Link to post

You are using an old version of ICS, all gettickcount functions now use Int64. 

 

    FConnectTick  : Int64;       { V8.71 changed from cardinal to Int64 }
    FLastRecvTick : Int64;       { V8.71 }
    FLastSendTick : Int64;       { V8.71 }

 

ICS has supported In64 ticks on all Windows platforms since 2009 using the OverbyteIcsTicks64 unit which uses QueryPerformanceCounter for ancient operating systems, although Cardinal ticks were only fully removed for the latest release. 

 

Angus

 

Share this post


Link to post

IIRC there was check which of the ticks is greater exactly to avoid overflow exception so you likely use VERY old version. Anyway now x64 ticks provide 99.(9)% insurance against rollover

Edited by Fr0sT.Brutal

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×