That's one of the great things they've been doing in the JIT/compiler if you read through the blog article - it can detect when list access won't be out of its range so it can omit any superfluous range checks.
A few additional improvements to your source code:
function IntersectListSort(node1, node2: Pointer): Integer;
var
pt1, pt2: ^TPoint64;
i: Int64;
begin
// note to self - can't return int64 values :)
pt1 := @PIntersectNode(node1).pt;
pt2 := @PIntersectNode(node2).pt;
i := pt2.Y - pt1.Y;
if (i = 0) then
begin
if (pt1 = pt2) then
begin
Result := 0;
Exit;
end;
// Sort by X too. Not essential, but it significantly
// speeds up the secondary sort in ProcessIntersectList .
i := pt1.X - pt2.X;
end;
if i > 0 then Result := 1
else if i < 0 then Result := -1
else result := 0;
end;
This eliminates as many repeatedly indirections as possible - that reduces register pressure (especially on x86)
Next improvement is in TClipperBase.ProcessIntersectList which takes the majority of overall time in the benchmark - namely the inner loop that increments j:
for i := 0 to highI do
begin
// make sure edges are adjacent, otherwise
// change the intersection order before proceeding
node := UnsafeGet(FIntersectList, i);
if not EdgesAdjacentInAEL(node) then
begin
j := i;
repeat
inc(j);
until EdgesAdjacentInAEL(UnsafeGet(FIntersectList, j));
// now swap intersection order
FIntersectList.List[i] := UnsafeGet(FIntersectList, j);
FIntersectList.List[j] := node;
end;
First we store the node at i because its the same after the loop - so no repeatly getting it necessary.
Since you incremented j by 1 anyway I chose to use a repeat loop because I know that it won't do a tail jump to check the condition first. Also it generates slightly better code with an inlined function as its condition. Which leads the the third improvement:
function EdgesAdjacentInAEL(node: PIntersectNode): Boolean;
{$IFDEF INLINING} inline; {$ENDIF}
var
active1, active2: PActive;
begin
active1 := node.active1;
active2 := node.active2;
Result := (active1.nextInAEL = active2) or (active1.prevInAEL = active2);
end;
Same story as before: eliminate any repeatly indirections, fetch active1 and active2 only once instead of accessing both two times (that with you used there before does not change that)
Another big chunk of time can probably be shaved off by using a hand-coded quicksort that does not use the repeated call into IntersectListSort but avoids that call altogether.