Jump to content
Sign in to follow this  
Mike Torrettinni

Is variable value kept after For.. in ... do loop?

Recommended Posts

I know I should never rely on iterator value to be kept as iterated value when normal For loop ends, but what about in For in loop?

 

I never rely on i being the same value after loop as when 'condition' is met, so I assign i value to variable (vIndex)  so I can use the value of after the loop:

 

for i := 1 to n do
  if condition then
  begin
    vIndex := i;
    Break;
  end;
    

 

But in For..in loop I assume vItem still holds the same value as when condition is met, so I can use vItem afer loop

for vItem in Items do
   if condition then
     Break;

     
All my tests show that vItem keeps the value and can be used after For...in loop.

 

Am I right and I can use vItem without worries, or is there a case where vItem should not be expected to be the same value as when 'condition' is met?

 


 

Edited by Mike Torrettinni

Share this post


Link to post

It's the same issue for both types of for loops. The value of the loop variable is undefined if the for loop terminates normally. In your case that corresponds to the condition never having been met. 

 

Unfortunately though, whilst you are aware that there is an issue here, you have not correctly understood what the issue is. 

 

1 hour ago, Mike Torrettinni said:

I never rely on i being the same value after loop as when 'condition' is met, so I assign i value to variable (vIndex)  so I can use the value of i  after the loop

That reasoning is incorrect. The loop variable i does not change in case of a normal termination, e.g. the break in your code. 

 

Before you consider for in loops I recommend that you correct your misunderstanding of for loops. 

 

Executive summary. There is an issue with for loops of both kind. It is the same issue for both kind of for loop. But the issue is not what you have described. 

Edited by David Heffernan
  • Thanks 1

Share this post


Link to post

http://docwiki.embarcadero.com/RADStudio/Sydney/en/Declarations_and_Statements_(Delphi)#For_Statements

 

For to rules are pretty clear: 

Quote

After the for statement terminates (provided this was not forced by a Break or an Exit procedure), the value of counter is undefined.
Warning: The iteration variable counter cannot be modified within the loop. This includes assignment and passing the variable to a var parameter of a procedure. Doing so results in a compile-time warning.

 

For in rules are a bit general:

Quote

Warning: The iteration variable cannot be modified within the loop. This includes assignment and passing the variable to a var parameter of a procedure. Doing so results in a compile-time warning.

I am not sure how this would be a problem if we are talking about a loop over a list of objects, as we then would be passing a reference to the list object?

I have not really thought about how the variable of a for in loop over an array of records is kept.
Do we see a deep copy of the array element in the loop variable, or is there pointer magic involved?
I can see how this could be a problem if the reference passed is to a local stack copy.

  • Thanks 1

Share this post


Link to post
55 minutes ago, Lars Fosdal said:

I have not really thought about how the variable of a for in loop over an array of records is kept.
Do we see a deep copy of the array element in the loop variable, or is there pointer magic involved?
I can see how this could be a problem if the reference passed is to a local stack copy.

Loop variables are assigned using assignment semantics. That's all you need to know. 

Share this post


Link to post
21 minutes ago, David Heffernan said:

Loop variables are assigned using assignment semantics. That's all you need to know. 

So, looping an array of records using for v in array, can be significantly more expensive than indexed access using for vx := Low(array) to High(array), given the size of the record?

  • Like 1

Share this post


Link to post
8 minutes ago, Lars Fosdal said:

So, looping an array of records using for v in array, can be significantly more expensive than indexed access using for vx := Low(array) to High(array), given the size of the record?

Correct. 

  • Thanks 1

Share this post


Link to post

So actually, while looking cool and trendy it really is slow and not so cool at all? I got some refactoring to do... gnarf!

  • Like 2

Share this post


Link to post
6 minutes ago, Sherlock said:

So actually, while looking cool and trendy it really is slow and not so cool at all? I got some refactoring to do... gnarf!

Almost all syntax sugar is expensive. for-in means creating iterator, calling its methods etc instead of plain index addressing. But in most cases it doesn't matter. Anyway if speed is important, pointer-based iteration probably will be faster

  • Like 4

Share this post


Link to post

Which means lists and arrays of objects may be preferable over lists and arrays of records in certain situations.

Share this post


Link to post
15 minutes ago, Sherlock said:

So actually, while looking cool and trendy it really is slow and not so cool at all? I got some refactoring to do... gnarf!

Yeah... you are not the only one.

  • Like 1

Share this post


Link to post
19 minutes ago, Fr0sT.Brutal said:

for-in means creating iterator

This is why I always implement my enumerators as records so that they can be automatically allocated.

 

19 minutes ago, Fr0sT.Brutal said:

Anyway if speed is important, pointer-based iteration probably will be faster

Will it? Faster than array indexing? That seems to be something of a myth in my experience.

Share this post


Link to post
53 minutes ago, Sherlock said:

So actually, while looking cool and trendy it really is slow and not so cool at all?

Not necessarily. And in any case, if you wanted performance, you'd not have chosen Delphi.

  • Haha 1

Share this post


Link to post

Ah well, I chose Delphi because I like it. 😄 And if I can avoid a slow down whilst using Delphi I will try to do so. So thanks for the insight, and probably I'll do some benchmarking for my specific application, just to see if this refactoring is justified.

Share this post


Link to post

If you use an iterator outside the loop, the compiler will warn you of the incorrect use. At least in 10.4.

Share this post


Link to post

You guys are right, it really creates the whole record structure for variable iterator and copies the content, for each item iteration. So, if you have really big record structure it will use some memory, or will be considerably slower  than using normal iterator. Had a suspicion on this, but never really thought about it enough (or test it myself), but now I know for sure.

 

7 hours ago, David Heffernan said:

It's the same issue for both types of for loops. The value of the loop variable is undefined if the for loop terminates normally.

Thanks, I had a feeling is good to check with experts

5 hours ago, Lars Fosdal said:

After the for statement terminates (provided this was not forced by a Break or an Exit procedure), the value of counter is undefined.
Warning: The iteration variable counter cannot be modified within the loop. This includes assignment and passing the variable to a var parameter of a procedure. Doing so results in a compile-time warning.

or read the documentation 🙂

Share this post


Link to post
3 hours ago, Lars Fosdal said:

Yeah... you are not the only one.

Major rule of optimization says: Benchmark First Before Trying To Optimize 😉 . Syntax sugar is created for code simplicity and shortness and in most cases perf degradation will be insignificant

3 hours ago, David Heffernan said:

Will it? Faster than array indexing? That seems to be something of a myth in my experience.

I've really seen perf gain in string processing. With index loop you have ArrPtr + I*SizeOf(Item) each time which could, f.ex., cause cache misses. OTOH, Inc(CurrPtr, SizeOf(Item)) always operates on near memory area. Honestly I'm not a CPU-level expert and array loops could be well optimized already but in my case benches shown that pointer loop is faster.

3 hours ago, David Heffernan said:

 

  • Like 2

Share this post


Link to post
11 minutes ago, Fr0sT.Brutal said:

With index loop you have ArrPtr + I*SizeOf(Item) each time which could, f.ex., cause cache misses. OTOH, Inc(CurrPtr, SizeOf(Item)) always operates on near memory area. Honestly I'm not a CPU-level expert and array loops could be well optimized already but in my case benches shown that pointer loop is faster.

The hardware is designed to handle indexed access efficiently. I've seen plenty of cases where incrementing a pointer inside the loop is slower. After all, you have an extra variable that needs to be incremented. Hard to see why that would be faster.

 

11 minutes ago, Fr0sT.Brutal said:

With index loop you have ArrPtr + I*SizeOf(Item) each time which could, f.ex., cause cache misses.

No. The memory access pattern is identical. It's just a sequential pass across the array.

Edited by David Heffernan
  • Like 1

Share this post


Link to post
8 minutes ago, Fr0sT.Brutal said:

With index loop you have ArrPtr + I*SizeOf(Item) each time which could, f.ex., cause cache misses

Why would calculating a pointer cause a cache miss?

Share this post


Link to post

In this simple example you can see the the memory consumption jump when starting to iterate and also the time it takes to copy each record (not too much, but noticeable under debugger):

The memory consumption of each record is around 390MB so the effect of iterating with variable is very noticeable, while normal iterator is fast and no additional memory consumption:

 

 const
    cMaxItems = 100000000;

  type

    TData = record
      Values: array[1..cMaxItems] of string;
    end;

var
  gData: TArray<TData>;
  vItem: TData;
  
procedure TForm2.FormCreate(Sender: TObject);
var c: integer;
begin
  SetLength(gData, 3);

  gData[0].Values[1] := 'a';
  gData[1].Values[1] := 'b';
  gData[2].Values[1] := 'c';

  for vItem in gData do // slow and watch task manager for memory consumption
    inc(c);

  for c := Low(gData) to High(gData) do // fast 
   sleep(0);
end;  

 

 

Share this post


Link to post
22 minutes ago, Mike Torrettinni said:

The memory consumption of each record is around 390MB

I'd say that if you have records that size then the iterator is not the problem you should be focusing on :classic_smile:

  • Like 1

Share this post


Link to post
23 minutes ago, Anders Melander said:

I'd say that if you have records that size then the iterator is not the problem you should be focusing on :classic_smile:

If the iteration is the only thing that is affected with such large records, then is good to know what could be performance bottleneck.

I don't have any records of such size, so it is good to see what happens if you have nested records of large size, or smaller records but you have a large number of them, so iteration copies smaller memory but a lot of times - could be even worse than a few of 390MB records.

Share this post


Link to post
14 minutes ago, Mike Torrettinni said:

or smaller records but you have a large number of them, so iteration copies smaller memory but a lot of times

Why would there be an issue for small records? Presumably you are iterating over the array because you want to look at the content of the record. For a small enough record, there won't be any difference in perf between reading a field and copying the entire record.

Share this post


Link to post
6 minutes ago, David Heffernan said:

Why would there be an issue for small records? Presumably you are iterating over the array because you want to look at the content of the record. For a small enough record, there won't be any difference in perf between reading a field and copying the entire record.

Iterating a few large records (390MB) or lots of small records (10000x 0.039MB), could have similar performance effects.

Share this post


Link to post
40 minutes ago, Mike Torrettinni said:

Iterating a few large records (390MB) or lots of small records (10000x 0.039MB), could have similar performance effects.

No. Because when you iterate over an array, you typically don't read all of the content of each item.

 

Imagine that all you do is look inside each record for an integer ID. If the record is huge, then you can just read a single integer, and move on, if using a classic for loop with an array of record. But if you use a for in loop you have to copy the entire record before reading the single integer ID. That's wasteful.

 

In the case of a smaller record, let's say small enough to fit into a cache line, then reading an integer from the record has the essentially same cost as copying the entire record and then picking out the integer.

 

So the trade off depends hugely on the size of the record, and what proportion of it you actually need to access.

  • Like 1

Share this post


Link to post
51 minutes ago, David Heffernan said:

No. Because when you iterate over an array, you typically don't read all of the content of each item.

I was referring to variable iterating For..in loop - in this case full array record content is copied into variable. No? At least that's what I see from the simple example above.

Edited by Mike Torrettinni

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×