CodeKicks.com
Focus on Microsoft Technologies - Tutorials, Articles, Code Samples.

Monday, January 24, 2011

An observation on .NET loops – foreach, for, while, do-while

It’s very common that .NET programmers use “foreach” loop for iterating through collections. Following is my observation whilst I was testing simple scenario on loops. “for” loop is 30% faster than “foreach” and “while” loop is 50% faster than “foreach”. “do-while” is bit faster than “while”. Someone may feel that how does it make difference if I’m iterating only 1000 times in a loop.

This test case is only for simple iteration. According to the "Data structure" concepts, best and worst cases are completely based on the data we provide to the algorithm. so we can not conclude that a "foreach" algorithm is not good. All I want to tell that we need to be little cautious even choosing the loops.

Example:- You might want to chose quick sort when you want to sort more numbers. At the same time bubble sort may be effective than quick sort when you want to sort less numbers.


Take a simple scenario, a request of a simple web application fetches the data of 10000 (10K) rows and iterating them for some business logic. Think, this application is being accessed by 1000 (1K) people simultaneously. In this simple scenario you are ending up with 10000000 (10Million or 1 Crore) iterations.

below is the test scenario with simple console application to test 100 Million records.

using System;
using System.Collections.Generic;
using System.Diagnostics;

namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var sw = new Stopwatch();
var numbers = GetSomeNumbers();

sw.Start();
foreach (var item in numbers)
{

}
sw.Stop();

Console.WriteLine(
String.Format("\"foreach\" took {0} milliseconds",
sw.ElapsedMilliseconds));

sw.Reset();
sw.Start();
for (int i = 0; i < numbers.Count; i++)
{

}
sw.Stop();

Console.WriteLine(
String.Format("\"for\" loop took {0} milliseconds",
sw.ElapsedMilliseconds));

sw.Reset();
sw.Start();
var it = 0;
while (it++ < numbers.Count)
{

}
sw.Stop();

Console.WriteLine(
String.Format("\"while\" loop took {0} milliseconds",
sw.ElapsedMilliseconds));

sw.Reset();
sw.Start();
var it2 = 0;
do
{

} while (it2++ < numbers.Count);

sw.Stop();

Console.WriteLine(
String.Format("\"do-while\" loop took {0} milliseconds",
sw.ElapsedMilliseconds));
}

#region Get me 10Crore (100 Million) numbers
private static List<int> GetSomeNumbers()
{
var lstNumbers = new List<int>();
var count = 100000000;
for (var i = 1; i <= count; i++)
{
lstNumbers.Add(i);
}
return lstNumbers;
}
#endregion Get me some numbers
}
}


In above example, I was just iterating through 100 Million numbers. You can see the time to execute various  loops provided in .NET

Output
"foreach" took 1108 milliseconds

"for" loop took 727 milliseconds

"while" loop took 596 milliseconds

"do-while" loop took 594 milliseconds

  Press any key to continue . . .

NET_LOOPS_OUTPUT

So I feel we need to be careful while choosing the looping strategy. Please comment your thoughts.

Post a Comment

Steve Wortham said...

The results are a bit surprising with foreach. I mean, it makes sense that it would be slower, I just didn't expect that much of a difference. Of course, it's very plain to see that when there's nothing done inside the loop.

However, there's nothing wrong with the regular for loop. The only reason it's slower in this case is that you're not storing the count before you begin the tests.

If you call:
var NumberCount = numbers.Count;

And then use NumberCount in each loop then the for loop should be identical to the while loops in performance.

Sylvain said...

Hello,

I think your test is inadequate : you forget that foreach not only parse the list but affect too the value to item. If you try some item = numbers[i] in the others loops they are slower than foreach.
So in fact if you really need to access to each of part your array foreach is the faster solution.

Anonymous said...

Do something in one of those million loops, and you'll see that life is too short to worry about 'looping strategy'

Dhawal said...

Your program is flawed. The foreach loop assigns the contents of the list to a variable whereas the other loops don’t and hence the difference in execution times.

There is hardly any difference between the 3 looping methods. Even if there is any, it will be so small that it won't matter at all. So use whatever you like.

Anonymous said...

Was this done in debug mode? Try with optimisations enabled - they should end up being the same?

wekempf said...

"Premature optimization is the root of all evil."

These benchmarks are meaningless. Any difference in performance here will be entirely lost in the noise when you're loop actually does anything. This means that changing pretty much any real world foreach loop to be a do/while loop won't result in any significantly measurable difference in overall performance. Meanwhile, the likelyhood of introducing bugs is significantly more likely with a do/while than with a foreach.

http://blogs.msdn.com/b/ricom/archive/2004/06/30/170107.aspx

There's the opinion of the leading expert on .NET performance.

Anonymous said...

It seems to me that your test is flawed. The foreach example actually gets a hold of the item and assigns it to a variable while all other loops simply iterate over the item count.

Also, the loop performance greatly depends on the type of collection used.

Lloyd Cotten said...

Hmm... hope you don't mind a little criticism. While an interesting read, it's not very realistic, and there's also a couple of code problems with your post.

First the code issues:
The it++/it2++ call should be ++it/++it2 and for your while loop, your iterator should be initialized to -1. The it++ returns the variable and then increments the value, where as ++it increments the value and returns the result. As is, your code execution will go into the block when it is outside the bounds of the array. Making this change, and running it on my machine actually gets me better results.

Before code change:

"foreach" took 257 milliseconds
"for" loop took 101 milliseconds
"while" loop took 116 milliseconds
"do-while" loop took 112 milliseconds

After code change:

"foreach" took 243 milliseconds
"for" loop took 97 milliseconds
"while" loop took 47 milliseconds
"do-while" loop took 49 milliseconds

Secondly, you'll nearly always want to use the item in the array/list. When you change your code to assign the current index/item to a variable inside the loop, there is less of a difference and in fact the foreach is the fastest loop for me.

"foreach" took 299 milliseconds
"for" loop took 360 milliseconds
"while" loop took 343 milliseconds
"do-while" loop took 350 milliseconds

Here's the changed code (hope the formatting tuns out OK):

using System;
using System.Collections.Generic;
using System.Diagnostics;

namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var sw = new Stopwatch();
var numbers = GetSomeNumbers();

sw.Start();
foreach (var item in numbers)
{
var a = item;
}
sw.Stop();

Console.WriteLine(
String.Format("\"foreach\" took {0} milliseconds",
sw.ElapsedMilliseconds));

sw.Reset();
sw.Start();
for (int i = 0; i < numbers.Count; i++)
{
var a = numbers[i];
}
sw.Stop();

Console.WriteLine(
String.Format("\"for\" loop took {0} milliseconds",
sw.ElapsedMilliseconds));

sw.Reset();
sw.Start();
var it = -1;
while (++it < numbers.Count)
{
var a = numbers[it];
}
sw.Stop();

Console.WriteLine(
String.Format("\"while\" loop took {0} milliseconds",
sw.ElapsedMilliseconds));

sw.Reset();
sw.Start();
var it2 = 0;
do
{
var a = numbers[it2];
} while (++it2 < numbers.Count);

sw.Stop();

Console.WriteLine(
String.Format("\"do-while\" loop took {0} milliseconds",
sw.ElapsedMilliseconds));

Console.ReadKey();
}

#region Get me 10Crore (100 Million) numbers
private static List GetSomeNumbers()
{
var lstNumbers = new List();
var count = 100000000;
for (var i = 1; i <= count; i++)
{
lstNumbers.Add(i);
}
return lstNumbers;
}
#endregion Get me some numbers
}
}

St├ęphane said...

Hello.

Within the foreach loop you have a direct reference to the item.

In the other loops, you must do another operation to get the item by its index.

I think your benchmark would be more relevant if you take this into account and retrieve each items inside the loops.

St├ęphane.

David said...

Still what if you are looping through complex items. Inside the loop you are still going to have to access the item in the loop. These are not fair comparisons in this case. The foreach is doing work for you that you will need to do inside your do-while. Access the member that your iterating over.

Try it again with a foreach of a complex type List .

kjordan said...

This is a little unfair to the foreach loop since you're not accessing the element in the List like the foreach automatically does for you. Also, I'm not sure how .NET does the Count field for the List, but if there's a getter involved, you'll get much better performance if you stick it in a variable first and use that in the loop. You also have to be aware of the size of what you want to iterate since an object wrapper is generated for the Iterator for a foreach which adds a little to it. If you have to iterate 100 million records each time you get a request, you're doing something wrong.

Anonymous said...

And what happens when you change the order in which the test is run?

My first suspicion is that it's painfully suspicious that the runtime decreases monotonically with testing order.

Anonymous said...

your not doing anything with the list in either the for or do-while loop where as the foreach is accessing each element.

I would put that down as a bad comparison.

Dennis Sellinger said...

Lets see, The fastest time is about half a second and the slowest is twice as long. And all this for 100 million rows.

The cost of performing the iteration itself is swamped by the cost of doing what every I want to do in the my loop (e.g performing a database request or loading the information from a cache). I would say that we should forget about optimizing our iteration and concentrate on optimizing the function we are doing in the iteration or trying to reduce number of iterations.

I assure you that if you are processing 100 simultaneousness requests, the half a second to do the iteration will not even be a blip in your consumed time.

Also, your test should rather launch each request in a thread and perform the 10000 iterations and sum the total and you could make your claim one way or another with better data.

Dennis Sellinger said...

Another problem with this test is that you are counting the iteration times, but not the dereferencing time. That is, in order to access your row, you have to referenced it from from an indexed array. When I went to school we learnt that the cost of dereferencing is not trivial and so (c programmers will) use pointer arithmetic in their for loop, rather than using a index. This is where the foreach loop is superior because this type of pointer dereference is used by default (rather than an indexed lookup).

Once again the test code should do the same thing. The foreach loop gives you a reference to the iterated object, the for and while loops are just incrementing an index.

Zack Owens said...

Please read my response blog post. http://weblogs.asp.net/zowens/archive/2011/01/24/try-to-avoid-foreach-for-loops-over-my-dead-body.aspx. It isn't personal, I just want to dispel this incorrect result.

Anonymous said...

maybe do an apples to apples comparison and actually get the value from the list? All your for/while loops do is count from 1 to 100000000.

Add var number = numbers[i]; in the loops at least? seriously, how fast your computer can count from 1 to 100000000 isn't that interesting...

Mirko said...

Hm, I think it might make a big difference that you do not access your numbers in the for and while loop. Maybe the difference will get lower when you include something like numbers.get(it) in your loops? Anyway, a factor of 2 is IMO not very much given the more clear syntax of foreach.

Regards
Mirko

Anonymous said...

First of all, you should try repeating this test multiple times and analyse the data, because this one time result not only means nothing, but also provides false information. The for, while and do-while loops are pretty much the same thing when you look at the IL representation. Therefore, ther is no good reason why for loop shoudl be slower than a while loop.
Second, you're definately exaggerating. Sacrificing code readability is not worth a less than 10ns gain per record. In a real world app you wouldn't even notice that because usually the loop body takes far more time than the looping mechanism.
Besides, who on earth would consider loop on the CLR through a recordset big enough to become a performance issue?! In most cases, if you end up with huge recordsets in the CLR then it means you're doing something wrong (e.g. both SQL and NoSQL databases offer way better performance when handling huge data sets).
However, when looping really becomes an issue, you should consider ditching the CLR and look for a native solution.

Anonymous said...

1. A test that is not repeated doesn't count. Even a simple average of a about 1000 results would be better.
2. No data accessing in the other loops means that they are not the same as the foreach loop. Try looking at the IL and you'll notice why is foreach so different.
3. For, while and do-while are pretty much the same thing. Again, try looking at the IL.
4. It's just ~5ns per record (according to your results), so it's nothing compared to what would normally be in the loop body. I really would like to work on a project where looping is performance issue :) So it's a ~5ns gain but far less readble code.
5. Looping through that much data means that you're probably doing something wrong. Both SQL and NoSQL databases offer way better mechanisms to deal with filtering and processing huge data sets.
6. If looping becomes an issue on your project, consider ditching the CLR and use something native.

Anonymous said...

You are right but i think you miss the point here the difference between foreach and other loop.
i have made few change and see the result.

using System;
using System.Collections.Generic;
using System.Diagnostics;

namespace TestLoopTimes
{
class Program
{
static void Main(string[] args)
{
var sw = new Stopwatch();
var numbers = GetSomeNumbers();
var z = 0;

sw.Start();
foreach (var item in numbers)
{
z = item;
}
sw.Stop();

Console.WriteLine(
String.Format("\"foreach\" took {0} milliseconds",
sw.ElapsedMilliseconds));

sw.Reset();
sw.Start();
for (int i = 0; i < numbers.Count; i++)
{
z = numbers[i];
}
sw.Stop();

Console.WriteLine(
String.Format("\"for\" loop took {0} milliseconds",
sw.ElapsedMilliseconds));

sw.Reset();
sw.Start();
var it = 0;
while (++it < numbers.Count)
{
z = numbers[it];
}
sw.Stop();

Console.WriteLine(
String.Format("\"while\" loop took {0} milliseconds",
sw.ElapsedMilliseconds));

sw.Reset();
sw.Start();
var it2 = 0;
do
{
z = numbers[it2];
} while (++it2 < numbers.Count);

sw.Stop();

Console.WriteLine(
String.Format("\"do-while\" loop took {0} milliseconds",
sw.ElapsedMilliseconds));

Console.WriteLine("Press any key to continue ...");
Console.ReadKey();
}

#region Get me 10Crore (100 Million) numbers
private static List GetSomeNumbers()
{
var lstNumbers = new List();
var count = 100000000;
for (var i = 1; i <= count; i++)
{
lstNumbers.Add(i);
}
return lstNumbers;
}
#endregion Get me some numbers
}
}

Anonymous said...

Pure 100M foreach loop takes little above 1 sec...I guess more time consuming is what you do in the loop than loop it self. I think its 500 milisecs is acceptable when whole process can take much much more.

Anonymous said...

Try to iterate a Array or List with Foreach loop.
You will notice a significant performance in foreach loop than for loop

Mike Bosch - Software Engineer said...

Wow nice finding. These code optimization techniques really help to run the systems fast. These are some simple things every programmer but remind while doing coding.

Anonymous said...

I had some problems using the stopwatch in some tests and for a real compare I had to initialize it before really using it.

the code:
...
var sw = new Stopwatch();
var numbers = GetSomeNumbers();

sw.Start();
foreach (var item in numbers)
...

and should be:
...
var sw = new Stopwatch();

// the initialization mentioned
sw.Start();
sw.Stop();
sw.Reset();

var numbers = GetSomeNumbers();

sw.Start();
foreach (var item in numbers)
...