How does PHP 'foreach' actually work?
Let me prefix this by saying that I know what foreach
is, does and how to use it. This question concerns how it works under the bonnet, and I don't want any answers along the lines of "this is how you loop an array with foreach
".
For a long time I assumed that foreach
worked with the array itself. Then I found many references to the fact that it works with a copy of the array, and I have since assumed this to be the end of the story. But I recently got into a discussion on the matter, and after a little experimentation found that this was not in fact 100% true.
Let me show what I mean. For the following test cases, we will be working with the following array:
$array = array(1, 2, 3, 4, 5);
foreach ($array as $item) {
echo "$item
";
$array[] = $item;
}
print_r($array);
/* Output in loop: 1 2 3 4 5
$array after loop: 1 2 3 4 5 1 2 3 4 5 */
This clearly shows that we are not working directly with the source array - otherwise the loop would continue forever, since we are constantly pushing items onto the array during the loop. But just to be sure this is the case:
foreach ($array as $key => $item) {
$array[$key + 1] = $item + 2;
echo "$item
";
}
print_r($array);
/* Output in loop: 1 2 3 4 5
$array after loop: 1 3 4 5 6 7 */
This backs up our initial conclusion, we are working with a copy of the source array during the loop, otherwise we would see the modified values during the loop. But...
If we look in the manual, we find this statement:
When foreach first starts executing, the internal array pointer is automatically reset to the first element of the array.
Right... this seems to suggest that foreach
relies on the array pointer of the source array. But we've just proved that we're not working with the source array, right? Well, not entirely.
// Move the array pointer on one to make sure it doesn't affect the loop
var_dump(each($array));
foreach ($array as $item) {
echo "$item
";
}
var_dump(each($array));
/* Output
array(4) {
[1]=>
int(1)
["value"]=>
int(1)
[0]=>
int(0)
["key"]=>
int(0)
}
1
2
3
4
5
bool(false)
*/
So, despite the fact that we are not working directly with the source array, we are working directly with the source array pointer - the fact that the pointer is at the end of the array at the end of the loop shows this. Except this can't be true - if it was, then test case 1 would loop forever.
The PHP manual also states:
As foreach relies on the internal array pointer changing it within the loop may lead to unexpected behavior.
Well, let's find out what that "unexpected behavior" is (technically, any behavior is unexpected since I no longer know what to expect).
foreach ($array as $key => $item) {
echo "$item
";
each($array);
}
/* Output: 1 2 3 4 5 */
foreach ($array as $key => $item) {
echo "$item
";
reset($array);
}
/* Output: 1 2 3 4 5 */
...nothing that unexpected there, in fact it seems to support the "copy of source" theory.
The Question
What is going on here? My C-fu is not good enough for me to able to extract a proper conclusion simply by looking at the PHP source code, I would appreciate it if someone could translate it into English for me.
It seems to me that foreach
works with a copy of the array, but sets the array pointer of the source array to the end of the array after the loop.
- Is this correct and the whole story?
- If not, what is it really doing?
- Is there any situation where using functions that adjust the array pointer (
each()
,reset()
et al.) during aforeach
could affect the outcome of the loop?