adventures in perl: foreach returns pointers to elements

I’m not sure how I’ve never run into this issue before. In some work I was doing recently, I ran into what I thought was a nasty bug [in my code] but couldn’t explain it. Without thinking, I started trying to undo this, fix that, hack this, and ignore that.

After the meeting I was in finished, I was able to sit down and actually put some thought into what was happening. The one thing that stood out was interesting, but until then, I had no clue it was how Perl interpreted [in that scenario]. So, I wrote a short test script, sure enough proving the theory, and was able to fix my error.

Take a look at this:

1
2
3
4
5
my @animals = ("cat","dog","emu","frog");
 
foreach my $animal(@animals) {
    printf("do not eat the %s\n",$animal);
}

which dumps out this:

do not eat the cat
do not eat the dog
do not eat the emu
do not eat the frog

In my head, the code above would work like this: for each element in the array @animals, copy the data from that element into a new scalar $animal, then print it out. Simple enough. Now, consider this sample:

1
2
3
4
5
6
7
8
9
10
11
12
my @animals = ("cat","dog","emu","lemur");
 
foreach my $animal(@animals) {
    printf("do not eat the %s\n",$animal);
    $animal = sprintf("rabid %s",$animal);
}
 
printf("\n");
 
foreach my $animal(@animals) {
    printf("do not eat the %s\n",$animal);
}

I had *EXPECTED* it to output this:

do not eat the cat
do not eat the dog
do not eat the emu
do not eat the frog

do not eat the cat
do not eat the dog
do not eat the emu
do not eat the frog

For the first loop, again, my thinking was that we copy each array element’s data into the new scalar $animal, print it, modify it (by adding ‘rabid’ to it), but do nothing with the modification (we are just modifying $animal, which should be assigned by copy, of which would be lost when we iterate to the next element). Then, in our next loop, we iterate over @animals again, initializing $animal yet again for each element, so we just hit the un-modified @animals array and see the same thing.

[un]Surprisingly, I was totally wrong. This was the output I got:

do not eat the cat
do not eat the dog
do not eat the emu
do not eat the frog

do not eat the rabid cat
do not eat the rabid dog
do not eat the rabid emu
do not eat the rabid frog

Wow! What happened? Turns out, like the title suggests, when you use foreach in Perl, it actually assigns $animal to be a pointer (reference, whatever) to that array element, and not a copy. When you make any changes, the change applies directly to the element the array. As another example, it’s for-loop equivalent would be this:

1
2
3
4
5
6
my @animals = ("cat","dog","emu","lemur");
 
for( my $index=0 ; $index < @animals ; $index++ ) {
    printf("do not eat the %s\n", $animals[$index] );
    $animals[$index] = sprintf( "rabid %s", $animals[$index] );
}

Fun! Doing some research after the fact, it turns out there has been some mild discussion on the topic. Some consider it a bug, but in reality, it works by design. What does this mean for you? Well, if you happen to be iterating over an array using foreach, and want to modify each element, and plan on looping over the array multiple times, make sure you understand what’s going on behind the scenes, or else you’ll run into the same issue I did. You can create a temporary variable (which will assign by copy), ie:

my $new_animal = $animal;

or you can iterate using a for-loop and just do it like this:

my $animal = $animals[$index];

There of course may be some cases where you want to take advantage of the referencing feature, and if you do, all the more power to you.

Well, that’s all I have for now. Hopefully this helps someone out some day. Enjoy!

Leave a comment