Perl split gotcha

One of those things which are spelled out in the documentation, but most people (including myself) don’t really read the fine manual, until really, really forced to, and from the way it’s described, it’s not immediately clear how it can byte you. From perldoc:

Empty trailing fields, on the other hand, are produced when there is a match at the end of the string (and when LIMIT is given and is not 0), regardless of the length of the match.

Now consider the following example:

print join(',', split(/,/, 'a,b,c,d,'));

Which prints out “a,b,c,d”, because split ignored the last (empty) element(s). To fix this, specify a negative limit. As per the documentation:

If LIMIT is negative, it is treated as if an arbitrarily large LIMIT had been specified.

Combined with the previous snippet, this gives the resolution to the problem. You might want this behavior or you might not. The idea is to be aware of it so that you can apply it when needed. I needed it for stripping whitespaces off of lines in source code. Since the change was intrusive enough, it was very important to preserve the number of newlines at the file end, thus needing the technique described above when splitting at newlines.

Leave a Reply

Your email address will not be published.