I was using my code to parse a medium-sized CVS log and it was being slow (~1min). So, like an idiot I said: I know that the algorithm is quite slow, so I optimize it. Here is a version which is twice as fast as the original version:
use strict; use warnings; use Test::More tests => 8; is(cmp_cvs_tags('1.1', '1.1'), 0); is(cmp_cvs_tags('1.1', '1.1.1'), -1); is(cmp_cvs_tags('1.1', '1.2'), -1); is(cmp_cvs_tags('1.2', '1.2'), 0); is(cmp_cvs_tags('1.2', '1.3'), -1); is(cmp_cvs_tags('1.1.1', '1.1'), 1); is(cmp_cvs_tags('1.2', '1.1'), 1); is(cmp_cvs_tags('1.3', '1.1'), 1); sub cmp_cvs_tags { my ($a, $b) = @_; return 0 if ($a eq $b); # eliminate common prefix my ($alen, $blen) = (length($a), length($b)); my $minlen = ($alen < $blen) ? $alen : $blen; for (my $i = 0; $i < $minlen; ++$i) { if (substr($a, $i, 1) ne substr($b, $i, 1)) { # first different character, cut until this point $a = substr($a, $i); $b = substr($b, $i); last; } } my @a_lst = split /./, $a; my @b_lst = split /./, $b; while (1) { my ($apart, $bpart) = (shift @a_lst, shift @b_lst); return -1 if (!defined $apart); return 1 if (!defined $bpart); next if ($apart == $bpart); return $apart <=> $bpart; } }
However, the whole script was still running slow. So I did which I should have done in the first place: run NYTProof on it. After looking at the reports I was clear that 99% of the time was being spent in the read loop. Dooh!