Current results of "dictionary word count" programs...

Jason Holt jason at lunkwill.org
Tue Mar 14 02:36:20 MST 2006


Weird, no Perl solutions for the dictionary problem?  I guess as long as I 
have my Perl hat on...

#!/usr/bin/perl
# dict.pl
open(DICT, "<$ARGV[0]") or die $!;
for(<DICT>) { chomp; lc($dict{$_}) = 0; }
open(IN, "<$ARGV[1]") or die $!;
my $in = join('', <IN>);
map { $dict{lc($_)}++ if exists($dict{lc($_)}) } split(/\W/, $in);
print("$_: $dict{$_}\n") for sort keys %dict;

Or, here's a more compact version that requires the File::Slurp module:

#!/usr/bin/perl
# dict2.pl
use File::Slurp;
%dict = map {chomp; (lc,1)} read_file($ARGV[0]);
map {$_=lc; $words{$_}++ if $dict{$_}} split(/[^a-zA-Z]/,read_file($ARGV[1]));
print("$_: $words{$_}\n") for sort keys %words;

And performance data (Athlon64 3000+):

[jason at erg] ~$ time ./dict2.pl words.i kjv10 >/dev/null

real    0m3.871s
user    0m3.733s
sys     0m0.135s

[jason at erg] ~$ time ./dict.pl words.i kjv10 >/dev/null

real    0m3.063s
user    0m2.915s
sys     0m0.146s

[jason at erg] ~$ time ./dict2.pl words.i kjv10x10 >/dev/null

real    0m28.032s
user    0m27.054s
sys     0m0.897s

[jason at erg] ~$ time ./dict.pl words.i kjv10x10 >/dev/null

real    0m21.702s
user    0m20.635s
sys     0m0.802s

So, about 10x slower than my C solution.

 					-J



More information about the PLUG mailing list