Interesting little regex

Alan Young alansyoungiii at gmail.com
Fri Feb 24 13:40:15 MST 2006


> Yes, what are the unique occurrences of text in that string?  I've run the
> code and I'm still not exactly sure what it's supposed to do.
>
> use Data::Dump qw/ dump /;
>
> $a="abcde"x4;
> $a=~s{((\w+?)(??{!$b{$^N}++?"(?=)":"(?!)"}))}{($1)}xg;
> print "$a\n";
> print dump(\%b), "\n";
>
> (a)(b)(c)(d)(e)(ab)(cd)(ea)(bc)(de)(abc)de
> { a => 3, ab => 2, abc => 1, b => 2, bc => 1, c => 2, cd => 1, d => 3, de
> => 2, e => 3, ea => 1 }

There is 1 unique occurrence of 'abc', 2 occurrences of 'ab' (not
contained in the other occurrence of 'abc'), and 3 occurrences of 'a'
(not contained in the other occurrences of 'abc' and 'ab').

If you have a stream of text of variable size with delimiters of
varying length embedded in the string and values of varying length:

delim1abcdefgdel2hijklmnopqrstd3uvwxyz

(where we have delim1, del2 and d3 as delimiters, and abcdefg,
hijklmnopqrst and uvwxyz as values)

how can we get those out efficiently and without a lot of programming?
 That regex does that.

The !$b{$^N}++ portion of the regex is replace in live code with a
subroutine call that returns true or false based on whether we have a
recognized delimiter yet, and the regex is slightly different so that
we can capture the value as well.

I just thought it was a neat little regex that qualified as a FWP and
pass it along to my perl monger and user groups as well.
--
Alan



More information about the PLUG mailing list