Regex Help?

Jeff Anderson java.emitter at gmail.com
Sat Oct 8 14:11:36 MDT 2011


You probably don't even need a regular expression at all.  Just split on
whitespace.  Then loop over the fields.  From the second field onward (the
description), add a space for the delimiter to your output.  Until you hit a
number.  Then use commas.

As long as none of the description words start with a number and none of the
other fields contain whitespace, this will work.  I suspect none of them do.

If I did do a regular expression, I'd start with the regex below.  Haven't
tested it.
^(\d+)\s+(\D+)(\d+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\d+)\s+(\S+)$

The second group will pick up extra space at the end.  So, just trim that.

Regardless, you should read Mastering Regular Expressions by Jeffrey E.F.
Friedl.

On Sat, Oct 8, 2011 at 1:19 PM, S. Dale Morrey <sdalemorrey at gmail.com>wrote:

> Ok I'll admit it, I suck at regex.
> Unfortunately, I now have the task of importing a customers catalog
> price list into a database and I'm not sure where to begin.
> I really think a simple regex could convert the whole thing to a CSV
> and I could then import the CSV directly.
>
> Here is an example of what I'm looking at.
> Item Item Description Page # Retail Member Wholesale Pkg
> 12137 HOLIDAY PENGUIN FIGURINES 335 9.95 4.25 5.95 1 PR
>
> As you can see all the fields are separated by whitespace, but then
> again so are the individual words in the description.
> In some descriptions there may be commas and so description text would
> need to be quoted, in addition to having all the non-description
> whitespace replaced with commas.  The last 2 fields should be a number
> and a unit for instance 1 PR is 1 pair so 1,PR ought to be fine.
>
> Ideally,the output should look like this.
> 12137,"HOLIDAY PENGUIN FIGURINES",335,9.95,4.25,5.95,1,PR
>
> Where does one start with something like this?
>
> /*
> PLUG: http://plug.org, #utah on irc.freenode.net
> Unsubscribe: http://plug.org/mailman/options/plug
> Don't fear the penguin.
> */
>


More information about the PLUG mailing list