Utility Script: load_catalog.pl

Catalog entries are harder to represent in a flat file format, so I used XML.

Catalog Entries

XML format, very simple:

<productlist>

  <item catalog="64-2059954">
	<name>Chef's Choice Professional Knife Sharpener</name>
	<price>129.00</price>
	<description>
	This automatic sharpener turns out knives -- both straight-edge
	and serrated -- with a razor-sharp edge, and won't destroy the
	temper of fine kitchen blades.  It's quick and easy to use:
	Simply pass the knife once through each slot.  In a three-stage
	process, blades are sharpened and honed on 100% diamond
	abrasive pads, then stropped for a smooth edge.  Blades are
	aligned at the perfect angle, and you can select the type
	of edge, depending on the knife's use.  Made in the USA.
	</description>
  </item>

  <item catalog="64-137877">
	<name>Fini Aceto Balsamico</name>
        <price>10.50</price>
	<description>
	This superb balsamic vinegar, made in Modena, Italy, from
	sweet, white Trebbiano grapes, grows richer and darker as
	it's aged for about two years, all the while being transferred
	to smaller and smaller barrels of different aromatic woods
	(oak, chestnut, mulberry and juniper).  Recipes included. 
	8.8-oz bottle.
	</description>
	<image location="images/fini_balsamico.jpg"></image>
  </item>
...
</productlist>
  

load_catalog.pl Script

Marries DBI with XML, very strange.

#!/usr/local/bin/perl

use ProductParser;
use DBI;
use File::Basename;
use constant DB =>'DBI:mysql:perl_conference';

my $DB = DBI->connect( DB,undef,undef,{PrintError=>0} )
    || die "Connect failure: ",$DBI::errstr;

unshift @ARGV,'-' unless @ARGV;
while (my $file = shift) {
    my $h = ProductParser->new;
    $h->parse_file($file);
    insert_records($DB,$h);
}
$DB->disconnect;

sub insert_records {
    my $db   = shift;
    my $data = shift;
    my $insert = $db->prepare(<<END) || die $db->errstr;
INSERT INTO ProductList (name,description,price,image,catalog)
   VALUES (?,?,?,?,?)
END
    my $update = $db->prepare(<<END) || die $db->errstr;
UPDATE ProductList 
    SET name=?,description=?,price=?,image=?
    WHERE catalog=?
END
    ;
    for my $item ($data->list) {
	my $picture;
	if ($item->{'image'} && -e $item->{'image'}) {
	    open (I,$item->{'image'}) || die "open: $!";
	    local $/ = undef;
	    $picture = <I>;
	    close I;
	}
	$item->{'image'} =~ basename($item->{'image'});
	$insert->execute($item->{'name'},
			 $item->{'description'},
			 $item->{'price'},
			 $picture,
			 $item->{'catalog'}) 
	||
	$update->execute($item->{'name'},
			 $item->{'description'},
			 $item->{'price'},
			 $picture,
			 $item->{'catalog'}
			 ) 
	||
	warn "Can't load $item->{name}: ",$db->errstr,"\n";
    }
    $insert->finish; $update->finish;
}

__END__
Table definition:

CREATE TABLE ProductList 
    (
     catalog      char(10)     primary key,
     name         char(50)     not null,
     price        numeric(8,2) not null,
     description  text         not null,
     image        mediumblob
     );
  

ProductParser.pm module

Subclasses HTML::Parser (from LWP library) to parse XML. Tastes great! Less filling!

package ProductParser;
use HTML::Parser;

@ISA = 'HTML::Parser';

sub start {
    my $self = shift;
    my ($tag,$attr,$attrseq,$origtext) = @_;
    if ($tag eq 'productlist') {
	$self->{'list'} = [];
	return;
    }
    if ($tag eq 'item') {
	die "item without a productlist" unless $self->{'list'};
	die "no catalog number" unless $attr->{'catalog'};
	$self->{'currentitem'} = { catalog => $attr->{'catalog'} };
	return;
    }
    if ($tag eq 'image') {
	$self->{'currentitem'}->{'image'} = $attr->{'location'};
	return;
    }
    die "unknown tag" unless $tag =~ /^(price|description|name)$/;
    $self->{'current_text'} = '';
}

sub text {
    my $self = shift;
    my $text = shift;
    HTML::Entities::decode($text);
    $text =~ s/\n\s+/ /g;
    $text =~ s/^\s+//;
    $self->{'current_text'} .= $text;
}

sub end {
    my $self = shift;
    my $tag = shift;
    if ($tag eq 'item') {
	die "</item> without <item>" unless $self->{'currentitem'};
	push @{$self->{'list'}},$self->{'currentitem'};
	return;
    }
    if ($tag =~ /^(price|description|name)$/) {
	$self->{'currentitem'}->{$tag} = $self->{'current_text'};
    }    
}

sub list {
    return unless my $l = $_[0]->{'list'};
    return @$l;
}

1;
  

<< Previous
Contents >>

Lincoln D. Stein, lstein@cshl.org
Cold Spring Harbor Laboratory
Last modified: Sun Apr 25 14:09:22 EDT 1999