<script language="JavaScript"><!-- document.write('<a href="javascript:Load(\'stud.html#Five_Card_Stud\')">Five Card Stud</a>'); // --></script><noscript><a href="stud.html#Five_Card_Stud">Five Card Stud</a></noscript>that uses JavaScript to write the links that rely on JavaScript and that isolates their static equivalents within <noscript> tags - but that posed a real maintenance nightmare. (Just look at how much messier the code above is than a simple <a href=> tag, and at how I've had to write the link target and link text twice.)
This seemed a perfect place to put down Programming Perl in the middle of chapter four, and actually write my first Perl script. Expand.pl scans its working directory for .psx files that are newer than the corresponding .html files and:
It was meant to solve the problem my site had with boilerplate code. A consistent look meant lots of identical code all over the site. This meant that the information unique to a page (like the title and the links to on-page anchors) was easy to miss under all the code that didn't vary from page to page - and that changing the look meant changing all that code.
The answer was to replace all the boilerplate with macros that looked like
:This() and that could have the unique parameters in the parentheses.
Macros are defined in a package that exports a symbol table that consists of
template text, with %1% and %2% substitution points, and a Perl function that
takes an argument list, massages it in any way necessary, and then spits out a
result list to be substituted into the macro template. This makes my source files
much smaller and easier to read, and makes it easier to change standard elements.
For example, I've added Initially, I placed a shortcut to this script on my Win98 desktop. I'd edit
a source file, click on the shortcut's icon on the 'desktop toolbar', and reload
the page in a browser. Recently, though, I wrote a simple 'daemon' that waits on
a change notification on my HTML source directory and then calls the
preprocessor script via PerlEz. This means that
the preprocessor is invoked every time I save a change to a source file, so that
I just edit and reload, almost as if the preprocessor weren't even involved.
The engine consists of two different scripts: one to build a
concordance and the other to use it. The indexer uses HTML::Parser to
spider a local copy of my site and extract the plaintext. It lowercases all
words and does some other canonicizing (like splitting hyphenated words into
a pair of words), then builds the concordance, a hash by word of a hash by
filename of lists of word positions. (This data structure makes it easy to do
both simple searches (pages containing this word) and sequence (pages
containing this word followed by that word) or 'near'
searches.) It then prunes the most common words, and writes the concordance
to disk as a Perl package that can be use-d by the search engine.
This script is the one that convinced me that Perl is fun: all the
map and foreach operations started to feel like symbolic
processing in Lisp.
Equally non-obvious at the outset was the need to keep the original text, so
the user doesn't just have the title to tell them what a page contains. Currently,
I 'invert the concordance' to generate something like the original text - but, while
the code for this is pretty simple, it's rather expensive (it takes ten to one hundred
times as long to format 25 results as to generate 100 hits) and is more
than a bit weird looking since the index ignores commom words and is lower-cased. It's
good enough for a first pass, though, and I'd put the engine up, if I had a host
for it ....
I've used these bindings to
Spider & Index
I've wanted a search engine for my web site for a long time, but my ISP won't
let me run any CGI on their server. For a while it looked like I'd be able to run
Perl scripts on a friend's Solaris box with a DSL line, and I thought a search
engine would make a great learning project. (The engine is 'done enough', but the friend
just doesn't have the time to do anything with her toy, so the engine's
not up, for now. Perhaps I'll look into changing ISP's.)
Search Engine
Indexing was easy; searching turned out to be surprisingly complex. Doing simple
lookups was pretty easy; doing embedded lookups ('bed' matches 'embedded') wasn't
much harder; even sequence matches were pretty straightforward. What turned out to
be difficult was scoring all the different possible combinations of full and
partial matches!
Embedded Perl
Lately, I've been playing with using Active State's
PerlEz
library from Delphi.
I wrote 'flat' and object oriented bindings, and started a ONElist mailing list about calling Perl from Delphi.
Quiz Maker
QuizMaker reads a relatively
free-form text file, paragraph by paragraph and
builds a self-contained HTML/JavaScript
multiple choice quiz of the type common in newspapers and magazines. Paragraphs
that start with a ?, *, or + are questions, score ranges,
or quiz titles; all other paragraphs are passed untouched, as raw HTML text.
Each response may have a score value (the default is 0) and a Score button pops
up a score box scrolled to the appropriate range.
Copyright © 1999, Jon Shemitz,
jon@midnightbeach.com