< back

Sam-I-Am's Builder Blog

Work type stuff - handy urls and notes on the trials and tribulations of a web builder.

Tuesday, February 18, 2003

HTML::Index on CPAN.
This is a set of (perl) modules for creating, storing and searching indexes of html files that looks like a handy starting point for my html indexer. Seems like I might be able to sub-class it to use my own parser and store the code and throw out the content. So I could search for things like which pages on site.com are still using font tags? Which call such-and-such stylesheet or javascript library.
The real trick is going to be getting useful search results for tag combinations.
And don't forget I want to offer a download of the results in csv or xls format!

0 Comments:

Post a Comment

<< Home

< back

© Copyright 1999-2005 Sam Foster