Sam-I-Am on Web Development

Sam Foster on the web and web-ish software developmen

Friday, November 30, 2007

Regexp to match only html filenames and directory names

I've been working on a script to create filtered directory tree listings. It can be configured with both include conditions, and exclude conditions. If something passes the include filters, it then checks to see if its explicitly excluded. For example, I want to exclude cgi-bin, but include all other directories and files. So its useful to have a good catch-all pattern for including only the good stuff. I needed only html files this time. So after some head scratching I came up with: /(^.*\.htm(l?)$)|(^[^\.]+$)/ I dont like using .*, so to improve it maybe I could use a character range instead, or a negated range like [^\.]+, but this works beautifully.

Labels: