movable type, apache redirects, justifications for wasted time
Among the many things I like about my new job is that it's given me the opportunity to learn a bunch of new technologies. Email triggers, .htaccess files, SVN repositories, XML, and of course the power to raise the dead (from the command line!) — lots of good stuff.
And useful stuff. For instance, this morning I finally fixed out broken archive URLs. Movable Type 2.66whatever built entries in the format
- /blog/archives/000001.php
- /blog/archives/000002.php
- etc.
But since the upgrade to 3.2, we've switched to a more useful format, oriented around a value that MT generates called the basename (which is basically an abbreviated, URL-safe version of the title). The new URLs look like this:
- /blog/archives/2004/01/22/round_the_world/
- /blog/archives/2005/12/16/sweet_nostalgia/
The problem is that there are still lots of links to the old URLs, both within entries on this site and elsewhere. That's no good, since they don't get rebuilt when new comments are added, or when their entries are updated, or when we redesign, or when anything else happens on the site.
But now I can do something about it. And since it took me a few hours, and since it might help someone else, I thought I'd post my solution.
First, an .htaccess file (to use it, you'd remove the "sample" and the ".txt", then add a period to the front, leaving it as ".htaccess". This goes in the archives directory and tells the web server software to redirect all requests to that directory that take the form 00????.* (in MS-DOS parlance — that's two leading zeros, four more characters, then any extension) to "oldurl.php?x=[whatever the original request was for]". The "00" as an identifier isn't great, but the regex had given me enough trouble, and this solution was good enough, that I just gave up and went for the easy route. Apache Rewrite rules are confusing.
Next comes oldurl.php, which is passed the originally requested filename in a querystring parameter (inventively named "x"). It opens up the original file (using direct filesystem access — if we went in through HTTP we'd just get redirected to oldurl.php again) and scans it for the entry's date. It does this using two different formats — we changed templates before, so the useful RDF block of data (visible if you look at the source of this page) isn't always available). If you were going to adapt this script, you'd have to be sure that the date (and title) extraction is customized to match your own templates.
We now have the date. From that, we can find the directory that contains all of that day's entries. We have the title, too, but we still don't reliably know how to get the basename. We could probably pore through Movable Type's documentation (or — shudder — its code), but it's easier to just iterate through the directories that represent the entries from that date, looking for entries whose content matches the title we found. If we find one match, we transparently redirect the user to it. If we find more than one, we show a list and let the user chose. And if we find none, we provide a link to a google search for the title (constrained to only search this domain). And, earlier in the process, we throw errors for failing to find the file, failing to match the date, etc. But if the file exists, the transparent redirect really ought to work in pretty much all cases.
Slick, huh? Well, I think so. Linux: it's pretty neat. I don't know if anyone will be able to directly use this, but geeks-in-need can probably adapt it pretty easily.

Comments
Linux: it's pretty neat.
Now, why don't they use that as their official motto?
You could also use the technique detailed here but use this statement instead:
<MTEntries lastn="99999"$>
update mt_entry set entry_basename = "< pad="1"$>" where entry_id = <$>;
</MTEntries>
The idea being that you set your basename to whatever you were using before. Then MT will continue to print out your past archives in that fashion while using the more friendly form going forward.
You could also use the same technique for creating an apache rewritemap file if you wanted to convert your old entries to using the friendly form without breaking the permalinks...
Interesting! That's a pretty slick solution. Thanks, Jay.
Since I've got my own behemoth of a hack in place, though, I'll probably stick with it. It's nice to have everything using meaningful basenames.
Post A Comment