Log in

No account? Create an account
Ramblings Journals I Read Calendar The Dirt MegaZone's Waste of Time Older Older Newer Newer
KBlogger: Real World Tips - Redirects - MegaZone's Safety Valve — LiveJournal
The Ramblings of a Damaged Mind
KBlogger: Real World Tips - Redirects

This time around I thought I'd get off the standards bandwagon for a bit and just talk about some of the real- world considerations that go into running a site. I had to do some work on our server configuration today, so I figured that'd make a decent topic. It kind of ballooned from there as I worked on it, but things tend to happen that way.

To start with, give careful thought to your directory/URL structure. You want something that will be logical and easy to maintain. It should make sense to both you and your users and truncating the URL should do something productive. If you're in /toplevel/category/subcat/page.html, then cutting back to any level should bring you to a useful page - not an error or a directory listing. People do explore sites by playing with the URL. For example, consider how you can trim this URL at each level and progress to logically 'higher level' pages:
http://www.paycashwallet.com/consumer/Wallet/AddFunds /

Organize the site logically, and plan for growth. Don't just dump all the files into one 'flat' directory, create a hierarchy. Even if it seems silly at first, as your site grows you will thank yourself for your foresight. If you don't do it, you'll curse yourself endlessly. And give everything reasonable names - AddFunds, RemoveFunds, etc., and not Directory1, Directory2, etc. Even if you use a publishing tool that normally hides these things from you, someday you may need to do something by hand, or use a different tool. Or maybe your site grows and you need to hire more people to work on the site, and then they have to figure out what 'Directory1' means. These things also show in the URLs, so a customer who is confuse can glance at it and see 'AddFunds' and clue into the area of the site they're in, while 'Directory1' doesn't help them at all.

The next step is to set a custom 404 error handler. I actually ran into this again Wednesday. I took over the sites at work a while back, and I made a number of changes, but I never did this. Today I got passed a complaint because an external site had linked to a page that hasn't existed for a long time, if ever — at least since before I took over. So someone got a 404 by following the link, and we don't want that. So I changed our 404 handlers to redirect to the homepage of each site. Now when someone comes in via a bad link, or even typos a URL, etc, they'll at least go the home page and can navigate from there. And since most users following a link from a 3rd party site don't know exactly what to expect anyway, most of them will never even know they got an 'error', so their perception of our site is improved. This link will produce a 404.

At work we're running IIS, so I handled this by creating an PHP file that does the redirect (see below) and setting a Custom Error handler for 404 errors. First open the Properties for the server and select the Custom Errors tab. Then select the error you want to set a custom handler for, 404 in this case, and select Edit Properties.

Screenshot of IIS Custom Errors tab

Then select the new handler. In this case it is a URL I want to be called, and it is a file 'redirect-404.php' that I've installed at the root level of the site.

Screenshot of IIS Custom Errors Edit Properties

Yes, I know, most of you are probably using Apache. I usually do as well, but we don't always get to pick the platform we work on so while I prefer Linux/Apache, I'm picking up Windows/IIS as well. You can do the same kind of thing in Apache by setting your ErrorDocument, either in httpd.conf or a local .htaccess file. Since the conf file has an example I'll use that instead of reinventing the wheel:

# Customizable error responses come in three flavors:
# 1) plain text 2) local redirects 3) external redirects
# Some examples:
#ErrorDocument 500 "The server made a boo boo."
#ErrorDocument 404 /missing.html
#ErrorDocument 404 "/cgi-bin/missing_handler.pl"
#ErrorDocument 402 http://www.example.com/subscription_info.html

What I've done in IIS is basically the same as a local redirect. So why didn't I just enter the URL of the page I wanted it to go to? Because when I did that IIS served the content of that page, but the URL in the browser didn't reflect the change, so relative links were FUBAR. Might there be a more elegant way to do this? Sure, but this took me 5 minutes and it worked. It took quite a bit longer to write this up than to do it, actually.

Starting from the above points, once you have a solid structure try to make as few changes as possible. Try not to move directories or pages, especially anything you think may be linked from outside or a page a user is likely to bookmark. If you do need to move things, try to put in redirection from the old page. For example, as part of the ongoing site expansion and re-branding at work, all of the content that used to live at the PayCash site now lives at the PayCash Wallet site. If you try to go to http://www.paycash.us/consumer/ you will find yourself at http://www.paycashwallet.com/consumer/. Since the entire site structure is the same all of the pages redirect to their new locations.

When I need to redirect there are a few options. If I am using an Apache server I would probably insert a rule for mod_alias or mod_rewrite to redirect surfers to the new content. At work, as I said, I'm on IIS, so I don't have the options I would on Apache. But there are still options: I can map a resource in IIS to a redirect. That is how I handled the entire site move - I setup two virtual directories for /consumer and /merchant and made them redirects. It is also a way to put in 'fast shortcuts' to pages deeper into a site, short URLs users can enter to jump to specific pages. For example, try this one: http://www.tivo.com/adapters.

This is how I have IIS configured for PayCash.us:

Screenshot of IIS configuration

Note the virtual directories I created for consumer and merchant to redirect both as they now live on the new domain. For ease of site development and management all of the other domains live under the same document root. The 'mall' directory is the content for PayCashMall.com, etc. This way when I'm working on my development machine I can work with all the domains simply by using http://localhost/mall/, http://localhost/billpay/, etc. I also like how it organizes the structure. However, for branding purposes, they don't really want people using http://www.paycash.us/wallet/ (go ahead, try it), they want people to use http://www.paycashwallet.com/. So I setup a redirect to handle those too. The directories 'add', 'remove', 'download', and 'APS' are shortcut links to specific pages.

This is how it is configured to redirect a directory:

Screenshot of consumer redirect configuration

And this is how to redirect to a specific page:

Screenshot of add funds shortcut configuration

It is pretty much the same - different URL, and for a specific page check the 'The exact URL entered above' box. These are basically equivalent to standard Apache mod_alias RedirectPermanent directive.

If you can't setup a redirect on the server or if you want to redirect a dynamic page (PHP, ASP, .Net, etc) then you can simply replace the page with one that returns a redirect programmatically. In PHP this is trivial:

header("HTTP/1.1 301 Moved Permanently");
header("Location: http://www.paycash.us/");

That's it, dead simple. That's actually the redirect-404.php file for PayCash.us that I set as the 404 handler above.

If you can't set up the redirect on the server, and you can't use a dynamic page, you can also use an HTML file with a meta refresh header or JavaScript redirect. Actually, I usually use a hat trick - meta refresh, JavaScript redirect, and a link in the page for anyone who doesn't get caught by the first two, which is rare but it happens. Here is an example of such a file:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
<title>Redirect to PayCash Add Funds</title>
<meta http-equiv="Refresh" content="0; http://www.paycashwallet.com/consumer/Wallet/AddFunds/" />
<script type="text/javascript">
<p>If your browser has not redirected, please <a
href="http://www.paycashwallet.com/consumer/Wallet/AddFunds/">jump to
the new site.</a></p>

The meta refresh and JavaScript redirect, together, will catch nearly every user and redirect them to the correct page. If anyone falls through the cracks, they'll still get a link to take them there.

That's really the order of preference for doing a redirect - set it in the server, use a dynamic page to return redirect headers, or finally embed it into a page delivered to the browser. Best to avoid having to do redirects due to moving content, but they're useful tools to understand. And, in the case of the shortcut URLs, they can really improve usability of a site.

Well, that's all I have to say this time. I was planning to write something else entirely for this entry, but then I had to deal with this stuff today and it just put the bug in my head to write it up as an entry, so I did. I guess I'll save the other stuff until next time. Until then...

Anyone reading these? :-)

Tags: , , ,
I am: hungry hungry
Current Media: Chinga Chonga: Fuck Bush - Country Fried Chinese Girl

malone From: malone Date: June 9th, 2005 02:18 am (UTC) (Direct Link)
Oh yeah, well meta refresh JaveScript URL's headers HTML & Paycash to you too!

I read them Love, I don't understand one word of it - but I read it ;)

zonereyrie From: zonereyrie Date: June 9th, 2005 02:30 am (UTC) (Direct Link)
Well, it is nice to know someone is. We'll make a geek out of you yet. ;-)
malone From: malone Date: June 9th, 2005 02:34 am (UTC) (Direct Link)
Do geeks get to wear fuck me boots?
zonereyrie From: zonereyrie Date: June 9th, 2005 02:38 am (UTC) (Direct Link)
malone From: malone Date: June 9th, 2005 02:10 pm (UTC) (Direct Link)
starbucking From: starbucking Date: June 9th, 2005 05:11 am (UTC) (Direct Link)
I'm reading them as well. Thanks!
zonereyrie From: zonereyrie Date: June 9th, 2005 05:24 am (UTC) (Direct Link)
Thanks. :-)
rbarclay From: rbarclay Date: June 9th, 2005 06:50 am (UTC) (Direct Link)
I at least skim over it, but as I'm more of a general sys-/netadmin than a webmaster ... ;)
zonereyrie From: zonereyrie Date: June 9th, 2005 06:52 am (UTC) (Direct Link)
Yeah, most of these have been web stuff. I may go off on some tanget like TiVo sometime. :-)
fallenpegasus From: fallenpegasus Date: June 9th, 2005 09:04 am (UTC) (Direct Link)
I'm reading them.
zonereyrie From: zonereyrie Date: June 9th, 2005 03:39 pm (UTC) (Direct Link)
solipsistnation From: solipsistnation Date: June 9th, 2005 11:20 pm (UTC) (Direct Link)
I'm reading them! I'm all leet and web-server-adminny and stuff already, but you know more of the client-side things than I do, for sure.

I am also a BIG fan of mod_rewrite, which you can use to send people all over the place. We have some aliases and some RedirectRule entries, mostly for shortcuts, which we locally standardized on like this: http://www.wpi.edu/+CCC , and also for some dumb URLs people put on printed material ("What do you MEAN it's case sensitive? I have a pallet of flyers right here, and we can't send them back!") and things like redirecting user pages over to the user server and away from the main server and redirecting request without ~'s back to the main server. For anything more complicated, though, mod_rewrite is the best thing ever. One of the systems we're using has something like five different web-based services on it, and depending on what referer, what IP, what URL, and whether you're coming in via http or https, it directs you all over the place to different virtual hosts, and forces https if you come in to webmail without it, and so on. It's very cool.

That'd be worth a whole article right there if you feel particularly enthusiastic. 8)
zonereyrie From: zonereyrie Date: June 9th, 2005 11:41 pm (UTC) (Direct Link)
Yeah, I didn't play with mod_rewrite a lot, but from what I have seen of it, and from many examples I've encountered, it looks like a hell of a lot of fun. I think most of my serious Apache experience pre-dates mod_rewrite, so I was using mod_alias for stuff at the time. mod_rewrite is very popular for blocking bandwidth theft, spam harvestors, etc.

You should start your own tech blog at WPI - from our conversations you have a shitload of interesting problems you've solved. If you wrote some of them up you'd have some interesting stuff there. Really.
solipsistnation From: solipsistnation Date: June 9th, 2005 11:48 pm (UTC) (Direct Link)
Yeah, but then what would we talk about over dinner? 8) (Oh, and you know... "Mmm, effort.")

I have a mighty mighty story for Sunday, too. Not going to post it here, but over dinner? Yeah. Remind me. Say "Soap opera," and I'll know what you mean. Heh heh.