Search Marketing Blog



The 404 Error Page: Don’t Offend the Search Robots!

Posted by Tyson Braun on 02.02.2009

robot404According to Wikipedia, a 404 (or Not Found) Error message is “an HTTP standard response code indicating that the client was able to communicate with the server, but either the server could not find what was requested, or it was configured not to fulfill the request and did not reveal the reason why. 404 Errors should not be confused with ‘server not found’ or similar errors, in which a connection to the destination server could not be made at all.”

In a perfect world, dead-end 404 Errors should never happen. If content from a page is moved, a 301 Redirect to the new location should be a fundamental step. This way, the site retains the referred visitors from external links, along with the ‘link juice’ from the linking page.

Every time a human, or a robot such as a search engine spider, visits a Web page, two things happen. The server sends the content of the page, along with a ‘result code,’ indicating the page’s accessibility status. We humans do not care about a page’s result code.  However, robots (especially the extremely important Google search ’bots’) view these result codes as critical elements that contribute to the accessibility of a page.

Therefore, it is important that the code accurately represent its corresponding page.  In fact, Google considers it to be so important that it will not allow a new user to be ‘verified’ for Google Webmaster Tools until it is able to understand whether a page is accessible or not.  Google’s Web Standards require that a human user and a search engine have the same experience while visiting a site.  Thus, a conflicting result code is considered a flagrant violation of these standards.

A common result-code accessibility issue occurs when a Web site owner creates a custom 404 Error page for humans, but mistakenly presents a ‘Status: 200 Success’ result code to the search robots.  A ‘Status: 200 Success’ code means that the page is accessible by humans and by ‘bots,’ and the content on the page is correct. A 404 Error, however, indicates the page does not exist, which means that the search bots will not index the page, and the user needs to look elsewhere. The content on the page has been either moved or erased, sorry, try again.

If content is removed from a site, such as an old product page on an e-commerce site, a user should be directed to customized-404 Error page, with helpful links to get the user back on track. It’s common that a single-digit-fraction of page views on a site may result in 404 Error pages.  Imagine if visitors were successfully moved through your conversion funnel, instead of showing them a ’this page does not exist page’ message!  It’s hard enough to get visitors to your site; don’t lose them by presenting dead ends.

When visitors land on a customized 404 page, it still must communicate a result code of 404 to search ‘bots,’ even though the visitor is receiving useful information. A result code of 200 is a signal that this page is correct, and should be indexed. You don’t want this. Especially if, instead of a customized 404 page, you re-directed your 404 Error pages to your site map or homepage. You could end up with multiple indexed pages of this content resulting in duplicate content penalties, not to mention replacing the intended content on your homepage or site map!

Google Webmaster Tools, along with other Web analysis programs, are available to check for this common SEO misstep, and should be a part of your regular SEO maintenance routine.  Fixing the result-code accessibility issues depends on the type of server you are using to host your site. Below are solutions for several common servers.

ASP.NET Server
The ‘web.config’ file in the application root must be edited to contain the following code:

<configuration>
<system.web>
<customErrors mode=”On” defaultRedirect=”error.asp”>
<error statusCode=”404″ redirect=”error404.asp” />
</customErrors>
</system.web>
</configuration>

Windows-Server/ASP Page
Add the following code to the top of the page:
<%
Response.Status = “404 Not Found”
%>

Apache Server
Add this line to your ‘.htaccess’ file:
ErrorDocument 404 /error404.php

PHP Page
Ad this line to a PHP page:
<?php
header(”HTTP/1.0 404 Not Found”);
?>

To ensure proper indexing of your site within the major Natural search engines, it is paramount that you incorporate effective 404 Error strategies.  If you are interested in learning more, or have a comment pertaining to the strategies in this post, please send a comment through our blog.

Add to Sphinn Del.icio.us  Digg  Furl  Magnolia  Reddit  Stumble Upon Add to Mixx!




Leave a Reply



Google AdWords Certified Yahoo Search Marketing Amassador SEMPO Cirle Member SEMpdx U.S. travel Association