tony spencer » Search Engine Optimization

Google Crawling HTML Forms IS Harmful to Your Rankings

tony — Tue, 03 Jun 2008 18:39:14 +0000

A couple of months ago Google officially announced it would be “exploring some HTML forms to try to discover new web pages“. I imagine more than a few SEO’s were baffled by this decision as was I but were probably not too concerned about the decision as Google promised us all “this change doesn’t reduce PageRank for your other pages” and would only increase your exposure in the engines.

During the month of April I began to notice a lot of our internal search pages were not only indexed but outranking the relevant pages for a user’s query. For instance, if you Googled “SubConscious Subs” the first page to appear in the SERP’s would be something like:
http://raleigh.ohsohandy.com/ads/search?q=tables

rather than the page for the establishment:
http://raleigh.ohsohandy.com/review/27571-sub-concious-subs

This wasn’t just a random occurrence. It was happening a lot and in addition to the landing pages being far less relevant for the user, they weren’t optimized for the best placement in the search engines so they were appearing in position #20 instead of say position #6. These local search pages even had pagerank usually between 2 and 3.

Hmm, Just How Bad is This Problem

Eventually I began to realize how often I was running into this in Google, noticed my recent, slow, decline in traffic and it occurred to me this may be a real problem. I’ve never linked to any local search pages on OhSoHandy.com and I couldn’t see that anyone else had either. I queried to find out how many search pages Google had indexed:

Whoa. 5,000+ pages of junk in the index with pagerank. I slept on it for a night, got up the next morning and plugged in

Disallow: /ads/search?q=*

in robots.txt (and threw in a meta robots noindex on those pages for safe measure). Within a week we saw a big improvement in rankings due to properly optimized pages trumping crap and traffic is up 25% since the change and back to trending upwards weekly instead of stagnant, slow decline.

Bit of Advice

The robots.txt disallow works but it is slow to remove the URL’s from Google’s index. I added the meta noindex tag to the search pages a week later and saw much faster results.

Duplicate Content Can Penalize

tony — Mon, 04 Feb 2008 21:00:50 +0000

For some time now I’ve been telling clients and friends that publishing duplicate content will not cause you to receive a penalty but that Google will only choose one version of a unique piece of content that it believes to be the authority and refuse to allow other copies to be indexed. So if you publish a copy of one of my blog posts, Google will likely allow my original copy to rank but yours won’t be found.

I think I’ve discovered that enough duplicate content can actually do harm to a domain.

I had an old site we’ll call oldsite1.com. I was publishing fresh, unique, well written content there several times a day. oldsite1.com would always enjoy nice rankings for the content published there and new content was indexed quickly. I had always intended to eventually 301 redirect all of oldsite1.com’s pages to newsite1.com which would be hosting identical content. Past experience tells me that the 301 will cause all of oldsite1.com’s backlinks and authority to transfer over to newsite1.com and within days I’d see the new site perform nearly as well as the old site’s.

Now here is the mistake I made: some time ago I setup newsite1.com to mirror oldsite1.com (for some offline promotional reasons). I had zero backlinks to newsite1.com but it was crawled and indexed anyway. Obviously it was 100% duplicate content and nothing but duplicate content. But I didn’t worry too much about it. The day came to 301 redirect and within days the traffic plummeted. Its been several weeks and no recovery has happened.

Compete.com’s new tool is so good I kinda feel dirty :)

tony — Tue, 17 Jul 2007 16:51:36 +0000

I received a beta account to test Compete.com’s competitive analysis tool dubbed “Search Analytics”. They should rename it to “Login to Your Competitor’s Web Stats” because the “Site Referrals” tool is exactly that.

For instance I have a site that competes with YachtWorld.com so I plugged it into the site referrals tool and it spits back a list of keywords sorted by click volume that for which YachtWorld.com receives organic traffic. Whoa! Thats kind of scary.

I’m off to London but will write more later on Compete.com’s new tool that will leave you feeling dirty.

At London SES, Will Party with London SEO

tony — Thu, 15 Feb 2007 00:07:32 +0000

Last year I was unable to make it to the London SEO party but this year I’m there! If you are in London for SES don’t forget to round the corner of the ExCel conference center at 5PM and join us all for some free beer. I know that Cutts, Naylor, and Rand are coming.

Better Search Engine Friendly URL’s with Ruby on Rails

tony — Sun, 04 Feb 2007 16:59:40 +0000

In my last post about SEO and RoR I gave really BAD advice on how to accomplish clean URL’s that are search engine friendly. Our first RoR project consists of pages that are nearly 100% unspiderable by the search engine bots because its an app that requires login to do anything so I hadn’t really sat down and worked out the proper way to do URLs.

Obie Fernandez posted a great comment and reply on his blog showing how he taps into the power of to_param to easily generate clean URL’s. But the problem with both my approach and his is that it doesn’t guarantee a permanant link. From an SEO perspective its very important that your URL’s remain the same for eternity once they are generated.

For example:

You created a new company in your app a few months ago that resulted in the URL:
/companies/apple-computer

At Macworld Steve Jobs announces that Apple Computer is changing its name to Apple, Inc. A staff member updates your CMS to reflect the new name and now your URL is:
/companies/apple-inc

BAD. What if you were really good with your copy on that page and were forunate to get lots of links to /companies/apple-computer? Now that page returns 404 and you lose all that link juice.

Solution

You need to store the slug in a field so that your URL remains the same forever just as its done in most blog systems like Wordpress. The plugin permalink_fu solves the problem by generating a slug that is automatically stored in the DB. Hat tip to Eadz who goes into more detail on how to implement:
http://www.seoonrails.com/even-better-looking-urls-with-permalink_fu

One More Solution for URL Slugs

I haven’t tried this one either but the Acts as Sluggable plugin sounds like it might do the trick.

If you have used either of these solutions please leave a comment!

SEO for Ruby on Rails

tony — Sat, 27 Jan 2007 01:10:11 +0000

Now that the NotSleepy camp is actively switching to mostly Rails development for our web apps I’ve been exploring how to accomplish the typical onpage stuff commonly needed with PHP and JSP to get Googlebot and Slurp to snuggle up all cozy with your site. The following tips are just a way to use Rails to accomplish common tasks you should be comfortable with in your current choice of web app/language.

Search Engine Friendly URLs

NO MORE MOD_REWRITE! Boy that feels good to scream. I’m sure many a SEO will agree that Mr. Ralf S. Engelschall’s creation was a beautiful one when he came up with the ‘Swiss Army knife of URL manipulation’ but damn it can be so difficult to debug. I also find it cumbersome to have to manage my URL functionality outside of my application code. This is especially difficult to manage when you have multiple developers working on different operating systems, filepaths, and httpd.confs.

Lets say you had a list of companies you wanted to display but instead of a dynamic URL like

http://www.mysite.com/company?id=4

you wanted to show a nice static URL like

http://www.mysite.com/company/1/american-express

In your company list view you would create a link to a company like so:

[source:ruby]
company.id,
:name => company.name.downcase.gsub(/ /, ‘-’)) %>>
<%= company.name %>
[/source]

Then in your config/routes.rb file you will add just one line:

[source:ruby]
map.companyshow ‘company/:id/:name’, :controller => ‘company’, :action => ’show’
[/source]

Now your clean URL’s will work no matter where your app is deployed whether its a naive developer still stuck on Windows or a test server running SuSE.

UPDATE: This is not a good solution for clean URLs. Better Answer:
Newer post for search engine friendly URL’s

301 redirects

The permanant 301 redirect is the hammer in the SEO toolbox; a must for moving a nasty site written in ASP to a slick new CMS or for simply making sure that all non-www requests get redirected to www.yourdomain.com. Based on recent conversations with a friend from the darker side of the aisle, it appears that the big GOOG still isn’t doing a great job of dealing with 302 redirects so make sure you get it right.

First generate a controller just for handling your redirects:

ruby script/generate controller Redir

Edit that controller so it looks like so:

[source:ruby]
class RedirController < ApplicationController
def index
headers["Status"] = "301 Moved Permanently"
redirect_to params[:newurl]
end
end
[/source]

Finally create a new route for your old URL to the new URL in the config/routes.rb

[source:ruby]
map.connect '/someoldcrap.asp', :controller => ‘redir’, :newurl => ‘/sweet-new-url’
[/source]

This is just a simple one-to-one redirect but you could easily extend this to something dynamic like we did in the end of the SEOBook 301 redirects post by adding a function to dynamically determine where to redirect to and placing it in helpers/redir_helper.rb.

Boost Your Page Load Speed with Page Caching

This one isn’t so much a SEO thing as it is a general user experience nicety but I don’t think it hurts you in the SERP’s to have fast loading pages and it will definitely help your conversions if your visitors can see the full page before their super-short-American-attention-span decides to go look for Lindsay Lohan pics.

Implementation:

How simple is this!?
[source:ruby]
class SomeController < ApplicationController
caches_page :index, :view, :list
[/source]

In the above example Rails will cache the view for index, view, and list by creating a flat file in public and serving that file up until you explicitly invalidate the cache in your controller during an action such as an update.

Problems with Ruby on Rails page caching:

Page caching does not work with pages that have dynamic query strings but of course you shouldn’t need query strings if you use the static URL’s I detailed in the first segment of this post. Page caching also doesn’t work if you are dealing with pages that require authentication or rights management but of course you can’t cache information in any environment that requires such checking.

Dont’ Worry About Session ID’s and URL Rewriting

With PHP one of the many things a SEO has to remember to check for is nasty URL rewriting in which session ID’s are appended to the URL via standard URL rewriting and look like this:

http://www.mypoorsite.com/index.php?PHPSESSID=6791e39af103baf30274938da9dfdfac

You often see it visible in URL’s of apps such as PHPBB and those session ID’s on the URL mean that Google, Y!, MSN bots see an infinite number of indexable URLs on your site and either bail on any attempt to index the relevant content or simply dampen your rankings.

According to Lee Nussbaum Rails handles sesssions as follows:

creates a cookie using an MD5 hash of data with tolerable amounts of entropy, though more would be desirable.
seems to avoid session fixation attacks by taking session IDs only in cookies (which are basically site-specific) and not in links (which can be used to communicate information cross-site).
* makes a store for session state (e.g., @session['user']) available on the server, where it is found by session id and not subject to manipulation by the user.

In layman’s terms this means that you simply don’t have to worry about session id’s in the URL as the default Rails setup uses cookies.

Easy Cloaking

Cloaking for legitimate reasons with no intent to deceive the end user is universally acceptable by the major engines as far as I know and shouldn’t be a dirty word for any SEO. There is simply no reason to feel bad for hiding a nasty chunk of javascript or an unavoidable affiliate link from search engine bots. To perform simple user agent cloaking in Rails:

[source:ruby]
<%
@useragent = @request.user_agent
if @useragent.downcase =~ /googlebot/
%>
<%= render :partial => ‘bot’ %>
<%
else
%>
<%= render :partial => ‘notabot’ %>
<%
end
%>
[/source]

A partial is a beautiful feature built into Rails that lets you stick with the DRY principle even at the presentation level by allowing you to create chunks of HTML and Ruby that can be reused in multiple places (such as a contact form). In the above code, we choose a partial to show based on whether or not the request came from Googlebot. I am sure there is a way to do with this without the <%= and so many open/close rails tags but I don't know it. If you do, please leave a comment.

Questions?

Have some questions about Ruby on Rails and SEO? Post them in a comment and we’ll try to work them out.

Cross Linking Sites Still Burns

Tony — Mon, 20 Nov 2006 19:15:58 +0000

Its been widely known that cross linking your own sites will result in a penalty in Google. Just before leaving PubCon Vegas one of my bots notified me that something was seriously wrong with 2 of my sites. Yesterday when I got home I started trying to figure out what happened. DUH! Somehow in the heat of trying to do too many things at one time I had a brain lapse and decided to link to a page on siteB from siteA. They weren’t just on the same C-class, they were on the same IP. Boom! Both domains rendered useless.

IDIOT!

Virante’s super clever use of on page optimization

Tony — Wed, 23 Aug 2006 19:01:05 +0000

Russ IM’d me today to show me something cool. Very cool. Check out the clever use of on-page optimization to get rankings for a non-competitive phrase: five seo excuses

Pay particular attention to the subdomains and what they spell out. Very slick. I’m sure your phone will be ringing off the hook over at Virante now Russ.

Justilien – So Long and Thanks for All the Links

Tony — Fri, 18 Aug 2006 19:55:28 +0000

Its been a busy week here at NotSleepy with several new hires including another programmer who had done some horrifically tedious, blow your brains out data entry for us in the past. PHP and Java coding should be a far more gratifying work experience for him.

Justilien (formerly of WeBuildPages) just left our office to head home after spending a week here in Raleigh training some new link builders we’ve just added to the organization. Justilien did two days of very well organized presentation on link building as well as core SEO fundamentals and followed that up with three days of hands on training. Judging from the work they’ve already accomplished in one week with Justilien’s guidance I’d say we’re looking as some quality links coming down the pipeline very soon. Thanks much for the outstanding consulting you’ve provided this week Justilien.

Andy Beal Leaves His Company Fortune Interactive

Tony — Fri, 04 Aug 2006 19:09:08 +0000

I had heard a rumor one day ago that Andy Beal may be leaving his own company Fortune Interactive due to some issues with the investors. Today I received an email from him saying that he is leaving the company even though he was the “co-founder, equity holder and CEO”. Andy says that “circumstances have taken place that unfortunately lead me to move on”.

How ironic that this comes on the same day that Mike Grehan announces leaving Marketsmart Interactive.

I wish you all the best Andy and I look forward to seeing you at SES San Jose.