Archive

Archive for December, 2004

phpBB Exploit – howdark Exploit

December 11th, 2004 Tony 8 comments

There is a serious exploit in phpBB that requires immediate patch.

Patch

If you maintain a site running phpBB forums load this patch immediately. Apparently the exploit allows the hacker to gain root access to mySql. Many sites have been hacked including phpBB’s own support forum. Ouch.

I haven’t been able to find much info on how the exploit works. The code change in this patch seems trivial and I still don’t understand how it fixes the problem with the forums. If anyone has more insight I would love to hear.

Supposedly this is only a temporary patch and we need to stay tuned for a permanant fix.

Full discussion on the security flaw including input from ‘jessbunny’, the user that reported the problem and then proceded to use the exploit against phpBB.com

Categories: Code Tags:

tracker2.php Pagejacking via HTTP 302 Redirect Google Bug

December 10th, 2004 Tony 15 comments

Google has a nasty bug these days that allows unsavory webmasters to hijack the content on your site. If your competitor wants to destroy your search engine rankings he only needs to create a simple page that forces a HTTP 302 redirect to your site. Sounds harmless enough right? Well the problem is that Google follows the redirect to the your site but gives the evil redirect site credit for the content.

How Do I Know if a Site is Hijacking My Site?

1. Search in Google for allinurl:www.mysite.com
2. Look for any listings that are not your site but have the exact title as your site.
3. View the Google cache to see if it looks just like your site.
4. Use my HTTP Response Viewer to view all HTTP headers being returned.

If the title is the same, the cache is the same, and a HTTP 302 is being returned, you’ve been page jacked. The most commonly talked about filename associated with this tactic is tracker2.php but many more are popping up. Often times this is accompanied by cloaking tactics to serve up the 302 redirect to Googlebot and a different page to normal visitors. My HTTP Response Viewer fakes the user-agent so that the server thinks it is a visit from googlebot so that you may view HTTP response headers exactly as Google sees them.

The bad site can either return a HTTP 302 redirect header or drop a simple one liner meta-refresh tag.

How Does This Hurt My Site?

Google sees two sites with identical content and one of them gets smacked out of the index. My experience is that is often the original, innocent site.

Some claim that the offensive site actually steals your pagerank in the process but I’m not convinced this is true.

Are All 302 Redirects Malicious?

No. Remember that HTTP 302 is a valid method of reporting that a page has been temporarily redirected to another site. Google is to blame here. I have found that most webmasters are oblivious to the problem it poses. If you find a site that is redirecting to your site and cloaking in the process you can consider it malicious and a swift FedEx from your attorney is in order.

An interesting sidenote is that Business.com recently shot themselves in the foot. They used a 302 redirect to bounce all traffic to business.com to www.business.com. Soon thereafter they suffered the PageRank 0 on the homepage and higher PageRank on internal pages syndrome as well as having their homepage removed from the Google index.

How Can I Stop my Pages from Being Hijacked?

Unfortunately your only course of action is to attempt to get the other site to remove the HTTP 302 redirect. As I said before most webmasters have no idea of the havoc they are wreaking. I have found that a polite yet firm email nearly always results in a swift removal of the redirect and its often followed by a puzzled reply “Whats the problem?”. To make matters worse, it seems that a module for PHPNuke is creating HTTP 302 redirects. (View an example email that has worked extremely well for us)

What is Google Doing About This?

Who knows. They have requested examples of the bug and at the recent Webmasterworld conference they claimed to be working on a resolution but it seems to be taking a long time. Yahoo had the same bug about 10 months ago and they fixed it very quickly.

Categories: Search Engine Optimization Tags:

Google Sandbox Doesn’t Exist!

December 3rd, 2004 Tony No comments

i’m starting to buy into the idea that the google sandbox is non-existent. i talked to quite a few peeps at the WebmasterWorld conference that had sucessfully launched new sites that quickly ranked well for competitive keywords.

i think caveman may be right:

I think it involves a series of dup filters, not limited to “content” in the “onpage” sense of the word. We believe it can involve text, and/or anchor text, and/or non-content code. Not too unlike good ol’ kw filters.

We guess that it involves an evolving db of kw’s, with the list changing in waves, though the biggest waves had names (e.g. Austin), and now the evolution of the list is more subtle. This could be a conclusion however that is way off base, with other things happening that just make it look like a kw based phenomenon. Some 2kw phrases that seem like they should be on the hit list are not, which is why we only guess at the kw list thing. And evidence abounds that it’s not just a money kw thing. I wish that would get put to bed.

We’d agree that there are multiple algo’s, and that G’s system factors in qualitative and evolutionary factors including age of pages, age of links, number of pages, number of links and rate of growth. There are lots of ways for a new cat to get skinned, and there have been ever since Florida. Just worse since April.

We also think a site has *hurdles* to get over that kick in or not at varying degrees depending upon their assesment of (what we view as) the evolutionary measures. And the hurdles also involve semantics, and measures of quality.

Trip a filter, you die. Sitewide. Fail to get over the hurdles, you die. Sitewide. (Did someone say G was all about pages?) ;-)

One thing I’d never do right now. Launch a 10,000 page site, or add 1000 pages to a 500 page site…unless I had a LOT of other things going for me.

I even seem to recall pages getting trimmed from the really big sites, which may be more related to all this than has been discussed much…especially if the algos are conceptually similar, but vary largely by degree.

this is so typical of the SEO business. a couple of years ago the standard way to launch a new site was to make sure you had all 100 to 20,000 pages up and ready to be indexed and then drop a link to the new site. you needed to make sure all your pages were ready to go before you queued googlebot because a deepcrawl was nearly inevitable at first launch but it may be a long time before you saw another. now it seems you’re better off starting with a few pages and trickle in new pages and inbound links. you definitely need software to manage this. its the only way i’ve seen sucess in the past year.

Categories: Search Engine Optimization Tags: