Archive for the ‘Code’ Category

Ruby Could Replace my Python Crawler Pretty Soon

July 28th, 2008 tony 6 comments

One of my developers just sent me some truly incredible stats about Ruby 1.9 and its threading performance.

20 threads * 100,000 iterations
Ruby 1.9 = 1.54 s.
Ruby Enterprise = 3.01 s.
JRuby 1.1.2 = 5.82 s.
Jython 2.2.1 = 11.86 s.
Python 2.5.2 = 12.32 s.
Ruby 1.8.7 = 22.68

Since our attempt at testing Ruby as a crawler really wasn’t all that much slower than Python it could be really interesting to see what will happen with Ruby 1.9.

The blog post about the test (Its in Polish)

Categories: Code, Crawlers Tags:

Python is Ugly but Damn She’s Beautiful

March 26th, 2008 tony 2 comments

Remember the Python crawler NotSleepy built to suck up all your internets and find your affiliate IDs? Well we kept massaging the code and finally slapped that thing down on a fat pipe. WOW. The stats are rocking now. How about double time!

Latest Stats:
35.6 URLs per second
3.073 Million URLs per day!

Whats most promising is that the new fat pipe is still the bottleneck which means that if anybody really wants to party, all we need to do is lay down some greenbacks and a OC-12 will show us mass terabyte pleasure.

Categories: Code, Crawlers Tags:

Why Basecamp Sucks…

March 12th, 2008 tony 11 comments




Read more…

Categories: Code Tags:

Easy Solution for Conflicting Rails Migration Version Numbering

March 2nd, 2008 tony 1 comment

migrations numbers
Thats a crappy blog post title but the best I could come up with! You know the scenario: you are about to commit your latest Rails code to subversion and you perform an update first. Rats. Someone has committed a new migration with the same number as yours. So you mess around with SQL manually reversing changes, and rename files. Its a pain, and I think this single problem with Rails causes much stress because as Rails developers we are used to everything working so smoothly. We’ll Steve Purcell has solved this problem in a beautiful way.

Install his plugin ‘renumber_migrations’:
script/plugin install

Next time you run into this migrations mess:
rake db:migrate:renumber

Problem solved!

One note: for some reason I couldn’t get it to work until I removed line 18:
raise "This task currently supports only subversion projects"

Don’t know why he added that line but once it was removed it worked perfectly. Thank you very much Steve! Now if someone will write a nice script to setup a bunch of common ignore properties (log/, schema.rb, tmp/) in SVN when first importing a new Rails project…. :)

Categories: Code, Ruby on Rails Tags:

Crazy Python Crawler

January 7th, 2008 tony 4 comments

Someone emailed me doubting my crawler could operate at the speeds I posted last week so here is a video I took this morning. I should have waited a few minutes after launching it before starting the video as it really starts cranking once all the threads get rocking and you can see that near the end of the video. Also notice my streaming internet radio going in and out thanks to no available bandwidth left on my 5Mbps line.

You can also hear a ticking sound. That is my new 1TB drive. It makes these weird ticking noises even when its not in use. REally sounds like the arm hitting something its not supposed to hit. Hope its not defective.

Video link

Categories: Code, Crawlers, Python Tags: