Automating SERP Tracking

vcm

Banned
Jun 26, 2006
96
4
0
I'm wondering what you guys use to scrape your search engine results on a regular basis.

I'm currently trying to put something together with php/curl (using Smaxor's tutorials to get started), but wanted to know if there is already a program or bot that could do this.

In exchange or some decent answers, I give you Miss Hazell...

keeley-hazell-8.jpg


keeley%20hazell%20topless%20halloween%206.jpg


keeley_hazell_britain.jpg


keeley-hazell-maximal-7-08-3.jpg


keeley-hazell-wallpapers-08.jpg


keeley-hazell-naked-collection-31.jpg


keeley_hazell_2009_calendar_pic.jpg


hazell3.jpg


keeley_hazell_boobs.jpg


keeley-hazell-nude-bath-4.jpg


keeley-hazell-topless-17.jpg
 


php/curl is what I would use, not too tricky either if you know php already and Smaxors tutorials are easy to follow.
 
I would argue against using php here. For scripts that you will want to run locally(more than likely if your scraping SE's) and that generally take a long time to run(such as collecting large amounts of data). To do this your going to be using php-cli anyway from the command line so why not get something (such as python or ruby) that will be a lot more stable (IMO) for the task?

If you don't want to code something though just buy scrapebox and hook it up to some winautomation tool.
 
err... what?

Never had a problem scraping locally with PHP.

::emp::
 
I would argue against using php here. For scripts that you will want to run locally(more than likely if your scraping SE's) and that generally take a long time to run(such as collecting large amounts of data). To do this your going to be using php-cli anyway from the command line so why not get something (such as python or ruby) that will be a lot more stable (IMO) for the task?

If you don't want to code something though just buy scrapebox and hook it up to some winautomation tool.

good points man
 
In my personal experience and this is probably down to bad coding but many long running scripts in php, and by long running I mean >24hours have run out of memory and crashed. There are also many known memory leak problems associated with php-cli. Once I changed and started writing my scripts in python this problem disappeared.

By all means I agree with using whatever works best for you and it just so happened that php didn't work best for me and I would never go back to it for client side scripting. Just my 0.02$
 
If you're using curl, be sure to user curl_multi_* if you're going to be checking more than one URL.

Also, don't use simple_html_dom or you will definitely see OOM problems when doing batches (even more than 10 urls in some cases). See my post in the PHP warchest thread about how to parse Google serps quickly (really fucking fast by comparison) and cheaply. simple_html_dom is really easy to use, which is great in simple cases, but it has self-admitted memory leaks and is slow as fuck.