Anyone know where I'd buy a database dump of EzineArticles?

gravearc · Dec 28, 2010

Was thinking of just getting it coded up but figured someone has already done it.

nvanprooyen · Dec 28, 2010

Probably wouldn't take too long to write a scraper for EZA, but like you said...someone else has probably done it already.

Edit> Blackhat SEO – Esrun » Ezinearticles content scraper Although that looks a little dated. May / may not work.

stmadeveloper · Dec 28, 2010

When you get it coded up make sure it deals with changing css sheets, and ip banning.

They make other frequent changes as well to stop scrapers. I've written half a dozen over the last year and they change so much it's just a pain in the ass.

Rexibit · Dec 28, 2010

I wrote a tutorial on it a couple weeks ago using EzineArticles as an example: PHP Tutorial 2: Advanced Data Scraping Using cURL And XPATH | Matthew Watts

It's honestly not hard to do yourself. Just add database functionality and make sure to add a function in the cURL call to insert a proxy to use.

bizousoft · Dec 28, 2010

Rexibit said:
I wrote a tutorial on it a couple weeks ago using EzineArticles as an example: PHP Tutorial 2: Advanced Data Scraping Using cURL And XPATH | Matthew Watts

It's honestly not hard to do yourself. Just add database functionality and make sure to add a function in the cURL call to insert a proxy to use.

you tested it and got it to work with Ezine? I had my scraper banned, through diff. proxies, user agents etc. etc. my hat goes off to you

guerilla · Dec 28, 2010

Scrape Google cache.

gravearc · Dec 29, 2010

Rexibit said:
I wrote a tutorial on it a couple weeks ago using EzineArticles as an example: PHP Tutorial 2: Advanced Data Scraping Using cURL And XPATH | Matthew Watts

It's honestly not hard to do yourself. Just add database functionality and make sure to add a function in the cURL call to insert a proxy to use.

Wish I skimmed that before I started, I didn't see that they had a next button on their site so I ended up spamming Google trying to find all the article URLS.

Did you guys have problems initially or get banned later on?. I've hit it ~5k times so far and haven't been banned yet.(proxies obv)

bizousoft · Dec 29, 2010

gravearc said:
Wish I skimmed that before I started, I didn't see that they had a next button on their site so I ended up spamming Google trying to find all the article URLS.

Did you guys have problems initially or get banned later on?. I've hit it ~5k times so far and haven't been banned yet.(proxies obv)

I didn't get banned, just a message comes up, "hey I see you're using ___ no problem, enter captcha and keep reading"

Never thought of scraping cache, excellent idea!

dchuk · Dec 29, 2010

this is why automating a browser is infinitely better than botting with cURL...watir/celerity to the rescue, run 10 concurrent instances with liberal pauses and a dedicated proxy per instance...ezine isn't going anywhere, it's not a race

productionhead · Dec 29, 2010

Google

Search

Search

Anyone know where I'd buy a database dump of EzineArticles?

gravearc

Banned

nvanprooyen

Fortes Fortuna Adiuvat

stmadeveloper

New member

Rexibit

Automation, I has it.

bizousoft

ну Вовка

guerilla

All we do is win

gravearc

Banned

bizousoft

ну Вовка

dchuk

Senior Botter

productionhead

New member

Anyone know where I'd buy a database dump of EzineArticles?

Banned

Fortes Fortuna Adiuvat

New member

Automation, I has it.

&#1085;&#1091; &#1042;&#1086;&#1074;&#1082;&#1072;

All we do is win

Banned

&#1085;&#1091; &#1042;&#1086;&#1074;&#1082;&#1072;

Senior Botter

New member

ну Вовка

ну Вовка