Programming Language Used To Scrape?

I learned Perl way back when before PHP, Rails and other shit didn't exist, and I've just stuck with it. Perl is great because you can manage your entire server with it, create web apps in mod_perl/CGI, pretty much do anything you want with it.
 


yup C# is hard to get into quickly. If your sole purpose is to learn scraping and browser automation i suggest go for vb.net. Its a child play.It would take you max. 1 week to learn the basics of windows forms and controls. And after that you can learn how to scrape, automatically login etc. watch this video on youtube :

http://www.youtube.com/watch?v=AV2tk2FTM0g

Say no to .NET, it's way too bloated. It's predecessor, VB, left a horrific scar on decent programing practices.

Yes, it's great for rapid prototyping and development, but it will severely limit projects long-term. Not to mention, you end up paying more long-term as well.
 
Really? I've been doing a bit of looking around and Python does seem to be a popular one with quite a big following, and as i understand it's quite new so has the benefit of looking back at other languages. I've had a little play with it on Komodo Edit and it seems pretty easy to learn so far...but i still need to look into .net and Perl maybe.

I'm sure i looked into C# a while back and had a hard time with it, but that might have been because i wasn't too good with programming then in general. Cheers anyway, just need a bit more research i think.

Python has been around since 1991, far from new. It's become more popular in the last 5 years because people realized you could write 5 lines of Python code to replace 20-30 lines of .NET or Java.
 
Curl.class.php + MasterCurl.class.php + HTMLParser.class.php + URLObject.class.php = saturate 100mbit link with 1-10,000 requests/second.
 
If you already know php, go with that. Get hold of the book freezeprogram linked and start.
In terms of which is the best language, depends on what you want to do.

.net is well organised and designed to scale to the largest scale programs. It has no relation to visual basic, rather it's very much a clone of java with a few tweaks along the way. Speedwise .net is about the same as java, much much faster then scripted languages but slower then c.

Php's good in the sense that you can get going quickly and every hosting company on the planet supports it.
 
.net is well organised and designed to scale to the largest scale programs. It has no relation to visual basic, rather it's very much a clone of java with a few tweaks along the way. Speedwise .net is about the same as java, much much faster then scripted languages but slower then c.

.net is incredibly slow to load, around the same load times as java. It doesn't scale well across multiple servers, especially if you take into account cost of software. Mono sucks ass, so don't even bring that up.

4329878511_7ea19a804f.jpg


Atleast we can agree Ruby sucks

4329878353_cfa899439a.jpg


I'd probably like python a lot more if you could do shortcuts like

if (somevar = somefunction()):

I'd honestly say PHP is awesome for rapid development and single developer applications. Python is a lot easier to deal with in teams due to it's anal enforcing ways, but less of a pain in the dick like java.

php.net is awesome, can't really beat the documentation.

They are planning on enforcing <?php and typecasting function blah((int) $gah) though in newer versions, which is probably about the time I'll start getting pissed off and leave PHP to rot. Fuck facebook for encouraging that shit (ala hiphop).
 
If you know javascript I highly recommend trying node.js and server side jquery. Using the various $() selectors will change the way you think about scraping.
 
Don't make me copy and paste my question again...

I'm looking for some input on what PROGRAMMING LANGUAGE is preferred in here for scraping and bot type stuff related to internet marketing. I know its different languages for different functionality, but i'm looking for the IM input.

If that's what i wanted i'd have asked you to link me to uBot or ScrapeBox...but i already have them...i'm looking for a little more advanced stuff. Post your toss in DP, or post me a Dickroll, but not a shitty torrent search.


why do you care what we prefer, you should care about what you prefer. In you first post you said you wanted to look at another language for scraping which makes me think you have experience programming already in some language.. so look for a language that is close to what you already know.

Even stupid people would know to go to Google and type in queries like "scraping wikipedia" or "scrape html" and see that there would be tons of pages and tutorials about PHP and a few on ruby and python. From that observation someone with common sense would deduct that prob. there is a lot of info with regards to PHP and scraping and that PHP is prob. the widest choice used based on the sheer amount written about scraping with PHP.
 
I've always used php, curl and the simple dom parser class comes in handy, though it's slower than using regular expressions or whatever. There's no multithreading, which is pretty frustrating but I've had success with using a python script with threads executing the php code.

Server side JS, I'll have to check out. JS is probably my favorite language to code in (with jQUery) but I've only used it client side.
 
Ok. I just realized that my hosts file was pointing to the domain and it IS indeed expired. Thanks for trying to tell me, major fail on my part.