Scraping Businesses and downloading a file

Pb.com · May 27, 2014

In broad terms, let's say I want a specific pdf file that is hosted on a company's site and I want to get this file for as many businesses as possible in the U.S.

What is the best way to do this?

My thoughts:
1. Create a master list of businesses (cross reference various directories)
2. Extract Name of company, address, phone, site URL
3. Have mechanical turks go through the list of URL's, paste url to exact file location (www.joesbiz.com/thefile.pdf) or the page that the information I'm looking for is on (www.joesbiz.com/thefilepage.html)
4. Download the file/take a screenshot of the page and convert it into a pdf, stripping the extra parts out.

I apologize for speaking in generic terms ahead of time.

Dannko · Sep 21, 2014

First point is correct.
Next ones are replaced by Zennoposter.

scrape.it · Sep 26, 2014

Pb.com said:
In broad terms, let's say I want a specific pdf file that is hosted on a company's site and I want to get this file for as many businesses as possible in the U.S.

What is the best way to do this?

My thoughts:
1. Create a master list of businesses (cross reference various directories)
2. Extract Name of company, address, phone, site URL
3. Have mechanical turks go through the list of URL's, paste url to exact file location (www.joesbiz.com/thefile.pdf) or the page that the information I'm looking for is on (www.joesbiz.com/thefilepage.html)
4. Download the file/take a screenshot of the page and convert it into a pdf, stripping the extra parts out.

I apologize for speaking in generic terms ahead of time.

you'd want to either write a scraper or use a tool that does this.

MarketingMall · Jan 4, 2019

If you need company data from different sources like Yelp, Yellowpages, Linkedin Companies, BBB, etc. Then I can help you, dont reinvent the wheel.
I have already coded a tool and constantly adding more features.

Search

Search

Scraping Businesses and downloading a file

Pb.com

New member

Dannko

New member

scrape.it

New member

MarketingMall

New member