Figured I'd share because it's saved me some time working out how to merge three inaccurate databases with a total of ~300k rows.
SQL Power - DQguru Data Cleansing & MDM Tool
Long story short, I've got a few datafeeds that have duplicate UPCs, and wrong UPCs. DQguru makes it very easy to strain out the dupes. Working on a workflow that'll help me correct them, hopefully by fuzzy-comparing (upc, product name) pairs with a known good db of (upc, product name). First I need to find the latter.. or maybe build it via the Amazon API.
PS: Anyone ever build a 100k+ page WordPress site? I never expected that the permalink to postid translation would be a bottleneck. Any other gotchas to look out for?
SQL Power - DQguru Data Cleansing & MDM Tool
Long story short, I've got a few datafeeds that have duplicate UPCs, and wrong UPCs. DQguru makes it very easy to strain out the dupes. Working on a workflow that'll help me correct them, hopefully by fuzzy-comparing (upc, product name) pairs with a known good db of (upc, product name). First I need to find the latter.. or maybe build it via the Amazon API.
PS: Anyone ever build a 100k+ page WordPress site? I never expected that the permalink to postid translation would be a bottleneck. Any other gotchas to look out for?