Yahoo! Search BOSS

Status
Not open for further replies.

Red_Virus

Skype: smartseoservices
Jun 29, 2007
3,299
37
0
BOSS (Build your Own Search Service) is Yahoo!'s open search web services platform. The goal of BOSS is simple: to foster innovation in the search industry. Developers, start-ups, and large Internet companies can use BOSS to build and launch web-scale search products that utilize the entire Yahoo! Search index
Yahoo! Search BOSS - YDN

So now Yahoo comes with an API with unlimited queries. !
 


Trying to sign up but how the hell do I know Web Application URL and BBAuth Success URL before I've had a chance to build anything?

Just make up shit like example.com/y-search.php and example.com/y-results.php even though those pages don't exist?
 
OK, you can just enter your domain root for both of those fields. It will then ask you to create a uniquely named html file and upload it to the root of your domain (or probably whatver URL you entered in those boxes).

Once it validates your html file you get your API key. Takes less than 5 minutes.
 
Finally got this shit working geez I suck at PHP.

This example uses CURL and SimpleXML.

I created a quick searchbox for photoshop tutorials. I strip out the words photoshop and tutorial if the user has entered them - and add them back to the search.

None of that is important, all you want to do is create a proper $query and $url that you will pass to the boss as well as using your own APPID.


After we execute the CURL we load the results into an SimpleXmlElement and see how many results we got - default is 10.

The boss returns a clickurl, a displayurl, and abstract and some other junk we don't care about.

So we take the count of results and build up some arrays $clickurls, $displayurls, $descriptions to hold all the values. Note that the values are still simplexml objects so you have to cast them to string or pass them to a function that will return a string.

Next we build up an associative array $urls that has all the data inside it. You could probably do this and the above step all at once.

Lastly we loop through our $urls and build out some clickable URLs with descriptions.

Maybe someone good at PHP can clean this up and look for any form exploits and post some better code.

Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "[URL]http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd[/URL]">
<html xmlns="[URL="http://www.w3.org/1999/xhtml"]XHTML namespace[/URL]">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>Photoshop Tutorial Search</title>
</head>
<body>
<?php
if(isset($_POST['tutorial']))
$tutorial = rtrim($_POST['tutorial']);
 
if ($tutorial != '')
{
$tutorial = str_replace('tutorial', '', $tutorial);
$tutorial = str_replace('photoshop', '', $tutorial);
$tutorial = str_replace('how to', '', $tutorial);
$query = $tutorial . " photoshop tutorial";
$query = str_replace(' ', '+', $query);
// Boss Level - Must Defeat the BOSS!
$url = "[URL]http://boss.yahooapis.com/ysearch/web/v1/[/URL]" . $query . "?appid=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx&format=xml";
DefeatBOSSLevel($url);
}
function DefeatBOSSLevel($target_url) {
 
$userAgent = 'IE 7 - Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)';
$clickurls = array();
$displayurls = array();
$descriptions = array();
$urls = array();
// make the cURL request to $target_url
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
$data = curl_exec($ch);
curl_close($ch);  
if (!$data) {
 echo "<br />cURL error number:" .curl_errno($ch);
 echo "<br />cURL error:" . curl_error($ch);
 exit;
}
 
$xml = new SimpleXmlElement($data, LIBXML_NOCDATA);
$cnt = count($xml->resultset_web->result);
for($i=0; $i<$cnt; $i++)
{
  $clickurls[] = (string)$xml->resultset_web->result[$i]->clickurl;
  $displayurls[] = (string)$xml->resultset_web->result[$i]->dispurl;
  $descriptions[] = trim($xml->resultset_web->result[$i]->abstract);
}
for ($i = 0; $i < $cnt; $i++) {
$urls[] = array('curl' => $clickurls[$i], 'durl' => $displayurls[$i], 'description' => $descriptions[$i]);
} 
foreach ($urls as $url) {
print ('<div><a href="' . $url["curl"] . '">' . $url["durl"] . '</a><br />' . $url["description"] . '<br /><br />');
} 
}
?>
<form action="tutorialsearch.php" method="post" enctype="application/x-www-form-urlencoded" name="tutorialsearch">
<div><label>Tutorial Search</label><input name="tutorial" type="text" size="30" maxlength="50" value="<?php echo $tutorial; ?>" /><input name="submit" type="submit" value="Search!" /></div>
</form>
<div>Examples -</div>
<div>Glitter</div>
<div>Fire Text</div>
<div>Cool Text</div>
</body>
</html>
 
Screenshot of results -




I think the next thing I'll try (sometime tomorrow after work) is to also do an image search and if any of the image domains match the query results then display the image next to the result.

And then eventually pass in site: commands to limit which sites I get results from.
 
Status
Not open for further replies.