gogogogo



Not really a great showcase for go's unique features, but:

Code:
package main

import "bufio"
import "os"
import "fmt"
import "net/url"
import "strings"

func main() {
    domainsMap := make(map[string]bool)
    scanner := bufio.NewScanner(os.Stdin)
    for scanner.Scan() {
        parsed, err := url.Parse(scanner.Text())
        if err == nil {
            domainsMap[strings.ToLower(parsed.Host)] = true
        }
    }    
    fmt.Println(len(domainsMap))
}

Accepts piped urls
 
The scanner takes each line from piped in data (from cat / whatever), parses the url on each line, takes the domain, adds it as a key to a hash, counts how many keys in the hash.
 
oh, i see, so like "| cut -d\/ -f1 | sort -u | wc -l", cool go-ry bro

#100th post
 
Bumping this back up because Go is awesome, and more people should try it out.

Here, have a spin function.

Code:
func spin(text_input string) string {
	rand.Seed(time.Now().UTC().UnixNano())

	// Common error is no ending }
	if strings.Count(text_input, "{")-strings.Count(text_input, "}") == 1 {
		text_input = text_input + "}"
	}

	count := strings.Count(text_input, "{")

	// Replace bitchez
	for i := 0; i < count; i++ {
		field := spinRegex.FindStringSubmatch(text_input)
		replacements := strings.Split(field[1], "|")
		text_input = strings.Replace(text_input, field[0], replacements[rand.Intn(len(replacements))], 1)
	}

	return text_input

}
 
Speaking of unique domains anyone got code in Python handy to delete duplicate ones? Using set() to delete dup urls but nothing for domains yet.
 
Speaking of unique domains anyone got code in Python handy to delete duplicate ones? Using set() to delete dup urls but nothing for domains yet.

Code:
import urlparse

urls = set(open('links.txt', 'r').read().replace('\r\n').split('\n'))
seen_domains = []
output = []

for url in urls:
	domain = urlparse.urlparse(url).netloc.lower()
	if domain not in seen_domains:
		seen_domains.append(domain)
		output.append(url)

print "Found %s unique domains" % len(output)

f = open('output.txt', 'w')
for i in output:
	f.write(i+"\n")

Will keep one url from each domain. (untested, but should work)