Develop programs that examine all elements of a dataset. Keep track of what you've seen, how often you've seen it and in what context it appears.
The more "semantic" your analysis, the more difficult this becomes. Work from structure towards semantic items that are of interest. Count what you are skipping to keep your results in perspective.
Let's ask, who is scraping my site?
ssh c2.com 'head -1000 /var/log/httpd/access_log' |\ perl -pe 's/ .*//' | sort | uniq -c | sort -n
Collapse hyphenated numbers in domain names.
s/-\d+/-/g
Grep for a mysterious site.
grep 192.243.55
Hmm. 15,000 hits today?
1385 192.243.55.131 1388 192.243.55.133 1456 192.243.55.138 1479 192.243.55.136 1483 192.243.55.129 1555 192.243.55.130 1568 192.243.55.132 1576 192.243.55.137 1582 192.243.55.134 1594 192.243.55.135
See Graph a Structure with Graphviz for presenting structured counts.