Tutorial: theHarvester – Collect a Company’s Email Addresses, Subdomains, Related Servers

theharvester-thumbnailThe information gathering steps of footprinting and scanning are of utmost importance. Good information gathering can make the difference between a successful penetration test and one that has failed to provide maximum benefit to the client. We can say that Information is a weapon, a successful penetration testing and a hacking process need a lots of relevant information that is why, information gathering so called foot printing is the first step of hacking. So, gathering valid login names and emails are one of the most important parts for penetration testing. We can use these to profile our target, brute force authentication systems, send client-side attacks (through phishing), look through social networks for juicy info on platforms and technologies, etc.

What is theHarvester?

TheHarvester has been developed in Python by Christian Martorella. It is a tool which provides us information of about e-mail accounts, user names and hostnames/subdomains from different public sources like search engines and PGP key server.

This tool is designed to help the penetration tester on an earlier stage; it is an effective, simple and easy to use. The sources supported are:

  • Google – emails, subdomains/hostnames
  • Google profiles – Employee names
  • Bing search – emails, subdomains/hostnames, virtual hosts
  • Pgp servers – emails, subdomains/hostnames
  • LinkedIn – Employee names
  • Exalead – emails, subdomain/hostnames

New features:

  • Time delays between requests
  • XML results export
  • Search a domain in all sources
  • Virtual host verifier

Getting Started:

Go to the Arsenal -] scanning -] web scanner -] theharvester.
In case, if it is not available in your distribution, than you can easily download it from http://code.google.com/p/theharvester/downlaod, where latest version 2.2 is available, simply download it and extract it.
Provide execute permission to the theHarvester.py by [chmod 755 theHavester.py]
After getting in to that, simply run. /theharvester, it will display version and other option that can be used with this tool with detailed description.

Example 1:

Command Syntax:

theHarvester -d [url] -l 300 -b [search engine name]

theHarvester -d matriux.com -l 300 -b google

In Above command:-

  • -d [url] will be the remote site from which you wants to fetch the juicy information.
  • -l will limit the search for specified number.
  • -b is used to specify search engine name.

From above information of email address we can identify pattern of the email addresses assigned to the employees of the organization. For example, some companies uses firstname.lastname@domain.com pattern, so that can be useful in order to brute force the account of a specific person. Host information can be useful in order to scan the specific system.

Example 2:

Search from all search engine.
Command:

theHarvester -d gtu.ac.in -l 300 -b all

This command will grab the information from multiple search engines supported by the specific version of theHarvester.

Example 3:

Save the result in HTML file. Command:

theHarvester.py -d gtu.ac.in -l 300 -b all -f hackguru

To save results in html file -f parameter is used as shown in this example.

Conclusion

theHarvester is a handy tool, which would quickly fetch the juicy information from the public resources by active or passive means.

Suggestion

Exposure of personal information is an advantage for every social engineer guy. Every information that you post on the Internet will eventually stay forever. So before you post something personal think twice if it is really necessary to allow other people to know about yourself and your activities. Also using different email addresses and usernames will make the work of social engineers much more difficult.

Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *