List websites on Shared Servers using Bing API

Finding websites that are hosted on a particular IP address or that are hosted on a shared web server is a very useful part of information gathering during a penetration test. Bing supports searching for websites that are indexed on a particular IP address, and there are a few websites that provide this service too, for example the reverse IP domain check tool on https://www.yougetsignal.com/. There is also an Nmap script that uses https://ip2hosts.com/.

A colleague at Dionach told me about the Bing API for web search, which can return JSON formatted results of search queries. I thought it would be a nice way to automate listing websites that are indexed for a particular IP address with Python. You can manually search for these using the Bing “ip:” keyword, for example “ip:204.79.197.200”, which is for Bing itself. You can find a list of keywords for Bing at  https://onlinehelp.microsoft.com/en-us/bing/ff808421.aspx. To use the API, you have to register and choose the web search service.  It is free up to 5,000 transactions a month: https://datamarket.azure.com/dataset/bing/searchweb . Once you register, you get an API account key which you need to use to do searches.

The search query is a simple GET request, but requires basic authentication using the API account key as both the username and password. As you can see in the request URL below, the query strings you need are mainly the “Query”, which needs to be a quoted string (hence the “%27” encoded quotes), “$format” set to “json”, as JSON is easy with Python, “$top” for the number of results per page, and “$offset” to return each page of results.

<code> https://api.datamarket.azure.com/Bing/Search/v1/Web?$skip=[offset]&$top=[results_per_page]&$format=json&Query=%27[query]%27 </code>

This request is used in the Python function “bing_api_query” in the code below to iterate over each page of results and return a list of results. Each result is a dictionary of the search result attributes, for example  “Url” and “Description”. The “get_sites_on_ipaddress” function uses “bing_api_query” to get all the results for the “ip:<ipaddress>” Bing query and returns a list of unique domains for that IP address. As an example, use “get_sites_on_ipaddress(‘204.79.197.200’)” to repeat the initial example above.

<code>import urllib, urllib2, urlparse, json, base64 

def bing_api_query(query):
    “””Gets all results for a bing web query (up to 50 pages), returns HTTP code and list of result dictionaries”””
    
    api_account_key = “your_bing_api_account_key”
    headers = {“Authorization”:”Basic %s” % ( base64.b64encode(‘%s:%s’ % (api_account_key, api_account_key)) )}
    results_per_page = 50
    offset = 0
    results = []
    while offset < results_per_page * 50:
        url = “https://api.datamarket.azure.com/Bing/Search/v1/Web?$skip=%s&$top=%s&$format=json&Query=%%27%s%%27” % (offset, results_per_page, urllib.quote(query))
        req = urllib2.Request(url, None, headers)
        try:      
            f = urllib2.urlopen(req)
            code, resp = f.code, f.read()
        except urllib2.HTTPError, e:
            code, resp = e.code, e.read()
        if code == 200:
            bing_data = json.loads(resp)
            d_bing = bing_data[‘d’]
            if ‘results’ in d_bing.keys():
                results.extend(d_bing[‘results’])
            if ‘__next’ in d_bing.keys():
                offset += page_count
            else:
               break
        else: # error
            return code, resp
        
    return code, results


def get_sites_on_ipaddress(ip_address):
    “””Returns a list of website domains for a given IP address”””

    code, results = bing_api_query(“ip:” + ip_address)
    if code == 200:
        return sorted(set([urlparse.urlparse(result[‘Url’]).netloc for result in results]))
    else:
        raise NameError(‘BING ERROR: code %s – %s’ % (code, results)) 
</code>
&nbsp;

Find out how we can help with your cyber challenge

Please enter your contact details using the form below for a free, no obligation, quote and we will get back to you as soon as possible. Alternatively, you can email us directly at [email protected]