Search code examples
pythoncsvpython-3.5scopus

Export Python-Scopus API results into CSV


I'm very new to Python so not sure if this can be done but I hope it can!

I have accessed the Scopus API and managed to run a search query which gives me the following results in a pandas dataframe:

                                                            search-results
entry                    [{'@_fa': 'true', 'affiliation': [{'@_fa': 'tr...
link                     [{'@_fa': 'true', '@ref': 'self', '@type': 'ap...
opensearch:Query         {'@role': 'request', '@searchTerms': 'AFFIL(un...
opensearch:itemsPerPage                                                200
opensearch:startIndex                                                    0
opensearch:totalResults                                             106652

If possible, I'd like to export the 106652 results into a csv file so that they can be analysed. Is this possible at all?


Solution

  • first you need to get all the results (see comments under question). The data you need (search results) is inside the "entry" list. You can extract that list and append it to a support list, iterating until you got all the results. Here i cycle and at every round i subtract the downloaded items (count) from the total number of results.

            found_items_num = 1
            start_item = 0
            items_per_query = 25
            max_items = 2000
            JSON = []
    
            print ('GET data from Search API...')
    
            while found_items_num > 0:
    
                resp = requests.get(self._url,
                                    headers={'Accept': 'application/json', 'X-ELS-APIKey': MY_API_KEY},
                                    params={'query': query, 'view': view, 'count': items_per_query,
                                            'start': start_item})
    
                print ('Current query url:\n\t{}\n'.format(resp.url))
    
                if resp.status_code != 200:
                    # error
                    raise Exception('ScopusSearchApi status {0}, JSON dump:\n{1}\n'.format(resp.status_code, resp.json()))
    
                # we set found_items_num=1 at initialization, on the first call it has to be set to the actual value
                if found_items_num == 1:
                    found_items_num = int(resp.json().get('search-results').get('opensearch:totalResults'))
                    print ('GET returned {} articles.'.format(found_items_num))
    
                if found_items_num == 0:
                    pass
                else:
                    # write fetched JSON data to a file.
                    out_file = os.path.join(str(start_item) + '.json')
    
                    with open(out_file, 'w') as f:
                        json.dump(resp.json(), f, indent=4)
                        f.close()
    
                    # check if results number exceed the given limit
                    if found_items_num > max_items:
                        print('WARNING: too many results, truncating to {}'.format(max_items))
                        found_items_num = max_items
    
    
    
                    # check if returned some result
                    if 'entry' in resp.json().get('search-results', []):
                        # combine entries to make a single JSON
                        JSON += resp.json()['search-results']['entry']
                # set counters for the next cycle
                self._found_items_num -= self._items_per_query
                self._start_item += self._items_per_query
                print ('Still {} results to be downloaded'.format(self._found_items_num if self._found_items_num > 0 else 0))
    
            # end while - finished downloading JSON data
    

    then, outside the while, you can save the complete file like this...

    out_file = os.path.join('articles.json')
            with open(out_file, 'w') as f:
                json.dump(JSON, f, indent=4)
                f.close()
    

    or you can follow this guide i found online(not tested, you can search 'json to cvs python' and you get many guides) to convert the json data to a csv