Search code examples
pythonscopuspybliometrics

**kwds not working in pybliometrics.scopus.ScopusSearch


I am currently using pybliometrics.scopus.ScopusSearch to get all results to a search. I want to constrain my search to a specific date range. However, the date kwds is not working.

My code currently looks like

q = "TITLE(ecotourism) AND TITLE-ABS-KEY(quant*)"
scopus_search = ScopusSearch(q, date='2002-2007')

However, this is returning documents that are also outside of the specified dates. It is actually returning all the files for the query, disregarding the published dates. I do not want to include the dates in the query, as there is a maximum of 8 connectors in Scopus and other queries I have are going to be much more compplex than this example.

Any help is appreciated!


Solution

  • The kwds arguments work, but they are irrelevant when you re-use cached results. You probably ran the query without the kwds first, didn't you? When you refresh this query, the kwds will apply again.

    Compare this:

    import pandas as pd
    from pybliometrics.scopus import ScopusSearch
    
    q = "TITLE(ecotourism) AND TITLE-ABS-KEY(quant*)"
    
    scopus_search = ScopusSearch(q, date='2002-2007')
    df = pd.DataFrame(scopus_search.results)
    df["year"] = df["coverDate"].str[:4].astype(int)
    print(df["year"].max())
    # 2007
    
    scopus_search = ScopusSearch(q)
    df = pd.DataFrame(scopus_search.results)
    df["year"] = df["coverDate"].str[:4].astype(int)
    print(df["year"].max())
    # 2007
    
    scopus_search = ScopusSearch(q, refresh=True)
    df = pd.DataFrame(scopus_search.results)
    df["year"] = df["coverDate"].str[:4].astype(int)
    print(df["year"].max())
    # 2024
    
    scopus_search = ScopusSearch(q, date='2002-2007')
    df = pd.DataFrame(scopus_search.results)
    df["year"] = df["coverDate"].str[:4].astype(int)
    print(df["year"].max())
    # 2024
    
    scopus_search = ScopusSearch(q, date='2002-2007', refresh=True)
    df = pd.DataFrame(scopus_search.results)
    df["year"] = df["coverDate"].str[:4].astype(int)
    print(df["year"].max())
    # 2007