Search code examples
pythonapipandasnumpyflickr

How to feed array of user_ids to flickr.people.getInfo()?


I have been working on extracting the flickr users location (not lat. and long. but person's country) by using their user_ids. I have made a dataframe (Here's the dataframe) consisted with photo id, owner and few other columns. My attempt was to feed each of the owner to flickr.people.getInfo() query by iterating owner column in dataframe. Here is my attempt

for index, row in df.iterrows():
     A=np.array(df["owner"])
for i in range(len(A)):
    B=flickr.people.getInfo(user_id=A[i])

unfortunately, it results only 1 result. After careful examination I've found that it belongs to the last user in the dataframe. My dataframe has 250 observations. I don't know how could I extract others. Any help is appreciated.


Solution

  • It seems like you forgot to store the results while iterating over the dataframe. I haven't use the API but I think that this snippet should do it.

    result_dict = {}
    for idx, owner in df['owner'].iteritems():
        result_dict[owner] = flickr.people.getInfo(user_id=owner)
    

    The results are stored in a dictonary where the user id is the key.

    EDIT:

    Since it is a JSON you can use the read_json function to parse the result. Example:

    result_list = []
    for idx, owner in df['owner'].iteritems():
        result_list.appen(pd.read_json(json.dumps(flickr.people.get‌​Info(user_id=owner))‌​,orient=list))
        # you may have to set the orient parameter. 
        # Option are: 'split','records','index', Default is 'index'
    

    Note: I switched the dictonary to a list, since it is more convenient

    Afterwards you can concatenate the resulting pandas serieses together like this:

    df = pd.concat(result_list, axis=1).transpose()
    

    I added the transpose() since you probably want the ID as the index. Afterwards you should be able to sort by the column 'location'. Hope that helps.