Search code examples
python-3.xpandastweets

Create variables from values in another column in a pandas dataframe


I have what seems like a rather simple questions but can't wrap my head around them.

I have a pandas dataframe for Tweets. The location of the users is registered in a variable named "Location" in various ways:

When the location is well recorded, I often get:

{'country_code': 'tr', 'state': 'Central Anatolia Region', 'county': 'Çankaya', 'city': 'Ankara'}

or

('country_code': 'tr', 'state': 'Black Sea Region', 'city': 'Trabzon'}

But sometimes, all I get is:

{'country_code': 'tr'}

('country_code': 'tr', 'state': 'Batman'}

and often, there's nothing and all that's registered is this:

{}

I want to write a script that can create new variables in my pandas dataframe for these individual values. In other words, if country_code is registered for a specific row, then I want the value in question to be recorded in a variable named country_code. And so on for state, county, and city. If nothing is there, it can simply input a blank or an NA for all the missing variables in question (county, state, city).

The end result should be such that I have four new variables in my dataframe: country-code, state, county, and city, based on the values registered in the "Location" variable with something (or nothing) registered for these values.

Can someone help by any chance?

Thank you so much!


Solution

  • I was able to fix the problem by working with the original JSON file directly. All I did was store the location data into the different categories I was looking by using a for and if loop similar to what others suggest here. I did so instead of trying to use pandas specific functions to store the data registered in variable "Location" into different variables in my dataset.