Pandas data cleansing to assign records below a row are assigned a particular value

In the below code, I'm using simple data manipulation to split the columns and remove unnecessary characters.

input_uni_towns = pd.read_fwf("university_towns.txt", sep = " ", header = None)
uni_towns = input_uni_towns.rename(columns={0: "Raw_data"})
uni_towns['Cleaned'] = uni_towns["Raw_data"].replace(regex=True,to_replace=[r'\[[^()]*\]'],value=r'')
uni_towns[["State","University"]] = uni_towns.Cleaned.str.split("(",n=1,expand=True) 
uni_towns["University"] = uni_towns["University"].str.rstrip(')')
cleaned_uni_towns = uni_towns[["State","University"]]

After this above step, I want to assign State to records whose above record has None assigned to it. For Ex: Auburn (Auburn University) current State is Auburn, but I want this to be updated to Alabama and similarly for records below Alabama till code encounters next State i.e. Alaska

This is the current output

This is the expected output

Solution

You can keep put NaN in State when University is not None (i.e. that's a row with a university), then fill the NaN in State with the most recent non-NaN value.

df = pd.DataFrame({'Cleaned': ['Alabama', 'Auburn (Auburn University)', 'Alaska']})
df[['State', 'University']] = df.Cleaned.str.split('(', n = 1, expand = True)
df.University = df.University.str.rstrip(')')

df.State = np.where(df.University.map(lambda u: u is None), df.State, np.nan)
df.State = df.State.fillna(method = 'ffill')