python regex dataframe data-science data-processing

To have data only in square bracket in a data frame using a Python (regex)

I have been working with a data frame in which data record have useful information in square brackets and non-useful information outside the square bracket.

Sample Data frame:

 Record        Data
      1          Rohan is [age:10] with [height:130 cm].
      2          Girish is [age:12] with [height:140 cm].
      3          Both kids live in [location:Punjab] and [location:Delhi].
      4          They love to play [Sport:Cricket] and [Sport:Football].

Expected Output:

 Record        Data
      1          [age:10],[height:130 cm]
      2          [age:12],[height:140 cm]
      3          [location:Punjab],[location:Delhi]
      4          [Sport:Cricket],[Sport:Football]

I have been trying this but cannot get the desired output.

df['b'] = df['Record'].str.findall('([[][a-z \s]+[]])', expand=False).str.strip()
print(df['b'])

That doesn't seems to work.

I am new with Python.

Solution

I believe you need for strings findall with join:

df['b'] = df['Data'].str.findall('(\[.*?\])').str.join(', ')
print (df)

   Record                                               Data  \
0       1            Rohan is [age:10] with [height:130 cm].   
1       2           Girish is [age:12] with [height:140 cm].   
2       3   Both kids live in [location:Punjab] and [Delhi].   
3       4  They love to play [Sport:Cricket] and [Sport:F...   

                                   b  
0          [age:10], [height:130 cm]  
1          [age:12], [height:140 cm]  
2         [location:Punjab], [Delhi]  
3  [Sport:Cricket], [Sport:Football]

If need values in lists:

df['b'] = df['Data'].str.findall('\[(.*?)\]')
print (df)

   Record                                               Data  \
0       1            Rohan is [age:10] with [height:130 cm].   
1       2           Girish is [age:12] with [height:140 cm].   
2       3   Both kids live in [location:Punjab] and [Delhi].   
3       4  They love to play [Sport:Cricket] and [Sport:F...   

                                 b  
0          [age:10, height:130 cm]  
1          [age:12, height:140 cm]  
2         [location:Punjab, Delhi]  
3  [Sport:Cricket, Sport:Football]