I have a data frame that is called df and looks like this
Text No
c0404079=0.00 34
c1444716<=0.00 45
1c0<0226311 <= 0.00 36
c0001208 <= 0.00 32
0.00<c0243026<=2.00 85
c0036983 <= 0.00 55
c00369
74=0.00 39
I want to create a new column in that df that is called "Code"
this code can be the code in the first column which start with the letter c till the furst non alpha-numeric char or the end of the line
so the dataframe will be
c0404079=0.00 34 c0404079
c1444716<=0.00 45 c1444716
1.0<c00226311 <= 0.00 36 c00226311
c0001208 <= 0.00 32 c0001208
0.00<c0243026<=2.00 85 c0243026
c0036983 <= 0.00 55 c0036983
c0036974=0.00 39 c0036974
Any idea how to do that?
I tried this but I did not get the right results
df['Code'] = df['Text'].str.extract(r'c^(\d[^\W_]{5,})')
given your df here is how to get everything from the letter c, til the first non alphanumeric char:
df['extracted'] = df['text'].str.extract(r'(c[^\W]+)')
text extracted
0 c1444716<=0.00 c1444716
1 1.0c00226311 <= 0.00 c00226311
2 0.00<c0243026<=2.00 c0243026