python dataframe object split calculated-columns

Need help splitting a column in my DataFrame (Python)

I have a Python DataFrame "dt", one of the dt columns "betName" is filled with objects that sometimes have +/- numbers after the names. I'm trying to figure out how to separate "betName" into 2 columns "betName" & "line" where "betName" is just the name and "line" has the +/- number or regular number

Please see screenshots, thank you for helping!

example of problem and desired result

dt["betName"]

Solution

Try this (updated) code:

df2=df['betName'].str.split(r' (?=[+-]\d{1,}\.?\d{,}?)', expand=True).astype('str')

Explanation. You can use str.split to split a text in the rows into 2 or more columns by regular expression:

  (?=[+-]\d{1,}\.?\d{,}?)

' ' - Space char is the first.

() - Indicates the start and end of a group.

?= - Lookahead assertion. Matches if ... matches next, but doesn’t consume any of the string.

[+-] - a set of characters. It will match + or -.

\d{1,} - \d is a digit from 0 to 9 with {start, end} number of digits. Here it means from 1 to any number: 1,200,4000 etc.

\.? - \. for a dot and ? - 0 or 1 repetitions of the preceding expression group or symbol.

str.split(pattern=None, n=- 1, expand=False)

pattern - string or regular expression to split on. If not specified, split on whitespace

n - number of splits in output. None, 0 and -1 will be interpreted as return all splits.

expand - expand the split strings into separate columns.

True for placing splitted groups into different columns
False for Series/Index lists of strings in a row.

by .astype('str') function you convert dataframe to string type.

The output.