I ma trying to convert the variable Formula_bit
into variable like names where they are lowercase and words are seperated by _
. My Process is as follows splitting the right-hand side by operators (+, -, *, /) or x (multiplication), converts the resulting items to lowercase, replaces spaces with underscores, removes opening and closing parentheses. Finally removing the leading and trailing underscores if there are any. However my output
and expected outputs
dont match what could I do to fix this?
import re
Formula_bit = ['Σ (Dividends)', 'Dividend Payout Ratio * eps']
# Process the right-hand side of each formula to extract parameters
params = [
re.split(r'\s*[+\-*/]\s*| x ', re.sub(r'[+\-*/]', ',', item))[0] # Split the right-hand side by operators (+, -, *, /) or 'x' (multiplication)
.lower() # Convert to lowercase
.replace(" ", "_") # Replace spaces with underscores
.replace("(", "") # Remove opening parentheses
.replace(")", "") # Remove closing parentheses
for item in Formula_bit
]
# Remove leading and trailing underscores from each item and strip whitespace
params = [item.lstrip('_').rstrip('_').strip() for item in params]
Output:
['σ_dividends', 'dividend_payout_ratio_,_eps']
Expected output:
['σ_dividends', 'dividend_payout_ratio', 'eps']
Example, that converts formula to variable names
import re
import string
Formula_bit = ['Σ (Dividends)', 'Dividend Payout Ratio * eps'] # Input formulas
splitter = "_" # Splitter character for replacing spaces
formula = ",".join(Formula_bit) # Join the formulas into a single string
formula = re.sub(r"[()]", "", formula.lower()) # Remove parentheses from the formula string
formula = re.sub(r"\s", splitter, formula) # Replace whitespace characters with the splitter
punctuation = string.punctuation.replace(splitter, "") # Punctuation excluding the splitter
formula = re.sub(fr"[{punctuation}]", ",", formula) # Remove punctuation characters from the formula strin
params = [s.strip(splitter) for s in formula.split(",")] # Split the formula string on commas to extract the parameters and strip splitter characters
print(params)
# ['σ_dividends', 'dividend_payout_ratio', 'eps']
Here one check is missed. To be a valid variable name, first character should be a letter (not digit).