Relatively new with python and pandas, hence need some inputs here. Appreciate some response here. I'm having multiple files with a filename having a combination of text, number and date. I want to have camel casing with an underscore and trimming of white space to a standard format, for eg,
FileName- ARA Inoc Start Times V34 20200418.xlsx to be named as Ara_Inoc_Start_Time_V34_20200418.xlsx
FileName- Batch Start Time V3 20200418.xlsx to be named as Batch_Start_Time_V3_20200418.xlsx
The challenge I'm facing is 1) how to add an underscore before date? 2) with a word in a filename like ARA Inoc Start - my code converts it to A_R_A _Inoc _Start. How to adapt it to Ara_Inoc? this would involve trimming the white space as well. How to add it in current code.
def change_case(str):
res = [str[0].upper()]
for c in str[1:]:
if c in ('ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
res.append('_')
res.append(c.upper())
else:
res.append(c)
return ''.join(res)
# Driver code
for filename in os.listdir("C:\\Users\\t\\Documents\\DummyData\\"):
str = filename
print(change_case(str))
Split the strings using str.split()
, convert the first letter using str.upper()
, then join them using str.join()
import os
for filename in [
' ARA Inoc Start Times V34 20200418.xlsx ',
' Batch_Start_Time_V3_20200418.xlsx '
]: # os.listdir('C:\\Users\\t\\Documents\\DummyData\\')
new_filename = '_'.join([i[:1].upper()+i[1:].lower() for i in filename.strip().split()])
print(new_filename)
Output:
Ara_Inoc_Start_Times_V34_20200418.xlsx
Batch_start_time_v3_20200418.xlsx
Note the use of i[:1].upper()+i[1:]
instead of str.title()
. You can use the latter, but that will convert the file extension to title case as well, hence why I used the above instead. Alternatively, you can split the filename and the extension before doing the conversion:
import os
for filename in[
' ARA Inoc Start Times V34 20200418.xlsx ',
' Batch_Start_Time_V3_20200418.xlsx '
]:
filename, ext = filename.rsplit('.', 1)
filename = '_'.join([i.title() for i in filename.strip().lower().split()])
new_filename = '.'.join([filename, ext])
print(new_filename)
Output:
Ara_Inoc_Start_Times_V34_20200418.xlsx
Batch_Start_Time_V3_20200418.xlsx