I am having difficulty trying to rename Json files within a ton of subfolders. What I want to do is to replace json files with a count variable. Since, each and every one of the .json files end with messages_1.json within its respective folder.
Here Person_1, Person_2, Person_3,......,Person_n are individual sub-folders inside the Inbox folder
Example file structure
- C:/abc/def/ghi/klmn/opq/rst/uvw/xyz/messages/Inbox:
- Person_1
- message_1.json
- Person_2
- message_1.json
- Person_3:
- message_1.json
.
.
.
.
- Person_n:
- message_1.json
Additionally, I want to save them as a single panda dataframe and later export it as a csv file where I can work further on the created dataframe.
Here is What I have tried so far and am stuck:
Code I've Tried:
directory = os.path.dirname(os.path.realpath(sys.argv[0]))
for root, dirs, files in os.walk("C:/abc/def/ghi/klmn/opq/rst/uvw/xyz/messages/inbox/"):
for name in files:
if name.endswith((".json")):
folder_names = os.path.relpath(root, directory)
json_files = os.path.join(folder_names, name)
Output Which I want to get
- Person_1
- message_1.json
- Person_2
- message_2.json
- Person_3:
- message_3.json
.
.
.
.
- Person_n:
- message_n.json
OR
All replaced json names and then a single csv file with all json files
Any help will be deeply appreciated I'm not able to wrap my head around how to get this
Use pathlib
to build the dataframe, then you can rename the files.
from pathlib import Path
import pandas as pd
pth = Path("C:/abc/def/ghi/klmn/opq/rst/uvw/xyz/messages/inbox/")
data = [(f, f.parent, f.stem, f.suffix)
for f in pth.rglob('*.json')]
# load into dataframe
df = pd.DataFrame(data=data, columns=['pth', 'dname', 'fname', 'suffix'])
# create new filename
df['new_name'] = (
df['fname'].str.split('_').str[0] +
'_' +
(df.index + 1).astype(str) +
df['suffix']
)
# now rename each file
for row in df.itertuples():
row.pth.rename(Path(row.dname) / row.new_name)