I'm trying to make a transmittal application which will take file names and turn them into a csv record of documents issued. Currently the python calls for all the file names in a given folder, creates a list, and splits the file name into document number, revision and title.
Currently I have been able to get the python to scrape the file names, create a list of this information and then split them to create a new list of separate data, i.e. documentnumber,revision,title.pdf to [document number, Revision , Title].
def getFiles():
i = 0
path = input("Paste in path for outgoing folder: ")
numTitleRev = os.listdir(path)
issueRec = []
fileData = []
totalList = len(numTitleRev)
listNumber = str(totalList)
print('\n' + "The total amount of documents in this folder is: " + listNumber + '\n')
csvOutput = []
while i < totalList:
for item in numTitleRev:
fileSplit = item.split(',', 2)
fileTitle = fileSplit.pop(2)
fileRev = fileSplit.pop(1)
fileNum = fileSplit.pop(0)
csvOutput.append([fileNum,fileRev,fileTitle])
with open('output.csv', 'a') as writeCSV:
writer = csv.writer(writeCSV)
for row in csvOutput:
writer.writerow(row)
i += 1
writeCSV.close()
print("Writing complete")
The output I'm looking for is like so:
Number - Revision - Title
File1 - 01 - Title 1
File2 - 03 - Title 2 etc.
The code above is the process of spliting the list and it's records by ',' which is how the file names are stored in the folder.
The issue I think I have with the code below is that csvOutput is only sending the one result to the CSV, the last result of the string.
Then this is being printed in the csv for the total number of files in the folder, rather than split list record one, send to csv repeat with record two.
The problem is I can't think of how I would store this information as variables when the total amount of files is not constant.
Any help would be most appreciated.
The main issue is the nested while/for
loop. I restructured the code a bit to make it testable locally (and runnable by just copy/pasting). That should also give you an idea of how to structure code to make it easier to ask for help.
I added a lot of comments explaining the changes I made.
import csv
# This part has ben extracted from the main logic, to make the code runnable
# with sample data (see main() below)
def getFiles():
path = input("Paste in path for outgoing folder: ")
numTitleRev = os.listdir(path)
print("\nThe total amount of documents in this folder is: %s\n" % len(numTitleRev))
return numTitleRev
# This piece of logic contained the core error. The nested "while" loop was
# unnecessary. Additionally, the ".append" call wass on the wrong indent-level.
# Removing the unnecessary while-loop makes this much clearer
def process_files(filenames):
parsed = []
for item in filenames:
# Using "pop()" is a destructive operation (it modifies the list
# in-place which may leed to bugs). In this case it's absolutely fine,
# but I replaced it with a different syntax which in turn also makes
# the code a bit nicer to read.
fileNum, fileRev, fileTitle = item.split(',', 2)
parsed.append([fileNum,fileRev,fileTitle])
return parsed
# Similarly to "getFiles", I extracted this to make testing easier. Both
# "getFiles" and "write_output" are functions with "side-effects" which rely on
# external resources (the disk in this case). Extracting the main logic into
# "process_files" makes that function easily testable without the need of
# having the files really exist on the disk.
def write_output(parsed_data):
with open('output.csv', 'a') as writeCSV:
writer = csv.writer(writeCSV)
for row in parsed_data:
writer.writerow(row)
print("Writing complete")
# This is just a simple main function to illustrate how the new functions are
# called.
def main():
filenames = [ # <-- Some example data to make the SO answer runnable
'0,1,this is an example.txt',
'1,4,this is an example.txt',
'2,200,this is an example, with a comma in the name.txt',
'3,1,this is an example.txt',
]
# filenames = getFiles() <-- This needs to be enabled for the real code
converted = process_files(filenames)
write_output(converted)
# This special block prevents "import side-effects" when this Python file would
# be imported somewhere else.
if __name__ == '__main__':
main()