I am working with a CSV dataset that looks like below:
img_id obj_id xcen ycen width height
0 94a69b66-23f0-11e9-a78e-2f2b7983ac0d 0 0.377734 0.091667 0.071094 0.183333
1 94a6a3a4-23f0-11e9-a78f-ebd9c88ef3e8 0 0.375781 0.090972 0.075000 0.181944
2 94a6a430-23f0-11e9-a790-2b5f72f1667a 0 0.378516 0.091667 0.069531 0.183333
3 94a6a48a-23f0-11e9-a791-fb958b6ab6b3 0 0.391406 0.106944 0.076563 0.213889
4 94a6a4da-23f0-11e9-a792-f320b734bd9b 0 0.395313 0.106250 0.068750 0.212500
5 94a6a534-23f0-11e9-a793-c7e8fecc9fa8 0 0.362109 0.127778 0.105469 0.225000
What I am trying to do is to write each row to a seperate text file, each column value separated by a comma on one line.
I am only dropping img_id from being written inside the text file because I am using img_id for naming the individual text files.
I have been trying different methods but I am having issues getting each row written to its respective text file. I have successfully gotten each individual text file to be named by its img_id.
An example would be that the first img_id text file would contain something like this
0, 0.377734, 0.091667, 0.071094, 0.183333
Currently, I am trying iterate on one column and instead of each row going into the respective text file, it takes the entire list that I got from using the .values()
method and puts into each text file like this
[0,0,0,0,0,0.......0,0,0,0,0,0]
Also some of the img_ids are the same so I want to prevent overwriting a txt file with another txt file of the same name when my code creates it and instead if there is more than one img_id then instead of creating another textfile and (I assume) overwrite the previous text file with the same img_id, it adds that row to the text file so now there are 2 lines like this:
Contents of 94a6a54-23f0-11e9-a793-c7e8fecc9fa8.txt
0, 0.362109, 0.127778, 0.105469, 0.225000
0, 0.175781, 0.283642, 0.210913, 0.293922
Here is the code that I am currently working with.
file = '{}.txt'
a = df['img_id'].values
b = df['object_class'].values
c = df['xcen'].values
d = df['ycen'].values
e = df['width'].values
f = df['height'].values
b = str(b)
c = str(c)
d = str(d)
e = str(e)
f = str(f)
for x in a:
with open(file.format(x), 'w') as f:
for i in b:
f.write(i)
Sample data:
>>> df
img_id object_class xcen ycen width height
0 b192cbd4-7958-4a82-8f90-42217076a66c 4 0.211284 0.428579 0.287383 0.683370
1 b192cbd4-7958-4a82-8f90-42217076a66c 2 0.840717 0.040433 0.192738 0.545159
2 9d452f25-aa60-4fe1-9165-a1a8a981a372 2 0.840717 0.040433 0.192738 0.545159
3 3fa5d0d9-c781-40ad-a8f5-a1eae7d51b98 9 0.741793 0.098438 0.707242 0.102758
4 706ad967-11a6-4e6f-85bc-24bc204597f4 4 0.786071 0.735364 0.661866 0.453724
5 1b577e42-d037-4f7b-918e-1c7e6cc7e7a1 17 0.513458 0.012236 0.856802 0.894129
6 c4c16c64-30cd-450b-ab08-543c4818f1f3 13 0.625725 0.765523 0.007714 0.678993
7 329b908e-ce41-4fd1-b671-b20909c3b31d 10 0.784206 0.831250 0.728761 0.809600
8 fd83b03c-2a84-4cb3-834d-714167475104 7 0.508803 0.137691 0.290492 0.206802
9 6ce64fd0-ca9b-47e8-ae1d-049a87468197 13 0.919442 0.168500 0.995826 0.250895
10 66ddffca-fdea-444f-ae79-d4ee284b9385 12 0.920211 0.803805 0.360863 0.866571
Export files:
for filename, data in df.groupby("img_id"):
data.drop(columns="img_id").to_csv(f"{filename}.txt", header=None, index=None)