Search code examples
pythonpython-3.xpython-2.7csvdata-preprocessing

Add random noise in each value of CSV rows


I am trying to add some random noise to my csv columns except last column.

This is my csv file:

z-1            z-2          z-3        z-4         z-5        z-6           z-7     class
0.1305512   0.1301835   0.1295706   0.1287125   0.1276091   0.1262605   0.1246666     1
0.151239    0.1508714   0.1502585   0.1494004   0.148297    0.1469484   0.1453545      0
0.1463833   0.1461299   0.1456313   0.1448875   0.1438984   0.1426641   0.1411845     1
0.1422839   0.1419962   0.1414633   0.1406851   0.1396616   0.138393    0.136879       0
0.1452986   0.1450747   0.1446055   0.1438911   0.1429314   0.1417265   0.1402764      1
0.1354216   0.1351467   0.1346265   0.1338611   0.1328504   0.1315945   0.1300933       0
0.1458855   0.1456223   0.1451139   0.1443602   0.1433613   0.1421172   0.1406278      1
0.149526    0.1492658   0.1487604   0.1480096   0.1470137   0.1457725   0.144286       0
0.1452744   0.1450098   0.1444999   0.1437448   0.1427444   0.1414988   0.1400079      1
0.146562    0.1462768   0.1457463   0.1449706   0.1439496   0.1426834   0.1411719      0

I want to add noise in each value of CSV rows in the range of [0.1, 0.12].

This is my code:

import pandas as pd
import random

df = pd.read_csv("./data/raw.csv")

noise = [random.uniform(0.1,0.12) for _ in range(5)] # 5 random floats in the range of 0.1 to 0.12
#print(noise)


for i in noise:
    print(i)
    df1 = df + round((df.iloc[:,:-1] + i),7) # round values after adding noise

df1.to_csv("./data/new.csv", index=False) # make new csv

My code generates 5 noises but adds only one noise to every value.

I want to add all generated noise randomly in each rows. I mean 5 noises, randomly add to first row, second row, ....so on. and make new csv.

Is it possible?

Any help would be appreciated.

Thank you.


Solution

  • If you want to add random noise to each value individually in the first 5 columns do something like this:

    df1 = pd.DataFrame()
    for c in df.columns[0:5]:
        df1[c] = df[c] + [random.uniform(0.1,0.12) for _ in range(len(df[c]))] 
    

    If you want to add random noise to each column individually:

    df1 = pd.DataFrame()
    for c in df.columns[0:5]:
       column_noise = random.uniform(0.1,0.12)
       df1[c] = df[c] + [column_noise for _ in range(len(df[c]))]