Search code examples
pythonpandasfaker

Creating fake data similar to a given template


I wish to make multiple fake excel files containing data such as the following :

DATE       CAR           Cost  Outlet  Code      
2012/01/01 BMW           100   AA      2187 
2012/01/01 Mercedes Benz 200   AA      2187    
2012/01/01 BMW           100   AA      2187 
2012/01/02 Volvo         100   AA      2187  
2012/01/02 BMW           50    AA      2187
2012/01/03 Mercedes Benz 75    AA      2187
...
2012/09/01 BMW           200   AA      2187
2012/09/02 Volvo         100   AA      2187  

The idea is to be able to create fake data which has a template similar to that of above. The data can be random too.

What is the best way to create fake tabulated data for data analytics?


Solution

  • You could use pydbgen package to create random data and return as pandas dataframe:

    from pydbgen import pydbgen
    myDB=pydbgen.pydb()
    myDB.gen_dataframe(5,['name','city','phone','date'])
    

    This outputs:

    enter image description here