Search code examples
pythonpysparksas-wps

How to read .wpd sas dataset in python/pyspark


Hey I am trying to import .wpd sas dataset in python. But unable to get the solution. Can anyone please help me out on this.

I have tried it using below class in python: import json import numpy as np class JSONData: def init(self, filename): with open(filename) as data_file:self.data = json.load(data_file)

def getDatasetCount(self):
    return len(self.data['wpd']['dataSeries'])

def getDatasetByIndex(self, index):
    return self.data['wpd']['dataSeries'][index]

def getDatasetByName(self, name):
    return [x for x in self.data['wpd']['dataSeries'] if x['name'] == name][0]

def getDatasetNames(self):
    return [x['name'] for x in self.data['wpd']['dataSeries']]

def getDatasetValues(self, dataset):
    values = []
    for val in dataset['data']:values.append(val['value'])
    return np.array(values)

But no luck. Thanks in Advance.....


Solution

  • WPS allows you to save a sas7bdat file. I recommend using WPS to create a new file instead of trying to get the WPS file read.

    Then pandas, a python extension, can read sas sas7bdat files as a dataframe

    https://pandas.pydata.org/docs/reference/api/pandas.read_sas.html