python numpy matrix set symmetric-difference

convert a set of tuples into a numpy array of lists in python

So, I've been using the set method "symmetric_difference" between 2 ndarray matrices in the following way:

x_set = list(set(tuple(i) for i in x_spam_matrix.tolist()).symmetric_difference(
                 set(tuple(j) for j in partitioned_x[i].tolist())))

x = np.array([list(i) for i in x_set])

this method works fine for me, but it feel a little clumsy...is there anyway to conduct this in a slightly more elegant way?

Solution

A simple list of tuples:

In [146]: alist = [(1,2),(3,4),(2,1),(3,4)]

put it in a set:

In [147]: aset = set(alist)
In [148]: aset
Out[148]: {(1, 2), (2, 1), (3, 4)}

np.array just wraps that set in an object dtype:

In [149]: np.array(aset)
Out[149]: array({(1, 2), (3, 4), (2, 1)}, dtype=object)

but make it into a list, and get a 2d array:

In [150]: np.array(list(aset))
Out[150]: 
array([[1, 2],
       [3, 4],
       [2, 1]])

Since it is a list of tuples, it can also be made into a structured array:

In [151]: np.array(list(aset),'i,f')
Out[151]: array([(1, 2.), (3, 4.), (2, 1.)], dtype=[('f0', '<i4'), ('f1', '<f4')])

If the tuples varied in length, the list of tuples would be turned into a 1d array of tuples (object dtype):

In [152]: np.array([(1,2),(3,4),(5,6,7)])
Out[152]: array([(1, 2), (3, 4), (5, 6, 7)], dtype=object)
In [153]: _.shape
Out[153]: (3,)