I have dataset like :
profile category target
0 1 [5, 10] 1
1 2 [1] 0
2 3 [23, 5000] 1
3 4 [700, 4500] 0
How to handle category feature, this table may have others additional features too. One hot encoding lead to consume too much space.because number of rows is around 10 million. Any suggestion would be helpful.
MultiLabelBinarizer is solution for this kind of problem which gave sparse output low in memory you can convert other feature to sparse matrix than combine all features to feed into Machine learning model.