I have the following CSV format for users rating items:
A1YS,8F20,3.0
A3TS,8320,2.0
A3BU,1905,5.0
A3BU,3574,4.0
A14X,185A,1.0
The columns are UserID,ItemID,Rating
I want to load it to a Matlab matrix with rows for users, columns for items and cell values will contain the ratings (unknown rating equals zero) in example:
8F20, 1905, 3574, 185A
A1YS 3 , 0 , 0 , 0
A3TS 2 , 0 , 0 , 0
A3BU 0 , 5 , 4 , 0
A14X 0 , 0 , 0 , 1
Another thing, actually the matrix can be formed as:
3 , 0 , 0 , 0
2 , 0 , 0 , 0
0 , 5 , 4 , 0
0 , 0 , 0 , 1
I'm quite new to Matlab and tried some variations of:
https://stackoverflow.com/a/13775907/1726419 and https://stackoverflow.com/a/19613301/1726419
without big success - I'll be very thankful for any assistance.
EDIT: What I've got so far is:
fid = fopen('ratings_sample.csv');
out = textscan(fid,'%s%s%d%d','delimiter',',');
fclose(fid);
c1 = out{1};
c2 = out{2};
c3 = out{3};
My problem is that I need duplicate removal of both c1
& c2
and to fill in properly the inner cells of the matrix. plus, I don't know if this is the proper way to load it.
If UserID and ItemID are unique, you can use crosstab
:
UserID = categorical(c1);
ItemID = categorical(c2);
Rating = crosstab(UserID,ItemID);
Rating(Rating==1) = c3;
and get:
Rating =
3 0 0 0 0
0 0 0 0 1
0 2 5 0 0
0 0 0 4 0
If you want to organize it in a table, you need to first convert the item's ID to a valid variable name (that starts with a letter):
Items = cellfun(@(s) ['Item_' s],c2,'un',0);
and then you can use a table
to hold all the data:
Tbl = array2table(Rating,...
'RowNames',unique(c1,'stable'),...
'VariableNames',unique(Items,'stable'))
the result:
Tbl =
4×5 table
Item_8F20 Item_8320 Item_1905 Item_3574 Item_185A
_________ _________ _________ _________ _________
A1YS 3 0 0 0 0
A3TS 0 0 0 0 1
A3BU 0 2 5 0 0
A14X 0 0 0 4 0