I have csv file with 3.850.000 entries. Source,Target,Date,Time e.g. 7bc65f6f4342d49242514a50ce53d71d,555af6d82bb7f4c7475f7af29e8db147,2016-02-29,23:51:00
Is it possible for me to plot this csv file in matlab? I have tried working with excel, but most entries it can handle is 105k... How can I plot Date against Time, to then count stuff like: a) number of calls made daily b) number of calls made monday etc?
Can anyone point me to the right direction? Am I even using the correct tool for such a task? Thank you!
Here are some directions and suggestions.
Part 1
First, you would need to read the csv file. The following code should be ready to use for your application.
filename = 'G:\Desktop\new\Book1.csv'; % specify your file name and directory
fileID = fopen(filename,'r','n','UTF-8'); % open file
try
format = '%s%s%{yyyy/MM/dd}D%{HH:mm:ss}D%[^\n\r]'; % format specification
data = textscan(fileID, format, 'Delimiter', ',', 'ReturnOnError', false); % read data
catch e
fclose(fileID); % close file
rethrow(e)
end
fclose(fileID); % close file
clearvars -except data % clear variables
[source, target, dateData, timeData] = data{:}; % load data
Part 2
Since you have loaded data into Matlab, you can start to group the data by date. You can find line number where a change in date occurs:
iDate = find(diff(dateData) ~= 0); % locate where there is a change in date
iDate = [1, iDate(:)', numel(dateData)]; % add start and end
You may use similar code to find changes in time.
Then you can group your data:
for i = 1:numel(iDate) - 1
dataCollection{i}.date = dateData(iDate(i));
dataCollection{i}.source = source(iDate(i):iDate(i+1));
dataCollection{i}.target = target(iDate(i):iDate(i+1));
dataCollection{i}.timeData = timeData(iDate(i):iDate(i+1));
end
You can add another loop inside the above loop to group data by time. For example:
for j = 1:numel(iTime) - 1
dataCollection{i}.timeCollection{j} = dataCollection{i}.source(iTime(j):iTime(j+1));
...% more data
end
Next, you need to count the number of data for each date and, if required, time. This code can be inserted in the existing loops:
dataCollection{i}.dataNum = numel(dataCollection{i}.source);
dataCollection{i}.timeCollection{j}.dataNum = numel(dataCollection{i}.timeCollection{j}.source);
Finally, you can plot your curve and replace the x label to date, time or week, etc.
Part 2 - alternative
Going through the above, you can build a very nice structure to keep your data. If this is not required, the code can be much simpler:
iDate = find(diff(dateData) ~= 0); % locate where there is a change in date
iDate = [1, iDate(:)', numel(dateData)]; % add start and end
dataNumByDate = diff(iDate); % calculate the number of data for each date
xLabelStr = datestr(dateData(iDate(2:end))); % convert date into string for x labels
Then plot the figures. No loop required.
If you have a question about a particular part of the code, I suggest you open a new question.