Search code examples
performancematlabimportxlsx

Importdata from large xlsx file


I have xlsx file which contains 750,000x33 cells.

When I tried to use:

[FileName PathName] = uigetfile('*.xlsx','XLSX Files');

fid = fopen(FileName);
T=importdata(FileName);

The computation took over an hour.

Is there anything I can do to speed the process?

I also tried to use xlsread but it didnt work aswell.

  • I have managed to importdata a 550,000x33 file before in few minutes, dont see a reason that the time it takes will grow that much.

Thank you.


Solution

  • The fastest way would be:

    • using xlsread function for reading the data;
    • having also MS Excel installed (is not mandatory, but it helps with the speed and the data loading options).

    So, try this:

    [file_name, path_name] = uigetfile('*.xlsx','XLSX Files');
    [num, txt, ~] = xlsread(fullfile(path_name, file_name));
    

    After this, you will have anything that could be converted to numbers in the numeric matrix num, and everything else as the string cell array txt. Check the function's help for further tuning of the data loading.

    Later edit: If this is still slow, is most likely because xlsread is growing arrays in memory in basic mode, and the memory is fragmented, or too small. Options (they are not mutually exclusive):

    • convert the file to .CSV, then use textscan to load the data;
    • close MATLAB and open it again, before reading the file (best way of defragmenting the array memory);
    • increase the size of the virtual memory on your system;
    • add more RAM to your machine.