I'm in need of optimizing import of .xls files to matlab due to xlsread being very time consuming with large amount of files. Current xlsread script as follows:
scriptName = mfilename('fullpath');
[currentpath, filename, fileextension]= fileparts(scriptName);
xlsnames = dir(fullfile(currentpath,'*.xls'));
xlscount = length(xlsnames);
xlsimportdata = zeros(7,6,xlscount);
for k = 1:xlscount
xlsimport = xlsread(xlsnames(k).name,'D31:I37');
xlsimportdata(:,1:size(xlsimport,2),k) = xlsimport;
end
I have close to 10k files per week that needs processing and with approx. 2sec per file processed on my current workstation, it comes in at about 5½ hours.
I have read that ActiveX can be used for this purpose however that is far beyond my current programming skills and have not been able to find a solution elsewhere. Any help on how to make this would be appreciated.
If it is simple to perform with ActiveX (or other proposed method), I would also be interested in data on cells D5 and G3, which I am currently grabbing from 'xlsnames(k,1).name' and 'xlsnames(k,1).date'
EDIT: updated to reflect the solution
% Get path to .m script
scriptName = mfilename('fullpath');
[currentpath, filename, fileextension]= fileparts(scriptName);
% Generate list of .xls file data
xlsnames = dir(fullfile(currentpath,'*.xls'));
xlscount = length(xlsnames);
SampleInfo = cell(xlscount,2);
xlsimportdata = cell(7,6,xlscount);
% Define xls data ranges to import
SampleID = 'G3';
SampleRuntime = 'D5';
data_range = 'D31:I37';
% Initiate progression bar
h = waitbar(0,'Initiating import...');
% Start actxserver
exl = actxserver('excel.application');
exlWkbk = exl.Workbooks;
for k = 1:xlscount
% Restart actxserver every 100 loops due limited system memory
if mod (k,100) == 0
exl.Quit
exl = actxserver('excel.application');
exlWkbk = exl.Workbooks;
end
exlFile = exlWkbk.Open([dname filesep xlsnames(k).name]);
exlSheet1 = exlFile.Sheets.Item('Page 0');
rngObj1 = exlSheet1.Range(SampleID);
xlsimport_ID = rngObj1.Value;
rngObj2 = exlSheet1.Range(SampleRuntime);
xlsimport_Runtime = rngObj2.Value;
rngObj3 = exlSheet1.Range(data_range);
xlsimport_data = rngObj3.Value;
SampleInfo(k,1) = {xlsimport_ID};
SampleInfo(k,2) = {xlsimport_Runtime};
xlsimportdata(:,:,k) = xlsimport_data;
% Progression bar updater
progress = round((k / xlscount) * 100);
importtext = sprintf('Importing %d of %d', k, xlscount);
waitbar(progress/100,h,sprintf(importtext));
disp(['Import progress: ' num2str(k) '/' num2str(xlscount)]);
end
%close actxserver
exl.Quit
% Close progression bar
close(h)
Give this a try. I am not an ActiveX Excel guru by any means. However, this works for me for my small amount of test XLS files (3). I never close
the exlWkbk
so I don't know if memory usage is building or if it automatically cleaned up when descoped after the next is opened in its place ... so use at your own risk. I am seeing an almost 2.5x speed increase which seems promising.
>> timeit(@getSomeXLS)
ans =
1.8641
>> timeit(@getSomeXLS_old)
ans =
4.6192
Please leave some feedback if this work on large number of Excel sheets because I am curious how it goes.
function xlsimportdata = getSomeXLS()
scriptName = mfilename('fullpath');
[currentpath, filename, fileextension]= fileparts(scriptName);
xlsnames = dir(fullfile(currentpath,'*.xls'));
xlscount = length(xlsnames);
xlsimportdata = zeros(7,6,xlscount);
exl = actxserver('excel.application');
exlWkbk = exl.Workbooks;
dat_range = 'D31:I37';
for k = 1:xlscount
exlFile = exlWkbk.Open([currentpath filesep xlsnames(k).name]);
exlSheet1 = exlFile.Sheets.Item('Sheet1'); %Whatever your sheet is called.
rngObj = exlSheet1.Range(dat_range);
xlsimport = cell2mat(rngObj.Value);
xlsimportdata(:,:,k) = xlsimport;
end
exl.Quit