Search code examples
matlabcell-array

matlab: merging cell-arrays, converting to time series


i'm fiddling with matlab (no toolboxes added) and am trying to understand how to create x-minute bars from a trade database.

accessing the data from SQL server, i get a cell array, Buy_Data:

    Attributes: []
          Data: {619134x2 cell}
DatabaseObject: [1x1 database]
      RowLimit: 0
      SQLQuery: [1x210 char]
       Message: []
          Type: 'Database Cursor Object'
     ResultSet: [1x1 com.microsoft.sqlserver.jdbc.SQLServerResultSet]
        Cursor: [1x1 com.mathworks.toolbox.database.sqlExec]
     Statement: [1x1 com.microsoft.sqlserver.jdbc.SQLServerStatement]
         Fetch: [1x1 com.mathworks.toolbox.database.fetchTheData]

now, the content looks something like:

'2012-08-28 16:48:56.0'    [24800]
'2012-08-28 16:48:56.0'    [24800]
'2012-08-28 16:49:14.0'    [24800]
'2012-08-28 16:49:14.0'    [24800]
'2012-08-28 16:49:21.0'    [24799]
'2012-08-28 16:49:51.0'    [24800]
'2012-08-28 16:49:51.0'    [24800]
'2012-08-28 16:49:55.0'    [24800]
'2012-08-28 16:49:55.0'    [24800]
'2012-08-28 16:49:56.0'    [24800]
'2012-08-28 16:49:56.0'    [24800]
'2012-08-28 16:49:56.0'    [24800]
'2012-08-28 16:49:56.0'    [24800]
'2012-08-28 16:49:56.0'    [24800]
'2012-08-28 17:00:12.0'    [24839]

how do i go about combining 2 of these cell-arrays into a x-minute time series?


Solution

  • Well, combining the cell arrays is as simple as

    new_Data = [Buy_data.Data; Buy_data2.Data]
    

    where obviously, Buy_data and Buy_data2 are your two different structures.

    Combining these into an x-minute candlestick format can be done like this:

    % example data
    new_cell = {
        '2012-08-28 16:48:56.0'    [24800]
        '2012-08-28 16:48:56.0'    [24800]
        '2012-08-28 16:49:14.0'    [24800]
        '2012-08-28 16:49:14.0'    [24800]
        '2012-08-28 16:49:21.0'    [24799]
        '2012-08-28 16:49:51.0'    [24800]
        '2012-08-28 16:49:51.0'    [24800]
        '2012-08-28 16:49:55.0'    [24800]
        '2012-08-28 16:49:55.0'    [24800]
        '2012-08-28 16:49:56.0'    [24800]
        '2012-08-28 16:49:56.0'    [24800]
        '2012-08-28 16:49:56.0'    [24800]
        '2012-08-28 16:49:56.0'    [24800]
        '2012-08-28 16:49:56.0'    [24800]
        '2012-08-28 17:00:12.0'    [24839]
        };
    
    % your period (in minutes)
    x = 1; 
    
    
    
    % convert dates to numbers and make sure dates are sorted
    [dateNums, sortInds] = sort(datenum(new_cell(:,1)));
    prices = [new_cell{:,2}].';
    prices = prices(sortInds);
    
    % find minimum date, maximum date
    minDate = min(dateNums);
    maxDate = max(dateNums);
    
    % group data into bins
    x = x/24/60;
    numBins = ceil( (maxDate-minDate)/x );
    [N,X] = hist(dateNums,numBins);
    
    % create candlesticks
    inds          = [1 cumsum(N)];
    candles.min   = zeros(numBins,1);
    candles.max   = zeros(numBins,1);
    candles.open  = zeros(numBins,1);
    candles.close = zeros(numBins,1);
    candles.dates = cell(numBins,1);
    for ii = 1:numBins
    
        % we already know the dates
        candles.dates{ii} = datestr(X(ii));
    
        % continue only if there's data    
        if N(ii)
            % extract the prices
            subSet = prices(inds(ii):inds(ii+1));
            % find max, min, open, close
            candles.open(ii)  = subSet(1);
            candles.close(ii) = subSet(end);
            candles.max(ii)   = max(subSet);
            candles.min(ii)   = min(subSet);        
        end    
    
    end
    

    Note that you need to do some tweaking, as these would be pretty unrepresentable candlesticks :) Anyway, it should be enough to get you started.