Search code examples
sqlrrodbc

SQL-query loses information


I am doing a SQL query in R, using RODBC.

The query gives me data in the format

Date(POSIXct); var1:var29
%Y-%m-%d %H:%M:%S; numeric_values

The problem is, once the query exceeds a certain length, 6960 obs or so with 29-30 variables, including date, the information delivered to R starts to look like this:

Date(POSIXct); var1:var30
%Y-%m-%d; numeric_values

Hence, I lose the "%H:%M:%S" information. I have no idea why. If I decrease the number of variables, I can increase the time length before this happens.

It is ts-sql on a windows-server 2007. (I believe)

SQL-call example in R:

sqlQuery(database, "SELECT [datetime], [0] as SYS, [1] as NO1, [2] as NO2, [7] as NO3, [9] as NO4, [19] as NO5, [5] as DK1,[6] as DK2, [25] as SE1,
         [26] as SE2,[27] as SE3, [28] as SE4, [4] as FIN, [13] as DE, [14] as NL, [16] as FR, [15] as CH, [17] as AT, [20] as EE,
         [36] as LT, [45] as LV, [42] as SI, [50] as IT, [44] as ES, [43] as BE, [74] as HU, [75] as CZ, [41] as UK

         From
         (
         SELECT [area_id],[pris],[datetime]
         FROM [BigData].[dbo].[Prices]
         WHERE area_id in (0,1,2,7,9,19,5,6,25,26,27,28,4,13,14,16,15,17, 20, 36, 45, 42, 50, 44, 43, 74, 75, 41)
         AND [datetime]>= cast(GETDATE()-290 as date) 
         AND [datetime]< cast(GETDATE()+0 as date)
         ) p
         PIVOT(SUM([pris])
         FOR [area_id] IN
         ([0], [1], [2], [7], [9], [19], [5],[6], [25],[26],[27], [28], [4], [13], [14], [16], [15], [17], [20],
         [36], [45], [42], [50], [44], [43], [74], [75], [41]))
         AS pvt
         ORDER BY [datetime] asc   ") -> prices

Solution

  • Solution #1

    You can use the answer from kristang: (call sqlQuery with as.is option, get timestamps in a string and convert the column with as.POSIXct in R).

    Solution #2

    But I believe more efficient solution is to get the datetime values in numeric type via SQL expression (example for SQL server):

    sqlQuery( "select convert(float, my_date)*3600*24 as my_date from ...")
    

    and convert it to POSIXct from number:

    df1$my_date <- as.POSIXct(df1$my_date, origin = "1900-01-01", tz = "UTC")
    

    Since POSIXct is numeric at heart, fetching and type converting runs faster than normal sqlQuery with RODBC. RODBC converts every timestamp from text string (look as.POSIXct usage in the sqlGetResults function). So the solution is reasonable even with RODBC returning complete timestamps.

    P.S. If you really like converting from text see fastPOSIXct from fasttime package.