Subsetting odd rows in r using seq

Hope it is not a too newbie question.

I am trying to subset rows from the GDP UK dataset that can be downloaded from here: http://www.ons.gov.uk/ons/site-information/using-the-website/time-series/index.html

The dataframe looks more or less like that:

       X    ABMI
1   1948    283297
2   1949    293855
3   1950    304395

....

300 2013 Q2 381318
301 2013 Q3 384533
302 2013 Q4 387138
303 2014 Q1 390235

The thing is that for my analysis I only need the data for years 2004-2013 and I am interested in one result per year, so I wanted to get every fourth row from the dataset that lies between the 263 and 303 row.

On the basis of the following websites:

https://stat.ethz.ch/pipermail/r-help/2008-June/165634.html (plus a few that i cannot quote due to the link limit)

I tried the following, each time getting some error message:

> GDPUKodd <- seq(GDPUKsubset[263:302,], by = 4)
    Error in seq.default(GDPUKsubset[263:302, ], by = 4) : 
  argument 'from' musi mieæ d³ugoœæ 1

> OddGDPUK <- GDPUK[seq(263, 302, by = 4)]
    Error in `[.data.frame`(GDPUK, seq(263, 302, by = 4)) : 
  undefined columns selected

> OddGDPUKprim <- GDPUK[seq(263:302), by = 4]
Error in `[.data.frame`(GDPUK, seq(263:302), by = 4) : 
  unused argument (by = 4)

> OddGDPUK <- GDPUK[seq(from=263, to=302, by = 4)]
Error in `[.data.frame`(GDPUK, seq(from = 263, to = 302, by = 4)) : 
  undefined columns selected

> OddGDPUK <- GDPUK[seq(from=GDPUK[263,] to=GDPUK[302,] by = 4)]
Error: unexpected symbol in "OddGDPUK <- GDPUK[seq(from=GDPUK[263,] to"

> GDPUK[seq(1,nrows(GDPUK),by=4),]
Error in seq.default(1, nrows(GDPUK), by = 4) : 
  could not find function "nrows"

To put a long story short: help!

Solution

Instead of trying to extract data based on row ids, you can use the subset function with appropriate filters based on the values.

For example if your data frame has a year column with values 1948...2014 and a quarter column with values Q1..Q4, then you can get the right subset with:

subset(data, year >= 2004 & year <= 2013 & quarter == 'Q1')

UDATE

I see your source data is dirty, with no proper year and quarter columns. You can clean it like this:

x <- read.csv('http://www.ons.gov.uk/ons/datasets-and-tables/downloads/csv.csv?dataset=pgdp&cdid=ABMI')
x$ABMI <- as.numeric(as.character(x$ABMI))
x$year <- as.numeric(gsub('[^0-9].*', '', x$X))
x$quarter <- gsub('[0-9]{4} (Q[1-4])', '\\1', x$X)
subset(x, year >= 2004 & year <= 2013 & quarter == 'Q1')