Search code examples
rrbindcbind

Putting several rows into one column in R


I am trying to run a time series analysis on the following data set:

Year     1771   1772   1773  1774   1775   1776  1777   1778  1779   1780

Number   101     82     66    35     31      7     20     92   154   125

Year     1781   1782   1783  1784   1785   1786  1787   1788  1789   1790

Number    85     68     38    23     10     24    83    132   131    118

Year     1791   1792   1793  1794   1795   1796  1797   1798  1799   1800

Number    90     67     60    47     41     21    16      6     4      7

Year     1801   1802   1803  1804   1805   1806  1807   1808  1809   1810

Number    14     34     45    43     48     42    28     10     8      2

Year     1811   1812   1813  1814   1815   1816  1817   1818  1819   1820

Number     0      1      5    12     14     35    46     41    30     24

Year     1821   1822   1823  1824   1825   1826  1827   1828  1829   1830

Number    16      7      4     2      8     17    36     50    62     67

Year     1831   1832   1833  1834   1835   1836  1837   1838  1839   1840

Number    71     48     28     8     13     57   122    138   103     86

Year     1841   1842   1843  1844   1845   1846  1847   1848  1849   1850

Number     63     37     24    11     15     40    62     98   124     96

Year     1851   1852   1853  1854   1855   1856  1857   1858  1859   1860

Number    66     64     54    39     21      7     4     23    55     94

Year     1861   1862   1863  1864   1865   1866  1867   1868  1869   1870

Number    96     77     59    44     47     30    16      7    37     74

My problem is that the data is placed in multiple rows. I am trying to make two columns from the data. One for Year and one for Number, so that it is easily readable in R. I have tried

> library(tidyverse)
> sun.df = data.frame(sunspots)
> Year = filter(sun.df, sunspots == "Year")

to isolate the Year data, and it works, but I am unsure of how to then place it in a column.

Any suggestions?


Solution

  • Try this:

    library(tidyverse)
    
    df <- read_csv("test.csv",col_names = FALSE)
    df
    
    # A tibble: 6 x 4
    # X1        X2    X3    X4
    # <chr>  <dbl> <dbl> <dbl>
    # 1 Year     123   124   125
    # 2 Number     1     2     3
    # 3 Year     126   127   128
    # 4 Number     4     5     6
    # 5 Year     129   130   131
    # 6 Number     7     8     9
    
    # Removing first column and transpose it to get a dataframe of numbers
    df_number <- as.data.frame(as.matrix(t(df[,-1])),row.names = FALSE)
    df_number
    #   V1 V2  V3 V4  V5 V6
    # 1 123  1 126  4 129  7
    # 2 124  2 127  5 130  8
    # 3 125  3 128  6 131  9
    
    # Keep the first two column (V1,V2) and assign column names
    df_new <- df_number[1:2]
    colnames(df_new) <- c("Year","Number")
    
    # Iterate and rbind with subsequent columns (2 by 2) to df_new
    for(i in 1:((ncol(df_number) - 2 )/2)) {
      df_mini <- df_number[(i*2+1):(i*2+2)]
      colnames(df_mini) <- c("Year","Number")
      df_new <- rbind(df_new,df_mini)
    }
    df_new
    #    Year Number
    # 1  123      1
    # 2  124      2
    # 3  125      3
    # 4  126      4
    # 5  127      5
    # 6  128      6
    # 7  129      7
    # 8  130      8
    # 9  131      9