I was interested in looking at GDP of a few states over a span of 4 years. After I imported the .csv file, I renamed the column names and then removed irrelevant rows. The result is that the data skips the 10th row when numbered. So it goes from 1 to 9, then starts at 11.
When I tried this with a similar dataframe I imported from a .xls file, the data does not skip the 10th row when numbered.
gdp<-read.csv("GDP_per.csv",skip = 4)
gdp<-gdp%>%
rename(
"2014" = X2013.2014,
"2015" = X2014.2015,
"2016" = X2015.2016,
"2017" = X2016.2017,
"2018" = X2017.2018
)
gdp<-gdp[c(-(10),-(53:64)),]
gdp2<-read_excel("GDP_dol.xls", skip = 5)
gdp2<-gdp2[,c(2,20:24)]
gdp2<-gdp2[c(-(10),-(53:64)),]
9 Delaware 10.7 5.5 -0.7 2.5 3.9
11 Florida 4.9 6.5 5.0 4.4 5.8
vs.
9 Delaware 67178.9 70896.2 70379.8 72167.2 74973.3
10 Florida 839706.0 894044.0 938370.3 979464.6 1036323.2
The read.csv
function returns a data.frame while read_excel
returns a tibble. They are not the same and do not necessarily behave the same way. A data frame retains the original row names until you change them, e.g.
(x <- data.frame(V1=1:10, V2=11:20))
(x2 <- x[-5, ]) # Row name 5 is missing
rownames(x2) <- NULL
x2 # Row names 1 - 9
A tibble automatically renumber the rows:
library(tidyr)
xt <- tibble(x)
(xt[-5, ])