I just finished a course to learn R for Data Analysis and now I am working on my own on a case study.
Since I am a beginner, please help me understand this problem I did not have during the course.
I have imported csv files and I want to assign them to variables with better names.
I am using following packades: tidyverse, readr, lubridate, ggplot2, janitor, tidyr, skimr.
This is my code:
daily_Activity <- read_csv("../input/bellabeat-dataset/dailyActivity_merged.csv")
daily_Calories <- read_csv("../input/bellabeat-dataset/dailyCalories_merged.csv")
daily_Intesities <- read_csv("../input/bellabeat-dataset/dailyIntensities_merged.csv")
daily_Steps <- read_csv("../input/bellabeat-dataset/dailySteps_merged.csv")
hourly_Calories <- read_csv("../input/bellabeat-dataset/hourlyCalories_merged.csv")
sleep_Day <- read_csv("../input/bellabeat-dataset/sleepDay_merged.csv")
weight_Log <- read_csv("../input/bellabeat-dataset/weightLogInfo_merged.csv")
When I run the code the new tables are created with the new name, but the console also shows me this message:
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
I don't quite understand if this is a problem or if I should just ignore it.
Resources:
Column specification
It would be tedious if you had to specify the type of every column when reading a file. Instead readr
, uses some heuristics to guess the type of each column. You can access these results yourself using guess_parser()
:
Column specification describes the type of each column and the strategy readr uses to guess types so you don’t need to supply them all.
df <- read_csv(readr_example("mtcars.csv"))
will give:
Rows: 32 Columns: 11
-- Column specification ---------------------
Delimiter: ","
dbl (11): mpg, cyl, disp, hp, drat, wt, q...
i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
If we then use spec(df):
spec(df)
We will get:
cols(
mpg = col_double(),
cyl = col_double(),
disp = col_double(),
hp = col_double(),
drat = col_double(),
wt = col_double(),
qsec = col_double(),
vs = col_double(),
am = col_double(),
gear = col_double(),
carb = col_double()
)
readr
will guess the data types if there is no specification. This may consume time.readr
can't guess the data type (for example messy date input). With spec()
we have to identify and determine the type of this specific column.