I am learning to use ggvis and wanted to understand how to create the equivalent histogram to that produced by hist. Specifically, how do you set the bin widths and upper and lower bounds of x in ggvis histograms? What am I missing?
Question: How do I get the ggvis histogram output to match the hist output?
Let me provide an example:
require(psych)
require(RCurl)
require(ggvis)
if ( !exists("impact") ) {
url <- "https://dl.dropboxusercontent.com/u/8272421/stat/stat_one.txt"
myCsv <- getURL(url, ssl.verifypeer = FALSE)
impact <- read.csv(textConnection(myCsv), sep = "\t")
impact$subject <- factor(impact$subject)
}
describe(impact)
hist(impact$verbal_memory_baseline,
main = "Distribution of verbal memory baseline scores",
xlab = "score", ylab = "frequency")
Ok, lets try and reproduce with ggvis... the output does not match...
impact %>%
ggvis( x = ~verbal_memory_baseline, fill := "white") %>%
layer_histograms(width = 5) %>%
add_axis("x", title = "score") %>%
add_axis("y", title = "frequency")
How do I get the ggvis output to match the hist output?
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.2 (El Capitan)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] psych_1.5.6 knitr_1.11 ggvis_0.4.2.9000 setwidth_1.0-4 colorout_1.1-1 vimcom_1.2-3
loaded via a namespace (and not attached):
[1] Rcpp_0.12.0 digest_0.6.8 dplyr_0.4.3.9000 assertthat_0.1 mime_0.3
[6] R6_2.1.1 jsonlite_0.9.16 xtable_1.7-4 DBI_0.3.1 magrittr_1.5
[11] lazyeval_0.1.10.9000 rstudioapi_0.3.1 rmarkdown_0.7 tools_3.2.2 shiny_0.12.2
[16] httpuv_1.3.3 yaml_2.1.13 parallel_3.2.2 rsconnect_0.4.1.4 mnormt_1.5-3
[21] htmltools_0.2.6
Try
impact %>%
ggvis( x = ~verbal_memory_baseline, fill := "white") %>%
layer_histograms(width = 5, boundary = 5) %>%
add_axis("y", title = "frequency") %>%
add_axis("x", title = "score", ticks = 5)
Which gives:
The official documentation is a bit cryptic about how boundary
and center
works. Have a look at DataCamp's How to Make a Histogram with ggvis in R
The
width
argument already set the bin width to 5, but where do bins start and where do they end? You can use thecenter
orboundary
argument for this.center
should refer to one of the bins’ center value, which automatically determines the other bins location. Theboundary
argument specifies the boundary value of one of the bins. Here again, specifying a single value fixes the location of all bins. As these two arguments specify the same thing in a different way, you should set at most one ofcenter
orboundary
.
If you want the same result using center
instead of boundary
try:
impact %>%
ggvis( x = ~verbal_memory_baseline, fill := "white") %>%
layer_histograms(width = 5, center = 77.5) %>%
add_axis("y", title = "frequency") %>%
add_axis("x", title = "score", ticks = 5)
Here you specify the center of a bin (77.5) and it determines all the others automatically