This is my first time posting here after years of being an anonymous reader. Please be kind to me in case the format for posting questions is wrong.
My dataset involves storing iterations and particles and associated llh and se values.
llh se Particles Iterations Time
1 NaN NA 500 5 7.222
2 -2087.0886 41.53552846 1000 5 14.149
3 -1903.6823 171.30398540 1500 5 19.488
4 -2474.3789 NA 2000 5 25.336
5 -1229.1886 1.33015305 3000 5 37.858
6 -1331.1882 9.66674817 5000 5 60.994
7 -2330.5701 35.17979986 7500 5 92.654
8 -1753.6308 137.62546543 500 10 13.891
9 -1468.1730 64.58164086 1000 10 26.474
10 -2221.8960 73.11124703 1500 10 37.651
11 -2606.5620 46.51251610 2000 10 51.719
12 -1301.0474 12.59814717 3000 10 75.776
13 -927.7820 0.18559457 5000 10 125.121
14 -1180.8230 10.55185851 7500 10 151.593
15 -3109.6442 55.29536888 500 15 15.997
16 -1959.0457 44.58603179 1000 15 39.391
17 -1268.8367 24.06368751 1500 15 58.382
18 -2832.5527 NA 2000 15 76.853
19 -845.2781 0.21124844 3000 15 99.497
20 -845.4272 0.02649884 5000 15 147.611
21 -1446.8511 17.06673528 7500 15 217.608
or if dput() is preferred:
> dput(logliks[1:21,])
structure(list(llh = c(NaN, -2087.08855486818, -1903.6823477862,
-2474.37893002966, -1229.18856210967, -1331.18815912831, -2330.57009669248,
-1753.63084316259, -1468.17297841903, -2221.89596236152, -2606.56196704478,
-1301.0473771866, -927.782003670307, -1180.82300393742, -3109.64417468708,
-1959.04572793909, -1268.83669965093, -2832.5527445189, -845.278087151579,
-845.427210637555, -1446.85110262111), se = c(NA, 41.5355284568715,
171.303985396005, NA, 1.33015305002498, 9.66674817155666, 35.1797998633679,
137.625465433877, 64.5816408601655, 73.1112470277094, 46.5125161022654,
12.5981471672579, 0.185594570806789, 10.5518585121374, 55.2953688797359,
44.5860317855338, 24.0636875106622, NA, 0.21124844438021, 0.0264988432776242,
17.0667352804977), Particles = c(500, 1000, 1500, 2000, 3000,
5000, 7500, 500, 1000, 1500, 2000, 3000, 5000, 7500, 500, 1000,
1500, 2000, 3000, 5000, 7500), Iterations = c(5, 5, 5, 5, 5,
5, 5, 10, 10, 10, 10, 10, 10, 10, 15, 15, 15, 15, 15, 15, 15),
Time = c(7.222, 14.149, 19.488, 25.336, 37.858, 60.994, 92.654,
13.891, 26.474, 37.651, 51.719, 75.776, 125.121, 151.593,
15.997, 39.391, 58.382, 76.853, 99.497, 147.611, 217.608)), row.names = c(NA,
21L), class = "data.frame")
I was trying to plot a box-plot and it is not grouping as expected. I tried discretizing the x-axis according to another post I found here, however it gives me the error "Error: Discrete value supplied to continuous scale".
Here's my code:
library(ggthemes)
g <- ggplot(logliks,aes(x=factor(Iterations), y=llh, group=Particles, fill=factor(Particles)))+
geom_boxplot(position=position_dodge(1))+
ylim(-4000,-400)+
xlim(5,250)+
theme(axis.text.x = element_text(angle=65, vjust=0.6))+
labs(title="log Likelihoods",
subtitle = TeX(paste("For one guess of $\\epsilon$ and $\\kappa$ each")),
caption="Likelihoods with respect to iterations and particles",
x="Iterations",
y ="log Likelihood",
fill = paste("Particles"))+
scale_fill_manual(values = colour_palette_parts)+
guides(colour = guide_legend(override.aes = list(size=6,shape = 20),nrow=2))+
theme_bw()+
themespecs
I have defined my own colour palette colour_palette_parts
colour_palette_parts <- c("#ffbe0b", "#ff8e09", "#ff5d07", "#ff2b05", "#ff040e", "#ff023e", "#ff006e")
and also called library(latex2exp) for the LateX symbols in captions/subtitles.
Here's what I want: Taken from another website
Here's what I get using the above code, except not discretizing the x-axis
(i.e. using ...aes(x=Iterations,...
instead of ...aes(x=factor(Iterations),...
).
I even get the error "position_dodge()
requires non-overlapping x intervals "
How can I separate them into little boxes? Kindly help me out. Thanks in advance!
Update: I have found out how to discretize the x-axis without the error:
...aes(x=factor(Iterations,levels=c(5,10,15,20,30,50,100,150,200,250)), y=llh,...
Now it generates an image, albeit lack of grouping. This is the updated image which still lacks grouping into number of particles for each iteration.
I think this will put the boxes where you want them. geom_boxplot()
already dodges them automatically on a per-fill basis. In your example data frame there is only one data point per box, so they look very narrow, but I think with your full dataset it will look as you expect.
ggplot(logliks, aes(x=factor(Iterations), y=llh, fill=factor(Particles)))+
geom_boxplot() +
theme(axis.text.x = element_text(angle=65, vjust=0.6))+
labs(title="log Likelihoods",
subtitle = paste("For one guess of $\\epsilon$ and $\\kappa$ each"),
caption="Likelihoods with respect to iterations and particles",
x="Iterations",
y ="log Likelihood",
fill = paste("Particles"))+
scale_fill_manual(values = colour_palette_parts) +
theme_bw()