Search code examples
rplotcluster-analysisk-meansscatter-plot

how can i make the circles of my plot smaller in R?


This is the code I used:

resources <- read.csv("https://raw.githubusercontent.com/umbertomig/intro-prob-stat-FGV/master/datasets/resources.csv")

res <- subset(resources, select = c("cty_name", "year", "regime",
                             "oil", "logGDPcp", "illit"))
resNoNA <- na.omit(res)
resNoNAS <- scale(resNoNA[, 3:6])
colMeans(resNoNA[, 3:6])
apply(resNoNA[, 3:6], 2, sd)
cluster2 <- kmeans(resNoNAS, centers = 2)
table(cluster2$cluster)
## this gives standardized answer, which is hard to interpret
cluster2$centers
## better to subset the original data and then compute means
g1 <- resNoNA[cluster2$cluster == 1, ]
colMeans(g1[, 3:6])
g2 <- resNoNA[cluster2$cluster == 2, ]
colMeans(g2[, 3:6])

plot(x = resNoNA$logGDPcp, y = resNoNA$illit, main = "Illiteracy v GDP",
xlab = "GDP per Capita",  ylab = "Illiteracy", 
col = cluster2$cluster, cex = resNoNA$oil)

but I wanted to make the circles smaller in order to fit within the limits of the graph

enter image description here


Solution

  • You control the circle diameter with cex= here.

    plot(x = resNoNA$logGDPcp, y = resNoNA$illit, main = "Illiteracy v GDP",
    xlab = "GDP per Capita",  ylab = "Illiteracy", 
    col = cluster2$cluster, cex = resNoNA$oil)
    plot(x = resNoNA$logGDPcp, y = resNoNA$illit, main = "Illiteracy v GDP",
    xlab = "GDP per Capita",  ylab = "Illiteracy", 
    col = cluster2$cluster, cex = resNoNA$oil/3)
    plot(x = resNoNA$logGDPcp, y = resNoNA$illit, main = "Illiteracy v GDP",
    xlab = "GDP per Capita",  ylab = "Illiteracy", 
    col = cluster2$cluster, cex = resNoNA$oil/5)
    

    3-pack of images

    Realize, however, that if you are using this in some automated report generator (e.g., rmarkdown, shiny), then you may need to adjust the dimensions of the plot to control it from the other angle: update xlim and ylim.