Search code examples
rggplot2chemistry

How to display counts using the periodic table with ggplot?


I have a list of elemental compositions and I'd like to display a count for the number of times an element is included in a composition mapped onto the periodic table (e.g. CH4 would increase the count on H and C by one).

How can I do this with ggplot? Is there a map I can use?


Solution

  • With a bit of searching I found information about the periodic table in this example code project. They had an Access Database with element information. I've exported it to this gist. You can import the data using the httr library with

    library(httr)
    dd <- read.table(text=content(GET("https://gist.githubusercontent.com/MrFlick/c1183c911bc5398105d4/raw/715868fba2d0d17a61a8081de17c468bbc525ab1/elements.txt")), sep=",", header=TRUE)
    

    (You should probably create your own local version for easier loading in the future.)

    Then your other challenge is decomposing something like "CH4" into the raw element counts. I've created this helper function which I think does what you need.

    decompose <- function(x) {
        m <- gregexpr("([A-Z][a-z]?)(\\d*)", x, perl=T)
        dx <- Map(function(x, y) {
            ElementSymbol <- gsub("\\d","", x)
            cnt <- as.numeric(gsub("\\D","", x))
            cnt[is.na(cnt)]<-1
            cbind(Sym=y, as.data.frame(xtabs(cnt~ElementSymbol)))
        }, regmatches(x,m), x)
        do.call(rbind, dx)
    }
    

    Here I test the function

    test_input <- c("H2O","CH4")
    decompose(test_input)
    
    #   Sym ElementSymbol Freq
    # 1 H2O             H    2
    # 2 H2O             O    1
    # 3 CH4             C    1
    # 4 CH4             H    4
    

    Now we can combine the data and the reference information to make a plot

    library(ggplot2)
    ggplot(merge(decompose("CH4"), dd), aes(Column, -Row)) + 
        geom_tile(data=dd, aes(fill=GroupName), color="black") + 
        geom_text(aes(label=Freq))
    

    enter image description here

    Clearly there are opportunities for improvement but this should give you a good start.

    You might look for a more robust decomposition function. Looks like the CHNOSZ package has one

    library(CHNOSZ)
    data(thermo)
    decompose <- function(x) {
        do.call(`rbind`, lapply(x, function (x) {
           z <- makeup(x)
           cbind(data.frame(ElementSymbol = names(z),Freq=z), Sym=x)
        }))
    }
    ggplot(merge(decompose("CaAl2Si2O7(OH)2*H2O"), dd), aes(Column, -Row)) + 
        geom_tile(data=dd, aes(fill=GroupName), color="black") + 
        geom_text(aes(label=Freq))