Search code examples
juliacomposition

How to Histogram Unitful Units in Julia?


I am trying to figure out how to bin/histogram an array of data in Julia. I have an array of units from the Unitful.jl package and want to use Histogram from StatsBase to bin the data. The first error I got was an error saying there were no methods for log10 to use Unitful.FreeUnits so I wrote one. Now I get a dimensional error. I thought Unitful was just going to work with other stuff.. guess not. The following is where I am at.

using Unitful
using StatsBase

data = [rand()*100*1u"MHz" for x in 1:10000]

function Base.log10(x::Quantity{})
    u = unit(x)
    return log10(x.val)u
end

# eventually I want to define my bin width manually.. but this is a start.
fit(Histogram, data)

ERROR

ERROR: DimensionError: 0.0 and 0.8237981449864736 MHz are not dimensionally compatible.
Stacktrace:
 [1] _lt at /home/mcamp/.julia/packages/Unitful/1t88N/src/quantities.jl:274 [inlined]
 [2] <(::Quantity{Float64,NoDims,Unitful.FreeUnits{(),NoDims,nothing}}, ::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}) at /home/mcamp/.julia/packages/Unitful/1t88N/src/quantities.jl:264
 [3] <(::Int64, ::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}) at /home/mcamp/.julia/packages/Unitful/1t88N/src/quantities.jl:266
 [4] <=(::Int64, ::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}) at ./operators.jl:326
 [5] >=(::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}, ::Int64) at ./operators.jl:350
 [6] histrange(::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}, ::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}, ::Int64, ::Symbol) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:51
 [7] histrange(::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}, ::Int64, ::Symbol) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:39
 [8] (::StatsBase.var"#127#128"{Symbol})(::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}, ::Int64) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:103
 [9] map(::StatsBase.var"#127#128"{Symbol}, ::Tuple{Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}}, ::Tuple{Int64}) at ./tuple.jl:176
 [10] histrange(::Tuple{Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}}, ::Tuple{Int64}, ::Symbol) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:102
 [11] fit(::Type{Histogram{Int64,N,E} where E where N}, ::Tuple{Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}}; closed::Symbol, nbins::Int64) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:332
 [12] fit(::Type{Histogram{Int64,N,E} where E where N}, ::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}; closed::Symbol, nbins::Int64) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:276
 [13] fit(::Type{Histogram{Int64,N,E} where E where N}, ::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:276
 [14] fit(::Type{Histogram}, ::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:383
 [15] fit(::Type{Histogram}, ::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:383
 [16] top-level scope at REPL[183]:1
 [17] run_repl(::REPL.AbstractREPL, ::Any) at /build/julia/src/julia-1.5.3/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:288

Solution

  • [EDITED because I misunderstood the question]

    Are you looking to plot a histogram of the data? If yes, instead of StatsBase.jl, I would use the histogram function of a plotting package like Plots.jl. For instance, to combine plots from Plots.jl with units from Unitful.jl, you can use the UnitfulRecipes.jl package. See this MWE that may be what you are after:

    using Unitful: MHz
    using Plots
    using UnitfulRecipes
    data = 100 * exp.(randn(10000)) * MHz
    histogram(data)
    

    will output

    histogram of data with units


    [BEFORE EDIT]

    I don't think it makes sense mathematically or physically to take the logarithm of a non-nondimensional variable (i.e., a variable with a unit). That is, your redefinition of log10 is not a good idea IMHO. Instead, I would non-dimensionalize the data before taking the log, with something similar to

    using Unitful: MHz
    using StatsBase
    data = 100 * rand(10000) * MHz
    data_nodim = log10.(data / MHz) # <- this is valid
    fit(Histogram, data_nodim)      # <- this is valid too