I am trying to figure out how to bin/histogram an array of data in Julia. I have an array of units from the Unitful.jl
package and want to use Histogram
from StatsBase
to bin the data. The first error I got was an error saying there were no methods for log10
to use Unitful.FreeUnits
so I wrote one. Now I get a dimensional error. I thought Unitful was just going to work with other stuff.. guess not. The following is where I am at.
using Unitful
using StatsBase
data = [rand()*100*1u"MHz" for x in 1:10000]
function Base.log10(x::Quantity{})
u = unit(x)
return log10(x.val)u
end
# eventually I want to define my bin width manually.. but this is a start.
fit(Histogram, data)
ERROR
ERROR: DimensionError: 0.0 and 0.8237981449864736 MHz are not dimensionally compatible.
Stacktrace:
[1] _lt at /home/mcamp/.julia/packages/Unitful/1t88N/src/quantities.jl:274 [inlined]
[2] <(::Quantity{Float64,NoDims,Unitful.FreeUnits{(),NoDims,nothing}}, ::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}) at /home/mcamp/.julia/packages/Unitful/1t88N/src/quantities.jl:264
[3] <(::Int64, ::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}) at /home/mcamp/.julia/packages/Unitful/1t88N/src/quantities.jl:266
[4] <=(::Int64, ::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}) at ./operators.jl:326
[5] >=(::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}, ::Int64) at ./operators.jl:350
[6] histrange(::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}, ::Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}}, ::Int64, ::Symbol) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:51
[7] histrange(::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}, ::Int64, ::Symbol) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:39
[8] (::StatsBase.var"#127#128"{Symbol})(::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}, ::Int64) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:103
[9] map(::StatsBase.var"#127#128"{Symbol}, ::Tuple{Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}}, ::Tuple{Int64}) at ./tuple.jl:176
[10] histrange(::Tuple{Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}}, ::Tuple{Int64}, ::Symbol) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:102
[11] fit(::Type{Histogram{Int64,N,E} where E where N}, ::Tuple{Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}}; closed::Symbol, nbins::Int64) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:332
[12] fit(::Type{Histogram{Int64,N,E} where E where N}, ::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}; closed::Symbol, nbins::Int64) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:276
[13] fit(::Type{Histogram{Int64,N,E} where E where N}, ::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:276
[14] fit(::Type{Histogram}, ::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:383
[15] fit(::Type{Histogram}, ::Array{Quantity{Float64,𝐓^-1,Unitful.FreeUnits{(MHz,),𝐓^-1,nothing}},1}) at /home/mcamp/.julia/packages/StatsBase/EA8Mh/src/hist.jl:383
[16] top-level scope at REPL[183]:1
[17] run_repl(::REPL.AbstractREPL, ::Any) at /build/julia/src/julia-1.5.3/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:288
[EDITED because I misunderstood the question]
Are you looking to plot a histogram of the data? If yes, instead of StatsBase.jl, I would use the histogram
function of a plotting package like Plots.jl. For instance, to combine plots from Plots.jl with units from Unitful.jl, you can use the UnitfulRecipes.jl package. See this MWE that may be what you are after:
using Unitful: MHz
using Plots
using UnitfulRecipes
data = 100 * exp.(randn(10000)) * MHz
histogram(data)
will output
[BEFORE EDIT]
I don't think it makes sense mathematically or physically to take the logarithm of a non-nondimensional variable (i.e., a variable with a unit). That is, your redefinition of log10
is not a good idea IMHO. Instead, I would non-dimensionalize the data before taking the log, with something similar to
using Unitful: MHz
using StatsBase
data = 100 * rand(10000) * MHz
data_nodim = log10.(data / MHz) # <- this is valid
fit(Histogram, data_nodim) # <- this is valid too