Search code examples
type-conversionjulia

converting column types in Julia by looping over vector of columns


after quite a few attempts i can't seem to access the columns of the dataframe pictured below (partially) for type conversion. They are all strings upon import through CSV.read, and i need them to be Numbers.

THX in advance for your time!

Showing the dataframe with the first few columns to be converted (please note the Series Code column is safely not one of the columns

I;ve tried (1)

col_years = contains.(names(df),"YR")
transform!(df, names(df)[col_years] .=> ByRow(x->parse(Number, x)), renamecols = false)

with no success as well as (2)

for c in names(df)[col_years]
    df[!, c] = convert.(Number, df[!, c])
end

Error message for (1)

MethodError: no method matching parse(::Type{Number}, ::String31)
Closest candidates are:
  parse(::Type{T}, ::AbstractString; base) where T<:Integer at parse.jl:240
  parse(::Type{T}, ::AbstractString; kwargs...) where T<:Real at parse.jl:379
  parse(::Type{P}, ::AbstractString; kwargs...) where P<:FilePathsBase.AbstractPath at ~/.julia/packages/FilePathsBase/9kSEl/src/path.jl:77

When i try using the closest candidates it still fails but when substituting Int for Number above, i get

ArgumentError: invalid base 10 digit '.' in "80.0511140452116"

(Error message for (2) is highly similar)

A df similar to the one im working with (plz NOTE Mr. Kaminski's answer works for it, apparently b/c julia likes 'String' but not 'String31' for Float64 conversions?)

f = DataFrame(Series  = ["SP.POP.DPND"],
                  YR1960 = ["80.0511140452116"],
                  YR1961 = ["80.2223403638055"],
                  YR1962 = ["80.4019428356728"])

Solution

  • Parse the values as Float64:

    julia> parse(Float64, "12.3")
    12.3
    

    However, if CSV.read did not parse them as Float64 automatically it means that some values in your data frame are invalid and cannot be parsed as numbers. In such case you can either:

    1. find them and fix them (I would recommend this option),
    2. or ignore them by using tryparse.