I have the following dataframe called df:
df = DataFrame(i=1:5,
x=[missing, missing, missing, missing, missing],
y=[missing, missing, 1, 3, 6])
5×3 DataFrame
Row │ i x y
│ Int64 Missing Int64?
─────┼─────────────────────────
1 │ 1 missing missing
2 │ 2 missing missing
3 │ 3 missing 1
4 │ 4 missing 3
5 │ 5 missing 6
I would like to remove the columns where all values are missing. In this case it should remove column x because it has only all missing
values. with dropmissing
it removes all rows, but that's not what I want. So I was wondering if anyone knows how to remove only columns where all values are missing in a dataframe Julia?
df1 = DataFrame()
foreach(
x->all(ismissing, df[!, x]) ? nothing : df1[!, x] = df[!, x],
propertynames(df)
)
df
# 5×2 DataFrame
# Row │ i y
# │ Int64 Int64?
# ─────┼────────────────
# 1 │ 1 missing
# 2 │ 2 missing
# 3 │ 3 1
# 4 │ 4 3
# 5 │ 5 6
df[:, map(x->!all(ismissing, df[!, x]), propertynames(df))]
# 5×2 DataFrame
# Row │ i y
# │ Int64 Int64?
# ─────┼────────────────
# 1 │ 1 missing
# 2 │ 2 missing
# 3 │ 3 1
# 4 │ 4 3
# 5 │ 5 6
# OR
df[!, map(x->!all(ismissing, x), eachcol(df))]
# 5×2 DataFrame
# Row │ i y
# │ Int64 Int64?
# ─────┼────────────────
# 1 │ 1 missing
# 2 │ 2 missing
# 3 │ 3 1
# 4 │ 4 3
# 5 │ 5 6
#Or
df[!, Not(names(df, all.(ismissing, eachcol(df))))]
# I omitted the result to prevent this answer from becoming extensively lengthy.
#Or
df[!, Not(all.(ismissing, eachcol(df)))]
deleteat!
function:deleteat!(permutedims(df), all.(ismissing, eachcol(df))) |> permutedims
# 5×2 DataFrame
# Row │ i y
# │ Int64 Int64?
# ─────┼────────────────
# 1 │ 1 missing
# 2 │ 2 missing
# 3 │ 3 1
# 4 │ 4 3
# 5 │ 5 6
select!
function, as Dan noted:select!(df, [k for (k,v) in pairs(eachcol(df)) if !all(ismissing, v)])
# 5×2 DataFrame
# Row │ i y
# │ Int64 Int64?
# ─────┼────────────────
# 1 │ 1 missing
# 2 │ 2 missing
# 3 │ 3 1
# 4 │ 4 3
# 5 │ 5 6