Using DataFramesMeta.jl
module, operations can be linked in Julia using the following approach
using DataFrames, DataFramesMeta
df = DataFrame(a = collect(1:5), b = ["a","b","c","d","e"])
@> begin
df
@where(:a .> 2)
@select(:a, :b, c = :a*2)
end
# or:
print(
@linq df |>
@where(:a .> 2) |>
@select(:a,:b, c = :a*2)
)
3x3 DataFrames.DataFrame
| Row | a | b | c |
|-----|---|-----|----|
| 1 | 3 | "c" | 6 |
| 2 | 4 | "d" | 8 |
| 3 | 5 | "e" | 10 |
The idea is that df
is the first argument of macro @where
and the whole @where
statement feeds as the first argument into the @select
macro.
However we might want the top line to become the second argument or we might want to use it in several places. For R users (in R operations can be chained using %>%
), dplyr
package enables this with a dot (.
) notation, so the following would work:
library(dplyr)
df = data.frame(a = 1:5, b = c("a","b","c","d","e"))
df %>% filter(a > 2) %>% mutate(c = nrow(.):1) %>% select(b,c)
b c
1 c 3
2 d 2
3 e 1
I was looking for a way to mimic R's dot notation, but unfortunately, the dot notation does not work in Julia and I couldn't find anything about this in the package documentations.
If anyone know how this can be achieved, please let us know.
When using Lazy
,@as
macro lets you name the threaded argument:
@as _ x f(_, y) g(z, _) == g(z, f(x, y))
With @as
macro the mentioned task could be done like this:
julia> import Lazy.@as
julia> using DataFrames, DataFramesMeta
julia> df = DataFrame(a = collect(1:5), b = ["a","b","c","d","e"])
5x2 DataFrames.DataFrame
| Row | a | b |
|-----|---|-----|
| 1 | 1 | "a" |
| 2 | 2 | "b" |
| 3 | 3 | "c" |
| 4 | 4 | "d" |
| 5 | 5 | "e" |
julia> @as _ df @where(_, :a .> 2) @select(_,:a, :b, c = :a*2)
3x3 DataFrames.DataFrame
| Row | a | b | c |
|-----|---|-----|----|
| 1 | 3 | "c" | 6 |
| 2 | 4 | "d" | 8 |
| 3 | 5 | "e" | 10 |