Given a price frame priceFrame
as
28881 29021 29399
2010-01-01 00:00:00 -> 123.535878499576 195.28635425580265 189.92210186152082
2010-01-04 00:00:00 -> 124.19087548338847 198.10448102247753 190.1571733631235
2010-01-05 00:00:00 -> 123.82028508465247 197.8259452373992 190.31388769752525
2010-01-06 00:00:00 -> 124.17363872065654 197.80956077945342 189.98478759528152
2010-01-07 00:00:00 -> 123.4583130672824 197.58017836821244 190.31388769752527
2010-01-08 00:00:00 -> 124.23396739021821 198.10448102247756 190.25120196376457
2010-01-11 00:00:00 -> 125.12166067091142 197.87509861123658 190.73701640041008
2010-01-12 00:00:00 -> 124.9234378994945 195.0569718445617 191.41088803833776
2010-01-13 00:00:00 -> 125.06133200134975 195.64681233060992 191.50491663897884
2010-01-14 00:00:00 -> 124.97514818769021 196.28580619049552 191.56760237273951
2010-01-15 00:00:00 -> 123.71686450826103 192.5829186947483 192.08475967626538
2010-01-18 00:00:00 -> 123.71686450826103 194.10667328370621 192.31983117786805
2010-01-19 00:00:00 -> 123.15666971947407 195.87619474185092 191.94371677530378
2010-01-20 00:00:00 -> 121.5622691667727 191.79646471335064 192.82131704795376
2010-01-21 00:00:00 -> 121.5450324040408 188.38849746062752 192.9937028157957
2010-01-22 00:00:00 -> 121.81220222638535 186.8647428716696 192.9937028157957
2010-01-25 00:00:00 -> 121.94147794687466 184.83307008639233 192.9937028157957
2010-01-26 00:00:00 -> 121.38990153945363 185.9799821425972 193.19743145051802
2010-01-27 00:00:00 -> 120.94174570842405 184.91499237612123 193.3541457849198
2010-01-28 00:00:00 -> 120.44187958919875 182.5392459739825 193.22877431739838
2010-01-29 00:00:00 -> 119.4938576389439 183.75169586197052 193.35414578491978
and a dividends frame divFrame
as
28881 29021 29399
2010-01-04 00:00:00 -> 1.3 <missing> <missing>
2010-01-13 00:00:00 -> <missing> 1.3 <missing>
2010-01-22 00:00:00 -> <missing> <missing> 1.3
I want to combine them such that I have prices + dividends where dividends exists, otherwise keep the prices as they were.
Both of the following attempts
let dfZipped1 = priceFrame.Zip(divFrame, JoinKind.Left, JoinKind.Left, Lookup.Exact, false, fun (p:float) d -> p + d)
dfZipped1.Print()
let dfZipped2 = priceFrame.Zip(divFrame, JoinKind.Left, JoinKind.Left, Lookup.Exact, true, fun (p:float) d -> p + d)
dfZipped2.Print()
result in the same output
28881 29021 29399
2010-01-01 00:00:00 -> <missing> <missing> <missing>
2010-01-04 00:00:00 -> 125.49087548338846 <missing> <missing>
2010-01-05 00:00:00 -> <missing> <missing> <missing>
2010-01-06 00:00:00 -> <missing> <missing> <missing>
2010-01-07 00:00:00 -> <missing> <missing> <missing>
2010-01-08 00:00:00 -> <missing> <missing> <missing>
2010-01-11 00:00:00 -> <missing> <missing> <missing>
2010-01-12 00:00:00 -> <missing> <missing> <missing>
2010-01-13 00:00:00 -> <missing> 196.94681233060993 <missing>
2010-01-14 00:00:00 -> <missing> <missing> <missing>
2010-01-15 00:00:00 -> <missing> <missing> <missing>
2010-01-18 00:00:00 -> <missing> <missing> <missing>
2010-01-19 00:00:00 -> <missing> <missing> <missing>
2010-01-20 00:00:00 -> <missing> <missing> <missing>
2010-01-21 00:00:00 -> <missing> <missing> <missing>
2010-01-22 00:00:00 -> <missing> <missing> 194.2937028157957
2010-01-25 00:00:00 -> <missing> <missing> <missing>
2010-01-26 00:00:00 -> <missing> <missing> <missing>
2010-01-27 00:00:00 -> <missing> <missing> <missing>
2010-01-28 00:00:00 -> <missing> <missing> <missing>
2010-01-29 00:00:00 -> <missing> <missing> <missing>
The numbers that are not missing are correct. But I want to retain the prices that do not have dividends.
let dfZipped3 = priceFrame.Zip(divFrame, JoinKind.Left, JoinKind.Left, Lookup.Exact, false, fun (p:float) d -> p + (d |> Option.defaultValue 0.0))
dfZipped3.Print()
results in
28881 29021 29399
2010-01-01 00:00:00 -> 123.535878499576 195.28635425580265 189.92210186152082
2010-01-04 00:00:00 -> 124.19087548338847 198.10448102247753 190.1571733631235
2010-01-05 00:00:00 -> 123.82028508465247 197.8259452373992 190.31388769752525
2010-01-06 00:00:00 -> 124.17363872065654 197.80956077945342 189.98478759528152
2010-01-07 00:00:00 -> 123.4583130672824 197.58017836821244 190.31388769752527
2010-01-08 00:00:00 -> 124.23396739021821 198.10448102247756 190.25120196376457
2010-01-11 00:00:00 -> 125.12166067091142 197.87509861123658 190.73701640041008
2010-01-12 00:00:00 -> 124.9234378994945 195.0569718445617 191.41088803833776
2010-01-13 00:00:00 -> 125.06133200134975 195.64681233060992 191.50491663897884
2010-01-14 00:00:00 -> 124.97514818769021 196.28580619049552 191.56760237273951
2010-01-15 00:00:00 -> 123.71686450826103 192.5829186947483 192.08475967626538
2010-01-18 00:00:00 -> 123.71686450826103 194.10667328370621 192.31983117786805
2010-01-19 00:00:00 -> 123.15666971947407 195.87619474185092 191.94371677530378
2010-01-20 00:00:00 -> 121.5622691667727 191.79646471335064 192.82131704795376
2010-01-21 00:00:00 -> 121.5450324040408 188.38849746062752 192.9937028157957
2010-01-22 00:00:00 -> 121.81220222638535 186.8647428716696 192.9937028157957
2010-01-25 00:00:00 -> 121.94147794687466 184.83307008639233 192.9937028157957
2010-01-26 00:00:00 -> 121.38990153945363 185.9799821425972 193.19743145051802
2010-01-27 00:00:00 -> 120.94174570842405 184.91499237612123 193.3541457849198
2010-01-28 00:00:00 -> 120.44187958919875 182.5392459739825 193.22877431739838
2010-01-29 00:00:00 -> 119.4938576389439 183.75169586197052 193.35414578491978
all the prices are there but none of the dividends have been added
let dfZipped4 = priceFrame.Zip(divFrame, JoinKind.Left, JoinKind.Left, Lookup.Exact, true, fun (p:float) d -> p + (d |> Option.defaultValue 0.0))
dfZipped4.Print()
results in nothing but missing values.
How do I add the prices to the dividends when they align, but otherwise leave the prices unchanged?
Update
I have timed the execution of each of the answers from Frocha and zuzhu. The second answer from zyzhu does not, as it stands, produce the correct result.
For 1000 sequential runs of each tecnique I get typical times of
frocha1: 572.974400
frocha2: 562.867600
zyzhu1: 1099.057100
frocha2 is consistently slightly faster than frocha1. zyzhu1 is always slower than the others. So for now I am accepting Frocha's answer.
However if zyzhu2 can be made to work, it may end up being the fastest, since it is the simplest. In that case I will change the accepted answer.
My approach does not consider speed constraints and is as follow: 1) rename the columns to be able to perform a Join without error 2) join the frames. 3) replace missing values with zeros. 4) sum the corresponding columns. 5) drop the dividends columns. 6) Optional: change priceFrame's col name to the original type if the "string" transformation is not desired.
module Frame =
//I usually add this handy function to the Frame module
let mapReplaceCol col f frame =
frame
|> Frame.replaceCol col (Frame.mapRowValues f frame)
let priceFrame' = priceFrame |> Frame.mapColKeys string
//appends a "D" in the col key to eliminate col with same name
let dividends' =
dividends
|> Frame.mapColKeys (string >> (+) "D")
let joinedFrame =
priceFrame'
|> Frame.join JoinKind.Right dividends'
|> Frame.fillMissingWith 0.
(joinedFrame,priceFrame'.ColumnKeys |> List.ofSeq)
||> List.fold (fun acc elem ->
acc|> Frame.mapReplaceCol elem (fun row ->
row.GetAs<float>("D" + elem) + row.GetAs<float>(elem))
|> Frame.dropCol ("D" + elem))
EDIT
Another approach using Zip.
//generate a dividends frame with the same rows of priceFrame
let dividends2 =
(priceFrame,priceFrame.ColumnKeys |> List.ofSeq)
||> List.fold (fun acc elem -> acc|> Frame.dropCol elem) //empty frame
|> Frame.join JoinKind.Outer dividends
|> Frame.fillMissingWith 0.
(priceFrame,dividends2) ||> Frame.zip (fun (p : float) (d : float) -> p + d)