The function add_ints
correctly adds two integer columns
A,B
2,3
5,7
9,11
in a CSV file.
Why does the function add_strings
not correctly concatenate two string columns
L,R
"a","b"
"c","d"
"e","f"
into a third column
L,R,C
"a","b","ab"
"c","d","cd"
"e","f","ef"
when starting from a similar CSV file?
using Deedle;
using System.IO;
namespace NS
{
class TwoColumnOps
{
static void Main(string[] args)
{
string root = "path/to";
add_ints(root);
add_strings(root);
}
static void add_ints(string root)
{
Deedle.Frame<int, string> df = Frame.ReadCsv(Path.Combine(root, "data_ints.csv"));
Series<int, int> a = df.GetColumn<int>("A");
Series<int, int> b = df.GetColumn<int>("B");
Series<int, int> c = a + b;
df.AddColumn("C", c);
df.Print();
}
static void add_strings(string root)
{
Deedle.Frame<int, string> df = Frame.ReadCsv(Path.Combine(root, "data_strings.csv"));
Series<int, string> a = df.GetColumn<string>("L");
Series<int, string> b = df.GetColumn<string>("R");
// Series<int, string> c = a + b;
// Series<int, string> c = $"{a} and {b}";
Series<int, string> c = string.Concat(a, b);
df.AddColumn("C", c);
df.Print();
}
}
}
The error for all three styles of concatenation is:
Error CS0029 Cannot implicitly convert type 'string' to 'Deedle.Series<int, string>'
The reason why +
works on series of numbers, but string.Concat
does not work on series of strings is that the series type defines an overloaded +
operator for numerical series. This sadly only works on numbers.
For non-numeric series, the easiest option is to use ZipInner
to align the two series. This gives you a series of tuples. You can then use Select
to transfom the values in an element-wise way:
var df = Frame.ReadCsv("/some/test/file.csv");
var s1 = df.GetColumn<string>("first");
var s2 = df.GetColumn<string>("second");
var added = s1.ZipInner(s2).Select(t => t.Value.Item1 + t.Value.Item2);
df.AddColumn("added", added);