I have a list of entries. Each entry contains a string and a numerical value. The same string can appear one or more times in a list and it can have a different numerical value. be I am looking for an elegant way to process the list. The end result should be a list of unique pairs of string and (sum of values for the string). Each unique string should be associated with the sum of all numerical values for the string. I consider creating a HashSet<KeyValuePair<string, double>> and to add each string to the HashSet. But I am not sure how to add all values together in a single loop. The code below works, but it is inefficient.
SortedSet<string> symbols = new SortedSet<string>();
HashSet<KeyValuePair<string, double>> results = new HashSet<KeyValuePair<string, double>>();
string file = openFileDialog1.FileName;
string[] lines = File.ReadAllLines(file);
foreach(string line in lines) {
string[] values = line.Split('\t');
string symbol = values[0].Trim();
symbols.Add(symbol);
}
foreach(string uniqueSymbol in symbols) {
double value = 0;
foreach(string line in lines) {
string[] values = line.Split('\t');
string symbol = values[0].Trim();
string sProfit = values[7].Trim();
double fProfit = Convert.ToDouble(sProfit);
if(symbol == uniqueSymbol) {
value += fProfit;
}
}
results.Add(new KeyValuePair<string, double>(uniqueSymbol, value));
}
As several others here, I agree that a Dictionary
sounds like what suits your needs best. If using Linq is an option, here is an apporach that should work:
Dictionary<string, double> results = lines
.Select(line => line.Split('\t'))
.Select(lineEntries => (
Symbol: lineEntries[0].Trim(),
Value: Convert.ToDouble(lineEntries[7].Trim()))) // Some validation should be done here
.GroupBy(symbolAndValue => symbolAndValue.Symbol)
.ToDictionary(
gr => gr.Key,
gr => gr.Select(symbolAndValue => symbolAndValue.Value).Sum());
I'm creating a tuple (string Symbol, double Value)
for each line
, then grouping them by Symbol
, then creating a dictionary using Symbol
as the Key
and the sum of values connected to Symbol
as the Value
for each dictionary entry.
A slightly more concise approach would be merging the two Select
statements together by performing line.Split('\t')
twice:
.Select(line => (
Symbol: line.Split('\t')[0].Trim(),
Value: Convert.ToDouble(line.Split('\t')[7].Trim())))