So what I want to do is essentially do an inner join of two tables in c++. However, the only guide I could is this one from geeksforgeeks: https://www.geeksforgeeks.org/joining-tables-using-multimaps/
An example below is as follows:
// First Column Name: Numbers
// Second Column Name: Alphabets
vector<vector<string>> v1 = { vector<string> { "Numbers", "Alphabets" }
vector<string> { "1", "a" },
vector<string> { "2", "b" },
vector<string> { "3", "a" } }
// First Column Name: Fruits
// Second Column Name: Alphabets
vector<vector<string>> v2 = { vector<string> { "Fruits", "Alphabets" }
vector<string> { "apple", "a" },
vector<string> { "pear", "a" },
vector<string> { "peach", "c" },
vector<string> { "orange", "b" } }
So if I want to do a inner join via alphabets, (like the solution in the link, I can pass two arguments which is the index of the columns I want to compare) and get the following output:
vector<vector<string>> result = { vector<string> { "Numbers", "Alphabets", "Fruits" }
vector<string> { "1", "a", "apple" },
vector<string> { "1", "a", "pear" },
vector<string> { "3", "a", "apple" }
vector<string> { "3", "a", "pear" },
vector<string> { "2", "b", "orange" }, }
As you can see, I currently also need the column names in the final table, in the sense that I need to know that column 1 refers to Numbers, column 2 is Alphabets, and column 3 is Fruits, as I would need to extract out the data in a specific column later in my code, kind of like a SQL "get Numbers from result" which I can easily do with a for loop.
How do I do so? I've been stuck on this since I'm sure when building the new table, I would need to omit the first row of headers for both table when doing the comparison etc, which makes things complicated if there a strings of the same value.
EDIT: My data does not necessarily have to be in that format of vector<vector>. How I do it is I have an API that gives me a vector once I pass it my column Name, so I can easily store this in an unordered_map as well. I modeled into a "table-like" structure after that solution.
My original code looked something like this:
joinTwoTable(vector<vector<string>>& v1, int columnIndex1,
vector<vector<string>>& v2, int columnIndex2)
{
vector<vector<string>> result;
vector<string> header;
// Add in the header_row
for (string s : v1[0]) {
current.push_back(s);
}
for (string s2 : v2[0]) {
// Skip the common column again
if (s2 == v2[0][columnIndex2]) {
continue;
}
current.push_back(s2);
}
for (vector<string> v : v1) {
for (vector<string> v2 : v2) {
if (v[columnIndex1] == v2[columnIndex2]) {
vector<string> current;
for (string s : v) {
current.push_back(s);
}
for (string s2 : v2) {
// Skip the common column again
if (s2 == v2[columnIndex2]) {
continue;
}
current.push_back(s2);
}
result.push_back(current);
}
}
}
}
But the issue is also because here I can only inner join on one column, but what if I need to inner join on every common column names?
Thank you in advance!
In your code, you are actually computing for the column header names two times. First you are pushing them manually, then in the main loop you have for (vector<string> v : v1)
which actually will start from the first row, which is again those column names.
Instead you can use a normal for loop and start from the second row.
Another problem is here
for (string s2 : v2) {
// Skip the common column again
if (s2 == v2[columnIndex2]) {
continue;
}
current.push_back(s2);
}
Here you are assuming that no other row item has the same string value as the row item of common column, which is not a good assumption at all. This worked in case of headings as column names have to be unique in SQL.
Here is my try
joinTwoTable(vector<vector<string>>& v1, int columnIndex1, vector<vector<string>>& v2, int columnIndex2)
{
vector<vector<string>> result;
vector<string> columns;
string commonColumn = v1[0][columnIndex1];
for(string s: v1[0]) {
columns.push_back(s);
}
for(string s: v2[0]) {
if(s == commonColumn) continue;
columns.push_back(s);
}
// push headers
result.push_back(columns);
for(int i=1; i!=v1.size(); i++) {
string joiner = v1[i][columnIndex1];
for(int j=1; j!=v2.size(); j++) {
if(v2[j][columnIndex2] == joiner) {
// push a new joined row
vector<string> temp;
for(string s1: v1[i]) {
temp.push_back(s1);
}
for(auto k=0; k!=v2[j].size(); k++) {
if(k==columnIndex2) continue;
temp.push_back(v2[j][k]);
}
result.push_back(temp);
}
}
}
return result;
}