Search code examples
c++stdvector

Copying non-sequential columns from an array into another array C++ and removing duplicates based on 1 column


I want to copy columns from a std::vector<std::vector<double> > into another std::vector<std::vector<double> > in C++. This question answers that but only deals with the case where all the columns are in a sequence. In my case, the inner std::vector has 8 elements {C1, C2, C3, C4, C5, C6, C7, C8}. The new object needs to contain {C4, C5, C6, C8} and all the rows. Is there a way to do it directly?

After this step, I will be manipulating this to remove the duplicate rows and write it into a file. Also, please suggest which activity to do first (deleting "columns" or duplicates).

Just to put things in perspective - the outer std::vector has ~2 billion elements and after removing duplicates, I will end up with ~50 elements. So, a method that is faster and memory efficient is highly preferred.


Solution

  • I would use std::transform.

    It could look like this:

    #include <algorithm> // transform
    #include <vector>
    #include <iostream>
    #include <iterator>  // back_inserter
    
    int main() {
        std::vector<std::vector<double>> orig{
            {1,2,3,4,5,6,7,8},
            {11,12,13,14,15,16,17,18},
        };
    
        std::vector<std::vector<double>> result;
        result.reserve(orig.size());
    
        std::transform(orig.begin(), orig.end(), std::back_inserter(result),
            [](auto& iv) -> std::vector<double> {
                return {iv[3], iv[4], iv[5], iv[7]};
            });
    
        // print the result:
        for(auto& inner : result) {
            for(auto val : inner) std::cout << val << ' ';
            std::cout << '\n';
        }
    }
    

    Output:

    4 5 6 8 
    14 15 16 18 
    

    Note: If any of the inner vector<double>s in orig has fewer elements than 8, the transformation will access that array out of bounds (with undefined behavior as a result) - so, make sure they all have the required amount of elements.

    Or using C++20 ranges to create the resulting vector from a transformation view:

    #include <iostream>
    #include <ranges>  // views::transform
    #include <vector>
    
    int main() {
        std::vector<std::vector<double>> orig{
            {1, 2, 3, 4, 5, 6, 7, 8},
            {11, 12, 13, 14, 15, 16, 17, 18},
        };
    
        auto trans = [](auto&& iv) -> std::vector<double> {
            return {iv[3], iv[4], iv[5], iv[7]};
        };
    
        auto tview = orig | std::views::transform(trans);
        
        std::vector<std::vector<double>> result(tview.begin(), tview.end());
    
        // print the result:
        for (auto& inner : result) {
            for (auto val : inner) std::cout << val << ' ';
            std::cout << '\n';
        }
    }