Search code examples
c++qtrandomgenerator

Generating Table with random Numbers from 1 to 100.000.000 without any doubles in a short amount of time


for a Project do I need to create a Table that stores 100.000.000 Numbers in a random Order without any doubles, which then get saved as .csv File.

void Anonym_Option::GenerateTable(){
    ui->progressBar->setValue(0);
    QList<int> l(100000000);
    std::iota(l.begin(), l.end(), 0);

    QVector<QList<int>::iterator> v(l.size());
    std::iota(v.begin(), v.end(), l.begin());

    ui->progressBar->setValue(10);

    unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
    auto rng = std::default_random_engine {seed};

    QCoreApplication::processEvents();
    std::shuffle(v.begin(), v.end(), rng);

    QString SortString;
    QString CombinedString;

    ui->progressBar->setValue(30);

    for (auto z: v){
        QCoreApplication::processEvents();
        SortString += QString::number(*z) + "," + "\n";
    }

    ui->progressBar->setValue(70);

    CombinedString = SortString.replace(QString("\n;"), QString("\n"));

    QString Table = "Generated ID; \n" + CombinedString;

    ui->progressBar->setValue(90);

    QString Path = QDir::currentPath();
    QFile file(Path + "/Table.csv");
    if (!file.open(QFile::WriteOnly | QFile::Text)){
        QMessageBox::warning(this, "ACHTUNG","ACHTUNG! Der Anonymisierungs-Table kann nicht generiert werden! Bitte Kontaktieren sie den Support.");
        return;
    }
    else{
        QTextStream stream(&file);
        QCoreApplication::processEvents();
        stream << Table;
        ui->progressBar->setValue(100);
        hide();
        anonymisierung = new Anonymisierung();
        QTimer::singleShot(1500,anonymisierung,SLOT(show()));
    }
}

The purpose of that Table is to replace Numbers in the Customer File, so that it's anonymised. The Problem I have with my Code is that while if I use 10.000.000 Numbers does it take around 8 Min to get done,but when I use 100.000.000 does it seem to take more RAM and Time than it is practical. The Problem could I localize in this Function

    for (auto z: v){
        QCoreApplication::processEvents();
        SortString += QString::number(*z) + "," + "\n";
    }

which whole purpose is to add a "," and "\n" after each Number, so that it does get seperated accordingly and can be used later on. Any Ideas how to fasten up the Progress?

TL;DR I use QT6 in hope for Ranges, sadly not implemented yet, so not an Option I can use!


Solution

  • If you are storing the keys, then a shuffle is just as fast. I tried to keep as much as possible similar, but when shuffling, a std::linear_congruential_engine both doesn't make sense and it took ~4 times longer.

    I include both methods so you can comment out and test them yourself. While not super-scientific, my shell prompt shows a time to execute, and both methods show 10s. I'm executing in WSL with files stored in Windows-land.

    My compiler flags: clang++ -Wall -Wextra -O2 -std=c++17

    #include <algorithm>
    #include <cstdint>
    #include <iostream>
    #include <numeric>
    #include <random>
    #include <vector>
    
    int main() {
      constexpr std::uint32_t upper = 100'000'000;
      std::vector<std::uint32_t> rando(upper);
    
      std::iota(rando.begin(), rando.end(), 1);
      std::shuffle(rando.begin(), rando.end(),
                   std::mt19937(std::random_device{}()));
    
      for (std::uint32_t i = 345; i < 355; ++i) {
        std::cout << rando[i] << ' ';
      }
      std::cout << '\n';
    }
    
    // #include <iostream>
    // #include <vector>
    
    // int main()
    // {
    //     constexpr std::uint32_t upper = 100000000;
    //     std::vector<std::uint32_t> rando;
    //     rando.reserve(upper);
    
    //     std::uint32_t I = 128;
    //     for (std::uint32_t i = 0; i <= upper;){
    //         I = 1664525 * I + 1013904223;
    //         if (I <= upper){
    //             rando.push_back(I);
    //             ++i;
    //         }
    //     }
    
    //     for (int i = 345; i < 355; ++i) {
    //         std::cout << rando[i] << ' ';
    //     }
    //     std::cout << '\n';
    // }