Search code examples
bashmergecat

Merge numbered files with variable names


I have a number of numbered files, e.g.: alpha_01.txt alpha_02.txt beta_01.txt beta_02.txt I want to execute a single line bash that will output correctly merged files based on their variable name (e.g. alpha, beta, ...), that is, alpha.txt beta.txt.

I can do so for a single file:

cat alpha_*.txt(n) >>alpha.txt 2>/dev/null

But I don‘t know the name before _*.txt. Can I use a wildcard here? Or what would be the best solution?


Solution

  • If you want to concatenate all the alpha_xxx.txt files then you cannot have beta_xxx.txt in the arguments of cat.

    As @tripleee said, the easiest way would be to use a for loop where you list all the prefixes:

    for name in alpha beta
    do
        cat "$name"_*.txt > "$name".txt
    done
    

    Now, if you don't know the prefixes in advance then you can always workout something with awk:

    awk '
        BEGIN {
            for (i = 1; i <= ARGC; i++) {
                filename = ARGV[i]
    
                if (filename !~ /^(.*\/)?[^\/]+_[0-9]+\.[^\/.]+$/)
                    continue
    
                match(filename, /^(.*\/)?[^\/]+_/)
                prefix = substr(filename, RSTART, RLENGTH-1)
    
                match(filename, /\.[^.\/]+$/)
                suffix = substr(filename, RSTART, RLENGTH)
    
                outfile[filename] = prefix suffix
            }
        }
        FILENAME in outfile { print $0 > outfile[FILENAME] }
    ' ./*.txt