Search code examples
grepheadereditingfasta

Editing header by adding "pipe" in fasta file


I want to edit my headers in fasta file by adding pipes but unable to do so. The header looks like this

KX035646.1 Name:NADH domain ATGCGGGGCTGC..

I want it like

sp|KX035646.1| Name:NADH domain

The accession number is different for all sequences Can you please help me doing it? Thanks


Solution

  • You can try a simple sed one liner:

    cat test.fasta
    >KX035646.1 Name:NADH domain ATGCGGGGCTGC..
    ACGT
    CTTT
    >KX035646.2 Name:NADH domain ATGCGGGGCTGC..43214
    GCAT
    
    sed 's/^>\([a-zA-Z0-9.]\+\)\(.*\)/>sp|\1|\2/' test.fasta
    >sp|KX035646.1| Name:NADH domain ATGCGGGGCTGC..
    ACGT
    CTTT
    >sp|KX035646.2| Name:NADH domain ATGCGGGGCTGC..43214
    GCAT