I have a Python script called renaming.py that I want to use to generate many Bash scripts (over 500). The Python script looks like so:
#!/usr/bin/python
#Script to make multiple Bash scripts based on a .txt file with names of files
#The .txt file contains names of files, one name per line
#The .txt file must be passed as an argument.
import os
import sys
script_tpl="""#!/bin/bash
#BSUB -J "renaming_{line}"
#BSUB -e /scratch/username/renaming_SNPs/renaming_{line}.err
#BSUB -o /scratch/username/renaming_SNPs/renaming_{line}.out
#BSUB -n 8
#BSUB -R "span[ptile=4]"
#BSUB -q normal
#BSUB -P DBCDOBZAK
#BSUB -W 168:00
cd /scratch/username/renaming_SNPs
awk '{sub(/.*/,$1 "_" $3,$2)} 1' {file}.gen > {file}.renamed.gen
"""
with open(sys.argv[1],'r') as f:
for line in f:
line = line.strip()
if not line:
continue
line = line.strip(".gen")
script = script_tpl.format(line=line)
with open('renaming_{}.sh'.format(line), 'w') as output:
output.write(script)
The .txt file I pass as an argument to this Python script looks like so:
chr10.10.merged.no_unwanted.formatted.gen
chr10.11.merged.no_unwanted.formatted.gen
chr10.12.merged.no_unwanted.formatted.gen
chr10.13.merged.no_unwanted.formatted.gen
chr10.14.merged.no_unwanted.formatted.gen
chr10.15.merged.no_unwanted.formatted.gen
etc
When I run the Python script, I get the following error message:
Traceback (most recent call last):
File "renaming.py", line 33, in <module>
script = script_tpl.format(line=line)
KeyError: 'sub(/'
I am not entirely sure what is happening, but here is what I think
Something is wrong with line 33 - not sure what is the problem. I have used very similar scripts like this one before. In this line 33, I am replacing all the {line} instances in script_tpl by the entries in the .txt file (this happens 500, once for each line of the .txt file).
I am very confused by the KeyError. I am working on Linux HPC server (using a Mac laptop). I have managed to use this awk command with no problem when directly typing it into the terminal (as a Bash command). However, it seems that Python is maybe getting confused when I try and "print" it as a variable in the script..
Any help would be deeply appreciated.
When you use .format
all { }
in your string will invoke string formatting. Since you used those chars in your awk command, you must escape them. To do that you double the {{
and }}
:
script_tpl="""#!/bin/bash
#BSUB -J "renaming_{line}"
#BSUB -e /scratch/username/renaming_SNPs/renaming_{line}.err
#BSUB -o /scratch/username/renaming_SNPs/renaming_{line}.out
#BSUB -n 8
#BSUB -R "span[ptile=4]"
#BSUB -q normal
#BSUB -P DBCDOBZAK
#BSUB -W 168:00
cd /scratch/username/renaming_SNPs
awk '{{sub(/.*/,$1 "_" $3,$2)}} 1' {line}.gen > {line}.renamed.gen
"""
Here are the relevant docs.