I would like to convert a large number of html files to txt files. I have downloaded the inscript command line tool from github but I am struggling to apply it to all html files which are located in subdirectories and then save these files as text files in the same directory where the html files are located.
I have tried:
for f in ./ do inscript.py -o test.txt done
The following should work:
for d in ./**/*/; do
pushd "$d"
for f in *.html(N); do
out=test-${f%.html}.txt
inscript.py -o "$out" "$f"
done
popd
done
The pattern .**/*/
will recursively match the current directory and all its subdirectories. pushd
will change to a directory, but remember the current working directory. inscript.py
does its thing, then popd
returns to the original working directory so that the next value of d
continues to be a valid
relative directory.
Changing the working directory isn't strictly necessary; it just simplifies the file paths involved, because you focus on the name of the file and ignore the rest of the path.