I have a source code management system where I write annotated source and use sed
to convert that into pure source, markdown documentation, and test cases.
I would like to have an annotation that allowed me to write
other text...
PR(
eval (internal);e expr env env -> val
PR)
other text...
and end up having the string inside PR tags converted into a table:
other text...
<table>
<thead>
<tr>
<th colspan="2">eval (internal)</th>
</tr>
</thead>
<tr>
<td>e</td>
<td>a Lisp expression</td>
</tr>
<tr>
<td>env</td>
<td>an Environment</td>
</tr>
<tr>
<td><i>Returns:</i></td>
<td>val</td>
</tr>
</table>
other text...
editor's note (@Fravadona): the indentation doesn't matter in the expected output.
The basic algorithm is to take the text before the ; to be the header, and the rest of the line is looked at two tokens at a time. If the first token is a name, it is put inside td as is. If it is "->", the "Returns:" text goes in the td. The second token is a key into a dictionary that goes something like this:
env -> an Environment
val -> a Lisp value
vals -> some Lisp values
lvals -> a Lisp list of Lisp values
num -> a number
nums -> some numbers
...
Accessing the dictionary is done by keeping a key/value pair of C strings and traversing them with strcmp()..
I may have reached the end of my sed
skills here, I don't even know if it is possible. I have written the conversion program myself in C, but don't know how to plug it in with sed
.
I'm experimenting with the e
command of sed
. This works:
cat constcl.md | sed 's/\(eval (.*);.*\)/printf "%s" "$(echo "\1" | tr e i)"/e' |less
But if I try to simplify the regex or substitute my own command, it all goes bonkers.
I have to say, sed
isn't ideal for this task. An Awk/Python/Perl/etc solution is probably required.
Let's assume that your dictionary is stored in a dict.txt
file with this format:
env -> an Environment
val -> a Lisp value
vals -> some Lisp values
lvals -> a Lisp list of Lisp values
num -> a number
nums -> some numbers
expr -> an Expression
And that your "template" in the following template.txt
file:
other text...
PR(
eval (internal);e expr env env -> val
PR)
other text...
Then here's how you could expand the PR
blocks using Awk.
The main idea is to load the key/values from dict.txt
first, and then process template.txt
to generate the HTML tables. But don't forget to escape your strings for HTML-text!!! I added a function for it.
awk '
# remove the potential CR characters in the input line
{ gsub(/\r/, ""); }
# load the key/values pairs from dict.txt
# NOTE: NR is equal to FNR only while processing the first file
NR == FNR {
if (match($0, /[[:space:]]*->[[:space:]]*/))
dict[substr($0, 1, RSTART-1)] = substr($0, RSTART+RLENGTH);
next;
}
# expand the PR blocks as HTML tables in the remainder file(s)
$1 == "PR(" { inside_pr_block = 1; next; }
$1 == "PR)" { inside_pr_block = 0; next; }
inside_pr_block {
if (match($0, /;/)) {
printf "<table>";
th = substr($0, 1, RSTART-1);
printf "<thead><tr colspan=2><th>%s</th></tr></thead>", \
html_textify(th);
$0 = substr($0, RSTART+RLENGTH);
for (i = 1; i <= NF; i += 2) {
td1 = ($i == "->" ? "Returns:" : $i);
td2 = dict[$(i+1)];
printf "<tr><td>%s</td><td>%s</td></tr>", \
html_textify(td1), html_textify(td2);
}
print "</table>";
}
next;
}
# output non PR lines
{ print; }
# minimalist function that encodes a string as HTML text
function html_textify(str) {
gsub(/&/, "\\&", str);
gsub(/</, "\\<", str);
gsub(/>/, "\\>", str);
return str;
}
' dict.txt template.txt
With the given input files, Awk outputs (the indentation is added by me):
other text...
<table>
<thead>
<tr colspan=2>
<th>eval (internal)</th>
</tr>
</thead>
<tr>
<td>e</td>
<td>an Expression</td>
</tr>
<tr>
<td>env</td>
<td>an Environment</td>
</tr>
<tr>
<td>Returns:</td>
<td>a Lisp value</td>
</tr>
</table>
other text...