Search code examples
bashshellcurlhtml-parsing

Send unserialized & unescaped HTML file data to an API with a bash script


I wanted to create a bash script that takes an HTML file and sends it to several APIs.

I have a test.html file with unserialized HTML data like this:

<h2 id="overview">Overview</h2>
<p>Have the source of truth in your own space at <strong>somewhere</strong></p>
<pre>
<code class="lang-javascript">function go() {
  console.log(&#39;code blocks can be a pain&#39;);
}
go();
</code>
</pre>

I need to send the content of the file somehow to an API, like this:

curl --location --request POST 'https://devo.to/api/articles' \
--header 'api-key: askldjfalefjw02ijef02eifj20' \
--header 'Content-Type: application/json' \
--data-raw '{
  "article": {
    "title": "Blog Article",
    "body_markdown": "@test.html",
  }
}'

The only way I can think of so far, is to serialize/escape the HTML file, reading it into a variable as a string (like $TEST_HTML=$(cat serialized_test.html) and then passing it to "body_markdown".

Would it be possible to serialize/escape the HTML in one step inside the bash script or is there maybe a better way?


Solution

  • I'd use jq to build the JSON argument, and let it deal with properly escaping quotes, newlines and such in the included HTML file:

    curl --location --request POST 'https://devo.to/api/articles' \
    --header 'api-key: askldjfalefjw02ijef02eifj20' \
    --header 'Content-Type: application/json' \
    --data-raw "$(jq -n --arg html "$(< test.html)" '{article:{title:"Blog Article",body_markdown:$html}}')"
    

    The jq invocation puts the contents of test.html in a string variable $html, and evaluates to:

        {
          "article": {
            "title": "Blog Article",
            "body_markdown": "<h2 id=\"overview\">Overview</h2>\n<p>Have the source of truth in your own space at <strong>somewhere</strong></p>\n<pre>\n<code class=\"lang-javascript\">function go() {\n  console.log(&#39;code blocks can be a pain&#39;);\n}\ngo();\n</code>\n</pre>"
          }
        }
    

    $(< filename) is a bash substitution that evaluates to the contents of the given file. It's to be preferred over $(cat filename) in bash as it doesn't involve running another process.