Search code examples
javascriptwebpackwebpack-html-loaderposthtml

Parse an HTML template as you require it in JavaScript


I spent hours on a basic webpack configuration but I'm still not able to make it work. My aim is to perform the parsing of a html template as you import it in a JavaScript file. It looks like a common use-case, but there should be something odd in my webpack configuration or in my understanding.

I looked for configurations of html-loader, html-webpack-plugin, posthtml as well as pug and I've read all of their documentations, but none of them worked.

According to PostHTML Readme:

PostHTML is a tool for transforming HTML/XML with JS plugins. PostHTML itself is very small. It includes only a HTML parser, a HTML node tree API and a node tree stringifier.

So, since it was the most promising, I report my attempt with posthtml:

   rules: [
      {
        test: /.html$/,
        use: [
          {
            loader: "html-loader",
            options: {
              minimize: true,
              interpolation: false
            }
          },
          {
            loader: "posthtml-loader"
          }
        ]
      }
    ]

It doesn't return any error but it looks like is totally ignoring the posthtml-loader, since executing import template from 'src/test.html' I get the template as string (how is supposed to do html-loader alone).

In my understanding, loaders are supposed to compile/transform files with different formats and make them available to JavaScript, and since html is the most common type in a front-end project I supposed it was easy, but I'm not finding anything on the internet related to this question.

What I expect is to have a DOM tree object or, anyway, something that can be used by JavaScript.

Is anyone able to help me?

EDIT: My question is about getting a webpack configuration up and working. I know many solution for parsing HTML strings, but they're not applicable here


Solution

  • To me it seems like the posthtml-loader is primarily a tool that helps to "prepare" your HTML during the build. Its parser options allow you to break in to the string -> PostHTML AST Tree step, and its plugin options allow you to modify the tree. Then, it stringifies back to HTML.

    I couldn't find an option in the webpack plugin to return the interim tree format.


    You could write a small custom loader to parse HTML strings to DOM objects:

    // custom-loader.js
    module.exports = function(content) {
      return `module.exports = (function() {
        const parser = new DOMParser();
        const doc = parser.parseFromString("${content}", "text/html");
        return doc.body.children; 
      }())`;
    };
    

    Then, in your webpack.config.js, you can tell webpack to make .html files pass this loader:

    // webpack.config.js
    module.exports = {
      mode: 'development',
      entry: './main.js',
      output: {
        path: path.resolve(__dirname, 'dist'),
        filename: 'main.bundle.js'
      },
      devtool: "eval-source-map",
      module: {
        rules: [
          {
            test: /\.html$/,
            use: [ './custom-loader' ]
          }
        ]
      }
    };
    

    Now, whenever you type something like const template = require('./template.html'); you'll get an HTMLCollection instance rather than just a string.


    Note that this loader adds a dependency to DOMParser, which is only available in the browser. You could replace it by something like jsdom if you want to run in non-browser environments.