Search code examples
markdownjekyllliquidmathjaxpandoc

How can I access un-rendered (markdown) content in Jekyll with liquid tags?


From reading the documentation Jekyll's template data one might think that the way to access un-rendered content would be page.content; but as far as I can tell, this is providing the content of the post as already rendered by the markdown parser.

I need a solution that accesses the raw (original markdown) content directly, rather than simply trying to convert the html back to markdown.

Background on use case

My use case is the following: I use the pandoc plugin to render markdown for my Jekyll site, using the 'mathjax' option to get pretty equations. However, mathjax requires javascript, so these do not display in the RSS feed, which I generate by looping over page.content like so:

 {% for post in site.posts %}
 <entry>
   <title>{{ post.title }}</title>
   <link href="{{ site.production_url }}{{ post.url }}"/>
   <updated>{{ post.date | date_to_xmlschema }}</updated>
   <id>{{ site.production_url }}{{ post.id }}</id>
   <content type="html">{{ post.content | xml_escape }}</content>
 </entry>
 {% endfor %}

As the xml_escape filter implies, post.content here appears in html. If I could get the raw content (imagine post.contentraw or such existed) then I could easily add a filter that would use pandoc with the "webtex" option to generate images for equations when parsing the RSS feed, e.g:

require 'pandoc-ruby'
module TextFilter
  def webtex(input)
    PandocRuby.new(input, "webtex").to_html
  end
end
Liquid::Template.register_filter(TextFilter)

But as I get content with the equations already rendered in html+mathjax instead of the raw markdown, I'm stuck. Converting back to markdown doesn't help, since it doesn't convert the mathjax (simply garbles it).

Any suggestions? Surely there's a way to call the raw markdown instead?


Solution

  • Here's the trouble that I think you'll have: https://github.com/mojombo/jekyll/blob/master/lib/jekyll/convertible.rb https://github.com/mojombo/jekyll/blob/master/lib/jekyll/site.rb

    From my reading, for a given post/page self.content is replaced by the result of running self.content through Markdown and Liquid, at line 79 in convertible.rb:

    self.content = Liquid::Template.parse(self.content).render(payload, info)
    

    Posts are rendered before pages, seen at lines 37-44 and 197-211 in site.rb:

    def process
      self.reset
      self.read
      self.generate
      self.render
      self.cleanup
      self.write
    end
    
    ... ...
    
    def render
      payload = site_payload
      self.posts.each do |post|
        post.render(self.layouts, payload)
      end
    
      self.pages.each do |page|
        page.render(self.layouts, payload)
      end
    
      self.categories.values.map { |ps| ps.sort! { |a, b| b <=> a } }
      self.tags.values.map { |ps| ps.sort! { |a, b| b <=> a } }
    rescue Errno::ENOENT => e
      # ignore missing layout dir
    end
    

    By the time you get to rendering this page, self.content has been rendered to HTML - so it isn't a case of stopping it rendering. It's already done.

    However, Generators (https://github.com/mojombo/jekyll/wiki/Plugins) run before the render stage, so, as far as I can tell from reading the source, you should be able to fairly trivially write a generator which will duplicate self.content into some attribute (such as self.raw_content) which you can later access as raw Markdown in your templates {{ page.raw_content }}.