Search code examples
jekyllliquid

Iterate over links in markdown content in Jekyll


I am wondering is it possible in Jekyll to iterate over elements of the processed page content in Jekyll Liquid filters, in particular to iterate over all the <a> elements on the page and get their their content and href attributes.

I'd like to be able to have something in a page like

<ul>
{% for link in content.links %}
  <li>so something here with {{link.contents}} and {{link.href}}.</li>
{% endfor %}
</ul>

Is there any native feature or plugin that would allow something like this? It does not have to be compatible with Github Pages.


Edit:

I ended up making a modified version of the plugin Christian suggested in their answer that captures reference-style links as well and excludes images.

Jekyll::Hooks.register :site, :pre_render do |site|
    site.collections.each do |collection, files|

        if files.docs.any?
            files.docs.each do |file|

                links = []

                inline_regex = /[^!]\[([^\]]+)\]\(([^)]+)\)/
                referenced_regex = /\[([^\]]+)\](?:\[([^\]]+)\])?[^:]/
                references_regex = /\[([^\]]+)\]: ?(.+)/
                
                file.content.scan(inline_regex).each do |match|
                    if match.length == 2
                        links << { 
                            "text" => match[1],
                            "ref" => match[1],
                            "link_url" => match[2]
                        } 
                    end
                end

                file.content.scan(referenced_regex).each do |d_match|
                    if d_match.length == 2
                        link = { "text" => d_match[0], "ref" => d_match[1] }
                    elsif d_match.label == 1
                        link = { "text" => d_match[0], "ref" => d_match[0] }
                    end
                    file.content.scan(references_regex).each do |s_match|
                        if s_match[0] == link["ref"] and s_match[1]
                            links << link.merge!({"url" => s_match[1]})
                        end
                    end
                end

                file.merge_data!({"links" => links})
            end
        end
    end
end

Solution

  • You can loop over a dynamically generated front matter attribute.

    The page front matter will contain the link texts and URLs matched by a regular expression.

    Jekyll include file

    To avoid repetitions, I have created an include file for your code snippet in _includes\link_info.html.

    <ul>
        {% for link in page.links %}
        <li>so something here with {{link.link_text}} and {{link.link_url}}.</li>
        {% endfor %}
    </ul>
    

    You can insert the include file code in your layout(s) or each post individually. A sample post could be:

    ---
    layout: default
    title:  "Your page title"
    ---
    
    {% include link_info.html %}
    
    [google](https://www.google.com)
    

    Plugin to generate the links front matter attribute on each page

    I have created a plugin using the pre-render hook in _plugins\links_in_documents.rb:

    # This plugin dynamically adds the frontmatter attribute. This covers documents in all collections including posts.
    # It does not cover non-collection pages like index, search or 404 pages, on which attributes have to be set manually.
    Jekyll::Hooks.register :site, :pre_render do |site|
        site.collections.each do |collection, files|
    
          if files.docs.any?
            files.docs.each do |file|
    
              # empty page.links array for the particular file
              links = []
    
              # Regex from https://stackoverflow.com/questions/9268407/how-to-convert-markdown-style-links-using-regex
              # Any other link types would require adjusting the regex to match the different types.
              regex = /\[([^\]]+)\]\(([^)]+)\)/
              match = file.content.match(regex)
    
              # insert any link into the array
              if match 
                ## debug output on jekyll serve
                # puts "link #{match} found in #{file.relative_path}" 
                links << { "link_text" => match[1], "link_url" => match[2] } 
              end
    
              # merges data in the page front matter
              file.merge_data!({"links" => links})
            end
          end
        end
      end