How to parse Org headers for exports

Parsing Org headers to add meta informations to a markdown document

What is this all about?

TL;DR: Basically how to get back meta informations from an Org document when exporting it as a Markdown to build the header needed by the Pelican static site generator.

So, I have been playing with Emacs for a little while now, coming initially from Vim. I like Vim, but I have always wanted to try out Emacs. What really appealed to me is its customability; I tried a few times to fiddle around with vimscript, but I never found it easy to work with. I don't blame vimscript, but it was enough for me to want to try something else. It is also a way for me to work on how strongly I can feel about developer tools, which is most of the time a waste of it. What made me really tick about Emacs is Org Mode. I really got into it, it offers all the things that Markdown lacked, and in a consistent and standard way. If you're not aware, Org Mode describes itself as:

Org mode is for keeping notes, maintaining TODO lists, planning projects, and authoring documents with a fast and effective plain-text system.

You can browse its documentation or read about it on a few other places, but the point is that it became what I want to use for any of my structured text editing; everything and more than what I used to use markdown for. This blog you're reading is generated with Pelican, the static site generator I use. But Pelican cannot parse Org format to generate HTML, not yet anyway. Thanksfully, Org Mode has an export feature, which can be extended, and has A LOT of options. It might seem weird at first to write an Org Document to export it to Markdown, but turns out it is worth it. Org Mode allows exporting in Beamer, HTML, LaTeX, Markdown, even OpenDocument, and it can be expanded to be exported to your format of choice using its export dispatcher. That makes it a universal format to write in. It is the one I chose anyway, but here's the catch: While I found a package that improves the markdown generated to the more popular Github Flavour Markdown1, making it easier to read on its own, I still need to populate the generated document with some meta informations needed by Pelican. It did take me a lot of time to find a decent way to do it, and I didn't find a whole lot a similar use case online, hence this post. But I see it as worth it since it is in Elisp, and the time invested in understanding the inner working of my editor of choice will make it easier for me in the long term to make it work however I want.

Let's get into it

After reading a bunch of documentation, both on the way Org Mode works and Elisp, i found out a few ways to make it work. I'll only present the one I am using currently (which is to be improved anyway) because it is the easiest to use for me right now. I ended up on this page of the manual, which is a pretty good way of getting me started. So I now have a way to write my own functions to generate the generated document as I want with what emacs calls export filters. Since I want to put something at the very top of the resulting document, I figured I'd have to use the org-export-filter-final-output, since there is not a handy hook as there is if we were exporting to HTML. I also need a way to get those Org headers that already exist and match the meta information I wanna give to Pelican. To make it easier, the ox-html.el file2 already has a meta data builder to populate its header tag. I got the following snippet working which is vastly based on it:

(defun org-pelican-build-meta-info (info)
  "Return meta tags for exported document.
INFO is a plist used as a communication channel."
  (let* (
     (title (org-export-data (plist-get info :title) info))
     (author (org-export-data (plist-get info :author) info))
     (date (org-export-data (plist-get info :date) info))
     (description (plist-get info :description))
     ;; File wide tags are called filetags
     (tags (plist-get info :filetags)))
    (concat
     (if (org-string-nw-p title)
       (format "Title: %s\n" title))
     (if (org-string-nw-p author)
       (format "Author: %s\n" author))
     (if (org-string-nw-p description)
     (format "Summary: %s\n" description))
     (if (org-string-nw-p date)
     (format "Date: %s\n" date))
     (if tags
     (format "Tags: %s\n"
         (mapconcat 'identity tags ", "))))))

(defun my-md-filter-final-output (text backend info)
  ;; You might wanna change the derived-backend.
  ;;This is using the default markdown one
  (when (org-export-derived-backend-p backend 'md)
    (concat
     (org-pelican-build-meta-info info)
     text)))

(eval-after-load 'ox
  '(add-to-list 'org-export-filter-final-output-functions
        'my-md-filter-final-output))

So for instance, for a dummy Org file such as this one:

#+TITLE: This is a post
#+AUTHOR: myself
#+DATE: 08-10-2018
#+DESCRIPTION: This is a dummy post for export purposes
#+FILETAGS: :org:dummy:post:
#+OPTIONS: toc:nil
* Introduction
  Hey! You should check out how awesome Org mode is!
  But let's not forget that this is just a example paragraph.

  And this is too, but just another one.

* Conclusion
  Here's another header as another example so this isn't too empty.

Once exported with the default markdown backend, would give this:

Title: This is a post
Author: myself
Summary: This is a dummy post for export purposes
Tags: org, dummy, post

# Introduction

Hey! You should check out how awesome Org mode is!
But let's not forget that this is just a example paragraph.

And this is too, but just another one.


# Conclusion

Here's another header as another example so this isn't too empty.

There still is a few things that could be improved, expecially the date format, which is different from Org to Pelican. I am changing its format in Org itself this way:

(setq org-time-stamp-formats
      '("%d-%m-%Y" . "%a %d-%m-%Y %H:%M"))

Which, according to stackoverflow, is not recommended as it is a global variable. I'll update this post the moment I find a better solution, but for the moment, I need to move on.

And that is pretty much it. I am only getting started to play with Elisp, but hopefully I'll know my way around a bit more next time I have to dive in.

You are more than welcome to email me if you found it helpful, if you know of a better way of doing something similar or if there is any mistake I made.

Footnotes

1 The markdown standart it uses is that one.

2 You can browse that file locally by looking for a function it exposes. There is a link to it in its description. E.g <C-h> f RET org-html-code RET. I didn't find a better way to look it up yet.

links

social