DevLog 6: Extending Markdown

Markdown is ubiquitous nowadays. LLMs speak Markdown. Text editors embrace Markdown. I've been working with it both on my text editor side-project and on my main job. Markdown is great as a lightweight, easy-to-read markup syntax but it has its limitations. Everyone ends up working around those limitations in one way or another, which makes it difficult to have good interoperability.

The reason for this is that Markdown was originally envisioned as a text-to-html but ended up having to support a lot more stuff.

As I mentioned in Markdown and WYSIWYG, the de facto standard for Markdown is CommonMark. CommonMark defines the basics - bold, italics, and so on. It also defines lists, quotes, images. But it misses a lot of fairly common formatting elements and entities. I called out tables as an example in my previous post. Everyone is familiar with the table syntax, but this was introduced by GitHub as part of their GitHub Flavored Markdown spec.

GFM also introduced task list items:

- [x] Todo 1
- [ ] Todo 2

Commonly understood by parsers but not standard. Same with strikethrough. This would be the ~~ markers, as in ~~struck~~.

Two Ways To Extend

Custom Syntax

One way the syntax gets extended is with custom markers. The above strikethrough is an example. There's also underline, which is commonly a single ~, as in ~underlined~.

Footnotes are another common extension, where a footnote is specified inline via [^1] and, at the bottom of the document, [^1]: <Footnote content>.

Math is another one, when we want to render some LaTeX-syntax math using a library like MathJax. Inline math ends up between $ markers and blocks end up between $$ markers.

Highlights are written with ==, as in ==highlighted==.

Tradeoffs

The first extension mechanism covered keeps true to the spirit of Markdown. You can read the plain text and see ==highlighted== is a highlight. Tags are a bit harder to parse with our eyes.

On the flip side, tags allows us to add rich custom extensions (think additional attributes on the tag) while keeping the document standard-conforming. Any parser will notice there's some custom stuff there, and skip over it.

Rich Text Editing

Both of the above work to a certain extent but neither is quite good enough for interoperability between editors.

As I've been working on my text editor side-project, I'm reaching a point where I want to go beyond the basics and implement some common editor features that do not have a direct Markdown representation.

Up to this point, the extensions my editor supports are quite common: underline, strikethrough, highlights. Now it gets more interesting. I want to add support for multiple highlight colors.

What Bear does, which I ended up adopting, is using a block color emoji to define the highlight color, like ==🟩text== to signify a green highlight. This looks very neat and you can read it without a parser to understand what it does.

Another option is to introduce a custom tag, like <highlight color="green">.

The question is, at what point does a Markdown document stop being Markdown? If we load the custom markup in a different Markdown editor, the most likely thing to happen is for the ==🟩text== to show up as a yellow highlighted 🟩text. If we use the tags instead, a good editor will hide the tags, and leave you with text. By Markdown editor here I don't mean a plain-text editor, I mean something like the app I'm building or any similar solution.

I ended up implementing the Bear solution for this, though as far as I know no other editor support this syntax.

Highlights are a relatively easy problem. Here's a more complex one: comments. A good editor should support comments, which would anchor on some text and contain additional text rendered outside the flow of the document. How would we represent comments in Markdown?

There's the custom syntax approach, like [this is text]{and this is a comment}. Or maybe something like footnotes, which is another common extension. We can add a [#comment] inline and store it at the end of the doc as [#comment]: ....

Or use tags. <comment text="this is the comment">and this is the text</comment>. Or as a reference <comment refid="1">this is the text</comment> and somewhere else have the

<comments>
    <comment refid="1">this is the comment</comment>
</comments>

All of these are viable options but it should be pretty clear they all diverge quite far from Markdown. Maybe the first option, with some custom bracket combination, is the most Markdown-ish. I haven't yet decided how I will represent comments.

This begs the question: is Markdown really the best format to use? An option would be to switch my editor to a custom format that can easily support all the features I want. This can make advanced stuff like comments easy to support and get rid of some of the quirks Markdown has, for example the double newline required to create a new paragraph. This would be an option but, as I said at the beginning, Markdown is ubiquitous. I do want files created with my editor to be easily understood by reading them, opening them in another text editor, or sending them to an LLM.

So inevitably, the more features I add, the further I diverge from any common Markdown implementation. My app's parser/renderer will be the only ones that can properly interpret any document created with my app.

Solutions

The pragmatic solution I will end up implementing, regardless of what flavor I end up choosing to represent comments, is an Export to common Markdown feature. Better be explicit about it. Strip custom markup/tags and, while losing fidelity, provide an easy way to get the document closer to the standard.

I keep putting standard in quotes because there is no real standard. It's a combination of CommonMark, GFM, support for footnotes and math etc.

I think this is a good compromise between supporting all the features I want, having a native document format that can still be read as plain-text/by an LLM, and also providing a standard representation of it.

That said, I do think Markdown would benefit from a well-defined way to add custom extensions. The format is so successful, it moved far away from plain text and HTML. Dozens of text editors embraced it, AI chat embraced it, and every endpoint brought its own additions.

The HTML tag pass-through was a good solution for the initial intended purpose, but it feels like a crutch as an extensibility point. What would work better would be a syntax akin to (:<extension>: text). This would make it explicit to a parser that a custom extension to the syntax is being used and let it handle it as appropriate: discard it and keep the text only, delegate to the extension if available, replace it with a placeholder etc. Not very different from the HTML tags, but unlike HTML tags, this wouldn't be simply pass-through, it would have semantic meaning to the parser. Something like this would also make it easier for both humans and AI to reason over a document, even when the extension interpretation is missing.

The proposed syntax is just an example, it doesn't have to be this, it just has to be something everyone agrees on that meets the requirements.

With this, we would represent a green highlight as (:highlight green: text). Easy to read as plain text and an editor missing the extension could just render text. Maybe even have some fallback to hint to the user the text is marked but the markup is unsupported. A comment could be written as (:comment "this is the comment": text).

HTML tags would be used as originally intended, for custom HTML markup and not to add features. LLMs would be able to clearly distinguish between custom extensions and tags as the distinction would be explicit in the syntax rather than inferred.

The Markdown ecosystem is extremely fragmented as of today and without an agreed-upon solution to bring things back together it will just continue to diverge. Custom notations like ~underline~ and ==🟩green== will keep being invented and supported inconsistently, while custom tags will continue to provide non-portable extensibility. I wish we had a better way of doing this and I'm worried we're too far down the road by now to retrofit it in.