Mup, which is short for Markup Parser, is a cross-platform library written in C#. It targets .NET Standard 1.0 making it available for a wide variety of devices and applications. The main purpose of the library is to support parsing Lightweight Markup Languages into various output formats, such as HTML, XHTML, XML, Word Documents, Excel Documents, and any other type of document. The library does not expose types for each mentioned format, but it is made to be extensible. Any parsed text can be run through a custom visitor which traverses the resulting parse tree allowing the developer to specify what exactly needs to be generated at every step. To keep it lightweight, the library only provides a parser for Creole right now and an HTML visitor which allows users to generate HTML from parsed text. With each increment (or major version), the library will bring a new parser into the fold and thus supporting more languages. The end goal is to support most, if not all, Lightweight Markup Languages.
Name | Summary |
---|---|
Mup | Contains the base Mup types and implemented parsers. |
Mup.Elements | Contains the parse nodes representing a mark-up document. |
The library has a few core types that make it work: IMarkupParser ParseTreeRootElement and ParseTreeVisitor.
Each parser (currently just Creole) implements the IMarkupParser interface which exposes a number of methods that allow parsing a string or text from a TextReader. Each parser supports both synchronous and asynchronous models allowing its users to consume the API any way they want.
The result of any parse method is ultimately a ParseTreeRootElement. Surprisingly or not, this interface does not expose something like a root node or anything related to what one would expect when seeing the word "tree".
This is because trees can have different representations. For instance, we can have the usual example where we have a root node which exposes a property containing a number of nodes that are in fact child nodes, each child node also exposes such a property that contains their child nodes and so on. A different representation can be a flat one where the entire tree is stored as a list of elements that mark the beginning and end of each node.
Regardless of how we represent a parse tree, we need to be able to traverse it in order to generate a specific output, say HTML. This is where a ParseTreeVisitor comes into play. Any ParseTreeRootElement exposes methods that accept a ParseTreeVisitor, the entire logic for traversing the tree is encapsulated inside itself. Each time a node is being visited, a specific method for that node is called on the visitor. This helps keep the interface clean and completely decouple the language that is being parsed from the desired output format. Any new markup parser will work with existing visitors and any new visitor will work with any existing parser.
The one common rule for all parse trees is that they are all traversed in pre-order (see Tree Traversal (Wikipedia) for more about this topic).
Mup Copyright © 2020 Andrei Fangli