Announcing: An improved Clojure brush for Syntax Highlighter
Alex Gorbatchev’s Syntax Highlighter is a Javascript library that implements the nuts of bolts of highlighting plain text source code embedded in a web page. The library provides support for around 30 languages in it’s standard distrubtion using an extensible brushes system. Many additional languages, such as Clojure, have third party brushes available.
Clojure’s current brush is passable, though not comprehensive, due to following the guide for implementing a brush using a simple regular expression pattern matcher. Simple regexen based syntax highlighting for languages in the LISP family have limited utility; the lexical cues that provide an adequate heuristic for languages with more concrete syntax are absent. Daniel Solano Gómez has taken this limited toolkit to it’s limit, but there is room for a better approach.
My new project, (inc clojure-brush)
1, replaces the entire regular expression machinary that Syntax Highlighter expects you to employ with a custom tokenizer, expression tree builder, and form annotator2. This allows for structural analysis of the program to determine how to highlight the tokens. For example the following features are available:
- Highlighting
comment
forms, and those that have been skipped with the#_
reader macro. Like in Clojure,comment
forms still appear in the annotators expressions, while#_
forms are skipped entirely. - The head of any list form is treated as a function, method or constructor and highlighted accordingly.
- Metadata is supplied an additional ‘meta’ class in addition to the appropriate classes for the data it represents. Quoted and quasiquoted forms are supplied an additional ‘quoted’ class. Default style rules are provided in the
shClojureExtra.css
stylesheet. - Heuristics can be supported for common forms (such as
for
,let
,defn
etc) to provide additional rules for highlighting local variables. - Incorrectly balanced closing tokens are identified.
Without the aid of the runtime environment, accurate highlighting of a LISP is still inhibited due to the potentially limitless transformations that may be achieve through macro expansion. This means that while common forms have heuristics supplied by the new brush, there will still be deficiencies for macros introduced by third party libraries that define (for instance) binding forms. Implementing macro expansion correctly in Javascript is obviously beyond the scope of this project.
Edit December, 2024: The examples previously included here have been removed as I no longer use Syntax Highlighter for this blog (and neither should you).
- I assure you that the project is better than its name.
- Based on what limited documentation exists for custom syntaxes, I can only assume I have overridden an internal method of the brushes,
findMatches
, to achieve this.