A custom Datastar attribute plugin that uses the CSS Custom Highlight API to add syntax highlighting to elements.
<pre data-highlight="json">
{
"foo": "bar",
"baz": false
}
</pre>The plugin expects you to provide an import map that specifies the location of the datastar module, as well as any languages you want to support. To include languages, you must create an import mapping for each, using the module name format: data-highlight:<lang_code>.
Then, it's a simple matter of including a <script type="module"> element for the plugin and a <link rel="stylesheet"> with the relevant CSS. For example, the following configuration will enable highlighting for css, json, and html:
<script type="importmap">
{
"imports": {
"datastar": "https://cdn.jsdelivr.net/gh/starfederation/datastar@1.0.0-RC.6/bundles/datastar.js",
"data-highlight:css": "https://cdn.jsdelivr.net/gh/regaez/data-highlight@0.1.0/dist/tokenizers/css.js",
"data-highlight:json": "https://cdn.jsdelivr.net/gh/regaez/data-highlight@0.1.0/dist/tokenizers/json.js",
"data-highlight:html": "https://cdn.jsdelivr.net/gh/regaez/data-highlight@0.1.0/dist/tokenizers/html.js"
}
}
</script>
<link href="https://cdn.jsdelivr.net/gh/regaez/data-highlight@0.1.0/data-highlight.css" rel="stylesheet">
<script type="module" src="https://cdn.jsdelivr.net/gh/regaez/data-highlight@0.1.0/dist/data-highlight.js"></script>The plugin is a flexible implementation of syntax highlighting that can adapt to dynamic content changes in the DOM. Do you use the data-json-signals often? If so, you might find this plugin synergises nicely with it.
The plugin utilises the CSS Custom Highlight API so you only need to provide the raw text; no need to wrap your code snippets in dozens of additional span elements. This makes the plugin ideal when you are serving a static HTML page and/or do not have a backend set up to parse code snippets and render them with syntax highlighting.
In order to stay light-weight and take advantage of resource caching, the plugin is designed to be extensible; loading tokenizers dynamically, on-demand and serving each tokenizer as its own module. This means that the client will only download tokenizers for languages that you have actually used on the page, keeping the footprint as small as possible, while also enabling you to include custom tokenizers for any language you wish to support.
Another powerful advantage of this method of highlighting is that it is easily customisable with CSS. The plugin provides some sample CSS to get you started, but you remain in full control. It uses the ::highlight() pseudo-element, meaning you can support multiple themes, change them on-the-fly, and easily add additional rules for any token types your custom tokenizers may return. No need to worry about trying to override baked-in inline styles on span elements.
There are, unfortunately, some drawbacks to the plugin's implementation with the CSS Highlight API. Some are unavoidable, others could perhaps be improved upon.
- Requires a modern browser; while the CSS Highlight API is baseline widely available, if you need to support older browsers you will want to consider a different approach.
- Flashes of unstyled content; there will be no styling before the plugin has processed the text on first page load, and FOUCs will occur any time the text node updates. This is unfortunately unavoidable.
- Increased client-side processing; if your code snippet is static and never changes, then processing the syntax highlighting once server-side and sending static HTML will be more performant, compared to re-calculating each time on the client.
- Limited set of stylable CSS properties; this is a limitation of the
::highlight()pseudo-element. Only a small subset of CSS properties can be used. If you need to apply other styles while highlighting, this plugin will not be suitable. - Cannot style inputs, or textareas; this is a current limitation of the CSS Highlights API. Perhaps in the future such functionality will be included in the specification. In the meantime, you could consider using the overlay approach (see the live demo page examples).
- Only operates on text nodes; this decision was made in order to keep the plugin simple, reliable, and performant. Ranges can start and end on different nodes, so the plugin will attempt to highlight text that may spread across multiple text nodes.
The plugin adds two new attributes that you can use on elements:
It is recommended to view the live demo and documentation page in order to see the plugin in action, interact with the examples, and benefit from the additional context they provide.
To use the data-highlight attribute, simply pass the language as the attribute's key or value and ensure the element only contains text nodes (i.e. no nested elements) with the code snippet you want highlighted:
<!-- using the attribute key -->
<pre data-highlight:json>{ "foo": "bar" }</pre>
<!-- or, using the value -->
<pre data-highlight="json">{ "foo": "bar" }</pre>If the element with the data-highlight attribute contains nested elements, the plugin will return early and not apply any highlights to that element.
The value of the attribute must be a language string; it is not a Datastar expression. However, you are still able to dynamically change the target language with a signal, if you wish. You can compose it with the standard data-attr attribute to set the value whenever the signal changes, for example:
<!-- assuming the $language signal is defined elsewhere -->
<pre data-attr:data-highlight="$language"></pre>The attribute supports the following modifiers:
__debug: When included, this will trigger the plugin to log the input, language code, and array of tokens to the browser developer console, each time the tokenizer is invoked for that local use-case. This can be particularly useful when you are customising your CSS theme and are trying to identify which token types were assigned to certain text ranges.
The data-highlight-text attribute functions similarly to that of data-text, except it is specifically written to synergise better with data-highlight by retaining existing highlights, where possible, when updating text. This helps to avoid flashes of unstyled content after an update.
It accomplishes this by splitting text on every new line, creating distinct text nodes for each line, and then later re-using those same nodes when possible, such that previous highlight ranges can be repurposed, rather than orphaned and cleaned-up.
For example:
<!-- Prints all signals, with json syntax highlighting: -->
<pre
data-highlight:json
data-highlight-text="JSON.stringify($, null, 2)">
</pre>
<!-- which is equivalent to: -->
<pre data-highlight:json data-json-signals></pre>This plugin is ideal for content that updates with a high frequency, for example signals that change on short intervals, or when handling text input, where new highlighting may be applied with every keystroke.
For static, or infrequently changing content, it is likely not necessary; in these instances, outputting text as raw HTML content, using the data-text attribute, should suffice.
You may very well wish to use PrismJS, as it has a large assortment of supported languages, much more so than the data-highlight plugin will ever have first-party tokenizers for. Thus, the plugin provides a compatibility layer to be able to easily leverage the tokenizers from PrismJS.
In order to use PrismJS, you must include the following script tag somewhere on the page, pointing to the revelant JS file:
<script src="/path/to/prism.js" data-manual></script>Note that you won't be able to use their autoloader script, but should rather download your own custom bundle with the necessary languages you wish to support from the PrismJS download page.
It is also important that you include the data-manual attribute on the script tag, to indicate to Prism that it should not parse the page and try to apply any highlights itself — the data-highlight plugin will manage that instead; it will simply use the language tokenizers from PrismJS.
Finally, you must add the Prism compatibility script in your import map, with a module entry for each language that you wish to use. The module names should follow the same format of data-highlight:<lang_code> for each supported language, but all should point to the same script. The language codes must match those defined by Prism, per its supported languages section, in order for the plugin to be able to identify the correct tokenizer.
For example, the following enables support for Javascript and Rust via Prism:
{
"imports": {
// ... your other imports
"data-highlight:js": "https://cdn.jsdelivr.net/gh/regaez/data-highlight@0.1.0/dist/tokenizers/prism.js",
"data-highlight:rust": "https://cdn.jsdelivr.net/gh/regaez/data-highlight@0.1.0/dist/tokenizers/prism.js",
}
}Then it's as simple as using the language code with the data-highlight attribute, like any other:
<pre>
<code data-highlight:js>
let greeting = "hello world";
alert(greeting);
</code>
</pre>There is no need to include the Prism CSS file, as it will not be used.
In order to create your own custom tokenizer, you must create an ES Module which exports a default function that satisfies the following signature:
type Token = {
type: string; // The type should map to a CSS ::highlight() pseudo-element name
start: number; // The index of the first character of the token
end: number; // The index of the first character AFTER the token has ended
value: string; // The text slice of the input between the start and end indices
};
export default function(
input: string, // The element's textContent value
language: string // The language as specified by the data-highlight attribute key/value
): Token[]As a tokenizer can return any string value for the type field, you may also need to extend your CSS to handle styling any custom tokens that aren't covered by the plugin's provided styles, or your existing stylesheet. See the custom themes section for more information.
The plugin provides an example stylesheet with some pre-defined styles assigned to the most common token types. You can either import this on your page via a tag (see "Getting started"), or better yet: simply copy the contents into your own stylesheet and tweak it to your suit your needs/preferences.
When working with third party tokenizers, or when building your own custom ones, you may need to extend the CSS rules to include tokens that are not supported out-of-the-box. You can accomplish this by simply adding new ::highlight pseudo-element selectors with the appropriate token name, which apply any necessary CSS properties.
For example, to style the token type foo to have blue text:
::highlight(foo) {
color: blue;
}Note that only a small subset of CSS properties can be applied to the
::highlightpseudo-element.
MIT