Skip to content

tokenize

const tokenize: unique symbol

Defined in: prism/core.d.ts:44

The symbol used to add a custom tokenizer to a grammar.

For example the markdown code block grammar uses a custom tokenizer to highlight code blocks. This custom tokenizer first tokenizes the code as normal, then finds the language of the code block. If that language has a registered grammar, the content of the code block is tokenized using that language’s grammar.

CustomTokenizer for the type definition of a custom tokenizer.

It’s very important that you use withoutTokenizer and not tokenizeText inside a custom tokenizer since the latter will call the custom tokenizer again leading to infinite recursion.

// A custom tokenizer will often look more or less like this:
const myGrammar = {
// some tokens ...
[tokenize](code, grammar) {
const tokens = withoutTokenizer(code, grammar);
// Do something with the tokens
return tokens;
}
};