Fast Go syntax highlighter library #
gosyntax is a library for lexing and rendering syntax highlighted source code. It's built to power syntax highlighting on Gitpatch.
It does not use regexes. Instead, it tokenizes the file and splits it into primary ranges: comments, strings and embedded blocks or tags. Those identify where different types of content are located. For best performance, at this stage it does not parse or look for any other tokens.
After that, each range can be processed additionally, for example, to split keywords or numbers. This makes it possible to skip extra processing of parts of the file that do not need to be rendered. Or skip tokens that are not going to be highlighted.
This library supports any number of languages in the same file, JSX tags, embedded script tags etc.
Supported Languages #
List of languages is work in progress and many more will be added soon.
- C
- Go, Templ
- JavaScript, JSX
- Markdown
- HTML
Usage #
package main
import (
"fmt"
"github.com/fatih/color"
"gitpatch.com/se/gosyntax"
)
func main() {
lexer := gosyntax.NewLexer(gosyntax.LangJSX, []byte(`import React from 'react';
function Welcome(props) {
// The component receives 'props' as an argument, which is an object
// containing all the properties passed to it.
return <h1>Hello, {props.name}</h1>;
}
function App() {
return (
<div>
{/* Render the Welcome component and pass a 'name' prop */}
<Welcome name="O'Conner" />
</div>
);
}
`))
colorComment := color.RGB(0x49, 0x62, 0x82)
colorString := color.RGB(0x82, 0xb0, 0x9c)
colorTag := color.RGB(0x61, 0x90, 0xbf)
colorKeyword := color.New(color.Bold)
for r := range lexer.Ranges() {
switch r.Type {
case gosyntax.RangeComment:
colorComment.Printf("%s", r.String())
case gosyntax.RangeString:
colorString.Printf("%s", r.String())
case gosyntax.RangeTag:
colorTag.Printf("%s", r.String())
default:
for isKeyword, r := range r.SplitKeywords() {
if isKeyword {
colorKeyword.Printf("%s", r.String())
} else {
fmt.Printf("%s", r.String())
}
}
}
}
}
Line numbers #
Line numbers can be added by using lexer.Lines() function which additionally splits ranges into lines.
lineNum := 0
for foundNewline, line := range lexer.Lines() {
if lineNum == 0 || foundNewline {
lineNum++
fmt.Printf("% 7d ", lineNum)
}
// print ranges here...
}
CLI usage #
This package includes gosyntax program that can be used to render files on the terminal. Example:
go run ./cmd/gosyntax -- ./snippets/test.jsx
Performance #
gosyntax currently provides syntax highlighting only for comments, strings and keywords. So this benchmark is not fair. However, in some performance sensitive use cases this may be sufficient, with ~100x speed up.
At the moment, use Chroma or other libraries instead for more complete syntax highlighting.
Benchmark uses non-minified jQuery v2.1.4 source code.
goos: darwin
goarch: arm64
pkg: gitpatch.com/se/gosyntax/benchmark
cpu: Apple M1 Max
BenchmarkGoSyntax
BenchmarkGoSyntax-10 805 1450758 ns/op
BenchmarkChroma
BenchmarkChroma-10 8 142631505 ns/op
PASS
ok gitpatch.com/se/gosyntax/benchmark 2.522s
License #
MIT