15 KiB
Treerack Manual
This manual describes the primary use cases and workflows supported by Treerack.
Prerequisits
We assume a working installation of the standard Go tooling.
This manual relies on the treerack command-line tool. We can install it using one of the following methods.
A. source installation (requires make):
- clone the repository
git clone https://code.squareroundforest.org/arpio/treerack - navigate to the source directory, run:
make install. To install it to a custom location, use theprefixenvironment variable, e.g. runprefix=~/.local make install - verify the installation: run
treerack versionandman treerack
B. via go install:
Alternatively, we may be able to install directly using the Go toolchain:
- run
go install code.squareroundforest.org/arpio/treerack/cmd/treerack - verify:
treerack help
Hello syntax
A trivial syntax definition looks like this:
hello = "Hello, world!"
This definition matches only the exact string "Hello, world!" and nothing else. To test the validity of this rule, run:
treerack check-syntax --syntax-string 'hello = "Hello, world!"'
If successful, the command exits silently with code 0. (We can append && echo ok to advertise successful execution).
To test the syntax against actual input content:
treerack check --syntax-string 'hello = "Hello, world!"' --input-string 'Hello, world!'
To visualize the resulting Abstract Syntax Tree (AST), use the show subcommand:
treerack show --syntax-string 'hello = "Hello, world!"' --input-string 'Hello, world!'
The output will be raw JSON:
{"name":"hello","from":0,"to":13,"text":"Hello, world!"}
For a more readable output, add the --pretty flag:
treerack show --pretty --syntax-string 'hello = "Hello, world!"' --input-string 'Hello, world!'
...then the output will look like this:
{
"name": "hello",
"from": 0,
"to": 13,
"text": "Hello, world!"
}
Handling errors
If our syntax definition is invalid, check-syntax will fail:
treerack check-syntax --syntax-string 'foo = bar'
The above command will fail because the parser called foo references an undefined parser bar.
We can use check or show to detect when the input content does not match a valid syntax. Using the hello
syntax, we can try the following:
treerack check --syntax-string 'hello = "Hello, world!"' --input-string 'Hi!'
It will show that parsing the input failed and that it failed while using the parser hello.
Basic syntax - An arithmetic calculator
In this section, we will build a simplistic arithmetic calculator. It will read a line from standard input, parse it as an arithmetic expression, compute the result, print it, and start over - effectively creating a REPL (Read-Eval-Print Loop).
We will support addition +, subtraction -, multiplication *, division /, and grouping with parentheses ().
acalc.treerack:
// Define whitespace characters.
// The :ws flag marks this as the global whitespace handler.
ignore:ws = " " | [\t] | [\r] | [\n];
// Define the number format.
//
// The :nows flag ensures we do not skip whitespace *inside* the number token. We support integers, floats, and
// scientific notation (e.g., 1.5e3). Arbitrary leading zeros are disallowed to prevent confusion with octal
// literals.
num:nows = "-"? ("0" | [1-9][0-9]*) ("." [0-9]+)? ([eE] [+\-]? [0-9]+)?;
// define the supported operators:
add = "+";
sub = "-";
mul = "*";
div = "/";
// Grouping logic.
//
// Expressions can be enclosed in parentheses. This references 'expression', which is defined later,
// demonstrating recursive definitions. The :alias flag prevents 'group' from creating its own node in the AST;
// only the child 'expression' will appear.
group:alias = "(" expression ")";
// Operator Precedence.
//
// We group operators by precedence levels to ensure correct order of operations.
//
// Level 0 (high): multiplication/division
op0:alias = mul | div;
// Level 1 (low): addition/subtraction
op1:alias = add | sub;
// Operands for each precedence level.
//
// operand0 can be a raw number or a grouped expression.
operand0:alias = num | group;
// operand1 can be a higher-precedence operand or a completed binary0 operation.
operand1:alias = operand0 | binary0;
// Binary Expressions.
//
// We define these hierarchically. 'binary0' handles high-precedence operations (mul/div).
binary0 = operand0 (op0 operand0)+;
binary1 = operand1 (op1 operand1)+;
binary:alias = binary0 | binary1;
// The generalized Expression.
//
// An expression is either a raw number, a group, or a binary operation.
expression:alias = num | group | binary;
// Root Definition.
//
// The final result is either a valid expression or the "exit" command. Since 'expression' is an alias, we need
// a concrete root parser to anchor the AST. Note: The :root flag is optional here because this is the last
// definition in the file.
result = expression | "exit"
Testing the syntax
1. Simple number
treerack show --pretty --syntax acalc.treerack --input-string 42
Output:
{
"name": "result",
"from": 0,
"to": 2,
"nodes": [
{
"name": "num",
"from": 0,
"to": 2,
"text": "42"
}
]
}
2. Basic operation
treerack show --pretty --syntax acalc.treerack --input-string "42 + 24"
Output:
{
"name": "expression",
"from": 0,
"to": 7,
"nodes": [
{
"name": "binary1",
"from": 0,
"to": 7,
"nodes": [
{
"name": "num",
"from": 0,
"to": 2,
"text": "42"
},
{
"name": "add",
"from": 3,
"to": 4,
"text": "+"
},
{
"name": "num",
"from": 5,
"to": 7,
"text": "24"
}
]
}
]
}
3. Precedence check
treerack show --pretty --syntax acalc.treerack --input-string "42 + 24 * 2"
Output:
{
"name": "result",
"from": 0,
"to": 11,
"nodes": [
{
"name": "binary1",
"from": 0,
"to": 11,
"nodes": [
{
"name": "num",
"from": 0,
"to": 2,
"text": "42"
},
{
"name": "add",
"from": 3,
"to": 4,
"text": "+"
},
{
"name": "binary0",
"from": 5,
"to": 11,
"nodes": [
{
"name": "num",
"from": 5,
"to": 7,
"text": "24"
},
{
"name": "mul",
"from": 8,
"to": 9,
"text": "*"
},
{
"name": "num",
"from": 10,
"to": 11,
"text": "2"
}
]
}
]
}
]
}
4. Grouping override
treerack show --pretty --syntax acalc.treerack --input-string "(42 + 24) * 2"
Notice how the 'group' alias node is not present, but now the expression of the addition is a factor in the multiplication:
{
"name": "result",
"from": 0,
"to": 13,
"nodes": [
{
"name": "binary0",
"from": 0,
"to": 13,
"nodes": [
{
"name": "binary1",
"from": 1,
"to": 8,
"nodes": [
{
"name": "num",
"from": 1,
"to": 3,
"text": "42"
},
{
"name": "add",
"from": 4,
"to": 5,
"text": "+"
},
{
"name": "num",
"from": 6,
"to": 8,
"text": "24"
}
]
},
{
"name": "mul",
"from": 10,
"to": 11,
"text": "*"
},
{
"name": "num",
"from": 12,
"to": 13,
"text": "2"
}
]
}
]
}
Generator - Implementing the calculator
We will now generate the Go parser code and integrate it into a CLI application.
Initialize the project:
go mod init acalc && go mod tidy
Generate the parser:
treerack generate --syntax acalc.treerack > parser.go
Implement the application logic in main.go.
main.go:
package main
import (
"bufio"
"bytes"
"encoding/json"
"errors"
"fmt"
"io"
"log"
"os"
"strings"
)
var errExit = errors.New("exit")
// repl runs the Read-Eval-Print Loop.
func repl(input io.Reader, output io.Writer) {
// use buffered io, to read the input line-by-line:
buf := bufio.NewReader(os.Stdin)
// our REPL:
for {
// print a input prompt marker:
if _, err := output.Write([]byte("> ")); err != nil {
log.Fatalln(err)
}
// read the input and handle the errors:
expr, err := read(buf)
// handle EOF (Ctrl+D):
if errors.Is(err, io.EOF) {
output.Write([]byte{'\n'})
os.Exit(0)
}
// handle the explicit exit command:
if errors.Is(err, errExit) {
os.Exit(0)
}
// handle parser errors (allow the user to retry):
var perr *parseError
if errors.As(err, &perr) {
log.Println(err)
continue
}
// handle possible I/O errors:
if err != nil {
log.Fatalln(err)
}
// evaluate and print:
result := eval(expr)
if err := print(output, result); err != nil {
log.Fatalln(err)
}
}
}
func read(input *bufio.Reader) (*node, error) {
line, err := input.ReadString('\n')
if err != nil {
return nil, err
}
// parse the line using the generated parser:
expr, err := parse(bytes.NewBufferString(line))
if err != nil {
return nil, err
}
if strings.TrimSpace(expr.Text()) == "exit" {
return nil, errExit
}
// based on our syntax, the root node always has exactly one child: either a number or a binary operation.
return expr.Nodes[0], nil
}
// eval always returns the calculated result as a float64:
func eval(expr *node) float64 {
var value float64
switch expr.Name {
case "num":
// the number format in our syntax is based on the JSON spec, so we can piggy-back on it for the number
// parsing. In a real application, we would need to handle the errors here anyway, even if our parser
// already validated the input:
json.Unmarshal([]byte(expr.Text()), &value)
return value
default:
// handle binary expressions. Format: Operand [Operator Operand]...
value, expr.Nodes = eval(expr.Nodes[0]), expr.Nodes[1:]
for len(expr.Nodes) > 0 {
var (
operator string
operand float64
)
operator, operand, expr.Nodes = expr.Nodes[0].Name, eval(expr.Nodes[1]), expr.Nodes[2:]
switch operator {
case "add":
value += operand
case "sub":
value -= operand
case "mul":
value *= operand
case "div":
value /= operand // Go handles division by zero as +/-Inf
}
}
}
return value
}
func print(output io.Writer, result float64) error {
// we can use
_, err := fmt.Fprintln(output, result)
return err
}
func main() {
// for testability, we define the REPL in a separate function so that the test code can call it with
// in-memory buffers as input and output. Our main function calls it with the stdio handles:
repl(os.Stdin, os.Stdout)
}
Running the calculator
Our arithmetic calculator is now ready. We can run it via go run .. An example session may look like this:
$ go run .
> (42 + 24) * 2
132
> 42 + 24 * 2
90
> 1 + 2 + 3
6
> exit
We can find the source files for this example here: ./examples/acalc.
Important note: unescaping
Treerack does not automatically handle escape sequences (e.g., converting \n to a literal newline). If our
syntax supports escaped characters - common in string literals - the user code is responsible for "unescaping"
the raw text from the AST node.
This is analogous to how we needed to interpret the numbers in the calculator example to convert the string representation of a number into a Go float64.
Programmatically loading syntaxes
While generating static code via treerack generate is the recommended approach, we can also load definitions dynamically at runtime.
package parser
import (
"io"
"code.squareroundforest.org/arpio/treerack"
)
func initAndParse(syntax, content io.Reader) (*treerack.Node, error) {
s := &treerack.Syntax{}
if err := s.ReadSyntax(syntax); err != nil {
return nil, err
}
if err := s.Init(); err != nil {
return nil, err
}
return s.Parse(content)
}
Caution: be mindful of security implications when loading syntax definitions from untrusted sources.
Programmatically defining syntaxes
In rare cases where a syntax must be constructed computationally, we can define rules via the Go API:
package parser
import (
"io"
"code.squareroundforest.org/arpio/treerack"
)
func initAndParse(content io.Reader) (*treerack.Node, error) {
s := &treerack.Syntax{}
// whitespace:
s.Class("whitespace-chars", treerack.Alias, false, []rune{' ', '\t', '\r\, '\n'}, nil)
s.Choice("whitespace", treerack.Whitespace, "whitespace-chars")
s.Class("digit", treerack.Alias, false, nil, [][]rune{'0', '9'})
s.Sequence("number", treerack.NoWhitespace, treerack.SequenceItem{Name: "digit", Min: 1})
s.Class("operator", treerack.None, false, []rune{'+', '-'}, nil)
s.Sequence(
"expression",
treerack.Root,
treerack.SequenceItem{Name: "number"},
treerack.SequenceItem{Name: "operator"},
treerack.SequenceItem{Name: "number"},
)
if err := s.Init(); err != nil {
return nil, err
}
return s.Parse(content)
}
Summary
We have demonstrated how to use the treerack tool to define, test, and implement a parser. We recommend the following workflow:
- draft: define a syntax in a .treerack file.
- verify: use
treerack checkandtreerack showto validate building blocks incrementally. - generate: use
treerack generateto create embeddable Go code.
Links:
- the detailed documentation of the treerack definition language: ./syntax.md
- treerack command help: ../cmd/treerack/readme.md or, if the command is installed,
man treerack, orpath/to/treerack help - the arithmetic calculator example: ./examples/acalc.
- additional examples: ./examples
Happy parsing!