arpio/treerack

Fork 0

Arpad Ryszka 2fe6f88ed6 error fixes and doc updates

2026-01-21 20:54:16 +01:00

15 KiB

Raw Blame History

Treerack Manual

This manual describes the primary use cases and workflows supported by Treerack.

Prerequisits

We assume a working installation of the standard Go tooling.

This manual relies on the treerack command-line tool. We can install it using one of the following methods.

A. source installation (requires make):

clone the repository git clone https://code.squareroundforest.org/arpio/treerack
navigate to the source directory, run: make install. To install it to a custom location, use the prefix environment variable, e.g. run prefix=~/.local make install
verify the installation: run treerack version and man treerack

B. via go install:

Alternatively, we may be able to install directly using the Go toolchain:

run go install code.squareroundforest.org/arpio/treerack/cmd/treerack
verify: treerack help

Hello syntax

A trivial syntax definition looks like this:

hello = "Hello, world!"

This definition matches only the exact string "Hello, world!" and nothing else. To test the validity of this rule, run:

treerack check-syntax --syntax-string 'hello = "Hello, world!"'

If successful, the command exits silently with code 0. (We can append && echo ok to advertise successful execution).

To test the syntax against actual input content:

treerack check --syntax-string 'hello = "Hello, world!"' --input-string 'Hello, world!'

To visualize the resulting Abstract Syntax Tree (AST), use the show subcommand:

treerack show --syntax-string 'hello = "Hello, world!"' --input-string 'Hello, world!'

The output will be raw JSON:

{"name":"hello","from":0,"to":13,"text":"Hello, world!"}

For a more readable output, add the --pretty flag:

treerack show --pretty --syntax-string 'hello = "Hello, world!"' --input-string 'Hello, world!'

...then the output will look like this:

{
    "name": "hello",
    "from": 0,
    "to": 13,
    "text": "Hello, world!"
}

Handling errors

If our syntax definition is invalid, check-syntax will fail:

treerack check-syntax --syntax-string 'foo = bar'

The above command will fail because the parser called foo references an undefined parser bar.

We can use check or show to detect when the input content does not match a valid syntax. Using the hello syntax, we can try the following:

treerack check --syntax-string 'hello = "Hello, world!"' --input-string 'Hi!'

It will show that parsing the input failed and that it failed while using the parser hello.

Basic syntax - An arithmetic calculator

In this section, we will build a simplistic arithmetic calculator. It will read a line from standard input, parse it as an arithmetic expression, compute the result, print it, and start over - effectively creating a REPL (Read-Eval-Print Loop).

We will support addition +, subtraction -, multiplication *, division /, and grouping with parentheses ().

acalc.treerack:

// Define whitespace characters.
// The :ws flag marks this as the global whitespace handler.
ignore:ws = " " | [\t] | [\r] | [\n];

// Define the number format.
//
// The :nows flag ensures we do not skip whitespace *inside* the number token. We support integers, floats, and
// scientific notation (e.g., 1.5e3). Arbitrary leading zeros are disallowed to prevent confusion with octal
// literals.
num:nows = "-"? ("0" | [1-9][0-9]*) ("." [0-9]+)? ([eE] [+\-]? [0-9]+)?;

// define the supported operators:
add = "+";
sub = "-";
mul = "*";
div = "/";

// Grouping logic.
//
// Expressions can be enclosed in parentheses. This references 'expression', which is defined later,
// demonstrating recursive definitions. The :alias flag prevents 'group' from creating its own node in the AST;
// only the child 'expression' will appear.
group:alias = "(" expression ")";

// Operator Precedence.
//
// We group operators by precedence levels to ensure correct order of operations.
//
// Level 0 (high): multiplication/division
op0:alias = mul | div;

// Level 1 (low): addition/subtraction
op1:alias = add | sub;

// Operands for each precedence level.
//
// operand0 can be a raw number or a grouped expression.
operand0:alias = num | group;

// operand1 can be a higher-precedence operand or a completed binary0 operation.
operand1:alias = operand0 | binary0;

// Binary Expressions.
//
// We define these hierarchically. 'binary0' handles high-precedence operations (mul/div).
binary0 = operand0 (op0 operand0)+;
binary1 = operand1 (op1 operand1)+;
binary:alias = binary0 | binary1;

// The generalized Expression.
//
// An expression is either a raw number, a group, or a binary operation.
expression:alias = num | group | binary;

// Root Definition.
//
// The final result is either a valid expression or the "exit" command. Since 'expression' is an alias, we need
// a concrete root parser to anchor the AST. Note: The :root flag is optional here because this is the last
// definition in the file.
result = expression | "exit"

Testing the syntax

1. Simple number

treerack show --pretty --syntax acalc.treerack --input-string 42

Output:

{
    "name": "result",
    "from": 0,
    "to": 2,
    "nodes": [
        {
            "name": "num",
            "from": 0,
            "to": 2,
            "text": "42"
        }
    ]
}

2. Basic operation

treerack show --pretty --syntax acalc.treerack --input-string "42 + 24"

Output:

{
    "name": "expression",
    "from": 0,
    "to": 7,
    "nodes": [
        {
            "name": "binary1",
            "from": 0,
            "to": 7,
            "nodes": [
                {
                    "name": "num",
                    "from": 0,
                    "to": 2,
                    "text": "42"
                },
                {
                    "name": "add",
                    "from": 3,
                    "to": 4,
                    "text": "+"
                },
                {
                    "name": "num",
                    "from": 5,
                    "to": 7,
                    "text": "24"
                }
            ]
        }
    ]
}

3. Precedence check

treerack show --pretty --syntax acalc.treerack --input-string "42 + 24 * 2"

Output:

{
    "name": "result",
    "from": 0,
    "to": 11,
    "nodes": [
        {
            "name": "binary1",
            "from": 0,
            "to": 11,
            "nodes": [
                {
                    "name": "num",
                    "from": 0,
                    "to": 2,
                    "text": "42"
                },
                {
                    "name": "add",
                    "from": 3,
                    "to": 4,
                    "text": "+"
                },
                {
                    "name": "binary0",
                    "from": 5,
                    "to": 11,
                    "nodes": [
                        {
                            "name": "num",
                            "from": 5,
                            "to": 7,
                            "text": "24"
                        },
                        {
                            "name": "mul",
                            "from": 8,
                            "to": 9,
                            "text": "*"
                        },
                        {
                            "name": "num",
                            "from": 10,
                            "to": 11,
                            "text": "2"
                        }
                    ]
                }
            ]
        }
    ]
}

4. Grouping override

treerack show --pretty --syntax acalc.treerack --input-string "(42 + 24) * 2"

Notice how the 'group' alias node is not present, but now the expression of the addition is a factor in the multiplication:

{
    "name": "result",
    "from": 0,
    "to": 13,
    "nodes": [
        {
            "name": "binary0",
            "from": 0,
            "to": 13,
            "nodes": [
                {
                    "name": "binary1",
                    "from": 1,
                    "to": 8,
                    "nodes": [
                        {
                            "name": "num",
                            "from": 1,
                            "to": 3,
                            "text": "42"
                        },
                        {
                            "name": "add",
                            "from": 4,
                            "to": 5,
                            "text": "+"
                        },
                        {
                            "name": "num",
                            "from": 6,
                            "to": 8,
                            "text": "24"
                        }
                    ]
                },
                {
                    "name": "mul",
                    "from": 10,
                    "to": 11,
                    "text": "*"
                },
                {
                    "name": "num",
                    "from": 12,
                    "to": 13,
                    "text": "2"
                }
            ]
        }
    ]
}

Generator - Implementing the calculator

We will now generate the Go parser code and integrate it into a CLI application.

Initialize the project:

go mod init acalc && go mod tidy

Generate the parser:

treerack generate --syntax acalc.treerack > parser.go

Implement the application logic in main.go.

main.go:

package main

import (
	"bufio"
	"bytes"
	"encoding/json"
	"errors"
	"fmt"
	"io"
	"log"
	"os"
	"strings"
)

var errExit = errors.New("exit")

// repl runs the Read-Eval-Print Loop.
func repl(input io.Reader, output io.Writer) {

	// use buffered io, to read the input line-by-line:
	buf := bufio.NewReader(os.Stdin)

	// our REPL:
	for {
		// print a input prompt marker:
		if _, err := output.Write([]byte("> ")); err != nil {
			log.Fatalln(err)
		}

		// read the input and handle the errors:
		expr, err := read(buf)

		// handle EOF (Ctrl+D):
		if errors.Is(err, io.EOF) {
			output.Write([]byte{'\n'})
			os.Exit(0)
		}

		// handle the explicit exit command:
		if errors.Is(err, errExit) {
			os.Exit(0)
		}

		// handle parser errors (allow the user to retry):
		var perr *parseError
		if errors.As(err, &perr) {
			log.Println(err)
			continue
		}

		// handle possible I/O errors:
		if err != nil {
			log.Fatalln(err)
		}

		// evaluate and print:
		result := eval(expr)
		if err := print(output, result); err != nil {
			log.Fatalln(err)
		}
	}
}

func read(input *bufio.Reader) (*node, error) {
	line, err := input.ReadString('\n')
	if err != nil {
		return nil, err
	}

	// parse the line using the generated parser:
	expr, err := parse(bytes.NewBufferString(line))
	if err != nil {
		return nil, err
	}

	if strings.TrimSpace(expr.Text()) == "exit" {
		return nil, errExit
	}

	// based on our syntax, the root node always has exactly one child: either a number or a binary operation.
	return expr.Nodes[0], nil
}

// eval always returns the calculated result as a float64:
func eval(expr *node) float64 {
	var value float64
	switch expr.Name {
	case "num":

		// the number format in our syntax is based on the JSON spec, so we can piggy-back on it for the number
		// parsing. In a real application, we would need to handle the errors here anyway, even if our parser
		// already validated the input:
		json.Unmarshal([]byte(expr.Text()), &value)
		return value
	default:

		// handle binary expressions. Format: Operand [Operator Operand]...
		value, expr.Nodes = eval(expr.Nodes[0]), expr.Nodes[1:]
		for len(expr.Nodes) > 0 {
			var (
				operator string
				operand  float64
			)

			operator, operand, expr.Nodes = expr.Nodes[0].Name, eval(expr.Nodes[1]), expr.Nodes[2:]
			switch operator {
			case "add":
				value += operand
			case "sub":
				value -= operand
			case "mul":
				value *= operand
			case "div":
				value /= operand // Go handles division by zero as +/-Inf
			}
		}
	}

	return value
}

func print(output io.Writer, result float64) error {
	// we can use 
	_, err := fmt.Fprintln(output, result)
	return err
}

func main() {
	// for testability, we define the REPL in a separate function so that the test code can call it with
	// in-memory buffers as input and output. Our main function calls it with the stdio handles:
	repl(os.Stdin, os.Stdout)
}

Running the calculator

Our arithmetic calculator is now ready. We can run it via go run .. An example session may look like this:

$ go run .
> (42 + 24) * 2
132
> 42 + 24 * 2
90
> 1 + 2 + 3
6
> exit

We can find the source files for this example here: ./examples/acalc.

Important note: unescaping

Treerack does not automatically handle escape sequences (e.g., converting \n to a literal newline). If our syntax supports escaped characters - common in string literals - the user code is responsible for "unescaping" the raw text from the AST node.

This is analogous to how we needed to interpret the numbers in the calculator example to convert the string representation of a number into a Go float64.

Programmatically loading syntaxes

While generating static code via treerack generate is the recommended approach, we can also load definitions dynamically at runtime.

package parser

import (
	"io"
	"code.squareroundforest.org/arpio/treerack"
)

func initAndParse(syntax, content io.Reader) (*treerack.Node, error) {
	s := &treerack.Syntax{}
	if err := s.ReadSyntax(syntax); err != nil {
		return nil, err
	}

	if err := s.Init(); err != nil {
		return nil, err
	}

	return s.Parse(content)
}

Caution: be mindful of security implications when loading syntax definitions from untrusted sources.

Programmatically defining syntaxes

In rare cases where a syntax must be constructed computationally, we can define rules via the Go API:

package parser

import (
	"io"
	"code.squareroundforest.org/arpio/treerack"
)

func initAndParse(content io.Reader) (*treerack.Node, error) {
	s := &treerack.Syntax{}

	// whitespace:
	s.Class("whitespace-chars", treerack.Alias, false, []rune{' ', '\t', '\r\, '\n'}, nil)
	s.Choice("whitespace", treerack.Whitespace, "whitespace-chars")

	s.Class("digit", treerack.Alias, false, nil, [][]rune{'0', '9'})
	s.Sequence("number", treerack.NoWhitespace, treerack.SequenceItem{Name: "digit", Min: 1})
	s.Class("operator", treerack.None, false, []rune{'+', '-'}, nil)
	s.Sequence(
		"expression",
		treerack.Root,
		treerack.SequenceItem{Name: "number"}, 
		treerack.SequenceItem{Name: "operator"}, 
		treerack.SequenceItem{Name: "number"}, 
	)

	if err := s.Init(); err != nil {
		return nil, err
	}

	return s.Parse(content)
}

Summary

We have demonstrated how to use the treerack tool to define, test, and implement a parser. We recommend the following workflow:

draft: define a syntax in a .treerack file.
verify: use treerack check and treerack show to validate building blocks incrementally.
generate: use treerack generate to create embeddable Go code.

Links:

the detailed documentation of the treerack definition language: ./syntax.md
treerack command help: ../cmd/treerack/readme.md or, if the command is installed, man treerack, or path/to/treerack help
the arithmetic calculator example: ./examples/acalc.
additional examples: ./examples

Happy parsing!

15 KiB Raw Blame History

Treerack Manual

Prerequisits

Hello syntax

Handling errors

Basic syntax - An arithmetic calculator

Testing the syntax

1. Simple number

2. Basic operation

3. Precedence check

4. Grouping override

Generator - Implementing the calculator

Running the calculator

Important note: unescaping

Programmatically loading syntaxes

Programmatically defining syntaxes

Summary

15 KiB

Raw Blame History