1
0
treerack/docs/manual.md

630 lines
15 KiB
Markdown
Raw Permalink Normal View History

2026-01-18 22:52:27 +01:00
# Treerack Manual
This manual describes the primary use cases and workflows supported by Treerack.
## Prerequisits
We assume a working installation of the standard Go tooling.
This manual relies on the treerack command-line tool. We can install it using one of the following methods.
**A. source installation (requires make):**
1. clone the repository `git clone https://code.squareroundforest.org/arpio/treerack`
2. navigate to the source directory, run: `make install`. To install it to a custom location, use the `prefix`
environment variable, e.g. run `prefix=~/.local make install`
3. verify the installation: run `treerack version` and `man treerack`
**B. via go install:**
Alternatively, we _may be able to_ install directly using the Go toolchain:
1. run `go install code.squareroundforest.org/arpio/treerack/cmd/treerack`
2. verify: `treerack help`
## Hello syntax
2026-01-21 20:54:16 +01:00
A trivial syntax definition looks like this:
2026-01-18 22:52:27 +01:00
```
hello = "Hello, world!"
```
This definition matches only the exact string "Hello, world!" and nothing else. To test the validity of this
rule, run:
```
treerack check-syntax --syntax-string 'hello = "Hello, world!"'
```
If successful, the command exits silently with code 0. (We can append && echo ok to advertise successful
execution).
To test the syntax against actual input content:
```
treerack check --syntax-string 'hello = "Hello, world!"' --input-string 'Hello, world!'
```
To visualize the resulting Abstract Syntax Tree (AST), use the show subcommand:
```
treerack show --syntax-string 'hello = "Hello, world!"' --input-string 'Hello, world!'
```
The output will be raw JSON:
```
{"name":"hello","from":0,"to":13,"text":"Hello, world!"}
```
For a more readable output, add the --pretty flag:
```
treerack show --pretty --syntax-string 'hello = "Hello, world!"' --input-string 'Hello, world!'
```
...then the output will look like this:
```
{
"name": "hello",
"from": 0,
"to": 13,
"text": "Hello, world!"
}
```
### Handling errors
If our syntax definition is invalid, check-syntax will fail:
```
treerack check-syntax --syntax-string 'foo = bar'
```
2026-01-21 20:54:16 +01:00
The above command will fail because the parser called `foo` references an undefined parser `bar`.
2026-01-18 22:52:27 +01:00
2026-01-21 20:54:16 +01:00
We can use `check` or `show` to detect when the input content does not match a valid syntax. Using the hello
syntax, we can try the following:
2026-01-18 22:52:27 +01:00
```
treerack check --syntax-string 'hello = "Hello, world!"' --input-string 'Hi!'
```
It will show that parsing the input failed and that it failed while using the parser hello.
## Basic syntax - An arithmetic calculator
2026-01-21 20:54:16 +01:00
In this section, we will build a simplistic arithmetic calculator. It will read a line from standard input,
parse it as an arithmetic expression, compute the result, print it, and start over - effectively creating a REPL
(Read-Eval-Print Loop).
2026-01-18 22:52:27 +01:00
We will support addition +, subtraction -, multiplication *, division /, and grouping with parentheses ().
acalc.treerack:
```
// Define whitespace characters.
// The :ws flag marks this as the global whitespace handler.
ignore:ws = " " | [\t] | [\r] | [\n];
// Define the number format.
//
// The :nows flag ensures we do not skip whitespace *inside* the number token. We support integers, floats, and
// scientific notation (e.g., 1.5e3). Arbitrary leading zeros are disallowed to prevent confusion with octal
// literals.
num:nows = "-"? ("0" | [1-9][0-9]*) ("." [0-9]+)? ([eE] [+\-]? [0-9]+)?;
// define the supported operators:
add = "+";
sub = "-";
mul = "*";
div = "/";
// Grouping logic.
//
// Expressions can be enclosed in parentheses. This references 'expression', which is defined later,
// demonstrating recursive definitions. The :alias flag prevents 'group' from creating its own node in the AST;
// only the child 'expression' will appear.
group:alias = "(" expression ")";
// Operator Precedence.
//
// We group operators by precedence levels to ensure correct order of operations.
//
2026-01-21 20:54:16 +01:00
// Level 0 (high): multiplication/division
2026-01-18 22:52:27 +01:00
op0:alias = mul | div;
2026-01-21 20:54:16 +01:00
// Level 1 (low): addition/subtraction
2026-01-18 22:52:27 +01:00
op1:alias = add | sub;
// Operands for each precedence level.
//
// operand0 can be a raw number or a grouped expression.
operand0:alias = num | group;
// operand1 can be a higher-precedence operand or a completed binary0 operation.
operand1:alias = operand0 | binary0;
// Binary Expressions.
//
// We define these hierarchically. 'binary0' handles high-precedence operations (mul/div).
binary0 = operand0 (op0 operand0)+;
binary1 = operand1 (op1 operand1)+;
binary:alias = binary0 | binary1;
// The generalized Expression.
//
// An expression is either a raw number, a group, or a binary operation.
expression:alias = num | group | binary;
// Root Definition.
//
// The final result is either a valid expression or the "exit" command. Since 'expression' is an alias, we need
// a concrete root parser to anchor the AST. Note: The :root flag is optional here because this is the last
// definition in the file.
result = expression | "exit"
```
### Testing the syntax
#### 1. Simple number
```
treerack show --pretty --syntax acalc.treerack --input-string 42
```
Output:
```
{
"name": "result",
"from": 0,
"to": 2,
"nodes": [
{
"name": "num",
"from": 0,
"to": 2,
"text": "42"
}
]
}
```
#### 2. Basic operation
```
treerack show --pretty --syntax acalc.treerack --input-string "42 + 24"
```
Output:
```
{
"name": "expression",
"from": 0,
"to": 7,
"nodes": [
{
"name": "binary1",
"from": 0,
"to": 7,
"nodes": [
{
"name": "num",
"from": 0,
"to": 2,
"text": "42"
},
{
"name": "add",
"from": 3,
"to": 4,
"text": "+"
},
{
"name": "num",
"from": 5,
"to": 7,
"text": "24"
}
]
}
]
}
```
#### 3. Precedence check
```
treerack show --pretty --syntax acalc.treerack --input-string "42 + 24 * 2"
```
Output:
```
{
"name": "result",
"from": 0,
"to": 11,
"nodes": [
{
"name": "binary1",
"from": 0,
"to": 11,
"nodes": [
{
"name": "num",
"from": 0,
"to": 2,
"text": "42"
},
{
"name": "add",
"from": 3,
"to": 4,
"text": "+"
},
{
"name": "binary0",
"from": 5,
"to": 11,
"nodes": [
{
"name": "num",
"from": 5,
"to": 7,
"text": "24"
},
{
"name": "mul",
"from": 8,
"to": 9,
"text": "*"
},
{
"name": "num",
"from": 10,
"to": 11,
"text": "2"
}
]
}
]
}
]
}
```
#### 4. Grouping override
```
treerack show --pretty --syntax acalc.treerack --input-string "(42 + 24) * 2"
```
Notice how the 'group' alias node is not present, but now the expression of the addition is a factor in the
multiplication:
```
{
"name": "result",
"from": 0,
"to": 13,
"nodes": [
{
"name": "binary0",
"from": 0,
"to": 13,
"nodes": [
{
"name": "binary1",
"from": 1,
"to": 8,
"nodes": [
{
"name": "num",
"from": 1,
"to": 3,
"text": "42"
},
{
"name": "add",
"from": 4,
"to": 5,
"text": "+"
},
{
"name": "num",
"from": 6,
"to": 8,
"text": "24"
}
]
},
{
"name": "mul",
"from": 10,
"to": 11,
"text": "*"
},
{
"name": "num",
"from": 12,
"to": 13,
"text": "2"
}
]
}
]
}
```
## Generator - Implementing the calculator
We will now generate the Go parser code and integrate it into a CLI application.
Initialize the project:
```
go mod init acalc && go mod tidy
```
Generate the parser:
```
treerack generate --syntax acalc.treerack > parser.go
```
Implement the application logic in main.go.
main.go:
```
package main
import (
"bufio"
"bytes"
"encoding/json"
"errors"
"fmt"
"io"
"log"
"os"
"strings"
)
var errExit = errors.New("exit")
// repl runs the Read-Eval-Print Loop.
func repl(input io.Reader, output io.Writer) {
2026-01-21 20:54:16 +01:00
// use buffered io, to read the input line-by-line:
2026-01-18 22:52:27 +01:00
buf := bufio.NewReader(os.Stdin)
2026-01-21 20:54:16 +01:00
// our REPL:
2026-01-18 22:52:27 +01:00
for {
2026-01-21 20:54:16 +01:00
// print a input prompt marker:
2026-01-18 22:52:27 +01:00
if _, err := output.Write([]byte("> ")); err != nil {
log.Fatalln(err)
}
// read the input and handle the errors:
expr, err := read(buf)
2026-01-21 20:54:16 +01:00
// handle EOF (Ctrl+D):
2026-01-18 22:52:27 +01:00
if errors.Is(err, io.EOF) {
output.Write([]byte{'\n'})
os.Exit(0)
}
2026-01-21 20:54:16 +01:00
// handle the explicit exit command:
2026-01-18 22:52:27 +01:00
if errors.Is(err, errExit) {
os.Exit(0)
}
2026-01-21 20:54:16 +01:00
// handle parser errors (allow the user to retry):
2026-01-18 22:52:27 +01:00
var perr *parseError
if errors.As(err, &perr) {
log.Println(err)
continue
}
2026-01-21 20:54:16 +01:00
// handle possible I/O errors:
2026-01-18 22:52:27 +01:00
if err != nil {
log.Fatalln(err)
}
2026-01-21 20:54:16 +01:00
// evaluate and print:
2026-01-18 22:52:27 +01:00
result := eval(expr)
if err := print(output, result); err != nil {
log.Fatalln(err)
}
}
}
func read(input *bufio.Reader) (*node, error) {
line, err := input.ReadString('\n')
if err != nil {
return nil, err
}
2026-01-21 20:54:16 +01:00
// parse the line using the generated parser:
2026-01-18 22:52:27 +01:00
expr, err := parse(bytes.NewBufferString(line))
if err != nil {
return nil, err
}
if strings.TrimSpace(expr.Text()) == "exit" {
return nil, errExit
}
2026-01-21 20:54:16 +01:00
// based on our syntax, the root node always has exactly one child: either a number or a binary operation.
2026-01-18 22:52:27 +01:00
return expr.Nodes[0], nil
}
// eval always returns the calculated result as a float64:
func eval(expr *node) float64 {
var value float64
switch expr.Name {
case "num":
// the number format in our syntax is based on the JSON spec, so we can piggy-back on it for the number
// parsing. In a real application, we would need to handle the errors here anyway, even if our parser
// already validated the input:
json.Unmarshal([]byte(expr.Text()), &value)
return value
default:
2026-01-21 20:54:16 +01:00
// handle binary expressions. Format: Operand [Operator Operand]...
2026-01-18 22:52:27 +01:00
value, expr.Nodes = eval(expr.Nodes[0]), expr.Nodes[1:]
for len(expr.Nodes) > 0 {
var (
operator string
operand float64
)
operator, operand, expr.Nodes = expr.Nodes[0].Name, eval(expr.Nodes[1]), expr.Nodes[2:]
switch operator {
case "add":
value += operand
case "sub":
value -= operand
case "mul":
value *= operand
case "div":
2026-01-21 20:54:16 +01:00
value /= operand // Go handles division by zero as +/-Inf
2026-01-18 22:52:27 +01:00
}
}
}
return value
}
func print(output io.Writer, result float64) error {
2026-01-21 20:54:16 +01:00
// we can use
2026-01-18 22:52:27 +01:00
_, err := fmt.Fprintln(output, result)
return err
}
func main() {
2026-01-21 20:54:16 +01:00
// for testability, we define the REPL in a separate function so that the test code can call it with
2026-01-18 22:52:27 +01:00
// in-memory buffers as input and output. Our main function calls it with the stdio handles:
repl(os.Stdin, os.Stdout)
}
```
### Running the calculator
Our arithmetic calculator is now ready. We can run it via `go run .`. An example session may look like this:
```
$ go run .
> (42 + 24) * 2
132
> 42 + 24 * 2
90
> 1 + 2 + 3
6
> exit
```
We can find the source files for this example here: [./examples/acalc](./examples/acalc).
2026-01-21 20:54:16 +01:00
## Important note: unescaping
2026-01-18 22:52:27 +01:00
2026-01-21 20:54:16 +01:00
Treerack does not automatically handle escape sequences (e.g., converting `\n` to a literal newline). If our
syntax supports escaped characters - common in string literals - the user code is responsible for "unescaping"
the raw text from the AST node.
2026-01-18 22:52:27 +01:00
2026-01-21 20:54:16 +01:00
This is analogous to how we needed to interpret the numbers in the calculator example to convert the string
2026-01-18 22:52:27 +01:00
representation of a number into a Go float64.
## Programmatically loading syntaxes
While generating static code via treerack generate is the recommended approach, we can also load definitions
dynamically at runtime.
```
package parser
import (
"io"
"code.squareroundforest.org/arpio/treerack"
)
func initAndParse(syntax, content io.Reader) (*treerack.Node, error) {
s := &treerack.Syntax{}
if err := s.ReadSyntax(syntax); err != nil {
return nil, err
}
if err := s.Init(); err != nil {
return nil, err
}
return s.Parse(content)
}
```
2026-01-21 20:54:16 +01:00
Caution: be mindful of security implications when loading syntax definitions from untrusted sources.
2026-01-18 22:52:27 +01:00
## Programmatically defining syntaxes
In rare cases where a syntax must be constructed computationally, we can define rules via the Go API:
```
package parser
import (
"io"
"code.squareroundforest.org/arpio/treerack"
)
func initAndParse(content io.Reader) (*treerack.Node, error) {
s := &treerack.Syntax{}
// whitespace:
s.Class("whitespace-chars", treerack.Alias, false, []rune{' ', '\t', '\r\, '\n'}, nil)
s.Choice("whitespace", treerack.Whitespace, "whitespace-chars")
s.Class("digit", treerack.Alias, false, nil, [][]rune{'0', '9'})
s.Sequence("number", treerack.NoWhitespace, treerack.SequenceItem{Name: "digit", Min: 1})
s.Class("operator", treerack.None, false, []rune{'+', '-'}, nil)
s.Sequence(
"expression",
treerack.Root,
treerack.SequenceItem{Name: "number"},
treerack.SequenceItem{Name: "operator"},
treerack.SequenceItem{Name: "number"},
)
if err := s.Init(); err != nil {
return nil, err
}
return s.Parse(content)
}
```
## Summary
2026-01-21 20:54:16 +01:00
We have demonstrated how to use the treerack tool to define, test, and implement a parser. We recommend the
2026-01-18 22:52:27 +01:00
following workflow:
1. draft: define a syntax in a .treerack file.
2. verify: use `treerack check` and `treerack show` to validate building blocks incrementally.
3. generate: use `treerack generate` to create embeddable Go code.
**Links:**
- the detailed documentation of the treerack definition language: [./syntax.md](./syntax.md)
- treerack command help: [../cmd/treerack/readme.md](../cmd/treerack/readme.md) or, if the command is installed,
`man treerack`, or `path/to/treerack help`
- the arithmetic calculator example: [./examples/acalc](./examples/acalc).
- additional examples: [./examples](./examples)
Happy parsing!