1
0
treerack/readme.md

70 lines
2.6 KiB
Markdown

# treerack
**A parser generator for Go.**
Treerack defines and generates recursive descent parsers for arbitrary syntaxes, processing input content into
its Abstract Syntax Tree (AST) representation. It utilizes a custom syntax definition format derived from EBNF
(Extended Backus-Naur Form), allowing for clear and concise grammar descriptions.
## Examples
- **JSON**: [docs/examples/json.treerack](docs/examples/json.treerack)
- **Scheme**: [docs/examples/scheme.treerack](docs/examples/scheme.treerack)
- **Treerack (self-definition)**: [syntax.treerack](syntax.treerack)
## Overview
Treerack operates without a separate lexing phase, parsing character streams directly to produce an AST. The
syntax language supports recursive references, enabling the definition of context-free grammars.
We can define syntaxes during development and use the provided tool to generate static Go code, which is then
built into the application. Alternatively, the library supports loading syntaxes dynamically at runtime.
The parser engine handles recursive references and left-recursion internally. This way it makes it more
convenient writing intuitive grammar definitions, and allows defining context-free languages without complex
workarounds.
## Installation
From source (recommended):
```
git clone https://code.squareroundforest.org/arpio/treerack
cd treerack
make install
```
Alternatively ("best effort" basis):
```
go install code.squareroundforest.org/arpio/treerack/cmd/treerack
```
## Documentation
- [Manual](docs/manual.md): a guide to the main use cases supported by Treerack.
- [Syntax Definition](docs/syntax.md): detailed reference for the Treerack definition language.
- [Library Documentation](https://godocs.io/code.squareroundforest.org/arpio/treerack): GoDoc reference for the
runtime library.
## Developer Notes
We use a Makefile to manage the build and verification lifecycle.
Important: generating the parser for the Treerack syntax itself (bootstrapping) requires multiple phases.
Consequently, running standard go build or go test commands may miss subtle consistency problems.
The decisive way to verify changes is via the makefile:
```
make check
```
## Limitations
- Lexer & UTF-8: Treerack does not require a lexer, which simplifies the architecture. However, this enforces
the use of UTF-8 input. We have considered support for custom tokenizers as a potential future improvement.
- Whitespace Delimited Languages: due to the recursive descent nature and the lack of a dedicated lexer state,
defining whitespace-delimited syntaxes (such as Python-style indentation) can be difficult to achieve with the
current feature set.