Abstract Syntax Trees (ASTs) in Go

An Abstract Syntax Tree (AST) is a tree representation of the structure of a piece of code used mainly in compilers. ASTs can also allow us to traverse existing code and use the information collected to auto-generate new code, which we are going to do in this post! The structor is a command line utility we are going to build that given a domain package with struct definitions it automatically generates getter and setter functions for the fields of the structs.

Basic Go AST structure of a single file
Basic Go AST structure of a single file

How to transform code to an AST

Go already provides this functionality in the following 3 packages: go/ast, go/parser and go/token. In the example below I create a new FileSet (you can parse multiple files at once), parse the contents of a file and print a textual representation of the AST for that file.

fset := token.NewFileSet()
f, err := parser.ParseFile(fset, fn, src, 0)
if err != nil {
	return nil, err
}
ast.Print(fset, f)

Parsing the code of a file can either be done by passing the location of the file directly or by reading the contents of the file first in a src variable like I am doing here. You can use this approach if for example you want to use the AST information to navigate to a specific line/column in the original file.

πŸ–₯️ The full code is on my Github repo: https://github.com/efrag/blog-posts

What does an AST look like?

As we said an AST is a tree representation of our code. So let’s take a simplified version of the domain/person.go file from the repository and see what printing the AST file will produce.

*ast.File {
.  Package: src/github.com/efrag/blog-posts/abstract_syntax_trees/domain/person.go:1:1
.  Name: *ast.Ident {
.  .  NamePos: src/github.com/efrag/blog-posts/abstract_syntax_trees/domain/person.go:1:9
.  .  Name: "domain"
.  }
.  Decls: []ast.Decl (len = 1) {
.  .  0: *ast.GenDecl {
.  .  .  TokPos: src/github.com/efrag/blog-posts/abstract_syntax_trees/domain/person.go:3:1
.  .  .  Tok: type
.  .  .  Specs: []ast.Spec (len = 1) {
.  .  .  .  0: *ast.TypeSpec {
.  .  .  .  .  Name: *ast.Ident {...}
.  .  .  .  .  Type: *ast.StructType {
.  .  .  .  .  .  Struct: src/github.com/efrag/blog-posts/abstract_syntax_trees/domain/person.go:3:13
.  .  .  .  .  .  Fields: *ast.FieldList {
.  .  .  .  .  .  .  List: []*ast.Field (len = 1) {...}
.  .  .  .  .  .  }
.  .  .  .  .  }
.  .  .  .  }
.  .  .  }
.  .  }
.  }
}

The file is represented by an ast.File struct with the following fields: Package is the path, Name which in itself is an ast.Ident identifier and the Decls which is a list of ast.Decl. An ast.Decl can represent any piece of code from imports, variable declarations, structures, functions etc.). Pretty impressive right?

Building the structor

The structor is a command line utility that accepts a directory of a domain package as an input and generates getters and setters for the fields of every structure it identifies from within that directory.

What does the end result look like?

Let’s take the domain package from the sample code. It contains 2 go files, person.go and address.go, each with a single struct in them. Running the structor against this directory creates 2 additional files suffixed with _accessors.go (one for each structure that we read).

.
β”œβ”€β”€ address_accessors.go (structor generated file)
β”œβ”€β”€ address.go 
β”œβ”€β”€ person_accessors.go (structor generated file)
└── person.go

Looking at the person.go file we have:

package domain

import (
	"time"

	"github.com/efrag/blog-posts/abstract_syntax_trees/utils"
)

type Person struct {
	name        string
	dateOfBirth time.Time
	phone       utils.Phone
}

and then looking at the generated person_accessors.go file we have:

// DO NOT EDIT: file has been automatically generated
package domain

import "time"
import "github.com/efrag/blog-posts/abstract_syntax_trees/utils"

func (t *Person) GetName() string {
	return t.name
}

func (t *Person) SetName(f string) {
	t.name = f
}

func (t *Person) GetDateOfBirth() time.Time {
	return t.dateOfBirth
}

func (t *Person) SetDateOfBirth(f time.Time) {
	t.dateOfBirth = f
}

func (t *Person) GetPhone() utils.Phone {
	return t.phone
}

func (t *Person) SetPhone(f utils.Phone) {
	t.phone = f
}

The reason I chose to put the getters and setters in a separate file is simply because the code in the _accessors.go file is auto-generated and it will be overridden every time we run the structor.

For example if we add a new field in the Person struct we would want to re-run the structor to update the person_accessors.go file and then commit the result along with our code changes in our repo. In addition this allows us to put additional functions (that are not boilerplate code) in the person.go file without worrying about the auto-generated code messing with our hand-written functions.

How are we going to get there?

In order to generate the _accessors.go files we need to know the following information:

  • the name of the package we are generating
  • the list of imports that are required for the fields that we are using
  • the name of the struct
  • the names of the fields
  • the types of the fields

The good news is we can get all that information from our AST !

Once we have parsed the information from the AST into our custom structs we can pass the data into our templates and eventually write the code out in the generated files. For this we are going to use one more of Go’s packages, the html/template.

Parsing the AST

So far, we have seen how to read and parse a file and we have also seen what a printed version of an AST looks like. In order to extract the information that we need though we need to actually step through the AST and identify the details mentioned above. What is even more interesting with the structs is that they are represented by a sub-tree as they are complex structures with their own fields and types.

We can use the ast.Inspect function provided by the AST package to step through the tree and extract the information that we need.

pFile := newParsedFile(f.Name.Name)
ast.Inspect(f, func(n ast.Node) bool {
    switch t := n.(type) {
    case *ast.TypeSpec:
        e, ok := t.Type.(*ast.StructType)
        if ok {
            pFile.StructName = t.Name.Name
            for _, f := range e.Fields.List {
                pFile.Fields = append(pFile.Fields, parsedField{
                    SName: t.Name.Name,
                    FName: f.Names[0].Name,
                    CName: strings.Title(f.Names[0].Name),
                    FType: string(src[f.Type.Pos()-1 : f.Type.End()-1]),
                })
            }
        }
    case *ast.ImportSpec:
        pFile.Imports = append(pFile.Imports, t.Path.Value)
    }
    return true
})

From the list of all possible declarations that we can find in the AST for our file we are extracting the structs and imports from the files that we parse.

For each specification that represents a type in Go (ast.TypeSpec) we are only focusing in the ones that declare structures (ast.StructType). For each structure we then loop through the list of fields extracting the name and the type.

The easiest way to find the type of a field is to actually use the file we read and the position information for the type of field that we read from the AST. The position information (f.Type.Pos() and f.Type.End()) give us the start and end position of the field type in the original file.

Function code templates

We also need to define the templates that are going to be used to generate the functions. The templates accept the data that they require to render the end result in the form of a struct, that’s why when parsing the AST we are storing the information in the custom parsedField struct that are templates can recognize.

// getter template
template.Must(template.New("getter").Parse(`
func (t *{{.SName}}) Get{{ .CName }}() {{ .FType }} {
	return t.{{ .FName }}
}
`))

//setter template
template.Must(template.New("setter").Parse(`
func (t *{{.SName}}) Set{{ .CName }}(f {{ .FType }}) {
	t.{{ .FName}} = f
}
`))

Running the structor

Finally, once everything is hooked together in our code we can run the structor against a domain directory and see the generated code.

$ go run github.com/efrag/blog-posts/structor \
-domain=./src/github.com/efrag/blog-posts/abstract_syntax_trees/domain

So, there you have it πŸŽ† ! A command line utility that generates getters and setters for our structs build using Abstract Syntax Trees and templates !

Tip: The best way to understand the code above is to actually look at the repo on Github as there some boilerplate that I haven’t mentioned here.

Leave a Reply