Linting naked returns with Semgrep

November 15, 2020

Naked returns

The Go language supports a feature called Named return values that allows the developer to explicitly name return values of functions. These can be really useful for adding clarity to functions that return more than one argument.

func Split(s string) (head, tail string) {
  if len(s) < 1 {
    return s, ""
  }

  return s[:1], s[1:]
}

When return values are named in this way they are treated as variables defined at the start of the function. It is valid to use what is called a naked return to return these values implicitly.

func Split(s string) (head, tail string) {
  if len(s) < 1 {
    return 
  }

  return s[:1], s[1:]
}

Although this can be clear in shorter functions, when functions become longer and more complex it can lead to readability issues and potentially bugs. Even the official Golang Code Review Guidelines advises against the use of naked returns.

When working as part of a large team it is almost impossible to prevent use of language features like naked returns without an automated check. But how can we easily write our own linter to achieve this?

Introducing Semgrep

semgrep is an open source static analysis tool that allows you to write and run your own rules very easily. semgrep rules are defined as patterns that match code. For example the pattern func add(a int, b int) int would match:

func add(a int, b int) int {
  return a + b
}

You can also use the wildcard operator ... to match zero or more arguments, so changing the pattern to func add(...) int we can now match both variations in the snippet below:

func add(a int, b int) int {
  return a + b
}

func add(a, b int) int {
  return a + b
}

When looking for naked returns we only want to match functions that have one or more return variables too. This can be achieved using variables - the pattern func add(...) $RET will still match both examples above, but won't match a function with no return value like so:

func add(a int, b int) {
  return fmt.Println(a + b)
}

Using these three features, we can construct our rule to match any function containing a naked return as follows:

func $FUNC(...) $RET {
  ...
  return
  ...
}

There is loads more you can do with the pattern syntax, a full description can be found in the semgrep online documentation.

A Semgrep rule for naked returns

We've now defined a pattern than will match functions with naked returns so let's turn it into something we can use to detect naked returns in our own Go code!

Rules for semgrep are defined in simple yaml files. We can define our new rule, called naked-return, in the file shown below.

naked-return.yaml

rules:
- id: naked-return
  languages: [ go ]
  pattern: |
    func $FUNC(...) $RET {
      ...
      return
      ...
    }
  message: |
    Naked return should be avoided for readability
  severity: WARNING

semgrep provides the handy --test option to allow us to easily test rules during their development. We can write some test cases to define what we do and what we do not want to match. A test case is simply Go code with either //ruleid:{rule-name} or //ok:{rule-name} comment directives next to lines we expect the rule to flag or not flag respectively.

Our test cases look like so:

naked-return.go

package test

//ruleid: naked-return
func foo() (err error) {
    return
}

//ruleid: naked-return
func bar(s int) (err error) {
    return
}

//ok: naked-return
func baz(s int) (err error) {
    return err
}

//ruleid: naked-return
func biz(s int) (err error) {
    if s == 20 {
        return
    }

    return err
}

//ok: naked-return
func boz() {
    return
}

//ruleid: naked-return
func far() (thing string, err error) {
    return
}

type t struct{}

//ruleid: naked-return
func (t) faz() (err error) {

To test this we can run semgrep --test.

╰─ semgrep --test
1 yaml files tested
check id scoring:
================================================================================
(TODO: 0) naked-return.yaml
	✔ naked-return                                         TP: 5 TN: 2 FP: 0 FN: 0
================================================================================
final confusion matrix: TP: 5 TN: 2 FP: 0 FN: 0
================================================================================

It's all well and good running over a few tests but how about if we run it on a real codebase? Let's have a go with github.com/google/go-cmp:

╰─ semgrep --config /path/to/naked-return.yaml
using config from /path/to/naked-return.yaml. 
Visit https://semgrep.dev/registry to see all public rules.
running 1 rules...
cmp/internal/diff/debug_enable.go
severity:warning rule:naked-return: Naked return should be avoided for readability

71:func (dbg *debugger) Begin(nx, ny int, f EqualFunc, p1, p2 *EditScript) EqualFunc {
72:	dbg.Lock()
73:	dbg.fwdPath, dbg.revPath = p1, p2
74:	top := "┌─" + strings.Repeat("──", nx) + "┐\n"
75:	row := "│ " + strings.Repeat("· ", nx) + "│\n"
76:	btm := "└─" + strings.Repeat("──", nx) + "┘\n"
77:	dbg.grid = []byte(top + strings.Repeat(row, ny) + btm)
78:	dbg.lines = strings.Count(dbg.String(), "\n")
79:	fmt.Print(dbg)
80:
81:	// Wrap the EqualFunc so that we can intercept each result.
82:	return func(ix, iy int) (r Result) {
83:		cell := dbg.grid[len(top)+iy*len(row):][len("│ ")+len("· ")*ix:][:len("·")]
84:		for i := range cell {
85:			cell[i] = 0 // Zero out the multiple bytes of UTF-8 middle-dot
86:		}
87:		switch r = f(ix, iy); {
88:		case r.Equal():
89:			cell[0] = '\\'
90:		case r.Similar():
91:			cell[0] = 'X'
92:		default:
93:			cell[0] = '#'
94:		}
95:		return
96:	}
97:}

cmp/internal/diff/diff.go
severity:warning rule:naked-return: Naked return should be avoided for readability

55:func (es EditScript) stats() (s struct{ NI, NX, NY, NM int }) {
56:	for _, e := range es {
57:		switch e {
58:		case Identity:
59:			s.NI++
60:		case UniqueX:
61:			s.NX++
62:		case UniqueY:
63:			s.NY++
64:		case Modified:
65:			s.NM++
66:		default:
67:			panic("invalid edit-type")
68:		}
69:	}
70:	return
71:}
ran 1 rules on 42 files: 2 findings

I would say that's a success - we found two true positives 💥