Linting naked returns with Semgrep
Naked returns
The Go language supports a feature called Named return values that allows the developer to explicitly name return values of functions. These can be really useful for adding clarity to functions that return more than one argument.
func Split(s string) (head, tail string) {
if len(s) < 1 {
return s, ""
}
return s[:1], s[1:]
}
When return values are named in this way they are treated as variables defined at the start of the function. It is valid to use what is called a naked return to return these values implicitly.
func Split(s string) (head, tail string) {
if len(s) < 1 {
return
}
return s[:1], s[1:]
}
Although this can be clear in shorter functions, when functions become longer and more complex it can lead to readability issues and potentially bugs. Even the official Golang Code Review Guidelines advises against the use of naked returns.
When working as part of a large team it is almost impossible to prevent use of language features like naked returns without an automated check. But how can we easily write our own linter to achieve this?
Introducing Semgrep
semgrep
is an open source static analysis tool that
allows you to write and run your own rules very easily. semgrep
rules are
defined as patterns that match code. For example the pattern
func add(a int, b int) int
would match:
func add(a int, b int) int {
return a + b
}
You can also use the wildcard operator ...
to match zero or more arguments, so
changing the pattern to func add(...) int
we can now match both variations in
the snippet below:
func add(a int, b int) int {
return a + b
}
func add(a, b int) int {
return a + b
}
When looking for naked returns we only want to match functions that have one or
more return variables too. This can be achieved using variables - the pattern
func add(...) $RET
will still match both examples above, but won't match a
function with no return value like so:
func add(a int, b int) {
return fmt.Println(a + b)
}
Using these three features, we can construct our rule to match any function containing a naked return as follows:
func $FUNC(...) $RET {
...
return
...
}
There is loads more you can do with the pattern syntax, a full description can
be found in the semgrep
online documentation.
A Semgrep rule for naked returns
We've now defined a pattern than will match functions with naked returns so let's turn it into something we can use to detect naked returns in our own Go code!
Rules for semgrep
are defined in simple yaml
files. We can define our new
rule, called naked-return
, in the file shown below.
rules:
- id: naked-return
languages: [ go ]
pattern: |
func $FUNC(...) $RET {
...
return
...
}
message: |
Naked return should be avoided for readability
severity: WARNING
semgrep
provides the handy --test
option to allow us to easily test rules
during their development. We can write some test cases to define what we do and
what we do not want to match. A test case is simply Go code with either
//ruleid:{rule-name}
or //ok:{rule-name}
comment directives next to lines we
expect the rule to flag or not flag respectively.
Our test cases look like so:
package test
//ruleid: naked-return
func foo() (err error) {
return
}
//ruleid: naked-return
func bar(s int) (err error) {
return
}
//ok: naked-return
func baz(s int) (err error) {
return err
}
//ruleid: naked-return
func biz(s int) (err error) {
if s == 20 {
return
}
return err
}
//ok: naked-return
func boz() {
return
}
//ruleid: naked-return
func far() (thing string, err error) {
return
}
type t struct{}
//ruleid: naked-return
func (t) faz() (err error) {
To test this we can run semgrep --test
.
╰─ semgrep --test
1 yaml files tested
check id scoring:
================================================================================
(TODO: 0) naked-return.yaml
✔ naked-return TP: 5 TN: 2 FP: 0 FN: 0
================================================================================
final confusion matrix: TP: 5 TN: 2 FP: 0 FN: 0
================================================================================
It's all well and good running over a few tests but how about if we run it on a
real codebase? Let's have a go with github.com/google/go-cmp
:
╰─ semgrep --config /path/to/naked-return.yaml
using config from /path/to/naked-return.yaml.
Visit https://semgrep.dev/registry to see all public rules.
running 1 rules...
cmp/internal/diff/debug_enable.go
severity:warning rule:naked-return: Naked return should be avoided for readability
71:func (dbg *debugger) Begin(nx, ny int, f EqualFunc, p1, p2 *EditScript) EqualFunc {
72: dbg.Lock()
73: dbg.fwdPath, dbg.revPath = p1, p2
74: top := "┌─" + strings.Repeat("──", nx) + "┐\n"
75: row := "│ " + strings.Repeat("· ", nx) + "│\n"
76: btm := "└─" + strings.Repeat("──", nx) + "┘\n"
77: dbg.grid = []byte(top + strings.Repeat(row, ny) + btm)
78: dbg.lines = strings.Count(dbg.String(), "\n")
79: fmt.Print(dbg)
80:
81: // Wrap the EqualFunc so that we can intercept each result.
82: return func(ix, iy int) (r Result) {
83: cell := dbg.grid[len(top)+iy*len(row):][len("│ ")+len("· ")*ix:][:len("·")]
84: for i := range cell {
85: cell[i] = 0 // Zero out the multiple bytes of UTF-8 middle-dot
86: }
87: switch r = f(ix, iy); {
88: case r.Equal():
89: cell[0] = '\\'
90: case r.Similar():
91: cell[0] = 'X'
92: default:
93: cell[0] = '#'
94: }
95: return
96: }
97:}
cmp/internal/diff/diff.go
severity:warning rule:naked-return: Naked return should be avoided for readability
55:func (es EditScript) stats() (s struct{ NI, NX, NY, NM int }) {
56: for _, e := range es {
57: switch e {
58: case Identity:
59: s.NI++
60: case UniqueX:
61: s.NX++
62: case UniqueY:
63: s.NY++
64: case Modified:
65: s.NM++
66: default:
67: panic("invalid edit-type")
68: }
69: }
70: return
71:}
ran 1 rules on 42 files: 2 findings
I would say that's a success - we found two true positives 💥