Walking with filesystems: Go's new fs.FS interface
To understand recursion, you must first understand recursion.
—Traditional
The new io/fs
package introduced in Go 1.16 gives us a
powerful new way of working with filesystems: that is, trees of
files. In fact, the fs.FS
interface can be used with more
than just files: it abstracts the idea of a path-value map.
Introducing io/fs
In principle, any set of objects that can be addressed by a hierarchy
of pathnames can be represented by an fs.FS
. A
tree of disk files is the obvious example, but if we design our program
to operate on an fs.FS
value, it can also process ZIP and
tar
archives, Go modules, arbitrary JSON, YAML, or CUE data, or even Web resources
addressed by URLs.
Walk with me, then, as we take a tour of the new io/fs
package, the fs.FS
interface in particular, and the power
of the filesystem abstraction.
A simple file counter
Suppose we have been tasked with writing a tool that will count the number of Go source files contained in some tree (for example, a project repository).
Opening a tree of files, addressed by some path, is straightforward.
We can do this by calling os.DirFS
:
:= os.DirFS("testdata/tree") fsys
Walking the tree
Now, how do we walk this tree? In other words, how do we recursively traverse each folder within the tree, and visit every file, no matter how deeply nested?
The fs.WalkDir
function does exactly this. It takes a
filesystem and some starting path within it, and recursively
walks the tree, visiting every file and folder (in lexical
order; that is, alphabetically).
For each one it finds, it calls some function that you provide, passing it the pathname. For example:
var count int
:= os.DirFS("testdata/tree")
fsys .WalkDir(fsys, ".", func(p string, d fs.DirEntry, err
fserror) error {
if filepath.Ext(p) == ".go" {
++
count}
return nil
})
.Println(count) fmt
A file-finding tree-walker
It looks like using a filesystem and fs.WalkDir
will
work for our file-finding program, so let’s see how to turn it into a
full-fledged, well-tested Go package.
To do that, let’s expand our ambitions a bit. Counting files can be useful, but it seems a shame to go to all the trouble of finding the files, only to throw away everything but the number of files we found.
Suppose users wanted to get a list of those files; well,
it’s bad luck for them, if all they have is the value of
count
. They’d have to walk the tree all over again.
On the other hand, if we have the list of files, it’s very
easy to count them: just use the built-in len
function.
Finding files is the more general problem, so let’s try to solve that in
a useful way.
As usual, let’s first think about the main
function we’d
like to write, with absolutely minimal paperwork. Something like this
would be nice:
func main() {
:= findgo.Files(os.Args[1])
paths for _, p := range paths {
.Println(p)
fmt}
}
It wouldn’t actually be that simple, in practice, since we’d need to
check that os.Args[1]
exists, report errors, and so on. But
the CLI isn’t the point of this example, so let’s take it as read for
now, and see how findgo.Files
would work.
It would need to take the pathname of some folder as its argument, and it would walk the tree rooted at that folder finding Go files, in the way that we’ve already done as a proof of concept. Let’s write a test for that.
func TestFilesCorrectlyListsFilesInTree(t *testing.T) {
.Parallel()
t:= []string{
want "file.go",
"subfolder/subfolder.go",
"subfolder2/another.go",
"subfolder2/file.go",
}
:= findgo.Files("testdata/tree")
got if !cmp.Equal(want, got) {
.Error(cmp.Diff(want, got))
t}
}
We’ll copy our example tree of files into testdata/tree
so the test has something to work on. So the test is saying that if we
call Files
with this path, in which there are four Go
files, it should return the expected slice of strings. Over to you to
make this work.
GOAL: Implement Files
.
Well, we’ve already more or less done it, haven’t we? We can take the
code from our main.go
proof of concept and move it straight
into the findgo
package. All we need to change is that,
instead of incrementing a counter every time we find a file, we append
its path to a slice instead.
func Files(path string) (paths []string) {
:= os.DirFS(path)
fsys .WalkDir(fsys, ".", func(p string, d fs.DirEntry, err error) error {
fsif filepath.Ext(p) == ".go" {
= append(paths, p)
paths }
return nil
})
return paths
}
Excellent! The program works perfectly on our little test tree. But we can imagine that a program with more complicated logic might run into problems, especially in large and complicated filesystems. How could we test cases like that?
The fstest.MapFS
type is a neat way to test code that
traverses filesystems, without needing any disk access. Instead, it’s an
fs.FS
that lives entirely in memory, based on a Go map.
Let’s see how to rewrite our test for Files
using a
MapFS
instead of regular disk files.
func TestFilesCorrectlyListsFilesInMapFS(t *testing.T) {
.Parallel()
t:= fstest.MapFS{
fsys "file.go": {},
"subfolder/subfolder.go": {},
"subfolder2/another.go": {},
"subfolder2/file.go": {},
}
:= []string{
want "file.go",
"subfolder/subfolder.go",
"subfolder2/another.go",
"subfolder2/file.go",
}
:= findgo.Files(fsys)
got if !cmp.Equal(want, got) {
.Error(cmp.Diff(want, got))
t}
}
We’ll need to update Files
to take an fs.FS
as its parameter instead of a pathname. And since we’re
receiving the filesystem now, we needn’t open it ourselves
using os.DirFS
, so we can remove that call.
Here’s the modified Files
function:
func Files(fsys fs.FS) (paths []string) {
.WalkDir(fsys, ".", func(p string, d fs.DirEntry, err error) error {
fsif filepath.Ext(p) == ".go" {
= append(paths, p)
paths }
return nil
})
return paths
}
Using fs.FS
in APIs
There’s nothing stopping you from writing your own fs.FS
implementation, and it’s quite straightforward. Indeed, whenever you’re
writing Go code to deal with data that could in principle be addressed
as a path-value tree, you might like to consider accepting an
fs.FS
as input, or making your data type satisfy
fs.FS
itself. It all helps to make your libraries more
flexible, useful, powerful, and friendly.
We can see the effect of this with our file-finder example.
Initially, because it took a disk pathname, the only thing we could use
it to search was a disk-based filesystem. Now that we’ve updated it to
accept fs.FS
, it can operate on anything satisfying that
interface. Our test can pass it a MapFS
and it works just
fine.
So what else would work? We mentioned earlier some examples of other
things that satisfy fs.FS
. Just for fun, let’s try
Files
with a filesystem derived from a ZIP archive: after
all, it should work, shouldn’t it?
First, let’s zip up our test tree
folder and its
contents using the zip
command. If you don’t have that
command, you can use anything that creates standard ZIP files, including
the macOS Finder’s “Compress” action.
cd testdata
zip -r files.zip tree/
adding: tree/ (stored 0%)
adding: tree/subfolder/ (stored 0%)
adding: tree/subfolder/subfolder.go (stored 0%)
adding: tree/subfolder2/ (stored 0%)
adding: tree/subfolder2/another.go (stored 0%)
adding: tree/subfolder2/file.go (stored 0%)
adding: tree/file.go (stored 0%)
All these files are empty, which is why zipping them doesn’t seem to save much space, but that’s not the point: we just want a ZIP file to play with. Now, how do we open it as a filesystem?
Helpfully, Go provides facilities for reading ZIP files in the
standard library package archive/zip
, so here’s our
test:
func TestFilesCorrectlyListsFilesInZIPArchive(t *testing.T) {
.Parallel()
t, err := zip.OpenReader("testdata/files.zip")
fsysif err != nil {
.Fatal(err)
t}
:= []string{
want "tree/file.go",
"tree/subfolder/subfolder.go",
"tree/subfolder2/another.go",
"tree/subfolder2/file.go",
}
:= findgo.Files(fsys)
got if !cmp.Equal(want, got) {
.Error(cmp.Diff(want, got))
t}
}
We call zip.OpenReader
with the pathname of our test ZIP
file, and the result is a value that satisfies fs.FS
, so we
can pass it directly to Files
. And, of course, it gives us
the correct answer:
PASS
Reassuring!