A generic Set type in Go
You wanted generics. You got them. Now it’s time to pay the price.
In order to be irreplaceable, one must always be different.
—Coco Chanel
In Functional programming in Go, we
looked at some interesting generic functions that we can write on slices
(and, by extension, maps): for example, Map
,
Reduce
, and Filter
. Maps and slices are so
useful and ubiquitous in Go precisely because they can be used as
containers: values that store a collection of elements of any
type.
Before the introduction of generics in Go 1.18, though, we were more or less limited to just maps and slices, because it wasn’t possible to write user-defined containers of arbitrary types.
Now that we can, what container types, or other generic data structures, would it be interesting to write?
Sets
A set is like a kind of compromise between a slice and a map. Like a slice, a set is a collection of elements, all of the same type. Like a map, there’s no defined order, and elements must be unique.
We very often need something like a set in Go, but there’s no built-in or standard library set type, so we tend to roll our own using a map. For example:
var validCategory = map[string]bool{
"Autobiography": true,
"Large Print Romance": true,
"Particle Physics": true,
}
The value type of the map doesn’t matter here, since we’re
only really interested in the keys. But it’s convenient to use bool
values, since it
means we can use them in conditional expressions:
if validCategory[category] {
As you probably know, a map index expression like this returns the
zero value if the given key is not in the map. Since the zero value for
bool
is false
, this means the if
condition is false if category
is not in the “set”.
As neat as this is, it feels a little hacky, so it’s nice that we can
now write a proper Set
type using generics. Let’s try.
Operations on sets
What kind of things would we like to be able to do with a set? There are many possibilities, but perhaps the most basic are adding an element, and checking whether the set contains a given element.
In other words, we’d like to be able to write something like this:
:= NewSet[int]()
s .Add(1)
s.Println(s.Contains(1))
fmt// true
Let’s get to work!
Designing the Set
type
We need something to keep the actual data in. A map would make sense, since it gives us the “unique key” property for free.
And we know the map keys will need to be of arbitrary (but
comparable) type, since they represent the elements of our
Set
. What about the value type?
We could use bool
again, but maybe struct{}
is a better choice in this case, since it takes up no space.
So suppose we wrote some type definition like this:
type Set[E comparable] map[E]struct{}
In other words, for some comparable type E, a Set[E]
is
a map of E to empty struct.
If you’re already familiar with Go’s generics syntax, you’ll
recognise E as a type parameter. It
means that we’re not just defining a single type Set
:
instead, we’re defining a Set
of something. A
Set
of int
, for example, or perhaps
string
.
In fact, we can use any element type E, provided it satisfies the
constraint we put on it: comparable
. That’s a
shorthand way of saying that E must be a type that you can use the
==
operator with. That makes sense, since we’re using a map
as the underlying type for Set
, and map keys must be
comparable.
Since we can’t do much with a map until it’s been initialised, let’s
provide a constructor function NewSet
:
func NewSet[E comparable]() Set[E] {
return Set[E]{}
}
The Add
method
Let’s write an Add
method on this type. It should take
some value and store it in the set. We can do that by assigning to the
map:
func (s Set[E]) Add(v E) {
[v] = struct{}{}
s}
Easy, right? It’s just like adding a new key to a map, which happens
to be exactly what we’re doing. We just don’t have to explicitly mention
the map’s key type, because it doesn’t matter—so long as it’s
comparable
. The compiler will check this for us when
someone tries to instantiate a Set
of that type, and
complain if it doesn’t satisfy the constraint.
The Contains
method
The next job is to write Contains
. How should it behave?
That seems straightforward: given some value, it should return
true
if the value is in the set, or false
otherwise.
You probably know that a map index expression in Go can return two
values. The second, conventionally named ok
, tells us
whether the key was in fact in the map. And that’s the information we
want here:
func (s Set[E]) Contains(v E) bool {
, ok := s[v]
_return ok
}
Let’s try out some simple set operations:
:= NewSet[string]()
s .Add("hello")
s.Println(s.Contains("hello"))
fmt// true
Great! This is already useful. But let’s see what we can do to make it even better.
Initialising with multiple values
It seems as though it would be convenient to initialise a set with a
bunch of elements supplied to NewSet
, rather than having to
create the set first and then add each element one by one. In
other words, we’d like to write, for example:
:= NewSet(1, 2, 3) // no need to say “[int]” s
Not only does this save time, but now we don’t need to instantiate
NewSet
explicitly, because the compiler can infer the type
of E from the values we pass to it.
Let’s modify NewSet
to take any number of E values:
func NewSet[E comparable](vals ...E) Set[E] {
:= Set[E]{}
s for _, v := range vals {
[v] = struct{}{}
s}
return s
}
While we’re at it, we’ll make the same change to Add
, so
that we can add any number of new members in one go:
func (s Set[E]) Add(vals ...E) {
for _, v := range vals {
[v] = struct{}{}
s}
}
We’re ready to roll:
:= NewSet(true, true, true)
s .Add(true, true)
s.Println(s.Contains(false))
fmt// false
What else could we do?
Getting a set’s members
It’ll almost certainly be convenient to get all the members of the set as a slice. One of the more annoying limitations of using maps as sets in Go is that it’s not straightforward to do this.
Let’s make it straightforward!
func (s Set[E]) Members() []E {
:= make([]E, 0, len(s))
result for v := range s {
= append(result, v)
result }
return result
}
Now we can write:
:= NewSet(1, 2, 3)
s .Println(s.Members())
fmt// [1 2 3]
A String
method
In fact, let’s add a String
method so that we can print
the set in a nice way:
func (s Set[E]) String() string {
return fmt.Sprintf("%v", s.Members())
}
Now we don’t need to call Members
to print out the set,
and we can write simply:
.Println(s)
fmt// [2 3 1]
Notice that the members came out in a different order this time, and that’s okay!
It’s part of the definition of a set that its members aren’t in any special order, just like a map. And because we’re using a map to store the elements, we get that behaviour for free.
The union of two sets
Let’s do something more fun. A common operation on sets is to ask for the union of two sets: that is, the set that contains all the elements of both sets. How could we implement that?
One way would be to create a new set, using the members of set 1, then add the members of set 2. Here’s what that looks like:
func (s Set[E]) Union(s2 Set[E]) Set[E] {
:= NewSet(s.Members()...)
result .Add(s2.Members()...)
resultreturn result
}
Making NewSet
and Add
variadic has paid off
already, since it makes Union
much easier to write, with no
looping.
If this works, then we should be able to find the union of two small sets of integers:
:= NewSet(1, 2, 3)
s1 := NewSet(3, 4, 5)
s2 .Println(s1.Union(s2))
fmt// [1 2 3 4 5]
Excellent! As we’d expect, Union
has not just added
together the two sets, it’s also removed any duplicates. We didn’t have
to write any code to do this: again, it came free with the map.
An Intersection
method
Let’s write Intersection
, too, just for completeness.
The intersection of two sets is the set of members that they have in
common.
This is a little more difficult to write, but I think we can do it:
func (s Set[E]) Intersection(s2 Set[E]) Set[E] {
:= NewSet[E]()
result for _, v := range s.Members() {
if s2.Contains(v) {
.Add(v)
result}
}
return result
}
In other words, for each member of s
, we check whether
it is also present in s2
. If so, we add it to the result
set.
Does it work? Here goes:
:= NewSet(1, 2, 3)
s1 := NewSet(3, 4, 5)
s2 .Println(s1.Intersection(s2))
fmt// [3]
Encouraging!
Logic on sets
Let’s try out our new Set
type in a semi-realistic
application: checking job applications against the required set of
skills for a vacancy.
Since we’ve just written Intersection
, this should be
easy. We just need to define the set of required skills, and find its
intersection with the candidate’s skills.
If there are any members in this set, then the candidate has at least one of the required skills, so we can hire them.
For example, suppose there’s a job available which requires either Go or Java skills, but I happen to know Go and Ruby. Would I qualify?
Let’s find out:
:= NewSet("go", "java")
jobSkills := NewSet("go", "ruby")
mySkills := jobSkills.Intersection(mySkills)
matches if len(matches) > 0 {
.Println("You're hired!")
fmt}
// You're hired!
If only it were that easy.
Challenge: Stack overflow
Now it’s your turn! See if you can solve this programming challenge using Go generics. Your task is to implement a stack: an abstract data type which stores items in a last-in-first-out way. The only thing you can do with a stack is “push” a new item onto it, or “pop” the top item off it.
You’ll be building a generic Stack[E]
type that holds an
ordered collection of values of as broad a range of types as
possible.
You’ll need to write a Push
method that can append any
number of items to the stack, in order, and a Pop
method
that retrieves (and removes) the last item from the stack.
Pop
should also return a second ok
value
indicating whether an item was actually popped. If the stack is empty,
then Pop
should return false
for this second
value, but true
otherwise.
Also, you should provide a Len
method that returns the
number of items in the stack.
For example:
:= Stack[int]{}
s .Println(s.Pop())
fmt// 0 false
.Push(5, 6)
s.Println(s.Len())
fmt// 2
, ok := s.Pop()
v.Println(v)
fmt// 6
.Println(ok)
fmt// true
If you’d like some hints, check out one possible solution.
What will you build next?
In pre-generics days, we had to write, or generate, lots of duplicated code to implement data structures on multiple specific types, so there was an obvious disincentive to put too much effort into it. Since it was impractical to provide general-purpose libraries of such data structures, we tended to hack together our own versions as needed.
That being so, they generally weren’t very sophisticated: no one wants to maintain N different versions of the same sophisticated data structure. So we often went without neat features like concurrency safety, high performance, state-of-the-art algorithms, and so on.
That’s no longer the case, and we can write generic abstract data types that are as sophisticated as we like. There are some examples in my book Know Go: Generics that may help to inspire you. See what you can come up with, and have fun!