# Queue data structure for Go [![GoDoc](https://godoc.org/github.com/phf/go-queue/queue?status.png)](http://godoc.org/github.com/phf/go-queue/queue) [![Go Report Card](https://goreportcard.com/badge/github.com/phf/go-queue)](https://goreportcard.com/report/github.com/phf/go-queue) A double-ended queue (aka "deque") built on top of a slice. All operations except pushes are constant-time; pushes are *amortized* constant-time. Benchmarks compare favorably to [container/list](https://golang.org/pkg/container/list/) as well as to Go's channels. I tried to stick close to the conventions [container/list](https://golang.org/pkg/container/list/) seems to follow even though I disagree with several of them (see [`RANT.md`](https://github.com/phf/go-queue/blob/master/RANT.md)). In other words, it's ready for the standard library (hah!). ## Background In 2013 I was hacking a breadth-first search in Go and needed a queue, but all I could find in the standard library was [container/list](https://golang.org/pkg/container/list/). Now in *principle* there's nothing wrong with [container/list](https://golang.org/pkg/container/list/), but I had just admonished my students to *always* think carefully about the number of memory allocations their programs make. In other words, it felt wrong for me to use a data structure that allocates memory for *every* single vertex we visit during a breadth-first search. After I got done with my project, I decided to clean up the queue code a little and to push it here to give everybody else what I really wanted to find in the standard library: A queue abstraction that doesn't allocate memory on every single insertion. ## Performance The benchmarks are not very sophisticated but we seem to be *almost* twice as fast as [container/list](https://golang.org/pkg/container/list/) ([speedup](https://en.wikipedia.org/wiki/Speedup) of 1.85-1.93). We're also a bit faster than Go's channels (speedup of 1.38). Anyway, here are some numbers from my old home machine: ``` $ go test -bench . -benchmem BenchmarkPushFrontQueue-2 20000 85886 ns/op 40944 B/op 1035 allocs/op BenchmarkPushFrontList-2 10000 158998 ns/op 57392 B/op 2049 allocs/op BenchmarkPushBackQueue-2 20000 85189 ns/op 40944 B/op 1035 allocs/op BenchmarkPushBackList-2 10000 160718 ns/op 57392 B/op 2049 allocs/op BenchmarkPushBackChannel-2 10000 117610 ns/op 24672 B/op 1026 allocs/op BenchmarkRandomQueue-2 10000 144867 ns/op 45720 B/op 1632 allocs/op BenchmarkRandomList-2 5000 278965 ns/op 90824 B/op 3243 allocs/op PASS ok github.com/phf/go-queue/queue 12.472s $ go version go version go1.7.5 linux/amd64 $ uname -p AMD Athlon(tm) 64 X2 Dual Core Processor 6000+ $ date Sat Apr 22 11:26:40 EDT 2017 ``` (The number of allocations seems off, since we grow by doubling we should only allocate memory O(log n) times.) The same benchmarks on a more recent laptop: ``` $ go test -bench=. -benchmem PASS BenchmarkPushFrontQueue-4 10000 107377 ns/op 40944 B/op 1035 allocs/op BenchmarkPushFrontList-4 10000 205141 ns/op 57392 B/op 2049 allocs/op BenchmarkPushBackQueue-4 10000 107339 ns/op 40944 B/op 1035 allocs/op BenchmarkPushBackList-4 10000 204100 ns/op 57392 B/op 2049 allocs/op BenchmarkPushBackChannel-4 10000 174319 ns/op 24672 B/op 1026 allocs/op BenchmarkRandomQueue-4 10000 190498 ns/op 45720 B/op 1632 allocs/op BenchmarkRandomList-4 5000 364802 ns/op 90825 B/op 3243 allocs/op ok github.com/phf/go-queue/queue 11.881s $ go version go version go1.6.2 linux/amd64 $ cat /proc/cpuinfo | grep "model name" | uniq model name : AMD A10-4600M APU with Radeon(tm) HD Graphics $ date Fri Apr 28 17:20:57 EDT 2017 ``` So that's a [speedup](https://en.wikipedia.org/wiki/Speedup) of 1.90 over [container/list](https://golang.org/pkg/container/list/) and of 1.62 over Go's channels. The same benchmarks on an old [Raspberry Pi Model B Rev 1](https://en.wikipedia.org/wiki/Raspberry_Pi): ``` $ go test -bench . -benchmem PASS BenchmarkPushFrontQueue 2000 788316 ns/op 16469 B/op 12 allocs/op BenchmarkPushFrontList 1000 2629835 ns/op 33904 B/op 1028 allocs/op BenchmarkPushBackQueue 2000 776663 ns/op 16469 B/op 12 allocs/op BenchmarkPushBackList 1000 2817162 ns/op 33877 B/op 1028 allocs/op BenchmarkPushBackChannel 2000 1229474 ns/op 8454 B/op 1 allocs/op BenchmarkRandomQueue 2000 1325947 ns/op 16469 B/op 12 allocs/op BenchmarkRandomList 500 4929491 ns/op 53437 B/op 1627 allocs/op ok github.com/phf/go-queue/queue 17.798s $ go version go version go1.3.3 linux/arm $ cat /proc/cpuinfo | grep "model name" model name : ARMv6-compatible processor rev 7 (v6l) $ date Sat Apr 22 18:04:16 UTC 2017 ``` So that's a [speedup](https://en.wikipedia.org/wiki/Speedup) of **3.34**-**3.72** over [container/list](https://golang.org/pkg/container/list/) and of 1.58 over Go's channels. (Also the number of allocations seems to be correct here for some reason?) ### Go's channels as queues Go's channels *used* to beat our queue implementation by about 22% for `PushBack`. That seemed sensible considering that channels are built into the language and offer a lot less functionality: We have to size them correctly if we want to use them as a simple queue in an otherwise non-concurrent setting, they are not double-ended, and they don't support "peeking" at the next element without removing it. It all changed with [two](https://github.com/phf/go-queue/commit/5652cbe39198516d853918fe64a4e70948b42f1a) [commits](https://github.com/phf/go-queue/commit/aa6086b89f98eb5cfd8df918e57612271ae1c137) that replaced the "manual" loop when a queue has to grow with [copy](https://golang.org/ref/spec#Appending_and_copying_slices) and the `%` operations to wrap indices around the slice with equivalent `&` operations. (The code was originally written without these "hacks" because I wanted to show it to my "innocent" Java students.) Those two changes *really* paid off. (I used to call channels "*ridiculously* fast" before and recommended their use in situations where nothing but performance matters. Alas that may no longer be good advice. Either that, or I am just benchmarking incorrectly.) ## Kudos Hacking queue data structures in Go seems to be a popular way to spend an evening. Kudos to... - [Rodrigo Moraes](https://github.com/moraes) for posting [this gist](https://gist.github.com/moraes/2141121) which reminded me of Go's [copy](https://golang.org/ref/spec#Appending_and_copying_slices) builtin and a similar trick I had previously used in Java. - [Evan Huus](https://github.com/eapache) for sharing [his queue](https://github.com/eapache/queue) which reminded me of the old "replace % by &" trick I had used many times before. - [Dariusz Górecki](https://github.com/canni) for his [commit](https://github.com/eapache/queue/commit/334cc1b02398be651373851653017e6cbf588f9e) to [Evan](https://github.com/eapache)'s queue that simplified [Rodrigo](https://github.com/moraes)'s snippet and hence mine. If you find something in my code that helps you improve yours, feel free to run with it!