Fork from github.com/phf/go-queue with exposed internal list

Peter Froehlich 56a3a60120 Slightly refined rant. 8 years ago
queue e9a553b868 RPi benchmark, tiny go fmt fix. 8 years ago
LICENSE c4fbaaf69f Added license. 12 years ago
RANT.md 56a3a60120 Slightly refined rant. 8 years ago
README.md e9a553b868 RPi benchmark, tiny go fmt fix. 8 years ago

README.md

Queue data structure for Go

GoDoc Go Report Card

A double-ended queue (aka "deque") built on top of a slice. All operations except pushes are constant-time; pushes are amortized constant-time. Benchmarks compare favorably to container/list as well as channels (see below).

I tried to stick close to the conventions container/list seems to follow even though I disagree with several of them (see RANT.md). In other words, it's ready for the standard library (hah!).

Background

In 2013 I was hacking a breadth-first search in Go and needed a queue, but all I could find in the standard library was container/list.

Now in principle there's nothing wrong with container/list, but I had just admonished my students to always think carefully about the number of memory allocations their programs make. In other words, it felt wrong for me to use a data structure that will allocate memory for every single vertex we visit during a breadth-first search.

After I got done with my project, I decided to clean up the queue code a little and to push it here to give everybody else what I really wanted to find in the standard library: A queue abstraction that doesn't allocate memory on every single insertion.

Performance comparison

The benchmarks are not very sophisticated but we seem to be almost twice as fast as container/list (speedup of 1.85-1.93). We're also a bit faster than Go's channels (speedup of 1.38). Anyway, here are the (latest) numbers:

$ go test -bench . -benchmem
BenchmarkPushFrontQueue-2    	   20000	     85886 ns/op	   40944 B/op	    1035 allocs/op
BenchmarkPushFrontList-2     	   10000	    158998 ns/op	   57392 B/op	    2049 allocs/op
BenchmarkPushBackQueue-2     	   20000	     85189 ns/op	   40944 B/op	    1035 allocs/op
BenchmarkPushBackList-2      	   10000	    160718 ns/op	   57392 B/op	    2049 allocs/op
BenchmarkPushBackChannel-2   	   10000	    117610 ns/op	   24672 B/op	    1026 allocs/op
BenchmarkRandomQueue-2       	   10000	    144867 ns/op	   45720 B/op	    1632 allocs/op
BenchmarkRandomList-2        	    5000	    278965 ns/op	   90824 B/op	    3243 allocs/op
PASS
ok  	github.com/phf/go-queue/queue	12.472s
$ go version
go version go1.7.5 linux/amd64
$ uname -p
AMD Athlon(tm) 64 X2 Dual Core Processor 6000+
$ date
Sat Apr 22 11:26:40 EDT 2017

(The number of allocations seems off, since we grow by doubling we should only allocate memory O(log n) times.)

The same benchmarks on an old Raspberry Pi Model B Rev 1:

$ go test -bench . -benchmem
PASS
BenchmarkPushFrontQueue     2000            788316 ns/op           16469 B/op         12 allocs/op
BenchmarkPushFrontList      1000           2629835 ns/op           33904 B/op       1028 allocs/op
BenchmarkPushBackQueue      2000            776663 ns/op           16469 B/op         12 allocs/op
BenchmarkPushBackList       1000           2817162 ns/op           33877 B/op       1028 allocs/op
BenchmarkPushBackChannel    2000           1229474 ns/op            8454 B/op          1 allocs/op
BenchmarkRandomQueue        2000           1325947 ns/op           16469 B/op         12 allocs/op
BenchmarkRandomList          500           4929491 ns/op           53437 B/op       1627 allocs/op
ok      github.com/phf/go-queue/queue   17.798s
$ go version
go version go1.3.3 linux/arm
$ cat /proc/cpuinfo | grep model
model name      : ARMv6-compatible processor rev 7 (v6l)
$ date
Sat Apr 22 18:04:16 UTC 2017

Here we're over three times faster than container/list and almost 60% faster than Go's channels. (Also the number of allocations seems to be correct. And in terms of raw performance, Go's memory allocator seems to have improved quite a bit in later versions.)

Go's channels as queues

Go's channels used to beat our queue implementation by about 22% for PushBack. That seemed sensible considering that channels are built into the language and offer a lot less functionality: We have to size them correctly if we want to use them as a simple queue in an otherwise non-concurrent setting, they are not double-ended, and they don't support "peeking" at the next element without removing it. Apparently replacing the "manual" loop when a queue has to grow with copy has paid off.

(In fact I used to call channels "ridiculously fast" before and recommended their use in situations where nothing but performance matters. Alas that may no longer be good advice. Either that, or I am just benchmarking incorrectly.)

Kudos

Hacking queue data structures in Go seems to be a popular way to spend an evening. Kudos to...

If you find something in my code that helps you improve yours, feel free to run with it!