Przeglądaj źródła

Some copy-editing, benchmarks from laptop.

Peter Froehlich 8 lat temu
rodzic
commit
3a80a3b421
1 zmienionych plików z 50 dodań i 19 usunięć
  1. 50 19
      README.md

+ 50 - 19
README.md

@@ -8,7 +8,7 @@ All operations except pushes are constant-time; pushes are
 *amortized* constant-time.
 *amortized* constant-time.
 Benchmarks compare favorably to
 Benchmarks compare favorably to
 [container/list](https://golang.org/pkg/container/list/) as
 [container/list](https://golang.org/pkg/container/list/) as
-well as channels (see below).
+well as to Go's channels.
 
 
 I tried to stick close to the conventions
 I tried to stick close to the conventions
 [container/list](https://golang.org/pkg/container/list/) seems to
 [container/list](https://golang.org/pkg/container/list/) seems to
@@ -27,7 +27,7 @@ Now in *principle* there's nothing wrong with
 had just admonished my students to *always* think carefully about
 had just admonished my students to *always* think carefully about
 the number of memory allocations their programs make.
 the number of memory allocations their programs make.
 In other words, it felt wrong for me to use a data structure that
 In other words, it felt wrong for me to use a data structure that
-will allocate memory for *every* single vertex we visit during a
+allocates memory for *every* single vertex we visit during a
 breadth-first search.
 breadth-first search.
 
 
 After I got done with my project, I decided to clean up the queue
 After I got done with my project, I decided to clean up the queue
@@ -36,13 +36,13 @@ really wanted to find in the standard library:
 A queue abstraction that doesn't allocate memory on every single
 A queue abstraction that doesn't allocate memory on every single
 insertion.
 insertion.
 
 
-## Performance comparison
+## Performance
 
 
 The benchmarks are not very sophisticated but we seem to be *almost*
 The benchmarks are not very sophisticated but we seem to be *almost*
 twice as fast as [container/list](https://golang.org/pkg/container/list/)
 twice as fast as [container/list](https://golang.org/pkg/container/list/)
 ([speedup](https://en.wikipedia.org/wiki/Speedup) of 1.85-1.93).
 ([speedup](https://en.wikipedia.org/wiki/Speedup) of 1.85-1.93).
 We're also a bit faster than Go's channels (speedup of 1.38).
 We're also a bit faster than Go's channels (speedup of 1.38).
-Anyway, here are the (latest) numbers:
+Anyway, here are some numbers from my old home machine:
 
 
 ```
 ```
 $ go test -bench . -benchmem
 $ go test -bench . -benchmem
@@ -65,7 +65,30 @@ Sat Apr 22 11:26:40 EDT 2017
 
 
 (The number of allocations seems off, since we grow by doubling we should
 (The number of allocations seems off, since we grow by doubling we should
 only allocate memory O(log n) times.)
 only allocate memory O(log n) times.)
+The same benchmarks on a more recent laptop:
 
 
+```
+$ go test -bench=. -benchmem
+PASS
+BenchmarkPushFrontQueue-4 	   10000	    107377 ns/op	   40944 B/op	    1035 allocs/op
+BenchmarkPushFrontList-4  	   10000	    205141 ns/op	   57392 B/op	    2049 allocs/op
+BenchmarkPushBackQueue-4  	   10000	    107339 ns/op	   40944 B/op	    1035 allocs/op
+BenchmarkPushBackList-4   	   10000	    204100 ns/op	   57392 B/op	    2049 allocs/op
+BenchmarkPushBackChannel-4	   10000	    174319 ns/op	   24672 B/op	    1026 allocs/op
+BenchmarkRandomQueue-4    	   10000	    190498 ns/op	   45720 B/op	    1632 allocs/op
+BenchmarkRandomList-4     	    5000	    364802 ns/op	   90825 B/op	    3243 allocs/op
+ok  	github.com/phf/go-queue/queue	11.881s
+$ go version
+go version go1.6.2 linux/amd64
+$ cat /proc/cpuinfo | grep "model name" | uniq
+model name	: AMD A10-4600M APU with Radeon(tm) HD Graphics
+$ date
+Fri Apr 28 17:20:57 EDT 2017
+```
+
+So that's a [speedup](https://en.wikipedia.org/wiki/Speedup) of 1.90 over
+[container/list](https://golang.org/pkg/container/list/) and of 1.62 over
+Go's channels.
 The same benchmarks on an old
 The same benchmarks on an old
 [Raspberry Pi Model B Rev 1](https://en.wikipedia.org/wiki/Raspberry_Pi):
 [Raspberry Pi Model B Rev 1](https://en.wikipedia.org/wiki/Raspberry_Pi):
 
 
@@ -82,18 +105,18 @@ BenchmarkRandomList          500           4929491 ns/op           53437 B/op
 ok      github.com/phf/go-queue/queue   17.798s
 ok      github.com/phf/go-queue/queue   17.798s
 $ go version
 $ go version
 go version go1.3.3 linux/arm
 go version go1.3.3 linux/arm
-$ cat /proc/cpuinfo | grep model
+$ cat /proc/cpuinfo | grep "model name"
 model name      : ARMv6-compatible processor rev 7 (v6l)
 model name      : ARMv6-compatible processor rev 7 (v6l)
 $ date
 $ date
 Sat Apr 22 18:04:16 UTC 2017
 Sat Apr 22 18:04:16 UTC 2017
 ```
 ```
 
 
-Here we're over three times faster than
-[container/list](https://golang.org/pkg/container/list/)
-and almost 60% faster than Go's channels.
-(Also the number of allocations seems to be correct. And in terms of
-raw performance, Go's memory allocator seems to have improved quite
-a bit in later versions.)
+So that's a [speedup](https://en.wikipedia.org/wiki/Speedup) of
+**3.34**-**3.72** over
+[container/list](https://golang.org/pkg/container/list/) and of 1.58 over
+Go's channels.
+(Also the number of allocations seems to be correct here for some
+reason?)
 
 
 ### Go's channels as queues
 ### Go's channels as queues
 
 
@@ -105,14 +128,22 @@ We have to size them correctly if we want to use them as a simple
 queue in an otherwise non-concurrent setting, they are not
 queue in an otherwise non-concurrent setting, they are not
 double-ended, and they don't support "peeking" at the next element
 double-ended, and they don't support "peeking" at the next element
 without removing it.
 without removing it.
-Apparently replacing the "manual" loop when a queue has to grow with
-[copy](https://golang.org/ref/spec#Appending_and_copying_slices) has
-paid off.
-
-(In fact I used to call channels "*ridiculously* fast" before and
-recommended their use in situations where nothing but performance
-matters. Alas that may no longer be good advice. Either that, or I
-am just benchmarking incorrectly.)
+
+It all changed with
+[two](https://github.com/phf/go-queue/commit/5652cbe39198516d853918fe64a4e70948b42f1a)
+[commits](https://github.com/phf/go-queue/commit/aa6086b89f98eb5cfd8df918e57612271ae1c137)
+that replaced the "manual" loop when a queue has to grow with
+[copy](https://golang.org/ref/spec#Appending_and_copying_slices)
+and the `%` operations to wrap indices around the slice with
+equivalent `&` operations.
+(The code was originally written without these "hacks" because I wanted to
+show it to my "innocent" Java students.)
+Those two changes *really* paid off.
+
+(I used to call channels "*ridiculously* fast" before and recommended their
+use in situations where nothing but performance matters.
+Alas that may no longer be good advice.
+Either that, or I am just benchmarking incorrectly.)
 
 
 ## Kudos
 ## Kudos