|
| 1 | +# SmoothieMap |
| 2 | + |
| 3 | +`SmoothieMap` is a `java.util.Map` implementation with worst write (`put(k, v)`) operation latencies |
| 4 | +more than 100 times smaller than in ordinary hash table implementations like `java.util.HashMap`. |
| 5 | +For example, when inserting 10 million entries into `HashMap` the longest one (when about 6m |
| 6 | +entries are already in the map) takes about 42 milliseconds. The longest insertion into |
| 7 | +`SmoothieMap` is only 0.27 milliseconds (when about 8m entries are already inserted). |
| 8 | + |
| 9 | +Another important property of `SmoothieMap` - on eventual growth it produces very little garbage - |
| 10 | +about 50 times less than e. g. `HashMap` by total size of objects, that are going to be GC'ed. On |
| 11 | +eventual shrinking, `SmoothieMap` doesn't generate any garbage, not accounting mapped keys and |
| 12 | +values themselves, that could be GC'ed or not depending on the application logic. |
| 13 | + |
| 14 | +Final major advantage of `SmoothieMap` over traditional `HashMap` is memory footprint itself. |
| 15 | + |
| 16 | +<table> |
| 17 | + <tr> |
| 18 | + <th></th> |
| 19 | + <th><code>new SmoothieMap(expectedSize)</code></th> |
| 20 | + <th><code>new SmoothieMap()</code>, i. e. without expected size provided</th> |
| 21 | + <th><code>HashMap</code>, regardless which constructor called</th> |
| 22 | + </tr> |
| 23 | + <tr> |
| 24 | + <th>Java ref size</th> |
| 25 | + <th colspan=3>Map footprint, bytes/entry (excluding keys and values themselves)</th> |
| 26 | + </tr> |
| 27 | + <tr> |
| 28 | + <td>4 bytes, <code>-XX:+UseCompressedOops</code></td> |
| 29 | + <td>16.1-22.5</td> |
| 30 | + <td>17.5-23.0</td> |
| 31 | + <td>37.3-42.7</td> |
| 32 | + </tr> |
| 33 | + <tr> |
| 34 | + <td>8 bytes, <code>-XX:-UseCompressedOops</code></td> |
| 35 | + <td>26.7-34.3</td> |
| 36 | + <td>28.2-37.2</td> |
| 37 | + <td>58.7-69.3</td> |
| 38 | + </tr> |
| 39 | +</table> |
| 40 | + |
| 41 | +<hr> |
| 42 | + |
| 43 | +These properties make `SmoothieMap` interesting for low latency scenarios and *real-time |
| 44 | +applications*, e. g. implementing services that has hard latency requirements defined by SLA. |
| 45 | + |
| 46 | +On the other hand, *amortized* performance of read (`get(k)`) and write operations on `SmoothieMap` |
| 47 | +is approximately equal to `HashMap`'s performance (sometimes slower, sometimes faster, but always |
| 48 | +the same ballpark, regardless map size), unlike, e. g. `java.util.TreeMap`. |
| 49 | + |
| 50 | +If you are curious how this has been achieved and what algorithm is behind `SmoothieMap`, read the |
| 51 | +implementation comment in [`SmoothieMap` |
| 52 | +class](https://github.com/OpenHFT/SmoothieMap/blob/master/src/main/java/net/openhft/SmoothieMap.java) |
| 53 | +(on top of the class body). |
| 54 | + |
| 55 | +### Other "real-time `Map`" field players |
| 56 | + |
| 57 | + - [Javolution](http://javolution.org/)'s `FastMap` appears to be an ordinary open-addressing hash |
| 58 | + table with linear probing, i. e. has very bad latency of `put()` call, when hash table resize is |
| 59 | + triggered. `FastSortedMap` indeed has real-time `put()`s, but it a tree and has `log(N)` operations |
| 60 | + complexity, and should better be compared to `java.util.TreeMap`. |
| 61 | + |
| 62 | + - [`PauselessHashMap`](https://github.com/giltene/PauselessHashMap) offers truly real-time `put()`s |
| 63 | + with constant worst latencies (while `SmoothieMap`'s latencies still grow linearly with the map |
| 64 | + size, though with a very small coefficient), but has different downsides: |
| 65 | + |
| 66 | + - Other operations, like `remove()`, `putAll()`, `clear()`, and the derivation of keysets and |
| 67 | + such *will* block for pending resize operations. |
| 68 | + |
| 69 | + - It produces garbage on nearly the same rates as `HashMap`. |
| 70 | + |
| 71 | + - It runs a background Executor with resizer threads, that could be undesirable, or lead to |
| 72 | + problems or stalls, if resizer threads starve. `SmoothieMap` is simply single-threaded. |
| 73 | + |
| 74 | + - `PauselessHashMap` also has amortized read and write performance close to `HashMap`'s, but |
| 75 | + `PauselessHashMap` is consistently slower. |
| 76 | + |
| 77 | + - `PauselessHashMap` has footprint characteristics similar to `HashMap`'s, i. e. it consumes |
| 78 | + more memory, than `SmoothieMap`. |
| 79 | + |
| 80 | +### Should I use SmoothieMap? |
| 81 | + |
| 82 | +Points for: |
| 83 | + |
| 84 | + - :smile: You have hard latency requirements of `put()` or `remove()` operations performance on the |
| 85 | + map. |
| 86 | + |
| 87 | + - :grinning: You don't want the map to produce garbage on growing and/or shrinking (Entry objects). |
| 88 | + |
| 89 | + - :grinning: You are worried about `HashMap` memory footprint. `SmoothieMap` allows to reduce it. |
| 90 | + |
| 91 | + - :blush: You run your application on a modern, powerful CPU with wide pipeline and supporting [bit |
| 92 | + manipulation extensions](https://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets), |
| 93 | + preferably Intel, preferably Haswell or newer architecture. `SmoothieMap` tends to perform better |
| 94 | + on newer CPUs. |
| 95 | + |
| 96 | +Points against: |
| 97 | + |
| 98 | + - :confused: You run your application on an outdated CPU (but desktop- of server-class) |
| 99 | + |
| 100 | + - :worried: Your map(s) are not very large (say smaller than of 1000 entries), particularly |
| 101 | + :persevere: if smaller than 32 entries. In this case even full `HashMap` resize could complete in |
| 102 | + less than 1 microsecond. While `SmoothieMap` cannot even be configured to hold less than 32 |
| 103 | + entries, so if you want to hold only a few entries, you are going to waste memory. |
| 104 | + |
| 105 | + - :persevere: You run your application on 32-bit or mobile-class CPU, like ARM. `SmoothieMap` is |
| 106 | + tailored for 64-bit CPUs and should perform badly on those without fast native 64-bit arithmetic |
| 107 | + operations and addressing. |
| 108 | + |
| 109 | + However, a `SmoothieMap` version optimized without native (or slow) 64-bit arithmetic could be |
| 110 | + implemented, it's just not here yet. |
| 111 | + |
| 112 | + - :dizzy_face: There is some non-zero possibility that 32 or more keys collide by 30 lowest bits of |
| 113 | + their hash codes. In this situation `SmoothieMap` is not operational and throws |
| 114 | + `IllegalStateException`. |
| 115 | + |
| 116 | + Fortunately, unless somebody purposely inserts keys with colliding hash |
| 117 | + codes, performing a hash DOS attack, this is practically impossible for any decent hash code |
| 118 | + implementation. Moreover, you can override [`SmoothieMap.keyHashCode()` |
| 119 | + ](http://openhft.github.io/SmoothieMap/apidocs/net/openhft/smoothie/SmoothieMap.html#keyHashCode-java.lang.Object-) |
| 120 | + method, for example adding some random salt, excluding any possibility even of hash DOS attack. |
| 121 | + |
| 122 | + - :dizzy_face: You run on old Java version. SmoothieMap sets Java 8 as the compatibility baseline. |
| 123 | + |
| 124 | +### Quick start |
| 125 | + |
| 126 | +Add the [`net.openhft:smoothie-map:1.0-rc` |
| 127 | +](http://search.maven.org/#artifactdetails%7Cnet.openhft%7Csmoothie-map%7C1.0-rc%7Cjar) dependency |
| 128 | +to your project (you can copy a snippet for your favourite build system on the linked page). |
| 129 | + |
| 130 | +E. g. Maven: |
| 131 | + |
| 132 | + <dependency> |
| 133 | + <groupId>net.openhft</groupId> |
| 134 | + <artifactId>smoothie-map</artifactId> |
| 135 | + <version>1.0-rc</version> |
| 136 | + </dependency> |
| 137 | + |
| 138 | +Then use it in Java: |
| 139 | + |
| 140 | + Map<K, V> map = new net.openhft.smoothie.SmoothieMap<>(); |
| 141 | + |
| 142 | +See [JavaDocs](http://openhft.github.io/SmoothieMap/apidocs/net/openhft/smoothie/SmoothieMap.html) |
| 143 | +for more information. |
| 144 | + |
| 145 | +### Production use considerations |
| 146 | + |
| 147 | + - SmoothieMap supports Java 8 or newer only |
| 148 | + |
| 149 | + - SmoothieMap is licensed under [LGPL, version 3 |
| 150 | +](https://tldrlegal.com/license/gnu-lesser-general-public-license-v3-(lgpl-3)) |
| 151 | + |
| 152 | + - There are some unit tests, including generated with `guava-testlib`. |
| 153 | + |
| 154 | + |
| 155 | +### Anticipated questions |
| 156 | + |
| 157 | +#### Is `SmoothieMap` safe for concurrent use from multiple threads? |
| 158 | + |
| 159 | +No, `SmoothieMap` is not synchronized. It competes with `HashMap`, not `ConcurrentHashMap`. However, |
| 160 | +concurrent version could be implemented naturally, in a way similar to `ConcurrentHashMap` was |
| 161 | +implemented in JDK 5, 6 and 7. |
| 162 | + |
| 163 | +Similarly, `SmoothieMap` could be tweaked to add some sort of LRU ordering, making it a good choice |
| 164 | +for implementing caches. |
| 165 | + |
| 166 | +#### How `SmoothieMap` is compared to `HashObjObjMap` from [Koloboke](https://github.com/OpenHFT/Koloboke)? |
| 167 | + |
| 168 | + - `HashObjObjMap` doesn't have latency guarantees, so it is similar to `HashMap` on this regard. |
| 169 | + |
| 170 | + - `HashObjObjMap` has smaller footprint on average in case of 4-byte Java refs |
| 171 | + (`-XX:+UseCompressedOops`), but with greater variance: 12-24 bytes per entry. If references are |
| 172 | + 8-byte (`-XX:-UseCompressedOops`), `HashObjObjMap` takes even more memory than |
| 173 | + `SmoothieMap`: 24-48 bytes per entry, 36 bytes on average. |
| 174 | + |
| 175 | + - The same with performance: on average `HashObjObjMap` is faster (especially if |
| 176 | + keys are effectively compared by identity `==`, and/or only successful queries are performed (i. e. |
| 177 | + `get(key)` calls always find some value mapped for the key). But `HashObjObjMap` has greater |
| 178 | + variance in performance, depending on the workload. |
| 179 | + |
| 180 | +#### What about primitive specializations for keys and/or values? |
| 181 | + |
| 182 | +`SmoothieMap` is specially designed for `Object` keys and values. For primitive Map specializations |
| 183 | +you would better use [Koloboke](https://github.com/OpenHFT/Koloboke) or other similar libs. |
| 184 | + |
| 185 | +#### How `SmoothieMap` is compared to [Chronicle Map](https://github.com/OpenHFT/Chronicle-Map) |
| 186 | + |
| 187 | +Chronicle Map stores keys and values off-heap (in shared memory), SmoothieMap is an ordinary vanilla |
| 188 | +Java `Map` implementation. Actually SmoothieMap is the result of the idea "what if we try to move |
| 189 | +Chronicle Map's design decisions back to the Java heap?" |
| 190 | + |
| 191 | +<hr> |
| 192 | + |
| 193 | +### Author |
| 194 | + |
| 195 | +[Roman Leventov](https://github.com/leventov) |
0 commit comments