| # base/containers library |
| |
| [TOC] |
| |
| ## What goes here |
| |
| This directory contains some STL-like containers. |
| |
| Things should be moved here that are generally applicable across the code base. |
| Don't add things here just because you need them in one place and think others |
| may someday want something similar. You can put specialized containers in |
| your component's directory and we can promote them here later if we feel there |
| is broad applicability. |
| |
| ### Design and naming |
| |
| Containers should adhere as closely to STL as possible. Functions and behaviors |
| not present in STL should only be added when they are related to the specific |
| data structure implemented by the container. |
| |
| For STL-like containers our policy is that they should use STL-like naming even |
| when it may conflict with the style guide. So functions and class names should |
| be lower case with underscores. Non-STL-like classes and functions should use |
| Google naming. Be sure to use the base namespace. |
| |
| ## Map and set selection |
| |
| ### Usage advice |
| |
| * Generally avoid `std::unordered_set` and `std::unordered_map`. In the |
| common case, query performance is unlikely to be sufficiently higher than |
| `std::map` to make a difference, insert performance is slightly worse, and |
| the memory overhead is high. This makes sense mostly for large tables where |
| you expect a lot of lookups. |
| |
| * Most maps and sets in Chrome are small and contain objects that can be |
| moved efficiently. In this case, consider `base::flat_map` and |
| `base::flat_set`. You need to be aware of the maximum expected size of |
| the container since individual inserts and deletes are O(n), giving O(n^2) |
| construction time for the entire map. But because it avoids mallocs in most |
| cases, inserts are better or comparable to other containers even for |
| several dozen items, and efficiently-moved types are unlikely to have |
| performance problems for most cases until you have hundreds of items. If |
| your container can be constructed in one shot, the constructor from vector |
| gives O(n log n) construction times and it should be strictly better than |
| a `std::map`. |
| |
| * `base::small_map` has better runtime memory usage without the poor |
| mutation performance of large containers that `base::flat_map` has. But this |
| advantage is partially offset by additional code size. Prefer in cases |
| where you make many objects so that the code/heap tradeoff is good. |
| |
| * Use `std::map` and `std::set` if you can't decide. Even if they're not |
| great, they're unlikely to be bad or surprising. |
| |
| ### Map and set details |
| |
| Sizes are on 64-bit platforms. Stable iterators aren't invalidated when the |
| container is mutated. |
| |
| | Container | Empty size | Per-item overhead | Stable iterators? | |
| |:------------------------------------------ |:--------------------- |:----------------- |:----------------- | |
| | `std::map`, `std::set` | 16 bytes | 32 bytes | Yes | |
| | `std::unordered_map`, `std::unordered_set` | 128 bytes | 16 - 24 bytes | No | |
| | `base::flat_map`, `base::flat_set` | 24 bytes | 0 (see notes) | No | |
| | `base::small_map` | 24 bytes (see notes) | 32 bytes | No | |
| |
| **Takeaways:** `std::unordered_map` and `std::unordered_map` have high |
| overhead for small container sizes, so prefer these only for larger workloads. |
| |
| Code size comparisons for a block of code (see appendix) on Windows using |
| strings as keys. |
| |
| | Container | Code size | |
| |:-------------------- |:---------- | |
| | `std::unordered_map` | 1646 bytes | |
| | `std::map` | 1759 bytes | |
| | `base::flat_map` | 1872 bytes | |
| | `base::small_map` | 2410 bytes | |
| |
| **Takeaways:** `base::small_map` generates more code because of the inlining of |
| both brute-force and red-black tree searching. This makes it less attractive |
| for random one-off uses. But if your code is called frequently, the runtime |
| memory benefits will be more important. The code sizes of the other maps are |
| close enough it's not worth worrying about. |
| |
| ### std::map and std::set |
| |
| A red-black tree. Each inserted item requires the memory allocation of a node |
| on the heap. Each node contains a left pointer, a right pointer, a parent |
| pointer, and a "color" for the red-black tree (32 bytes per item on 64-bit |
| platforms). |
| |
| ### std::unordered\_map and std::unordered\_set |
| |
| A hash table. Implemented on Windows as a `std::vector` + `std::list` and in libc++ |
| as the equivalent of a `std::vector` + a `std::forward_list`. Both implementations |
| allocate an 8-entry hash table (containing iterators into the list) on |
| initialization, and grow to 64 entries once 8 items are inserted. Above 64 |
| items, the size doubles every time the load factor exceeds 1. |
| |
| The empty size is `sizeof(std::unordered_map)` = 64 + the initial hash table |
| size which is 8 pointers. The per-item overhead in the table above counts the |
| list node (2 pointers on Windows, 1 pointer in libc++), plus amortizes the hash |
| table assuming a 0.5 load factor on average. |
| |
| In a microbenchmark on Windows, inserts of 1M integers into a |
| `std::unordered_set` took 1.07x the time of `std::set`, and queries took 0.67x |
| the time of `std::set`. For a typical 4-entry set (the statistical mode of map |
| sizes in the browser), query performance is identical to `std::set` and |
| `base::flat_set`. On ARM, `std::unordered_set` performance can be worse because |
| integer division to compute the bucket is slow, and a few "less than" operations |
| can be faster than computing a hash depending on the key type. The takeaway is |
| that you should not default to using unordered maps because "they're faster." |
| |
| ### base::flat\_map and base::flat\_set |
| |
| A sorted `std::vector`. Seached via binary search, inserts in the middle require |
| moving elements to make room. Good cache locality. For large objects and large |
| set sizes, `std::vector`'s doubling-when-full strategy can waste memory. |
| |
| Supports efficient construction from a vector of items which avoids the O(n^2) |
| insertion time of each element separately. |
| |
| The per-item overhead will depend on the underlying `std::vector`'s reallocation |
| strategy and the memory access pattern. Assuming items are being linearly added, |
| one would expect it to be 3/4 full, so per-item overhead will be 0.25 * |
| sizeof(T). |
| |
| `flat_set` and `flat_map` support a notion of transparent comparisons. |
| Therefore you can, for example, lookup `base::StringPiece` in a set of |
| `std::strings` without constructing a temporary `std::string`. This |
| functionality is based on C++14 extensions to the `std::set`/`std::map` |
| interface. |
| |
| You can find more information about transparent comparisons in [the `less<void>` |
| documentation](https://en.cppreference.com/w/cpp/utility/functional/less_void). |
| |
| Example, smart pointer set: |
| |
| ```cpp |
| // Declare a type alias using base::UniquePtrComparator. |
| template <typename T> |
| using UniquePtrSet = base::flat_set<std::unique_ptr<T>, |
| base::UniquePtrComparator>; |
| |
| // ... |
| // Collect data. |
| std::vector<std::unique_ptr<int>> ptr_vec; |
| ptr_vec.reserve(5); |
| std::generate_n(std::back_inserter(ptr_vec), 5, []{ |
| return std::make_unique<int>(0); |
| }); |
| |
| // Construct a set. |
| UniquePtrSet<int> ptr_set(std::move(ptr_vec), base::KEEP_FIRST_OF_DUPES); |
| |
| // Use raw pointers to lookup keys. |
| int* ptr = ptr_set.begin()->get(); |
| EXPECT_TRUE(ptr_set.find(ptr) == ptr_set.begin()); |
| ``` |
| |
| Example `flat_map<std::string, int>`: |
| |
| ```cpp |
| base::flat_map<std::string, int> str_to_int({{"a", 1}, {"c", 2},{"b", 2}}, |
| base::KEEP_FIRST_OF_DUPES); |
| |
| // Does not construct temporary strings. |
| str_to_int.find("c")->second = 3; |
| str_to_int.erase("c"); |
| EXPECT_EQ(str_to_int.end(), str_to_int.find("c")->second); |
| |
| // NOTE: This does construct a temporary string. This happens since if the |
| // item is not in the container, then it needs to be constructed, which is |
| // something that transparent comparators don't have to guarantee. |
| str_to_int["c"] = 3; |
| ``` |
| |
| ### base::small\_map |
| |
| A small inline buffer that is brute-force searched that overflows into a full |
| `std::map` or `std::unordered_map`. This gives the memory benefit of |
| `base::flat_map` for small data sizes without the degenerate insertion |
| performance for large container sizes. |
| |
| Since instantiations require both code for a `std::map` and a brute-force search |
| of the inline container, plus a fancy iterator to cover both cases, code size |
| is larger. |
| |
| The initial size in the above table is assuming a very small inline table. The |
| actual size will be `sizeof(int) + min(sizeof(std::map), sizeof(T) * |
| inline_size)`. |
| |
| ## Deque |
| |
| ### Usage advice |
| |
| Chromium code should always use `base::circular_deque` or `base::queue` in |
| preference to `std::deque` or `std::queue` due to memory usage and platform |
| variation. |
| |
| The `base::circular_deque` implementation (and the `base::queue` which uses it) |
| provide performance consistent across platforms that better matches most |
| programmer's expectations on performance (it doesn't waste as much space as |
| libc++ and doesn't do as many heap allocations as MSVC). It also generates less |
| code tham `std::queue`: using it across the code base saves several hundred |
| kilobytes. |
| |
| Since `base::deque` does not have stable iterators and it will move the objects |
| it contains, it may not be appropriate for all uses. If you need these, |
| consider using a `std::list` which will provide constant time insert and erase. |
| |
| ### std::deque and std::queue |
| |
| The implementation of `std::deque` varies considerably which makes it hard to |
| reason about. All implementations use a sequence of data blocks referenced by |
| an array of pointers. The standard guarantees random access, amortized |
| constant operations at the ends, and linear mutations in the middle. |
| |
| In Microsoft's implementation, each block is the smaller of 16 bytes or the |
| size of the contained element. This means in practice that every expansion of |
| the deque of non-trivial classes requires a heap allocation. libc++ (on Android |
| and Mac) uses 4K blocks which eliminates the problem of many heap allocations, |
| but generally wastes a large amount of space (an Android analysis revealed more |
| than 2.5MB wasted space from deque alone, resulting in some optimizations). |
| libstdc++ uses an intermediate-size 512-byte buffer. |
| |
| Microsoft's implementation never shrinks the deque capacity, so the capacity |
| will always be the maximum number of elements ever contained. libstdc++ |
| deallocates blocks as they are freed. libc++ keeps up to two empty blocks. |
| |
| ### base::circular_deque and base::queue |
| |
| A deque implemented as a circular buffer in an array. The underlying array will |
| grow like a `std::vector` while the beginning and end of the deque will move |
| around. The items will wrap around the underlying buffer so the storage will |
| not be contiguous, but fast random access iterators are still possible. |
| |
| When the underlying buffer is filled, it will be reallocated and the constents |
| moved (like a `std::vector`). The underlying buffer will be shrunk if there is |
| too much wasted space (_unlike_ a `std::vector`). As a result, iterators are |
| not stable across mutations. |
| |
| ## Stack |
| |
| `std::stack` is like `std::queue` in that it is a wrapper around an underlying |
| container. The default container is `std::deque` so everything from the deque |
| section applies. |
| |
| Chromium provides `base/containers/stack.h` which defines `base::stack` that |
| should be used in preference to `std::stack`. This changes the underlying |
| container to `base::circular_deque`. The result will be very similar to |
| manually specifying a `std::vector` for the underlying implementation except |
| that the storage will shrink when it gets too empty (vector will never |
| reallocate to a smaller size). |
| |
| Watch out: with some stack usage patterns it's easy to depend on unstable |
| behavior: |
| |
| ```cpp |
| base::stack<Foo> stack; |
| for (...) { |
| Foo& current = stack.top(); |
| DoStuff(); // May call stack.push(), say if writing a parser. |
| current.done = true; // Current may reference deleted item! |
| } |
| ``` |
| |
| ## Safety |
| |
| Code throughout Chromium, running at any level of privilege, may directly or |
| indirectly depend on these containers. Much calling code implicitly or |
| explicitly assumes that these containers are safe, and won't corrupt memory. |
| Unfortunately, [such assumptions have not always proven |
| true](https://bugs.chromium.org/p/chromium/issues/detail?id=817982). |
| |
| Therefore, we are making an effort to ensure basic safety in these classes so |
| that callers' assumptions are true. In particular, we are adding bounds checks, |
| arithmetic overflow checks, and checks for internal invariants to the base |
| containers where necessary. Here, safety means that the implementation will |
| `CHECK`. |
| |
| As of 8 August 2018, we have added checks to the following classes: |
| |
| - `base::StringPiece` |
| - `base::span` |
| - `base::Optional` |
| - `base::RingBuffer` |
| - `base::small_map` |
| |
| Ultimately, all base containers will have these checks. |
| |
| ### Safety, completeness, and efficiency |
| |
| Safety checks can affect performance at the micro-scale, although they do not |
| always. On a larger scale, if we can have confidence that these fundamental |
| classes and templates are minimally safe, we can sometimes avoid the security |
| requirement to sandbox code that (for example) processes untrustworthy inputs. |
| Sandboxing is a relatively heavyweight response to memory safety problems, and |
| in our experience not all callers can afford to pay it. |
| |
| (However, where affordable, privilege separation and reduction remain Chrome |
| Security Team's first approach to a variety of safety and security problems.) |
| |
| One can also imagine that the safety checks should be passed on to callers who |
| require safety. There are several problems with that approach: |
| |
| - Not all authors of all call sites will always |
| - know when they need safety |
| - remember to write the checks |
| - write the checks correctly |
| - write the checks maximally efficiently, considering |
| - space |
| - time |
| - object code size |
| - These classes typically do not document themselves as being unsafe |
| - Some call sites have their requirements change over time |
| - Code that gets moved from a low-privilege process into a high-privilege |
| process |
| - Code that changes from accepting inputs from only trustworthy sources to |
| accepting inputs from all sources |
| - Putting the checks in every call site results in strictly larger object code |
| than centralizing them in the callee |
| |
| Therefore, the minimal checks that we are adding to these base classes are the |
| most efficient and effective way to achieve the beginning of the safety that we |
| need. (Note that we cannot account for undefined behavior in callers.) |
| |
| ## Appendix |
| |
| ### Code for map code size comparison |
| |
| This just calls insert and query a number of times, with `printf`s that prevent |
| things from being dead-code eliminated. |
| |
| ```cpp |
| TEST(Foo, Bar) { |
| base::small_map<std::map<std::string, Flubber>> foo; |
| foo.insert(std::make_pair("foo", Flubber(8, "bar"))); |
| foo.insert(std::make_pair("bar", Flubber(8, "bar"))); |
| foo.insert(std::make_pair("foo1", Flubber(8, "bar"))); |
| foo.insert(std::make_pair("bar1", Flubber(8, "bar"))); |
| foo.insert(std::make_pair("foo", Flubber(8, "bar"))); |
| foo.insert(std::make_pair("bar", Flubber(8, "bar"))); |
| auto found = foo.find("asdf"); |
| printf("Found is %d\n", (int)(found == foo.end())); |
| found = foo.find("foo"); |
| printf("Found is %d\n", (int)(found == foo.end())); |
| found = foo.find("bar"); |
| printf("Found is %d\n", (int)(found == foo.end())); |
| found = foo.find("asdfhf"); |
| printf("Found is %d\n", (int)(found == foo.end())); |
| found = foo.find("bar1"); |
| printf("Found is %d\n", (int)(found == foo.end())); |
| } |
| ``` |