aboutsummaryrefslogtreecommitdiffstats
path: root/doc
Commit message (Collapse)AuthorAgeFilesLines
* Update performance logsSven Gothel2021-01-074-550/+550
|
* Update performance logs amd64 and arm64-raspi4Sven Gothel2021-01-064-574/+574
|
* Update benchmarks .. adding hashsetSven Gothel2021-01-064-372/+756
|
* Update test_cow_darray_perf01 logs amd64 and arm64-raspi4Sven Gothel2021-01-042-366/+366
|
* Update performance logs amd64 and add arm64-raspi4Sven Gothel2021-01-022-181/+531
|
* Add: darray, cow_darray, cow_iterator; Adjust cow_vector to cow_darray ↵Sven Gothel2021-01-025-540/+350
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | findings (deprecated); Performance enhancements cow_darray, cow_vector design changes / notes: - Aligned all typedef's names etc between both implementations for easier review - Removed direct element access, as this would be only valid if Value_type is a std:share_ptr, assuming the underlying shared storage has been pulled. Use iterator! - Introducing immutable const_iterator (a jau::cow_ro_iterator) and mutable iterator (a jau::cow_rw_iterator). Both types hold the underling's iterator and also either the lock-free shared snapshot (const_iterator) or hold the lock and a copy of the storage (iterator). This guarantees an efficient std API aligned operation, keeping alove std::shared_ptr references. - Removed for_each_cow: Use std::for_each with our new const_iterator or iterator etc... Performance changes / notes: - cow_darray: Use fixed golden-ratio for grow-factor in push_back(), reducing too many copies. - cow_darray::push_back(..): No copy on size < capacity, just push_back into underling, dramatically reducing copies. Guaranteed correct using cow_darray + darray, as darray increments end_ iterator after the new element has been added. - Always use pre-increment/decrement, avoiding copy with post-* variant. cow_vector fixes (learned from working with cow_darray) - reserve(): Only is new_capacity > capacity, then need copy_ctor storage - operator=(cow_vector&& x): Hold both cow_vector's write-mutex - pop_back(): Only if not empty - +++ Performance seems to fluctuate on the allocator and we might want resolve this with a custom pooled alloctor. This is obvious when comparing the 'empty' samples with 'reserved', the latter reserve whole memory of the std::vector and jau::darray upfront. Performance on arm64-raspi4 jau::cow_vector vs jau::cow_darray: - sequential fill and list O(1): cow_vector is ~30 times slower (starting empty) (delta due to cow_darray capacity usage, less copies) - unique fill and list O(n*n): cow_vector is 2-3 times slower (starting empty) (most time used for equal time dereferencing) Performance on arm64-raspi4 std::vector vs jau::darray: - sequential fill and list iterator O(1): jau::darray is ~0% - 40% slower (50 .. 1000) (starting empty) (we may call this almost equal, probably allocator related) - unique fill and list iterator O(n*n): std::vector is ~0% - 23% slower (50 .. 1000) (starting empty) (almost equal, most time used for equal time dereferencing) +++ Performance on amd64 jau::cow_vector vs jau::cow_darray: - sequential fill and list O(1): cow_vector is ~38 times slower (starting empty) (delta due to cow_darray capacity usage, less copies) - unique fill and list O(n*n): cow_vector is ~2 times slower (starting empty) (most time used for equal time dereferencing) Performance on amd64 std::vector vs jau::darray: - sequential fill and list iterator O(1): jau::darray is ~0% - 20% slower (50 .. 1000) (starting empty) (we may call this almost equal, probably allocator related) - unique fill and list iterator O(n*n): std::vector is ~0% - 30% slower (50 .. 1000) (starting empty) (almost equal, most time used for equal time dereferencing) +++ Memory ratio allocation/size jau::cow_vector vs jau::cow_darray having size: - 50: 2 vs 1.1 - 100: 2 vs 1.44 - 1000: 2 vs 1.6 - Hence cow_darray golden-ratio growth factor is more efficient on size + perf. Memory ratio allocation/size std::vector vs jau::darray having size: - 50: 1.28 vs 1.10 - 100: 1.28 vs 1.44 - 1000: 1.03 vs 1.60 - Hence cow_darray golden-ratio growth factor is less efficient on big sizes (but configurable)
* Add benchmark tests: cowvector and hashset, also using new ↵Sven Gothel2020-12-254-0/+540
counting_allocator measuring memory footprint On arm64, raspi4: cow_vector<T> uses ~50% more memory than vector<T> cow_vector<T> is 9-16 times slower than vector<T> (find) - 25 elements: 9x slower - 50 elements: 12x slower - 100 elements: 15x slower - 200 elements: 16x slower - 1000 elements: 10x slower +++ unordered_set<T> uses ~17% more memory than vector<T> unordered_set<T> performance is <= vector<T> up until 100 elements (find) - 25 elements: ~97% slower - 50 elements: ~44% slower unordered_set<T> performance is > vector<T> (find): - 100 elements: equal - 200 elements: ~38% faster - 1000 elements: ~90% faster