Closed Bug 1460674 Opened 6 years ago Closed 6 years ago

decrease hash table item storage for 64-bit platforms

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla65

Tracking Flags:

Tracking

Status

firefox65

---

fixed

People

(Reporter: froydnj, Assigned: froydnj)

References

(Blocks 1 open bug)

Details

(Whiteboard: [overhead:87k])

Attachments

(6 files, 3 obsolete files)

store the hash table entry size in iterators 6 years ago Nathan Froyd [:froydnj] 2.99 KB, patch	n.nethercote : review+	Details \| Diff \| Splinter Review
part 1 - change PLDHashTable internals to work on slots 6 years ago Nathan Froyd [:froydnj] 24.45 KB, patch	n.nethercote : feedback+	Details \| Diff \| Splinter Review
rearrange the internal storage of PLDHashTable 6 years ago Nathan Froyd [:froydnj] 10.95 KB, patch	n.nethercote : feedback+	Details \| Diff \| Splinter Review
stop trying to be const-correct in Get() 6 years ago Nathan Froyd [:froydnj] 1.66 KB, patch	n.nethercote : review+	Details \| Diff \| Splinter Review
part 1 - change PLDHashTable internals to work on slots 6 years ago Nathan Froyd [:froydnj] 25.30 KB, patch	n.nethercote : review+	Details \| Diff \| Splinter Review
part 2 - reorganize PLDHashTable's internal storage 6 years ago Nathan Froyd [:froydnj] 19.19 KB, patch		Details \| Diff \| Splinter Review
part 3 - make PLDHashTable iteration faster 6 years ago Nathan Froyd [:froydnj] 7.29 KB, patch	n.nethercote : review+	Details \| Diff \| Splinter Review
part 2 - reorganize PLDHashTable's internal storage 6 years ago Nathan Froyd [:froydnj] 19.59 KB, patch	n.nethercote : review+	Details \| Diff \| Splinter Review
part 4 - update gdb pretty-printing to grok the new layout 6 years ago Nathan Froyd [:froydnj] 1.85 KB, patch	tromey : review+	Details \| Diff \| Splinter Review

Nathan Froyd [:froydnj]

Assignee

Description

•

6 years ago

Our current PLDHashTable setup requires that all items stored inherit from PLDHashEntryHdr:

struct PLDHashEntryHdr {
  // PLDHashNumber is a uint32_t.
  PLDHashNumber mKeyHash;
};

class MyType : public PLDHashEntryHdr {
  // Data members, etc.
};

PLDHashEntryHdr::mKeyHash is used to cache the computed hash value of the entry, so we aren't rehashing entries on every lookup/add/etc.

Because of structure layout requirements on 64-bit platforms, the data members of MyType will typically start at offset 8:

MyType, offset 0: mKeyHash
MyType, offset 4: padding required by alignment
MyType, offset 8: first data member of MyType
MyType, offset N: ...

The padding at offset 4 is dead, unused space.  For a lot of our hashtables:

https://searchfox.org/mozilla-central/search?q=nsTHashtable%3C&regexp=true&path=

(nsTHashtable is backed by PLDHashTable; it's harder to look at all the PLDHashTables because it's not immediately obvious what type they're storing.  Templated C++ hashtables are more common than PLDHashTables, anyway.)

it's extremely common to have a MyType that looks like the moral equivalent of:

template<typename T>
class MyType : public PLDHashEntryHdr {
  T* mPtr;
};

which means that 25% of MyType's bytes are taken up by padding.  (mPtr can also be a smart pointer type, which is the same size as a pointer.)  That is, 25% of the bytes allocated by PLDHashTable to store MyType entries are taken up by padding.  String keys are also common, the moral equivalent of:

template<typename Char>
class MyType : public PLDHashEntryHdr {
  nsTString<Char> mString;
}

nsTString<Char> is 16 bytes, so we're wasting 4/24 bytes, or ~16%.

For other templated hashtables:

https://searchfox.org/mozilla-central/search?q=ns..%2BHashtable%3C&case=false&regexp=true&path=

the space waste is less noticeable, but it's probably ~5-10%.

We can do better than this.

The usual way this is handled (Rust does this, for instance) is to allocate one giant block of memory for hashtable storage, but internally store the entries and the cached hashes in separate regions of the block:

[ entry1, entry2, ..., entryN, hash1, hash2, ..., hashN ]

(Putting the hashes after the entries means that you--almost certainly--don't have to worry about alignment, since the alignment requirements + sizes of the entries means that the memory after entryN is already correctly aligned for the cached hashes, whereas the reverse is not necessarily true.)

For PLDHashTable, we'd remove mKeyHash from PLDHashEntryHdr (making everything not inherit from PLDHashEntryHdr would be another way to do this, but it's probably simpler to do that in a separate bug), and extensively modify PLDHashTable to understand this new layout for its internal storage.

Splitting the cached hash from the entry means that when we find a hash that matches, the entry is almost certainly not in the same cache line as the hash itself, so we're going to take a cache miss to examine the entry and see if it matches.  Whether this is noticeable is up for debate, since the tighter packing of cached hashes and entries has its own cache-related benefits.

I am slightly worried about whether making the internal logic for combing through entries + hashes more expensive will be worth it; the hashtable code is fairly sensitive to small details (see PLDHashTable::SearchTable).  We may want to *not* do this change for 32-bit platforms, as they don't need it, though that would significantly complicate the code.

(A really wild idea would be to "butterfly" the entry layouts:

[entry1, hash1, hash2, entry2, entry3, hash3, hash4, ...]

I am not sure that all the necessary logic for handling that can be implemented efficiently, though.)

Nicholas Nethercote [inactive]

Comment 1

•

6 years ago

I too have been thinking about this. I was waiting on the Robin Hood work (bug 1402910) because then the hashes are probed linearly, which means having all the hashes laid out consecutively is really good in terms of cache behaviour.

Eric Rahm [:erahm]

Comment 2

•

6 years ago

Nathan, do you have any guess as to how much overhead this could save (say once we've loaded about:blank)?

Flags: needinfo?(nfroyd)

Nathan Froyd [:froydnj]

Assignee

Comment 3

•

6 years ago

(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #2)
> Nathan, do you have any guess as to how much overhead this could save (say
> once we've loaded about:blank)?

I don't.  For the about:blank case, it would depend on the number of hashtable entries we have and the amount of wastage in each one.  That would require a lot of measurement, but let's say we thought we could save 100k and we were wasting 4 bytes/entry.  That works out to 25k hashtable entries, which a) sounds like a lot and b) is probably overly generous, since not all hashtable entries waste that much.

Given that thought experiment, I'd guess 10-20k?  But that's *really* just a guess.

Flags: needinfo?(nfroyd)

Eric Rahm [:erahm]

Updated

•

6 years ago

Whiteboard: [overhead:>20k]

Kris Maglione [:kmag]

Comment 4

•

6 years ago

So, from a quick hack, it looks like the numbers are probably about:

Memory reporting finished. content=0 hash overhead=168,724B pct=15
Memory reporting finished. content=1 hash overhead=35,788B pct=15
Memory reporting finished. content=1 hash overhead=35,788B pct=15
Memory reporting finished. content=1 hash overhead=44,540B pct=16
Memory reporting finished. content=1 hash overhead=49,768B pct=16

Taking into account only the extra 4 bytes in PLDHashEntryHdr, and assuming all entries have them.

Kris Maglione [:kmag]

Comment 5

•

6 years ago

(This is with the patches from bug 1471025, by the way. Prior to that, we had an extra 3,000 pref hashtable entries per process, so an extra 12K or so.)

Eric Rahm [:erahm]

Comment 6

•

6 years ago

Okay lets bump our guess to 50k.

Whiteboard: [overhead:>20k] → [overhead:50k]

Kris Maglione [:kmag]

Comment 7

•

6 years ago

(In reply to Nathan Froyd [:froydnj] (away until 16 July 2018) from comment #0)
> (A really wild idea would be to "butterfly" the entry layouts:
> 
> [entry1, hash1, hash2, entry2, entry3, hash3, hash4, ...]
> 
> I am not sure that all the necessary logic for handling that can be
> implemented efficiently, though.)


If we decided to go this route, I think we could probably handle it pretty efficiently with a compromise:

  struct EntryTuple {
    PLDHashNumber mHashes[2];
    Entry mEntries[2];
  }

  EntryTuple& tuple = entries[index >> 1];
  PLDHashNumber mHash = tuple.mHashes[index & 1];
  Entry& entry = tuple.mEntires[index & 1];

A particularly clever compiler could probably do it in one SHR operation and then a CF check. What real-world compilers would do, I don't know.

Kris Maglione [:kmag]

Comment 8

•

6 years ago

I ran this again using capacity rather than entry count (which is obviously what really matters):

Hash entry content=0 overhead=243,040B pct=15
Hash entry content=1 overhead=86,624B pct=15
Hash entry content=1 overhead=86,112B pct=15
Hash entry content=1 overhead=86,624B pct=15

Whiteboard: [overhead:50k] → [overhead:87k]

Nathan Froyd [:froydnj]

Assignee

Comment 9

•

6 years ago

Attached patch store the hash table entry size in iterators — Details — Splinter Review

This change is satisfying insofar as it removes a load from every
iteration step over the hashtable, but it doesn't seem to affect
our collection benchmarks in any way.  The change does not make
iterators any larger, as there is padding at the end of the iterator
structure on both 32- and 64-bit platforms.

This is a small precursor patch that fell out while I was working on the other
hashtable stuff.  Like bug 1506730, it's small, probably doesn't matter much,
but it at least removes instructions, so I think it's a good thing.

Attachment #9025116 - Flags: review?(n.nethercote)

Nathan Froyd [:froydnj]

Assignee

Comment 10

•

6 years ago

Attached patch part 1 - change PLDHashTable internals to work on slots (obsolete) — Details — Splinter Review

As a first step to laying out the hash table storage differently, we have to
make the internals of PLDHashTable not access PLDHashEntryHdr items directly,
but layer an abstraction on top.  We call this abstraction "slots": a slot
represents storage for the cached hash key and the associated object, and can
produce a PLDHashEntryHdr* on demand.  The rest of code can deal with slots and
mostly not care about the particular way that hash keys and the objects
themselves are stored.

This is not perfect, and I have a couple other commits in my tree that might
clean this up.  But I suspect there will be some changes required in any event,
so posting this to get some feedback.

Attachment #9025117 - Flags: feedback?(n.nethercote)

Nathan Froyd [:froydnj]

Assignee

Comment 11

•

6 years ago

Attached patch rearrange the internal storage of PLDHashTable (obsolete) — Details — Splinter Review

This patch obviously needs some comments and things describing what's going on, but it's good enough to run benchmarks and pass some basic tests locally.

The idea is that instead of laying out the storage:

+---------+---------+---------+
| object0 | object1 | ... |
+---------+---------+---------+

where objectN has dead padding space as discussed previously in the bug, we're going to lay things out:

+-------+-------+-------+-------+---------+---------+---------+
| hash0 | hash1 | ..... | hashN | object0 | object1 | ... |
+-------+-------+-------+-------+---------+---------+---------+

where objectN is the actual C++ object; we have made PLDHashEntryHdr be zero size with these changes, and it therefore takes up no space in its subclasses. EntryStore is the class that knows about most of this; you can change the particulars of the layout by altering bits of EntryStore and the attendant details of Slot.

Through a happy coincidence, because our capacity is always a power-of-two, object0 will always be properly aligned for whatever C++ type we might be using, so we don't have to deal with the particulars of alignment.

As a happy result of this change, besides making hash tables take up less space, is that the collection benchmark for PLDHashTable is ~10% faster. Successful and failing lookups account for the bulk of that, being about 20% faster apiece; the insert/remove and iteration bits of the benchmark are basically unchanged. Workloads that deal heavily in removes might suffer, as we've introduced a divide into the removal path, but it is entirely possible that the increased memory locality makes up for it. I have ideas how this might be managed for our C++ hashtable wrappers, but haven't implemented them yet.

Assignee: nobody → nfroyd

Status: NEW → ASSIGNED

Attachment #9025133 - Flags: feedback?(n.nethercote)

Nicholas Nethercote [inactive]

Comment 12

•

6 years ago

Comment on attachment 9025116 [details] [diff] [review]
store the hash table entry size in iterators

Review of attachment 9025116 [details] [diff] [review]:
-----------------------------------------------------------------

Seems fine, though you just removed `mStart`, so you seem to be going back on forth about whether to save space or remove pointer indirections...

Attachment #9025116 - Flags: review?(n.nethercote) → review+

Nathan Froyd [:froydnj]

Assignee

Comment 13

•

6 years ago

(In reply to Nicholas Nethercote [:njn] from comment #12)
> Seems fine, though you just removed `mStart`, so you seem to be going back
> on forth about whether to save space or remove pointer indirections...

It doesn't take up any more space, because it fits into the tail padding of the current layout.

Nicholas Nethercote [inactive]

Comment 14

•

6 years ago

Comment on attachment 9025117 [details] [diff] [review]
part 1 - change PLDHashTable internals to work on slots

Review of attachment 9025117 [details] [diff] [review]:
-----------------------------------------------------------------

Overall this seems good.

My main concern is about terminology. AIUI:

- Currently "keyhash + value = entry", though "value" isn't really mentioned explicitly, it's just everything in a subclass of PLDHashEntryHdr.

- With these patches applied, "keyhash + entry = slot", where "entry" is everything in a subclass of PLDHashEntryHdr.

Is that right?

Please make sure that these terms are used consistently. I think they are, though the changes being split across two patches makes it a bit hard to tell, and I found one place below that is imprecise.

(I don't especially like "slot", though I'm having trouble thinking of something better. "KeyhashAndEntry" is descriptive but verbose. Another possibility would be to not have an explicit name/concept/type for the keyhash + entry pair, and just have KeyhashForIndex(), EntryForIndex(), and maybe functions for getting from an entry to a keyhash and vice versa. Now I'm wondering if that would end up making the code less verbose.)

You comments in the bug have include some nice ASCII diagrams. It would be good to have one of those in the code, in order to explain (a) what the layout is and (b) why it's like that. It should also mention how the power-of-two entry capacity guarantees sufficient alignment.

::: xpcom/ds/PLDHashTable.cpp
@@ +448,5 @@
>  //      structure.
>  // Avoiding the need for |aKey| means we can avoid needing a way to map entries
>  // to keys, which means callers can use complex key types more easily.
> +MOZ_ALWAYS_INLINE auto
> +PLDHashTable::FindFreeSlot(PLDHashNumber aKeyHash) const -> Slot

Is the use of `auto ... -> Slot` necessary?

@@ +673,5 @@
> +                                   RawRemove(slot.ToEntry());
> +                                   ShrinkIfAppropriate();
> +                                 },
> +                                 [&]() {
> +                                   /*nothing*/;

// nothing

@@ +759,5 @@
>    , mEntrySize(aOther.mEntrySize)
>  {
>    // No need to change |mChecker| here.
>    aOther.mTable = nullptr;
> +  // We don't really have the concept of a null slot, so leave mLimit/mCurrent.

Why not? It would just contain a null pointer(s).

@@ +783,5 @@
>    if (ChaosMode::isActive(ChaosFeature::HashTableIteration) &&
>        mTable->Capacity() > 0) {
>      // Start iterating at a random entry. It would be even more chaotic to
>      // iterate in fully random order, but that's harder.
> +    uint32_t i = ChaosMode::randomUint32LessThan(mTable->Capacity());

FWIW: supporting chaos mode really slows down iteration :(

::: xpcom/ds/PLDHashTable.h
@@ +225,5 @@
>  class PLDHashTable
>  {
>  private:
> +  // A slot represents a cached hash value and its associated object stored
> +  // in the hash table.

Here's an example of imprecise terminology: please use "entry" instead of "object".

This comment should also make it clearer that the two parts of the slot (once the next patch is applied!) conceptually belong together, but they are separate in memory.

@@ +245,5 @@
> +    Slot(Slot&& aOther)
> +      : mEntry(aOther.mEntry)
> +    {}
> +
> +    Slot& operator=(Slot&& aOther) {

Moz style puts the brace on the next line :(  But a lot of the subsequent methods can be made into one-liners.

@@ +321,5 @@
> +    Slot SlotForIndex(uint32_t aIndex, uint32_t aEntrySize) {
> +      return Slot(Get() + aIndex * aEntrySize);
> +    }
> +
> +    Slot SlotForIndex(uint32_t aIndex, uint32_t aEntrySize) const {

Do you really need both variants of `SlotForIndex()`?

@@ +653,5 @@
>    PLDHashNumber ComputeKeyHash(const void* aKey) const;
>  
>    enum SearchReason { ForSearchOrRemove, ForAdd };
>  
> +  // Use prefixed names to avoid collisions with macro names from including files.

I don't understand this comment. Which names are prefixed?

@@ +655,5 @@
>    enum SearchReason { ForSearchOrRemove, ForAdd };
>  
> +  // Use prefixed names to avoid collisions with macro names from including files.
> +  template <SearchReason Reason, typename PLDSuccess, typename PLDFailure>
> +  auto 

trailing whitespace

Attachment #9025117 - Flags: feedback?(n.nethercote) → feedback+

Nicholas Nethercote [inactive]

Comment 15

•

6 years ago

Comment on attachment 9025133 [details] [diff] [review]
rearrange the internal storage of PLDHashTable

Review of attachment 9025133 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/ds/PLDHashTable.h
@@ +327,5 @@
>  
> +    Slot SlotForIndex(uint32_t aIndex, uint32_t aEntrySize,
> +                      uint32_t aCapacity) const {
> +      char* entries = Entries(aCapacity);
> +      auto* entry = reinterpret_cast<PLDHashEntryHdr*>(entries + aIndex * aEntrySize);

I just use `auto` (no `*`) for casts like these.

Attachment #9025133 - Flags: feedback?(n.nethercote) → feedback+

Nicholas Nethercote [inactive]

Comment 16

•

6 years ago

> It doesn't take up any more space, because it fits into the tail padding of the current layout.

Oh right, reading comprehension fail on my part.

Nathan Froyd [:froydnj]

Assignee

Comment 17

•

6 years ago

(In reply to Nicholas Nethercote [:njn] from comment #14)
> ::: xpcom/ds/PLDHashTable.cpp
> @@ +448,5 @@
> >  //      structure.
> >  // Avoiding the need for |aKey| means we can avoid needing a way to map entries
> >  // to keys, which means callers can use complex key types more easily.
> > +MOZ_ALWAYS_INLINE auto
> > +PLDHashTable::FindFreeSlot(PLDHashNumber aKeyHash) const -> Slot
> 
> Is the use of `auto ... -> Slot` necessary?

It's not strictly necessary, but using `auto` here enables `Slot` to be looked up in the context of `PLDHashTable`.  If one instead wrote:

MOZ_ALWAYS_INLINE Slot
PLDHashTable::FindFreeSlot(...)

that would not be valid, because `Slot` is not visible at global scope.  You'd have to write PLDHashTable::Slot, which isn't valid until you make `Slot` public in `PLDHashTable`.  Given all that, `auto` seemed like the better choice.

> @@ +759,5 @@
> >    , mEntrySize(aOther.mEntrySize)
> >  {
> >    // No need to change |mChecker| here.
> >    aOther.mTable = nullptr;
> > +  // We don't really have the concept of a null slot, so leave mLimit/mCurrent.
> 
> Why not? It would just contain a null pointer(s).

I guess.  It seemed nice to know that Slot's pointers are always non-null (even if that fact isn't reflected in the code), and the particular value of the Slots in the moved-from iterator don't matter in any event.

> @@ +245,5 @@
> > +    Slot(Slot&& aOther)
> > +      : mEntry(aOther.mEntry)
> > +    {}
> > +
> > +    Slot& operator=(Slot&& aOther) {
> 
> Moz style puts the brace on the next line :(  But a lot of the subsequent
> methods can be made into one-liners.

I can't even remember what the coding style is supposed to be this week.

Nathan Froyd [:froydnj]

Assignee

Comment 18

•

6 years ago

Attached patch stop trying to be const-correct in Get() — Details — Splinter Review

The only place where this is used where the const-ness matters is in
AddressEntry, and that use const_cast's away the const-ness.  So let's
just ditch the method that attempts to return const pointers.

Attachment #9026540 - Flags: review?(n.nethercote)

Nathan Froyd [:froydnj]

Assignee

Comment 19

•

6 years ago

Attached patch part 1 - change PLDHashTable internals to work on slots — Details — Splinter Review

This is a cleaned-up version of the patch posted previously.  I think I
addressed all your review comments; the extensive comments describing
the data layout are coming in the next patch.

Attachment #9026541 - Flags: review?(n.nethercote)

Nathan Froyd [:froydnj]

Assignee

Updated

•

6 years ago

Attachment #9025117 - Attachment is obsolete: true

Nathan Froyd [:froydnj]

Assignee

Comment 20

•

6 years ago

Attached patch part 2 - reorganize PLDHashTable's internal storage (obsolete) — Details — Splinter Review

As discussed in the previous commit message, PLDHashTable's storage
wastes space for common entry types.  This commit reorganizes the
storage to store all the hashes packed together, followed by all the
entries, which eliminates said waste.

Attachment #9026542 - Flags: review?(n.nethercote)

Nathan Froyd [:froydnj]

Assignee

Updated

•

6 years ago

Attachment #9025133 - Attachment is obsolete: true

Nathan Froyd [:froydnj]

Assignee

Comment 21

•

6 years ago

Attached patch part 3 - make PLDHashTable iteration faster — Details — Splinter Review

We do this with two changes, both of which complement and reinforce one
another:

1) Inling the iteration methods, which eliminates a good chunk of
   function call overhead; and

2) Exploit the structure of the entry store for a more efficient inner
   loop when moving to the next live entry.

The two changes together cut the time taken on the collections benchmark
by the iteration portion in half.

Attachment #9026543 - Flags: review?(n.nethercote)

Nathan Froyd [:froydnj]

Assignee

Comment 22

•

6 years ago

Attached patch part 2 - reorganize PLDHashTable's internal storage — Details — Splinter Review

Gah, rebase problems.

Attachment #9026560 - Flags: review?(n.nethercote)

Nathan Froyd [:froydnj]

Assignee

Updated

•

6 years ago

Attachment #9026542 - Attachment is obsolete: true

Attachment #9026542 - Flags: review?(n.nethercote)

Nicholas Nethercote [inactive]

Updated

•

6 years ago

Attachment #9026540 - Flags: review?(n.nethercote) → review+

Nicholas Nethercote [inactive]

Comment 23

•

6 years ago

Comment on attachment 9026541 [details] [diff] [review]
part 1 - change PLDHashTable internals to work on slots

Review of attachment 9026541 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/ds/PLDHashTable.cpp
@@ +373,5 @@
>  // these differences are worthwhile. (It's also hot enough that
>  // MOZ_ALWAYS_INLINE makes a significant difference.)
> +template <PLDHashTable::SearchReason Reason, typename Success, typename Failure>
> +MOZ_ALWAYS_INLINE
> +auto

I'd put the `auto` on the same line as the `MOZ_ALWAYS_INLINE`.

Attachment #9026541 - Flags: review?(n.nethercote) → review+

Nicholas Nethercote [inactive]

Comment 24

•

6 years ago

Comment on attachment 9026560 [details] [diff] [review]
part 2 - reorganize PLDHashTable's internal storage

Review of attachment 9026560 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/ds/PLDHashTable.cpp
@@ +126,5 @@
>  
>  static bool
>  SizeOfEntryStore(uint32_t aCapacity, uint32_t aEntrySize, uint32_t* aNbytes)
>  {
> +  uint32_t entrySize = aEntrySize + sizeof(PLDHashNumber);

Call this `slotSize`?

::: xpcom/ds/PLDHashTable.h
@@ +222,5 @@
>  class PLDHashTable
>  {
>  private:
>    // A slot represents a cached hash value and its associated entry stored
> +  // in the hash table.

Mention that they aren't stored together?

@@ +230,2 @@
>        : mEntry(aEntry)
> +      , mEntryHash(aEntryHash)

I think `mKeyHash` (and `aKeyHash`) would be better. We've always used "keyHash" for this value, and it's a better name that "entryHash" because we don't necessarily hash the whole entry.

@@ +278,5 @@
> +  // | entry0 | entry1 | ...     |
> +  // +--------+--------+---------+
> +  //
> +  // where the entries themselves contain the cached hash stored as their
> +  // first member.  PLDHashTable did this for a long time, with entries looking

How attached are you to putting two spaces after the period? (I used to do that, as I was taught in 8th grade typing class. But after years of negative comments I switched to single space a couple of years ago, and now two spaces looks bad to me...)

@@ +283,5 @@
> +  // like:
> +  //
> +  // class PLDHashEntryHdr
> +  // {
> +  //   ...

This `...` should be removed.

@@ +295,5 @@
> +  //
> +  // The problem with this setup is that, depending on the layout of `MyEntry`,
> +  // there may be platform ABI-mandated padding between `mKeyHash` and the
> +  // first member of `MyEntry`.  This ABI-mandated padding is wasted space,
> +  // and was surprisingly common.

", e.g. when MyEntry contains a single pointer on 64-bit platforms."

@@ +718,5 @@
>  //                      entry storage.
>  //  clearEntry          Run dtor on entry.
>  //
>  // Note the reason why initEntry is optional: the default hooks (stubs) clear
> +// entry storage: On successful Add(tbl, key), the returned entry pointer

Pre-existing, but you might as well fix while you're here: two colons in a sentence is weird.

Attachment #9026560 - Flags: review?(n.nethercote) → review+

Nicholas Nethercote [inactive]

Comment 25

•

6 years ago

Comment on attachment 9026543 [details] [diff] [review]
part 3 - make PLDHashTable iteration faster

Review of attachment 9026543 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/ds/PLDHashTable.cpp
@@ +784,5 @@
>    }
>  
>    // Advance to the first live entry, if there is one.
>    if (!Done()) {
> +    if (!IsOnNonLiveEntry()) {

Double negatives are hard to read. I think this is the only call site for this function, so you might as well invert its sense, giving `if (IsOnLiveEntry())`.

::: xpcom/ds/PLDHashTable.h
@@ +633,5 @@
> +      // The idea is that since we are really only iterating through the
> +      // stored hashes and because we know that there are a power-of-two
> +      // number of hashes, we can use masking to implement the wraparound for
> +      // us.  This method does have the downside of needing to recalculate
> +      // where the associated entry is once we've found it, but that seems OK.

I'm surprised this is faster, it feels like a lot of code.

How does the extra inlining affect code size?

@@ +645,5 @@
> +      uint32_t slotIndex = p - hashes;
> +
> +      do {
> +        slotIndex = (slotIndex + 1) & mask;
> +      } while (hashes[slotIndex] < 2);

`< 2` is mysterious. Please use some kind of `!IsLive()` formulation.

@@ +652,5 @@
> +      // and the slot.
> +      auto entries = reinterpret_cast<char*>(&hashes[capacity]);
> +      char* entryPtr = entries + slotIndex * mEntrySize;
> +      auto entry = reinterpret_cast<PLDHashEntryHdr*>(entryPtr);
> +          

Trailing whitespace and extraneous blank lines.

Attachment #9026543 - Flags: review?(n.nethercote) → review+

Nathan Froyd [:froydnj]

Assignee

Comment 26

•

6 years ago

(In reply to Nicholas Nethercote [:njn] from comment #24)
> @@ +278,5 @@
> > +  // | entry0 | entry1 | ...     |
> > +  // +--------+--------+---------+
> > +  //
> > +  // where the entries themselves contain the cached hash stored as their
> > +  // first member.  PLDHashTable did this for a long time, with entries looking
> 
> How attached are you to putting two spaces after the period? (I used to do
> that, as I was taught in 8th grade typing class. But after years of negative
> comments I switched to single space a couple of years ago, and now two
> spaces looks bad to me...)

I'm pretty attached: Emacs does sensible things with reflowing comments and movement by sentences when you have two spaces, but not one, and the alternative seems crapped to me.  But pre-existing style in this file looks to be single-spaced, so when in Rome...?

(In reply to Nicholas Nethercote [:njn] from comment #25)
> ::: xpcom/ds/PLDHashTable.h
> @@ +633,5 @@
> > +      // The idea is that since we are really only iterating through the
> > +      // stored hashes and because we know that there are a power-of-two
> > +      // number of hashes, we can use masking to implement the wraparound for
> > +      // us.  This method does have the downside of needing to recalculate
> > +      // where the associated entry is once we've found it, but that seems OK.
> 
> I'm surprised this is faster, it feels like a lot of code.
> 
> How does the extra inlining affect code size?

Ooo, good point.  An opt but not --enable-release build says that this change contributes 80K of code size.  That doesn't seem worth it, so I'll keep the optimization but uninline things, assuming there's still some performance benefit.

Nathan Froyd [:froydnj]

Assignee

Comment 27

•

6 years ago

I remembered the pretty-printer for nsTHashtable and friends needs to be updated before this lands:

https://searchfox.org/mozilla-central/source/third_party/python/gdbpp/gdbpp/thashtable.py#97

I'll do that next week.

Nathan Froyd [:froydnj]

Assignee

Comment 28

•

6 years ago

(In reply to Nathan Froyd [:froydnj] from comment #26)
> (In reply to Nicholas Nethercote [:njn] from comment #24)
> > @@ +278,5 @@
> > > +  // | entry0 | entry1 | ...     |
> > > +  // +--------+--------+---------+
> > > +  //
> > > +  // where the entries themselves contain the cached hash stored as their
> > > +  // first member.  PLDHashTable did this for a long time, with entries looking
> > 
> > How attached are you to putting two spaces after the period? (I used to do
> > that, as I was taught in 8th grade typing class. But after years of negative
> > comments I switched to single space a couple of years ago, and now two
> > spaces looks bad to me...)
> 
> I'm pretty attached: Emacs does sensible things with reflowing comments and
> movement by sentences when you have two spaces, but not one, and the
> alternative seems crapped to me.

Oooh, unfortunate typo: that should be "cramped"...

Nathan Froyd [:froydnj]

Assignee

Updated

•

6 years ago

Depends on: 1481577

Nathan Froyd [:froydnj]

Assignee

Comment 29

•

6 years ago

Attached patch part 4 - update gdb pretty-printing to grok the new layout — Details — Splinter Review

This depends on the change in bug 1481577.

Attachment #9027569 - Flags: review?(ttromey)

Tom Tromey :tromey

Comment 30

•

6 years ago

Comment on attachment 9027569 [details] [diff] [review]
part 4 - update gdb pretty-printing to grok the new layout

Review of attachment 9027569 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks.  This looks good.

Attachment #9027569 - Flags: review?(ttromey) → review+

Pulsebot

Comment 31

•

6 years ago

Pushed by nfroyd@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9869912ed6d7
part 0a - store the hash table entry size in iterators; r=njn
https://hg.mozilla.org/integration/mozilla-inbound/rev/927af324b93d
part 0b - stop trying to be const-correct in Get(); r=njn
https://hg.mozilla.org/integration/mozilla-inbound/rev/90e48634e1c3
part 1 - change PLDHashTable internals to work on slots; r=njn
https://hg.mozilla.org/integration/mozilla-inbound/rev/0c938490f21c
part 2 - reorganize PLDHashTable's internal storage; r=njn
https://hg.mozilla.org/integration/mozilla-inbound/rev/d94b469a0faa
part 3 - make PLDHashTable iteration faster; r=njn
https://hg.mozilla.org/integration/mozilla-inbound/rev/79b6eb03c0c9
part 4 - update gdb pretty-printing to grok the new layout; r=tromey

Daniel Varga [:dvarga]

Comment 32

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/9869912ed6d7
https://hg.mozilla.org/mozilla-central/rev/927af324b93d
https://hg.mozilla.org/mozilla-central/rev/90e48634e1c3
https://hg.mozilla.org/mozilla-central/rev/0c938490f21c
https://hg.mozilla.org/mozilla-central/rev/d94b469a0faa
https://hg.mozilla.org/mozilla-central/rev/79b6eb03c0c9

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

status-firefox65: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla65

Ryan VanderMeulen [:RyanVM]

Updated

•

5 years ago

status-firefox62: affected → ---

You need to log in before you can comment on or make changes to this bug.