Friday 17 March 2017

Hash indexes are faster than Btree indexes?


PostgreSQL have supported Hash Index for a long time, but they are not much used in production mainly because they are not durable.  Now, with the next version of PostgreSQL, they will be durable.  The immediate question is how do they perform as compared to Btree indexes. There is a lot of work done in the coming version to make them faster. There are multiple ways in which we can compare the performance of Hash and Btree indexes, like the time taken for creation of the index, search or insertion in the index.  This blog will mainly focus on the search operation. By definition, hash indexes are O(1) and Btree indexes are O(log n), however with duplicates that is not exactly true.

To start with let us see the impact of work being done to improve the performance of hash indexes. Below is the performance data of the pgbench read-only workload to compare the performance difference of Hash indexes between 9.6 and HEAD on IBM POWER-8 having 24 cores, 192 hardware threads, 492GB RAM.




The workload is such that all the data fits in shared buffers (scale factor is 300 (~4.5GB) and shared_buffers is 8GB).  As we can see from the above graph, that the performance has increased at all client counts in the range of 7% to 81% and the impact is more pronounced at higher client counts. The main work which has led to this improvement is 6d46f478 (Improve hash index bucket split behavior.) and 293e24e5 (Cache hash index's metapage in rel->rd_amcache.).

The first commit 6d46f478 has changed the heavyweight locks (locks that are used for logical database objects to ensure the database ACID properties) to lightweight locks (locks to protect shared data structures) for scanning the bucket pages.  In general, acquiring the heavyweight lock is costlier as compare to lightweight locks.  In addition to reducing the locking cost, this also avoids locking out scans and inserts for the lifetime of the split.

The second commit 293e24e5 avoids a significant amount of contention for accessing metapage. Each search operation needs to access metapage to find the bucket that contains tuple being searched which leads to high contention around metapage.  Each access to metapage needs to further access buffer manager. This work avoids that contention by caching the metapage information in backend local cache which helps bypassing all the buffer manager related work and hence the major contention in accessing the metapage.


The next graph shows how the hash index performs as compared to the btree index.  In this run we have changed hash to btree index in pgbench read-only tests.



We can see here that the hash index performs better than the btree index and the performance difference is in the range of 10 to 22%.  In some other workloads we have seen a better performance like with hash index on varchar columns and even in the community, it has been reported that there is performance improvement in the range of 40-60% when hash indexes are used for unique index columns.


The important thing to note about the above data is that it is only on some of the specific workloads and it mainly covers Selects as that is the main area where performance improvement work has been done for PostgreSQL10.  The other interesting parameters to compare are the size of the index and update on the index which needs more study and experiments.

In the end, I would like to thank my colleagues who were directly involved in this work and my employer EnterpriseDB who has supported this work.  Firstly I would like to thank, Robert Haas who has envisioned all this work and is the committer of this work, and Mithun C Y who was the author of commit 293e24e5.  Also, I would like to extend sincere thanks to all the community members who are involved in this work and especially Jeff Janes and Jesper Pedersen who have reviewed and tested this work.

13 comments:

  1. Just wanted to say I really appreciate all the work that went into this. I've been silently watching the development since it's inception, and it's been no small feat.

    One thing I hadn't really seen discussed before (I may have missed it), is how Hash indexes are for UUID compared to Integer primary keys... Do you know if there has been any benchmarking done on that?

    ReplyDelete
    Replies
    1. Thanks for appreciating the work.

      No, I am not aware of tests for UUID datatype. Theoretically, we should see similar performance improvements for UUID type as for unique integers. If you get chance to run any such benchmark, do share the data of same with me or in PostgreSQL community.

      Delete
    2. More specifically, I'm curious about the performance of UUIDs that are random, and thus cause unpredictable inserts of any value, versus the performance of inserts/selects on integers that are monotonically increasing.

      Delete
    3. Do you have any production workload which has data generation pattern like that? If so, can you share some details on what is the objective of such workload and how exactly it performs the search operation?

      Delete
  2. This article doesn't make sense to me.

    Hash indexes aren't used very much because they are functionally less useful than Btree indexes.

    Crucially, "range scans" cannot be satisfied by a hash index. So you can't use it to optimise the situation

    SELECT somecols FROM tbl WHERE indexedcol BETWEEN v1 AND v2;

    This also includes the use of a prefix, so a single hash index on two colums cannot be used to satisfy a query to match the first one.

    I had no idea that hash indexes were not durable in earlier versions. But they are generally far less useful than Btree because you can't use them for range-scans or prefix queries!

    Maybe you felt this was too obvious to mention, but I think you could catch out a lot of database-design newbies that way.

    ReplyDelete
    Replies
    1. I agree that there are many cases where hash indexes can't be used and that is true by definition of hash indexes. However, they could be used in pointed queries (where clause contains equal to condition). In databases, there are many workloads where we use it that way and on unique columns like columns of type UUID. Now, I think why this article didn't make sense to you is that either you don't have come across such a use-case and more importantly I should have mentioned it that this article is for cases where both hash and btree indexes can be used, but I presumed that one can compare performance of two types of indexes when they get used.

      Delete
    2. We use a hash index on all of our uuid tables in production because nobody will ever use anything on them except equality checking. Our indexes are much smaller. We are on pg 10.

      Delete
  3. I hope hash index can help speedup uuid lookups when used as a primary key because currently i have seen more projects moving this way instead of using a serial, specifically where range comparison dosen make sense. So i hope hash indices will work great with the uuid type.

    ReplyDelete
    Replies
    1. Yes, I think it should, however, I have not personally done any comparison for uuid types.

      Delete
  4. I am using UUID as primary keys as well and also wanted to see more benchmarks on Hash Indexes for UUID. Hash indexes sound great for UUID primery keys, since the only search that makes sense is equality.

    But one thing still missing is the UNIQUE check for the Hash index. Although UUID are pretty much unique, it's good to have a check in case some bug on application code tries to create the same row twice, leading to inconsistencies in the database that are hard to find and debug.

    ReplyDelete
    Replies
    1. I agree with you that UNIQUE is an important functionality for hash indexes that is missing. I don't have time to implement that myself, but if you want to work on it or you know someone who is interested in working on it, then I can help.

      Delete
  5. It is a question of handling hash collisions and collision rate of the hashing algorithm.
    What about using a trie instead? Eg. Judy Trie, that is a highly optimized C implementation of a prefix trie. That is not only very fast but also collision free.

    ReplyDelete
    Replies
    1. I think trie is a different data structure to handle different kind of problems. Hash and Btree indexes in postgres serves the need for which they are built.

      Delete