tag:blogger.com,1999:blog-8673752770146776575.post5260796850149094573..comments2024-01-02T05:04:05.147-08:00Comments on PostgreSQL and Databases in general: Hash indexes are faster than Btree indexes?Amit Kapilahttp://www.blogger.com/profile/01948926447381079550noreply@blogger.comBlogger13125tag:blogger.com,1999:blog-8673752770146776575.post-15783982108097862422022-02-25T10:19:13.187-08:002022-02-25T10:19:13.187-08:00We use a hash index on all of our uuid tables in p...We use a hash index on all of our uuid tables in production because nobody will ever use anything on them except equality checking. Our indexes are much smaller. We are on pg 10.Brandon Millerhttps://www.blogger.com/profile/07671277250070853501noreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-88186944399184949102019-05-18T04:40:33.783-07:002019-05-18T04:40:33.783-07:00I think trie is a different data structure to hand...I think trie is a different data structure to handle different kind of problems. Hash and Btree indexes in postgres serves the need for which they are built.Amit Kapilahttps://www.blogger.com/profile/01948926447381079550noreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-46950363100098558502019-05-15T01:45:10.243-07:002019-05-15T01:45:10.243-07:00It is a question of handling hash collisions and c...It is a question of handling hash collisions and collision rate of the hashing algorithm. <br />What about using a trie instead? Eg. Judy Trie, that is a highly optimized C implementation of a prefix trie. That is not only very fast but also collision free.salmonixhttps://www.blogger.com/profile/17208927260824609475noreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-29219406512195849032019-05-14T05:54:41.153-07:002019-05-14T05:54:41.153-07:00I agree with you that UNIQUE is an important funct...I agree with you that UNIQUE is an important functionality for hash indexes that is missing. I don't have time to implement that myself, but if you want to work on it or you know someone who is interested in working on it, then I can help.Amit Kapilahttps://www.blogger.com/profile/01948926447381079550noreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-51997881645663830602019-05-09T23:53:23.021-07:002019-05-09T23:53:23.021-07:00I am using UUID as primary keys as well and also w...I am using UUID as primary keys as well and also wanted to see more benchmarks on Hash Indexes for UUID. Hash indexes sound great for UUID primery keys, since the only search that makes sense is equality.<br /><br />But one thing still missing is the UNIQUE check for the Hash index. Although UUID are pretty much unique, it's good to have a check in case some bug on application code tries to create the same row twice, leading to inconsistencies in the database that are hard to find and debug. Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-76093067244482604232017-04-16T22:12:15.318-07:002017-04-16T22:12:15.318-07:00Yes, I think it should, however, I have not person...Yes, I think it should, however, I have not personally done any comparison for uuid types.Amit Kapilahttps://www.blogger.com/profile/01948926447381079550noreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-36279868357359361622017-04-16T18:49:01.034-07:002017-04-16T18:49:01.034-07:00I hope hash index can help speedup uuid lookups wh...I hope hash index can help speedup uuid lookups when used as a primary key because currently i have seen more projects moving this way instead of using a serial, specifically where range comparison dosen make sense. So i hope hash indices will work great with the uuid type.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-84330523540512603402017-04-03T05:30:59.882-07:002017-04-03T05:30:59.882-07:00I agree that there are many cases where hash index...I agree that there are many cases where hash indexes can't be used and that is true by definition of hash indexes. However, they could be used in pointed queries (where clause contains equal to condition). In databases, there are many workloads where we use it that way and on unique columns like columns of type UUID. Now, I think why this article didn't make sense to you is that either you don't have come across such a use-case and more importantly I should have mentioned it that this article is for cases where both hash and btree indexes can be used, but I presumed that one can compare performance of two types of indexes when they get used.Amit Kapilahttps://www.blogger.com/profile/01948926447381079550noreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-9485051166383743472017-04-03T01:57:54.334-07:002017-04-03T01:57:54.334-07:00This article doesn't make sense to me.
Hash i...This article doesn't make sense to me.<br /><br />Hash indexes aren't used very much because they are functionally less useful than Btree indexes. <br /><br />Crucially, "range scans" cannot be satisfied by a hash index. So you can't use it to optimise the situation<br /><br />SELECT somecols FROM tbl WHERE indexedcol BETWEEN v1 AND v2;<br /><br />This also includes the use of a prefix, so a single hash index on two colums cannot be used to satisfy a query to match the first one.<br /><br />I had no idea that hash indexes were not durable in earlier versions. But they are generally far less useful than Btree because you can't use them for range-scans or prefix queries!<br /><br />Maybe you felt this was too obvious to mention, but I think you could catch out a lot of database-design newbies that way.Mark Robsonhttps://www.blogger.com/profile/15864507044869250062noreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-49314672708362350782017-03-20T19:14:01.830-07:002017-03-20T19:14:01.830-07:00Do you have any production workload which has data...Do you have any production workload which has data generation pattern like that? If so, can you share some details on what is the objective of such workload and how exactly it performs the search operation?Amit Kapilahttps://www.blogger.com/profile/01948926447381079550noreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-90329661670107816102017-03-20T15:49:24.832-07:002017-03-20T15:49:24.832-07:00More specifically, I'm curious about the perfo...More specifically, I'm curious about the performance of UUIDs that are random, and thus cause unpredictable inserts of any value, versus the performance of inserts/selects on integers that are monotonically increasing.Aaron Zinmanhttp://azinman.comnoreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-33266735991550092672017-03-18T06:05:44.657-07:002017-03-18T06:05:44.657-07:00Thanks for appreciating the work.
No, I am not aw...Thanks for appreciating the work.<br /><br />No, I am not aware of tests for UUID datatype. Theoretically, we should see similar performance improvements for UUID type as for unique integers. If you get chance to run any such benchmark, do share the data of same with me or in PostgreSQL community.Amit Kapilahttps://www.blogger.com/profile/01948926447381079550noreply@blogger.comtag:blogger.com,1999:blog-8673752770146776575.post-34850044005199039372017-03-17T17:11:59.658-07:002017-03-17T17:11:59.658-07:00Just wanted to say I really appreciate all the wor...Just wanted to say I really appreciate all the work that went into this. I've been silently watching the development since it's inception, and it's been no small feat.<br /><br />One thing I hadn't really seen discussed before (I may have missed it), is how Hash indexes are for UUID compared to Integer primary keys... Do you know if there has been any benchmarking done on that?Tostinohttps://www.blogger.com/profile/17330018990492429849noreply@blogger.com