Comments on: Skip List Indices in AvocadoDB | ArangoDB Blog 2012 https://arangodb.com/2012/04/skip-list-indices-in-avocadodb/ The database for graph and beyond Wed, 26 Jun 2024 09:36:33 +0000 hourly 1 https://wordpress.org/?v=6.7.1 By: martin Schönert https://arangodb.com/2012/04/skip-list-indices-in-avocadodb/#comment-1112 Tue, 12 Jun 2012 11:09:00 +0000 http://www.arangodb.com/?p=269#comment-1112 @cbe805f7b5beef4c5fa24a04bb0788ec:disqus : thank you for the link to the very interesting article
 “Fast Intersection of Sorted Lists Using SSE Instructions”.

Unfortunately the title is a slight misnomer, it should be
“Fast Intersection of Sorted Arrays Using SSE Instructions”.
That is, the algorithm described really only works on arrays
(sometimes also called vectors).

It does not work on linked lists (basically that is because the
SSE instructions work on arrays/vectors and not on linked
lists).  So it does not really work on the skip-lists (which are
basically multiple linked lists with different strides).

]]>
By: Xdevelopers https://arangodb.com/2012/04/skip-list-indices-in-avocadodb/#comment-1111 Sat, 09 Jun 2012 03:24:00 +0000 http://www.arangodb.com/?p=269#comment-1111 FYI https://highlyscalable.wordpress.com/2012/06/05/fast-intersection-sorted-lists-sse/
Fast Intersection of Sorted Lists Using SSE Instructions

]]>
By: Anders https://arangodb.com/2012/04/skip-list-indices-in-avocadodb/#comment-1110 Sun, 29 Apr 2012 08:45:00 +0000 http://www.arangodb.com/?p=269#comment-1110 Why not use a ‘hits’ counter and ‘watermark’ levels to determine the skiplist nodes?
That way frequent occuring searches would be much faster.

]]>
By: martin Schönert https://arangodb.com/2012/04/skip-list-indices-in-avocadodb/#comment-1109 Thu, 12 Apr 2012 19:43:00 +0000 http://www.arangodb.com/?p=269#comment-1109 I wrote the following comment yesterday but only got around to post it here now.  I am aware that you came to the realization that doubly-linked lists will not improve the speed yourself.  Still, writing this comment was enough effort so that I do not want to throw it away 😉

Skiplists are not going to give faster search times when you use doubly linked lists.

I think the easiest way to see this is to realise that skiplists are basically multiple linked lists and they behave a lot like ordinary ordered linked lists.  And just like ordinary linked lists do not gain in search time from the backward link, neither do skiplists.

If you are willing to use more memory to decrease the search time (the average search time) with

skiplists, you can simply use more pointers on the higher layers.  So instead of adding another link

on the next layer with 50% probability (so you have about 1/2 as many links on each layer than on

the layer below) you can add another link on the next layer with e.g. 60% probability.  There is a limit beyond which speed will actually become worse again (because you start to spend more time walking the layers downward than you save from not having to scan the nodes on each layer rightward).  Actually on a perfectly balanced skiplist (where the nodes on every layer are equidistant), probability 50% is optimal.  But on a skiplist that is not perfectly balanced a slightly higher probability is a little bit better (my handwaving argument is that you loose more on

skips that are longer than average than you win on the skips that are shorter than average).

The backward links can still be useful on a plain linked list.  Namely if you may receiver a pointer to a node from some other part of your program and need to remove that node from the linked list. Than a backward link will allow you to remove that node in constant time from the linked list. Without the backward link you will need to search for the node, which will take O(N) time.  On a skiplist the same holds, i.e. backward links will allow you to remove random access nodes in constant time.  But since searching for the node only takes O(log(N)) time, the gain is smaller.

Hope this helps to answer the question. If not please feel free to ask further questions.

]]>
By: NoSQLFreak https://arangodb.com/2012/04/skip-list-indices-in-avocadodb/#comment-1108 Wed, 11 Apr 2012 22:05:00 +0000 http://www.arangodb.com/?p=269#comment-1108 Scratch what I asked, just put some thought to it and realised you wont achieve anything by having a doubly-linked list.
I was thinking more on keeping the previous pointer as you traverse the list so as to narrow the search but even that wont help a there is no relationship into which node to start from when switching to the finer resolution list.

]]>
By: NoSQLFreak https://arangodb.com/2012/04/skip-list-indices-in-avocadodb/#comment-1107 Wed, 11 Apr 2012 14:29:00 +0000 http://www.arangodb.com/?p=269#comment-1107 Would the extra effort in keeping a doubly-linked list give faster search results?
Maybe the extra ram required is too much a price for the marginal speed gain?

]]>