Number of nodes and links

Preston Marshall preston at synergyeoc.com
Sat May 15 18:17:41 EDT 2010


See some answers below.
On May 15, 2010, at 5:05 PM, Chris Hicks wrote:

> Where some of the B level objects (each level separated into different buckets) are linked to each other but in very limited numbers. In some of these areas there will be a ton of rapid updates and in others much more rare updates. Since a decent portion of my data modifications involve nothing more than shifting which C level object a D level object is associated with, is there anything I should keep in mind when planning for a lot of link-changing operations? For example, though for some reason I can't find it now, I remember reading that the amount of links one could have per object was something like 170K links, is that correct? I understand performance would degrade quite a bit when one has that amount of data for a single object and my project won't call for anywhere near that but just want to understand the nuances of the whole process.
> 
I won't address performance issues, as I don't know enough about Riak to do so.  The one thing that you need to keep in mind is that Riak doesn't do transactions.  You can easily spend days debugging an issue that turns out to be a race condition.  A simple example is a shopping cart.  Let's assume your software sucks and it takes 10 seconds to add something to a shopping cart, before saving to the database, but since you use AJAX, many can be triggered quickly.  If a customer adds more than one item to their cart within 10 seconds, whichever request finishes LAST will when, overwriting the other additions.  The reason for this is because when all of the requests were made, there was nothing in the cart. so their internal representation of document state reflects this.  When they all go to save back, the entire document is transmitted, and it replaces the previous version.  There is no way for Riak to magically merge these together, BUT it does provide you the ability to handle the situation.  To handle this, you need to enable allow_mult=true on the request, and be sure you are using vector clocks in each request.  You can then write a JS function that can merge the different versions together as you wish, or just throw out all but the latest.  I'm not sure where this function goes, as I've never done this before.
> Also, sort of related, I plan on running this whole thing on a single dedicated server machine (unless I get major usage and get the money to upgrade) that will have multiple CPU's. Should I just operate one physical node on that machine or should I match the number of CPU's with the number of nodes, essentially dedicating each physical CPU to handling a hardware node (if that is possible)? What would the pros and cons be to a single or multi-node system on that sort of hardware?
> 
Erlang should take advantage of your processors as long as SMP is enabled (in the Erlang VM).  However, it might be a good idea to run multiple nodes so you can kill a node to test the NRW parameters and how it merges your data back together.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20100515/c3cdfd11/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4663 bytes
Desc: not available
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20100515/c3cdfd11/attachment.p7s>


More information about the riak-users mailing list