best practices on data storage?

Antonio Rohman Fernandez rohman at
Thu Jul 28 02:47:37 EDT 2011


I also thought on that... but then the "User" object could become
really big... imagine i post 10 statuses every day and thousands of
friends comments all the time on them... also... wouldn't be troublesome
to update the "User" object when "friends" comments on statuses? as you
should have to retrieve the data to insert the new comment nested and if
several friends comment at the same times i see data getting lost in the
way... or i'm missing something?


On Wed, 27 Jul 2011
23:40:30 -0700, Sylvain Niles wrote: 

> Hi Rohman, the conversation
yesterday got us to thinking and Basho confirmed that buckets are a form
of key prefix. So no matter how small the bucket it will traverse the
whole key space for a map reduce. We sat down and did some thinking of
how to work our data differently as we have a similar use case to you
and decided on nested docs using Ripple. In our case we had special
buckets for each user like you describe below. Now that bucket is a
nested JSON struct inside the user object instead of a separate bucket.
In your use case you could have all statuses as a nested struct on your
user object and display would be a matter of linkwalking all an user's
friends and parsing status content with some time/sorting. 
> On Wed,
Jul 27, 2011 at 11:26 PM, Antonio Rohman Fernandez wrote:
Yesterday, somebody suggested that not for having the data distributed
on smaller buckets, Riak's MapReduce operations would be faster... while
nobody at Basho confirmed that yet, i'm now wondering which is the best
way for storing data... lets imagine this simple excercise:
>> 1. We
have entities users, friends, statuses and comments in a web app
>> 2.
Users can make friends with other users
>> 3. Users can post statuses
4. Friends ( Users ) can comment on user's statuses 
>> At first i
thought on having a bucket called "users" with all users and then for
friend linkage i was thinking on having personal buckets like
"rohman_friends", "fyodor_friends", etc... with the keys to the users
instead of a big "friends" bucket for easy querying... but seems i'm
wrong... so...
>> How would you distribute the data on buckets? and
how would you run MapReduce jobs? Would you use a support SQL database
to store relationship between keys? is possible on an only Riak
>> thanks
>> Rohman 
>> [1]
>> CEO, Founder & Lead Engineer
>> rohman at [2]

>> [3]
>> [4]
>> Wedding Album
>> _______________________________________________
riak-users mailing list
>> riak-users at [6]



CEO, Founder & Lead
rohman at [10] 		 
[11] [12]
Wedding Album [13] 


[2] mailto:rohman at
mailto:riak-users at
mailto:rohman at
mailto:rohman at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list