Riak newbie, need to know if I can switch from Cassandra for a messageQueue For Erlang

Eric Moritz eric at themoritzfamily.com
Fri May 11 10:31:07 EDT 2012


I am guessing that in Cassandra that each user's mailbox has a single row
key and each message has a lexicographic or timestamp for a column key to
preserve order.

This can be emulated by using a bucket name as the row key, something like
"mailbox-{user-key}" for the bucket and the lexicographic key for the
object key.  So a message's fully qualified URI in Riak would be something
like /buckets/mailbox-ericmoritz/2012-05-01T09:41:00.0324Z-0.0.0.29/

The message Id I generated is the message's timestamp as ISO-8601 with a
slugified reference() tacked on the end, this makes it unique and
lexicographically sortable.

To query the messages, you can add a secondary index for the user key on
the object.  You will have to sort the objects by key at query time.  If
sorting on reads is an issue, you can store a ordset() index under a
different bucket.  You'll have to turn off last write wins and do your
own conflict resolution, but resolving a set is as easy as ordsets:union().
 This ordset() is going to be much smaller and easier to manage than
keeping a sorted list of messages as a single document.  Granted you'll
have to pair it with a tombstone set for removing the messages from the
index document and worry about garbage collecting the delete items.

This technique may or may not end up performing worse than sorting on in
the mapred query so you'll have to benchmark it all yourself.

Since these queues are temporary, have you thought about using a persistent
messaging queue system?  I have not really explored how to set up a Highly
Available messaging queue like RabbitMQ, but it may be worth exploring for
you.

TL;DR You can do it in Riak, Cassandra may be better, a messaging queue
might be best.

Eric.

On Wed, May 9, 2012 at 4:25 PM, Morgan Segalis <msegalis at gmail.com> wrote:

> Hi Bogunov,
>
> Thank you for your fast answer.
>
> If I understand correctly your though, for every insert, I should retrieve
> the list of message, append a new message and then store the list again ?
> If it is, doesn't it performance eating ? retrieve a whole list (that can
> be long if the user has not connected since a long time) append a new
> message and store it ? there is 2 operations just for storing… Or is there
> a way to append data directly on a key ?
>
> Best regards,
>
> Le 9 mai 2012 à 22:16, Bogunov a écrit :
>
> Hi, morgan.
>
>
> - Store from 1 to X messages per registered user.
>
> Store all messages as one key.
>
> Get the number of stored messages per user. (may be stored on a variable)
>
> yes
>
>> retrieve all messages from an user at once.
>
> get one key =)
>
>> delete all messages from an user at once.
>
> delete one key
>
> delete all messages that are older than X months no matter the user
>
> you can store index entry like "written in X, X1 month" and find all users
> who has old messages and truncate them
>
> Is it realistic to have a bucket per user ?
> - bucket is prefix, for bucket you keep your default preferences: R/W/N,
> post/pre-commit hooks, etc. Not much gain in doing so.
>
>
>
> On Wed, May 9, 2012 at 11:51 PM, Morgan Segalis <msegalis at gmail.com>wrote:
>
>> Hi everyone !
>>
>> I have followed with interest the riak evolution.
>>
>> I have a chat server written in Erlang from scratch with my own protocol.
>> Right now I'm using MySQL in order to store Users credentials and friend
>> list.
>>
>> I'm using Cassandra via thrift to store message that an offline user has
>> got, until the user retrieves it.
>>
>> My Cassandra data model, is quite simple, a Column Family, each row is an
>> user, each column is a message (title = timestamp for getting time ordered
>> data, value = message).
>>
>> The thing is, despite the fact that I'm happy with Cassandra performance
>> and TimeToLive feature, I would like to avoid the hassle of thrift to
>> update my code.
>>
>> Since Riak is (As I have seen on multiple website) the closer thing to
>> cassandra (but simpler).
>>
>> However Riak paradigm seems to be different somehow, with bucket which
>> I'm not yet familiar with.
>>
>> Before getting to know Riak better I would like to have some expert
>> opinion on the matter.
>>
>> I need to do several things :
>>
>> - Store from 1 to X messages per registered user.
>> - Get the number of stored messages per user. (may be stored on a
>> variable)
>> - retrieve all messages from an user at once.
>> - delete all messages from an user at once.
>> - delete all messages that are older than X months no matter the user
>>
>> I would really love your opinion on, is Riak fit my needs, and if so,
>> what would be the data model ?
>> Is it realistic to have a bucket per user ?
>>
>> Best regards,
>>
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
>
>
> --
> email: bogunov at gmail.com
> skype: i.bogunov
> phone: +7 903 131 8499
> Regards, Bogunov Ilya
>
>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120511/54a458cc/attachment.html>


More information about the riak-users mailing list