Looking for Riak recommendations for modeling data with N:1 references

Mark Phillips mark at basho.com
Mon Jun 4 17:53:53 EDT 2012


Hi,

Assuming I'm understanding your use case correctly, the first option seems
like your best bet.

Primary key lookups are where Riak really shines, and if you're only
GET-ing one key and then manipulating the associated json-dict with some
app code, things should perform quite well (assuming your objects aren't
huge).

Does that make sense? Or did I misunderstand things?

Mark

On Sun, May 27, 2012 at 5:45 PM, elij <elij.mx at gmail.com> wrote:

> Hello,
>
> I am evaluating Riak for a project, and am looking for some
> recommendations on modeling data for optimal performance. The data is a
> single 'object' (henceforth named 'Widget') that needs to be looked up via
> N possible attributes, and should have a reference to theses keys. There
> will be many hundreds of millions of such Widgets (keys will not fit in
> cluster ram). Currently this data is housed in a RDBMS, but we are looking
> at a few alternatives due to single node scaling issue, the desire for
> easier operations, and growth to more datacenters.
>
> Consider the following example Widget object..
>
> Widget:
>  vendorAkey: "widget001"
>  vendorBkey: "bluewidget6"
>  vendorCkey: "sprocket42"
>  widgetData: <json blob of data>
>
> My ideas so far are to:
>
> 1) have 'reference lookups' performed application side, with a 'widget'
> bucket at the end.
>
> widget001 = 282ec0a1-a842-11e1-83cd-34159e0284ea
> bluewidget6 = 282ec0a1-a842-11e1-83cd-34159e0284ea
> sprocket42 = 282ec0a1-a842-11e1-83cd-34159e0284ea
>
> then finally the 'real widget'
> 282ec0a1-a842-11e1-83cd-34159e0284ea = <Widget json dict>
>
> With the idea being that the application code would fetch the uuid1 value
> by vendor key, and then perform another fetch of the actual widget data
> based on the response of the first (if found). The widget json dict would
> contain the vendor keys as well (for any needed cleanup down the road,
> cross reference, etc).
>
> 2) Use secondary indexes and have each vendor 'key' be a secondary index.
> I heard[1] that secondary indexes are slow though.
>
> 3) use layout of the first solution, but with links instead of application
> side lookups. I also hear[2] that links are slow too.
>
> I am leaning towards #1, but would like to hear of any better
> recommendations.
>
> Thanks.
>
> [1]: http://basho.com/blog/technical/2012/05/25/Scaling-Riak-At-Kiip/
> [2]: http://www.infoq.com/presentations/Case-Study-Riak-on-Drugs
>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120604/9fea139e/attachment.html>


More information about the riak-users mailing list