Proper data filtration and pagination

Eric Redmond eredmond at basho.com
Mon Jan 13 12:39:24 EST 2014


Anton,

Depending on how soon you plan to be in production, this sounds like a good usecase for yokozuna (the new Riak Search) coming in 2.0 (sometime in Q1). It has builtin support for handling semistructured data like JSON or XML, nested even, and will allow you to query by multiple fields at once. The most recent version also has excellent pagination support.

To toy with it right now, you can find more information on https://github.com/basho/yokozuna

Eric


On Jan 13, 2014, at 5:46 AM, Anton Yakushin <anton at istylemyself.com> wrote:

> Hi,
> 
> Could anyone advice on how to properly organize data structure so 
> it could be efficiently filtered and results paginated?
> 
> A structure example from our current system.
> We have products stored in the following JSON:
> {
>     "name": <string>
>     "category": {
>         "key" : <string>,
>         "name": <string>
>     },
>     "sub_categories": [
>         {
>             "key" : <string>,
>             "name": <string>
>         }
>     ],
>     ... lots of other attributes: strings, arrays, objects.
> }
> 
> What we need is be able to fetch filtered by their attributes
> products with paginated results.
> 
> What we have tried:
> 1. Implemented filtration using key filters. So currently product
> keys look like: "<unique_id>-c:<category_key>-sc:<subcategory_key>..."
> Disadvantages were that the keys could possibly change and they 
> had quite a large length. Also it was hard to add a new attribute to a key.
> 
> 2. MapReduce was used to filter by all attributes that were not in
> product keys. Also results from key filters were passed through 
> MapReduce to decrease amount of requests.
> 
> 3. SecondaryIndexes were used for some attributes. But it's very
> inconvenient that one can't filter by several indexes at a time.
> 
> 4. At the first moment seemed that Riak Search could solve all these
> problems, but it transforms product objects into a flat structure
> making them hardly usable. Case with Search + MapReduce was too slow
> and didn't allow to paginate data.
> 
> 5. Pagination with MapReduce was too slow and put exceptional load
> on our servers.
> 
> So the issue which holds our system from growing and feature
> increasing is how should the data be re-organized to make them easily and efficiently
> filtered with pagination?
> 
> Would highly appreciate any suggestions. It's really a big issue now
> so we even think about switching to another database (e.g. MongoDB) if
> we would not find a suitable solution in Riak.
> 
> Best regards,
> Anton
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20140113/acb637b5/attachment.html>


More information about the riak-users mailing list