Why is Riak Search using the leveldb backend?

Ryan Zezeski rzezeski at basho.com
Fri Nov 11 20:57:47 EST 2011


Elias,

This is an implementation detail of Search.  It stores something we call a
"proxy object" under the bucket _rsid_<index name> [1].  It does this so it
knows which entries to remove from the index when an object is
updated/deleted.  To achieve your goal you should be able to set the
buckets `_rsid_bucket1` and `_rsid_bucket2` to use the `bucket1` and
`bucket2` backends, respectively.

-Ryan

[1]:
https://github.com/basho/riak_search/blob/1.0.1/src/riak_indexed_doc.erl#L314

On Fri, Nov 11, 2011 at 6:02 PM, Elias Levy <fearsome.lucidity at gmail.com>wrote:

> I've set up a brand new cluster, configured two buckets in it, and
> configured search on both.  The cluster is using the multibackend, and I've
> created three instances of the leveldb backend, one each for my two
> buckets, and a third one as a default, just in case.
>
> The config looks something like:
>
> {storage_backend, riak_kv_multi_backend},
> {multi_backend_default, <<"leveldb">>},
> {<<"leveldb">>   , riak_kv_eleveldb_backend, [{data_root,
> "/data/riak/leveldb"}]},
> {<<"bucket1">>    , riak_kv_eleveldb_backend, [{data_root,
> "/data/riak/bucket1"}, other config items ]},
> {<<"bucket2">>    , riak_kv_eleveldb_backend, [{data_root,
> "/data/riak/bucket2"},other config items ]},
>
> and
>
>  {eleveldb, [ {data_root, "/data/riak/leveldb"} ]},
>
> Each of bucket1 and bucket two have had their bucket properties set to
> utilize the backend named after them.
>
> The idea behind this is that we wanted to segregate data within different
> buckets in the cluster, as they have different traffic patterns.  It should
> allow us to set leveldb parameters, such as the cache size, that are
> appropriate for each set of data.  We may also want to back them up at
> different schedules.
>
> So having set up this cluster, I loaded a day's worth of data into it.
>  Now, when I look at the data folder I see I got data in
> /data/riak/bucket1, /data/riak/bucket2, and in /data/riak/merge_index.
>  That much I expected.  But I also seem to have data in /data/riak/leveldb.
>
> The client loading the data only inserts into bucket1 and bucket2.
>
> I can look at the leveldb data files under bucket1 and bucket2 and I see
> my data.  Running strings over the leveldb data files under the
> /data/riak/leveldb shows data that appears related to Riak Search.  E.g.:
>
> X-Riak-VTaga2asadaFaHakaKaraOa9aYaSa2aTaxamaQawaQaDa5awjjl
> indexjjjl
> X-Riak-Last-Modifiedh
> $>jjjh
> riak_idx_docm
> bucket1m
> ?00b1a8ce42a54d81bf46d9bb7a7b4b21_5661725545713369108_1318223203l
> i_ag_tsm
> +00b1a8ce42a54d81bf46d9bb7a7b4b21_1318204800l
> +00b1a8ce42a54d81bf46d9bb7a7b4b21_1318204800k
> i_bg_tsm
> +eacc2a8e434f4498a70aa6ce904efe19_1318222800l
> +eacc2a8e434f4498a70aa6ce904efe19_1318222800k
>
>
> i_ag_ts and i_bg_ts are two of our indexed fields, and those are the
> values being indexed.  So why is Riak Search storing data in the leveldb
> backend?  I thought it only used the merge_index backend.   Is that wrong?
>
> Elias Levy
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20111111/7f914be8/attachment.html>


More information about the riak-users mailing list