Indexing of intermediate nested fields in Riak search

Rusty Klophaus rusty at basho.com
Tue Nov 1 12:10:46 EDT 2011


Hi Elias,

Yes, your workaround should work, you should be able to index and query on
subobjects with the same field names. I've included two examples below that
work successfully for me locally. Please take a look, this may help uncover
what's going wrong.

Best,
Rusty

bin/search-cmd install mybucket

# Example 1

curl -v -X PUT \
-H "Content-Type: application/json" \
-d @- \
http://127.0.0.1:8098/riak/mybucket/mykey1 \
<<EOF
{"menu": {
  "id": "file",
  "value": "File",
  "popup": {
    "menuitem": [
      {"value": "New", "onclick": "CreateNewDoc()"},
      {"value": "Open", "onclick": "OpenDoc()"},
      {"value": "Close", "onclick": "CloseDoc()"}
    ]
  }
}}
EOF

# All three of these queries should return mybucket/mykey1
bin/search-cmd search mybucket "menu_value:File"
bin/search-cmd search mybucket "menu_popup_menuitem_value:New"
bin/search-cmd search mybucket "menu_popup_menuitem_value:Open"


# Example 2

bin/search-cmd install mybucket

curl -v -X PUT \
-H "Content-Type: application/json" \
-d @- \
http://127.0.0.1:8098/riak/mybucket/mykey2 \
<<EOF
{"foo": [ { "bar" : "baz" }, {"bar":"yyy"} ] }
EOF

# Both of these queries should return mybucket/mykey2
bin/search-cmd search mybucket "foo_bar:baz"
bin/search-cmd search mybucket "foo_bar:yyy"



On Mon, Oct 31, 2011 at 5:44 PM, Elias Levy <fearsome.lucidity at gmail.com>wrote:

> Any ideas on this? Should indexing for sub-objects in an array with the
> same field names in a JSON document work?
>
>
>
> On Sun, Oct 30, 2011 at 6:56 AM, Elias Levy <fearsome.lucidity at gmail.com>wrote:
>
>> On Sat, Oct 29, 2011 at 9:59 PM, Elias Levy <fearsome.lucidity at gmail.com>wrote:
>>
>>> I am wondering if Riak search can index intermediate nested fields.
>>>  When indexing json data through the KV precommit hook, the underscore is
>>> understood in the schema as indicating nesting.  Thus, foo_bar will index
>>> the value "bah" of field "bar" in the json document { "foo" : { "bar" :
>>> "bah" } }.
>>>
>>> What I'd like to know is if it can instead index the key "bar" in the
>>> same json document.  In my current use case I want to be able to find
>>> documents with certain values for "bar" for these types of documents.
>>>
>>> Can this be done by simply indexing the field "foo"?  Does search know
>>> to index all keys in "foo" if foo is a hash, or all its values if it is an
>>> array?
>>>
>>
>> My testing on 1.0.0 shows that this appears not to work.  Looking at the
>> source for the search kv extractor gives the impression that a workaround
>> would be to instead store { "foo" : [ { "x": "bar", "y": "bah"}, { "x":
>> "woo", "y": "zoo" }, ... ]} and index "foo_z" to be able to search for
>> "bar" and "woo".  I.e. it appears the extractor will index each subdocument
>> in the array.
>>
>> At least that is what json_text() function implies with the tests:
>>
>> {<<"
>>
>> {\"menu\": {
>>
>>   \"id\": \"file\",
>>
>>   \"value\": \"File\",
>>
>>   \"popup\": {
>>
>>     \"menuitem\": [
>>
>>       {\"value\": \"New\", \"onclick\": \"CreateNewDoc()\"},
>>
>>       {\"value\": \"Open\", \"onclick\": \"OpenDoc()\"},
>>
>>       {\"value\": \"Close\", \"onclick\": \"CloseDoc()\"}
>>
>>     ]
>>
>>   }
>>
>> }}">>,
>>
>>               [{<<"menu_id">>, <<"file">>},
>>
>>                {<<"menu_value">>, <<"File">>},
>>
>>                {<<"menu_popup_menuitem_value">>, <<"New">>},
>>
>>                {<<"menu_popup_menuitem_onclick">>, <<"CreateNewDoc()">>},
>>
>>                {<<"menu_popup_menuitem_value">>, <<"Open">>},
>>
>>                {<<"menu_popup_menuitem_onclick">>, <<"OpenDoc()">>},
>>
>>                {<<"menu_popup_menuitem_value">>, <<"Close">>},
>>
>>                {<<"menu_popup_menuitem_onclick">>, <<"CloseDoc()">>}]},
>>
>>              %% From http://www.ibm.com/developerworks/library/x-atom2json.html
>>
>>
>>> The implication of the above code is that you can search for
>> menu_popup_menuitem_value:New or menu_popup_menuitem_value:Open and find
>> the doc.  But my testing shows this not to work.  If any of the documents
>> in the array have the same fields, those fields will not be indexed.
>>
>> E.g. if I set my schema to index foo_bar and insert {"foo": [ { "bar" :
>> "baz" }, {"xxx":"yyy"} ] }, I can search for foo_bar:baz and receive a
>> match.  If I instead insert '{"foo": [ { "bar" : "baz" }, {"bar":"yyy"} ]
>> }' and search for foo_bar:baz I receive no match.
>>
>> Is this expected behavior or a bug?
>>
>> Elias
>>
>>
>>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>


-- 
Rusty Klophaus (@rustyio)
*Basho Technologies, Inc.*
www.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20111101/950e9758/attachment.html>


More information about the riak-users mailing list