MapReduce "not_found" from Solr search index

Ellis Pritchard ellis.pritchard at ft.com
Tue Nov 17 09:16:08 EST 2015


Thanks, that certainly gets the answer for "how many matches" PDQ.

Still concerns me that MapReduce over search indexes doesn't seem to be
working for me, but if there's no obvious answer, I'll leave it at that
until I need it!

Ellis.


On 17 November 2015 at 14:09, Magnus Kessler <mkessler at basho.com> wrote:

> On 16 November 2015 at 11:47, Ellis Pritchard <ellis.pritchard at ft.com>
> wrote:
>
>> Hi,
>>
>> I've configured a Solr search index ("erights-users") for my bucket
>> (named "missing", default type), containing a bunch of JSON documents, with
>> a search schema ("erightsuser"); this seems to be working OK for simple
>> queries, i.e. I can run a Solr query against it and it returns expected
>> results:
>>
>> $ curl http://localhost:8098/types/default/buckets/missing/props
>>
>>
>> {"props":{"allow_mult":false,"basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dvv_enabled":false,"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":3,"name":"missing","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"quorum","rw":"quorum","search_index":"erights-users","small_vclock":50,"w":"quorum","write_once":false,"young_vclock":20}}
>>
>>
>> $ curl
>> http://localhost:8098/search/query/erights-users?q=country.code:ANT
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>>
>> <response><lst name="responseHeader"><int name="status">0</int><int
>> name="QTime">58</int><lst name="params"><str
>> name="q">country.code:ANT</str><str name="shards">
>> 127.0.0.1:8093/internal_solr/erights-users</str><str name="127.0.0.1:8093">_yz_pn:64
>> OR (_yz_pn:61 AND (_yz_fpn:61)) OR _yz_pn:60 OR _yz_pn:57 OR _yz_pn:54 OR
>> _yz_pn:51 OR _yz_pn:48 OR _yz_pn:45 OR _yz_pn:42 OR _yz_pn:39 OR _yz_pn:36
>> OR _yz_pn:33 OR _yz_pn:30 OR _yz_pn:27 OR _yz_pn:24 OR _yz_pn:21 OR
>> _yz_pn:18 OR _yz_pn:15 OR _yz_pn:12 OR _yz_pn:9 OR _yz_pn:6 OR
>> _yz_pn:3</str></lst></lst><result name="response" numFound="20" start="0"
>> maxScore="12.103038"><doc><str name="position.code">PR</str><str
>> name="country.code">ANT</str><str name="responsibility.code">FIN</str><str
>> name="industry.code">ENC</str><str
>> name="contactAddress.country.code">ANT</str><str name="email">xxx at xxx.com</str><str
>> name="gid">10783b99-9483-414d-a6f8-eb330ff6dfac</str><str
>> name="userId">10422205</str><str
>> name="_yz_id">1*default*missing*10783b99-9483-414d-a6f8-eb330ff6dfac*51</str><str
>> name="_yz_rk">10783b99-9483-414d-a6f8-eb330ff6dfac</str><str
>> name="_yz_rt">default</str><str name="_yz_rb">missing</str></doc> ...
>>
>>
>> However, I'm trying to do a simple MapReduce on it, initially to count
>> the documents (following the example in the 2.1.1 riakdocs) and I always
>> seem to get 0 as a result:
>>
>> $ curl -XPOST http://localhost:8098/mapred      -H 'Content-Type:
>> application/json'      -d
>> '{"inputs":{"module":"yokozuna","function":"mapred_search","arg":["erights-users","country.code:ANT"]},"query":[{"map":{"language":"javascript","keep":false,"source":"function(v)
>> { return [1];
>> }"}},{"reduce":{"language":"javascript","keep":true,"name":"Riak.reduceSum"}}]}'
>>
>> [0]
>>
>>
>> If I run with {"keep": true} on the map operation, I get the following:
>>
>>
>> [[{"not_found":{"bucket_type":"default","bucket":"missing","key":"0063aac8-bb45-4051-a502-d541b41d327b","keydata":{}}},...
>>
>> (NB confusingly, my bucket is called "missing"!).
>>
>> Doing a GET for the keys that come back "not_found" works fine.
>>
>>
>> What am I missing?
>>
>>
>> Ellis.
>>
>> (Riak 2.1.1 MacOS X)
>>
>>
>> Hi Ellis,
>
> You don't mention why your use case requires MapReduce, but to simply
> obtain the number of indexed objects there's a much easier way, using only
> Solr query features:
>
>     curl -s "
> http://localhost:8098/search/query/erights-users?wt=json&q=country.code:ANT&rows=0
> <http://localhost:8098/search/query/erights-users?wt=json&q=country.code:ANT?&q=*:*&rows=0>"
> | python -mjson.tool | grep numFound
>
> The above asks Solr to return its results as JSON ('wt=json'), and
> requests no actual objects, just the header information. The remainder of
> the line uses the python 'json.tool' module to pretty-print the response,
> and extracts the number.
>
> The various Riak clients also offer APIs to obtain search results and may
> make it easier to extract the desired information.
>
> Please let me know if this helped.
>
> Regards,
>
> Magnus
>
>
> --
> Magnus Kessler
> Client Services Engineer
> Basho Technologies Limited
>
> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
>

-- 

------------------------------
This email was sent by a company owned by Pearson plc, registered office at 
80 Strand, London WC2R 0RL.  Registered in England and Wales with company 
number 53723.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20151117/be31d84c/attachment-0002.html>


More information about the riak-users mailing list