MapReduce "not_found" from Solr search index

Magnus Kessler mkessler at basho.com
Tue Nov 17 09:09:11 EST 2015


On 16 November 2015 at 11:47, Ellis Pritchard <ellis.pritchard at ft.com>
wrote:

> Hi,
>
> I've configured a Solr search index ("erights-users") for my bucket (named
> "missing", default type), containing a bunch of JSON documents, with a
> search schema ("erightsuser"); this seems to be working OK for simple
> queries, i.e. I can run a Solr query against it and it returns expected
> results:
>
> $ curl http://localhost:8098/types/default/buckets/missing/props
>
>
> {"props":{"allow_mult":false,"basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dvv_enabled":false,"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":3,"name":"missing","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"quorum","rw":"quorum","search_index":"erights-users","small_vclock":50,"w":"quorum","write_once":false,"young_vclock":20}}
>
>
> $ curl http://localhost:8098/search/query/erights-users?q=country.code:ANT
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <response><lst name="responseHeader"><int name="status">0</int><int
> name="QTime">58</int><lst name="params"><str
> name="q">country.code:ANT</str><str name="shards">
> 127.0.0.1:8093/internal_solr/erights-users</str><str name="127.0.0.1:8093">_yz_pn:64
> OR (_yz_pn:61 AND (_yz_fpn:61)) OR _yz_pn:60 OR _yz_pn:57 OR _yz_pn:54 OR
> _yz_pn:51 OR _yz_pn:48 OR _yz_pn:45 OR _yz_pn:42 OR _yz_pn:39 OR _yz_pn:36
> OR _yz_pn:33 OR _yz_pn:30 OR _yz_pn:27 OR _yz_pn:24 OR _yz_pn:21 OR
> _yz_pn:18 OR _yz_pn:15 OR _yz_pn:12 OR _yz_pn:9 OR _yz_pn:6 OR
> _yz_pn:3</str></lst></lst><result name="response" numFound="20" start="0"
> maxScore="12.103038"><doc><str name="position.code">PR</str><str
> name="country.code">ANT</str><str name="responsibility.code">FIN</str><str
> name="industry.code">ENC</str><str
> name="contactAddress.country.code">ANT</str><str name="email">xxx at xxx.com</str><str
> name="gid">10783b99-9483-414d-a6f8-eb330ff6dfac</str><str
> name="userId">10422205</str><str
> name="_yz_id">1*default*missing*10783b99-9483-414d-a6f8-eb330ff6dfac*51</str><str
> name="_yz_rk">10783b99-9483-414d-a6f8-eb330ff6dfac</str><str
> name="_yz_rt">default</str><str name="_yz_rb">missing</str></doc> ...
>
>
> However, I'm trying to do a simple MapReduce on it, initially to count the
> documents (following the example in the 2.1.1 riakdocs) and I always seem
> to get 0 as a result:
>
> $ curl -XPOST http://localhost:8098/mapred      -H 'Content-Type:
> application/json'      -d
> '{"inputs":{"module":"yokozuna","function":"mapred_search","arg":["erights-users","country.code:ANT"]},"query":[{"map":{"language":"javascript","keep":false,"source":"function(v)
> { return [1];
> }"}},{"reduce":{"language":"javascript","keep":true,"name":"Riak.reduceSum"}}]}'
>
> [0]
>
>
> If I run with {"keep": true} on the map operation, I get the following:
>
>
> [[{"not_found":{"bucket_type":"default","bucket":"missing","key":"0063aac8-bb45-4051-a502-d541b41d327b","keydata":{}}},...
>
> (NB confusingly, my bucket is called "missing"!).
>
> Doing a GET for the keys that come back "not_found" works fine.
>
>
> What am I missing?
>
>
> Ellis.
>
> (Riak 2.1.1 MacOS X)
>
>
> Hi Ellis,

You don't mention why your use case requires MapReduce, but to simply
obtain the number of indexed objects there's a much easier way, using only
Solr query features:

    curl -s "
http://localhost:8098/search/query/erights-users?wt=json&q=country.code:ANT&rows=0
<http://localhost:8098/search/query/erights-users?wt=json&q=country.code:ANT?&q=*:*&rows=0>"
| python -mjson.tool | grep numFound

The above asks Solr to return its results as JSON ('wt=json'), and requests
no actual objects, just the header information. The remainder of the line
uses the python 'json.tool' module to pretty-print the response, and
extracts the number.

The various Riak clients also offer APIs to obtain search results and may
make it easier to extract the desired information.

Please let me know if this helped.

Regards,

Magnus


-- 
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20151117/80832560/attachment-0002.html>


More information about the riak-users mailing list