Some questions about Riak Search and Riak itself

Bryan Fink bryan at
Tue Oct 12 13:42:40 EDT 2010

On Tue, Oct 12, 2010 at 3:16 AM, Dmitry Demeshchuk <demeshchuk at> wrote:
> 1. I tried to put some Erlang terms into Riak bucket that is being
> indexed by Riak Search. I hoped that key-value lists like this
> Is there a way to send Erlang proplists into Riak and process them
> using Riak Search?

Hi, Dmitry.  We've filed a bug for doing exactly this:

In the meantime, you could also write your own extractor.  See the
"Other Data Encodings" section of

Or on the wiki:

> 2. Is there a way to query Erlang buckets indexes using any other APIs
> than REST API? The only way to query the bucket I found was
> /solr/some_bucket/select
> and my attempts of using Riak Search shell and Erlang API just failed.

If you could posts details about the ways in which your attempts
failed (error messages, etc.), we might be able to help you
troubleshoot them.

The other main way of querying Search indexes is using the map/reduce
Search input.  The "Querying via HTTP/Curl" section has an example of
how to hook this up:

And it's also possible to specify the same map/reduce input using any
of the Erlang clients (native, protocol buffer, or http).  Though
there is a small bug with the non-streaming native Erlang client at
the moment (  For an
example of using that syntax, have a look at the Wriaki project:

> 3. Is there a way to write custom analyzers in non-java languages? I
> saw the same question and found an answer that analyzer automatically
> tries to start JVM for its needs. The problem is that we don't have
> good Java and JVM developers so it would be better to use some other
> solutions (like OCaml or even C, for example). Also, I'm kinda
> suspicious about Java analyzers performance.

At the moment, the only non-Java language supported for custom
analyzers is Erlang.  You can specify an Erlang analyzer by adding an
"analyzer_factory" entry to your schema, of the form:

   {analyzer_factory, {erlang, my_modlue, my_function}}

Other formats for the analyzer_factory setting are:

   {erlang, my_module, my_function, Arguments}
   {java, FullyQualifiedClassNameAsString}
   {java, FullyQualifiedClassNameAsString, Arguments}

The last format is demonstrated in the "Defining a Schema" section of the docs:

Unfortunately, we haven't written much documentation about what an
analyzer is expected to do, but hopefully between the comments in
qilr_analyzer, and the default Erlang analyzer,
text_analyzers:default_analyzer_factory/2, you'll be able to work out
some of what you need.

> 4. Do you have any tips and advice about working with Unicode in Riak Search?

Encode everything in UTF-8.  There may still be a few bugs we need to
work out, but our intended goal is to have everything in that
department "just work" once you're using UTF-8 everywhere.


More information about the riak-users mailing list