0.14 and search schema

Rusty Klophaus rusty at basho.com
Tue Feb 22 15:37:50 EST 2011


Hi Gary,

Sorry to hear you are having troubles. Hopefully I can help. Please see my
responses inline below.

Best,
Rusty

On Mon, Feb 21, 2011 at 1:56 AM, Gary William Flake <gary at flake.org> wrote:

> I've been pulling my hair out over new behaviors with search-schema,
> and I am wondering if I've been simply doing it wrong.  Here's the
> issue: search-schema seem extremely brittle compared to the rest of
> the system in that one misstep requires that you effectively blow away
> your whole DB and try again.  Some specific examples:
>
> 1. Defaults don't seem to mean anything now in that if you add a new
> field to what you are storing, then riak will throw up on insert
> because the field isn't in the schema.
>

Just to make sure we're on the same page, the "default_field" schema is only
used to tell the system what field to use for searching if no field is
specified, or what field to use for indexing if you index a "Content-Type:
text/plain" document. It's the dynamic fields that allow you to add new
fields to your schema.

If you display the schema by running "./bin/search-cmd show_schema
BUCKETNAME", then you should see the following as the last field:

        {dynamic_field, [
            {name, "*"},
            {type, string},
            {analyzer_factory, {erlang, text_analyzers,
whitespace_analyzer_factory}}
        ]}

This should catch any unknown fields and index them as a whitespace
separated string. If you don't see this, or if you *do* see this but are
still getting errors, can you send the results of the show_bucket command
plus the error messages to me?


>
> 2. But if you add the new field to the schema, then it breaks all of
> your old records (e.g., they can't be deleted until you change the
> schema back or turn off hooks).
>

Hrm... this shouldn't be the case except for certain edge cases where you
*change* the type of field from string to integer and then try to view the
results via the Solr interface. New fields should be fine. Can you send me
the errors messages? If time permits a minimal failing test case would be
extremely helpful.


>
> 3. The semantics are non-intuitive (at least to me).  I had thought
> that {skip, true} would mean "this is data that I want but don't index
> over it".  Instead, it appears to mean "riak will not index it, and if
> you do a search, we will not provide this field value in the results
> even though it exists".
>

You're right, and this is a bug. The system should return the original
field. This is now tracked at https://issues.basho.com/show_bug.cgi?id=1014


>
> 4. To make it even more challenging, on ubuntu "search-cmd set-schema
> ..." ignores the last argument.  It seems to expect the schema to
> already exist in /var/lib/riak/scripts and I found that if I
> forcefully move and chmod the schema, then I can get it to stick.
>

Is it only chopping off the last argument on a "set-schema" command, or is
it doing this for all commands? Can you send me the command you are running
and the error message?

The command should be:

# search-cmd set-schema bucket filename


> 5. But, BTW, /var/lib/riak doesn't exist as part of 0.14's deb
> package, so you better know to add it yourself.
>

Riak Search should have a /var/lib/riaksearch directory, not a /var/lib/riak
directory (or are you experiencing this problem in Riak KV?)


>
>
> Anyhow, am I doing something wrong?  Or is how it is supposed to be?
> Does anyone have any tricks for working around this brittleness?
>
> Thanks,
> -- GWF
>
> PS - here's a trick that may help you.  Define a field called "future"
> and make set up the standard analyzer in the search schema.  Then,
> when you are tempted to add a new field, don't.  Instead, force your
> new "field:val" bits into future instead.  This way, you can get away
> with not having to change the schema nor do you need to add a new
> field.  I am hoping that this trick will help me keep a riak instance
> stable long enough to work through migration issues for when I do need
> to change the schema.
>

Hrm... this is not recommended behavior. You should be able to add fields to
your documents using either named fields (if you know the field names in
advance) or dynamic fields (using wildcard-style names). Let's get to the
bottom of this! :)


>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110222/23299bca/attachment.html>


More information about the riak-users mailing list