Using UUID as keys is problematic for Riak Search

David James davidcjames at gmail.com
Sun Aug 10 20:03:14 EDT 2014


Thanks for the quick responses.

Eric: I don't understand. Why does Solr have the UUIDField (
http://lucene.apache.org/solr/4_7_0/solr-core/org/apache/solr/schema/UUIDField.html)
if it were not indexable? What is the nature of the limitation?

Jason: Thanks, I will consider Base 64 encoding.


On Sun, Aug 10, 2014 at 7:19 PM, Jason Campbell <xiaclo at xiaclo.net> wrote:

> I like UUIDs for everything as well, although I expected compatibility
> issues with something. Base 64 encoding the binary value is a nice
> compromise for me, and takes 22 characters (if you drop the padding)
> instead of the usual 36 for the hyphenated hex format.
>
> It would still require re encoding all the keys, but it's a partial
> solutions.
>
>    *From: *Eric Redmond
> *Sent: *Monday, 11 August 2014 9:15 AM
> *To: *David James
> *Cc: *riak-users
> *Subject: *Re: Using UUID as keys is problematic for Riak Search
>
> You're correct that yokozuna only supports utf8, because the Solr
> interface only supports utf8 (note that the failure happens when attempting
> to build a non-utf8 JSON add document command). There's not much we can do
> here at the moment, since we've yet to (if ever) support a custom interface
> to Solr that accepts arbitrary binary values. In the mean time, to use
> yokozuna, you'll have to encode your keys to utf8.
>
> Eric Redmond, Engineer @ Basho
>
> On Sun, Aug 10, 2014 at 4:01 PM, David James <davidcjames at gmail.com>
> wrote:
>
> I'm using UUIDs for keys in Riak -- converted to bytes, not UTF-8 strings.
> (I'd rather spend 16 bytes for each key, not 36.)
>
> As I understand it, Yokozuna maps the Riak key to _yz_id.
>
> Here is the suggested schema from the documentation:
>
> <!-- schema.xml -->
> <field name="_yz_id" type="_yz_str" indexed="true" stored="true"
> multiValued="false" required="true"/>
> <fieldType name="_yz_str" class="solr.StrField" sortMissingLast="true"/>
>
>  Would you expect this to work with Riak Search? I would hope so.
>
> (Or must keys be UTF-8 strings?)
>
> I get this error, which does not surprise me, given that the _yz_id is
> defined as a string:
>
> ==> log/error.log <==
>
> 2014-08-10 18:24:16.221 [error] <0.610.0>@yz_kv:index:206 failed to index
> object
> {<<"test-0001">>,<<94,143,33,35,45,180,78,164,151,237,72,81,56,13,28,250>>}
> with error {ucs,{bad_utf8_character_code}} because
> [{xmerl_ucs,from_utf8,1,[{file,"xmerl_ucs.erl"},{line,185}]},{mochijson2,json_encode_string,2,[{file,"src/mochijson2.erl"},{line,186}]},{mochijson2,'-json_encode_proplist/2-fun-0-',3,[{file,"src/mochijson2.erl"},{line,167}]},{lists,foldl,3,[{file,"lists.erl"},{line,1248}]},{mochijson2,json_encode_proplist,2,[{file,"src/mochijson2.erl"},{line,170}]},{mochijson2,'-json_encode_proplist/2-fun-0-',3,[{file,"src/mochijson2.erl"},{line,167}]},{lists,foldl,3,[{file,"lists.erl"},{line,1248}]},{mochijson2,json_encode_proplist,2,[{file,"src/mochijson2.erl"},{line,170}]}]
> I don't think changing the schema.xml type for _yz_id to "solr.UUIDField"
> is a good idea.
>
> What can I do?
>
> Thanks,
> David
>
>
>
>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20140810/6d6c4769/attachment.html>


More information about the riak-users mailing list