appearance of text different in riak different than original xml data

Sean Cribbs sean at basho.com
Sat Apr 7 11:32:07 EDT 2012


Wes,

Also, if you're using curl to load things into Riak, be sure to use
--data-binary with your payload, which will not try to convert multibyte
characters or line-terminators.

On Sat, Apr 7, 2012 at 11:21 AM, Wes James <comptekki at gmail.com> wrote:

> I found it. I thought if any web site might be able to handle unicode,
> it would be erlang.org, so I went and grabbed some of the header text:
>
> <?xml version='1.0' encoding='utf-8'?>
> <!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Transitional//EN'
>    'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'>
> <html xmlns='http://www.w3.org/1999/xhtml'>
> <head>
> <title>test</title>
>  <meta http-equiv='Content-Type' content='text/html;charset=utf-8'/>
> </head>
>
> and it works correctly now.
>
> thanks
>
> On Fri, Apr 6, 2012 at 3:18 PM, Kresten Krab Thorup <krab at trifork.com>
> wrote:
> > It looks like you may have missed specifying the charset when importing
> your data; could that be the case?
> >
> > You need to specify the charset when importing 8-bit text.  It looks
> like your xml is utf-8 encoded, so it should be imported using something
> like this:
> >
> > curl -H 'Content-Type: text/html;charset=UTF-8' -X PUT @datafile.xml
> http://host:port/riak/bucket/key
> >
> > The various language clients have different ways of specifying the
> charset for a value; so if you imported the xml using some other method you
> need to find out where to specify it.
> >
> > Perhaps to verify, you can check the result of a curl -v (verbose, print
> the headers) for one of your values.  If it does not come back with a
> charset=XXX in the Content-Type header, then this is your problem.
> >
> > Kresten
> >
> >
> >
> > On Apr 6, 2012, at 4:44 PM, Wes James wrote:
> >
> > I imported many records, one of which looks like this:
> >
> > <add>
> > <doc>
> > <field name='id'>0</field>
> > <field name='title'>Ekologie lučních porostů (A)</field>
> > <field name='author_editor'>Rychnovská, Milena, Emilie
> Balátová-Tuláčková, Blanka Úlehlová, Jaroslav Pelikán</field>
> > <field name='date_of_publication'>1985</field>
> > <field name='publisher'>Academia</field>
> > <field name='keywords'>-</field>
> > <field name='notes'>amazon 5/22/09 Category: Ecology (Y)</field>
> > <field name='valuation'>8.00</field>
> > <field name='purchase_price'>10.00</field>
> > </doc>
> > </add>
> >
> > with
> >
> > bin/search-cmd solr books books.xml
> >
> > Notice the characters above.  In the riak -> cowboy -> webpage it looks
> like:
> >
> > Id:     0
> > Title:  title: Ekologie luÄ ních porostů (A)
> > Auther Editor:  author_editor: Rychnovská, Milena, Emilie
> Balátová-TulÃ¡Ä ková, Blanka Úlehlová, Jaroslav Pelikán
> > Date of Publication:    date_of_publication: 1985
> > Notes:  publisher: Academia
> > Notes:  notes: amazon 5/22/09 Category: Ecology (Y)
> > Purchase Price: purchase_price: 10.00
> > Valuation:      valuation: 8.00
> >
> > Is there a way I can fix this?
> >
> > Doing an io:format it it looks like:
> >
> > Rychnovská, Milena, Emilie Balátová-TulÃ¡Ä ková, Blanka Úlehlová,
> Jaroslav Pelikán
> >
> > Thanks,
> >
> > Wes
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com<mailto:riak-users at lists.basho.com>
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >
> >
> > Mobile: + 45 2343 4626 | Skype: krestenkrabthorup | Twitter: @drkrab
> > Trifork A/S  |  Margrethepladsen 4  | DK- 8000 Aarhus C |  Phone : +45
> 8732 8787  |  www.trifork.com<http://www.trifork.com>
> >
> >
> >
> >
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
Sean Cribbs <sean at basho.com>
Software Engineer
Basho Technologies, Inc.
http://basho.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120407/f61e8a93/attachment.html>


More information about the riak-users mailing list