How to best store arbitrarily large Java objects

Alex Moore amoore at basho.com
Thu Jul 21 11:36:10 EDT 2016


Hi Henning,

Responses inline:

...

> However, depending on the size of the `TreeMap`, the serialization
> output can become rather large, and this limits the usefulness of my
> object. In our tests, dealing with Riak-objects >2MB proved to be
> significantly slower than dealing with objects <200kB.


Yes. We usually recommend  keeping objects < 100kB for the best
performance; and Riak can usually withstand objects up to 1MB with the
understanding that everything will be a little slower with the larger
objects going around the system.


> My idea was to use a converter that splits the serialized JSON into
> chunks during _write_, and uses links to point from one chunk to the
> next. During _fetch_ the links would be traversed, the JSON string
> concatenated from chunks, deserialized and the object would be
> returned. Looking at `com.basho.riak.client.api.convert.Converter`, it
> seems this is not going to work.


Linkwalking was deprecated in Riak 2.0 so I wouldn't do it that way.

I'm beginning to think that I'll need to remodel my data and use CRDTs
> for individual fields such as the `TreeMap`. Would that be a better
> way?


This sounds like a plausible idea.  If you do a lot of possibly conflicting
updates to the Tree, then a CRDT map would be the way to go.  You could
reuse the key from the main object, and just put it in the new
buckettype/bucket.

If you don't need to update the tree much, you could also just serialize
the tree into it's own object - split up the static data and the often
updated data, and put them in different buckets that share the same key.

Thanks,
Alex


On Thu, Jul 21, 2016 at 9:36 AM, Henning Verbeek <hankipanky at gmail.com>
wrote:

> I have a Java class, which is being stored in Riak. The class contains
> a `TreeMap` field, amongst other fields. Out of the box, Riak is
> converting the object to/from JSON. Everything works fine.
>
> However, depending on the size of the `TreeMap`, the serialization
> output can become rather large, and this limits the usefulness of my
> object. In our tests, dealing with Riak-objects >2MB proved to be
> significantly slower than dealing with objects <200kB.
>
> So, in order to store/fetch instances of my class with arbitrary
> sizes, but with reliable performance, I believe I need to split the
> output into separate Riak-objects after serialization, and reassemble
> before deserialization.
>
> My idea was to use a converter that splits the serialized JSON into
> chunks during _write_, and uses links to point from one chunk to the
> next. During _fetch_ the links would be traversed, the JSON string
> concatenated from chunks, deserialized and the object would be
> returned. Looking at `com.basho.riak.client.api.convert.Converter`, it
> seems this is not going to work.
>
> I'm beginning to think that I'll need to remodel my data and use CRDTs
> for individual fields such as the `TreeMap`. Would that be a better
> way?
>
> Any other recommendations would be much appreciated.
>
> Thanks,
> Henning
> --
> My other signature is a regular expression.
> http://www.pray4snow.de
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160721/d13b314e/attachment-0002.html>


More information about the riak-users mailing list