Clarifying withoutFetch() with LevelDB and

Daniel Iwan iwan.daniel at gmail.com
Wed May 13 17:52:13 EDT 2015


Alex,

Thanks for answering this one and pointing me into right direction.
I did an experiment and wrote 0 bytes instead of a JSON and got the same
effect - level db folder is 80-220MB in size and activity around 20MB/s
write to disk, no read from disk.
Java client reports speed 45 secs for 1000 entries, so avg. 45ms per entry

Then I change the code so it writes to unique keys.
Astonishing difference. Very little write activity to disk ~ 600kB/s per
node 
and db is only 6MB big!
Java client reports 2.5 secs for 1000 entries!

The difference is huge both in speed and storage!

Now, I was always under the impression that writing with stale clock using
withoutFetch() would be the quickest way to put data into Riak. Looks like I
was wrong.
Would the all overhead be basically vclocks? 
I did not know that even if I'm using withoutFetch() data is still read in
the background (?)

Regarding data model.
I'm trying to solve particular problem.

I'm modeling timeline in Riak and I wanted to group events into batches of 1
hour windows. So basically timeboxing.
Data has to go to disk so there's no option for me to delay write.
Once 1000 events per key is reached next key is selected.
Keys are predictable so I can calculate them when read operation happens.
I want to grab as much events in one read operation as possible hence the
idea of writing in a controlled way to the same key with stale clock. 

Is there any better way to model that?
Obviously next thing I will try is to resolve sibling during write but I
hoped I can avoid/delay that until read happens.
This vclock/storage/bandwidth explosion really surprised me.

Regards
Daniel





--
View this message in context: http://riak-users.197444.n3.nabble.com/Clarifying-withoutFetch-with-LevelDB-and-tp4033051p4033057.html
Sent from the Riak Users mailing list archive at Nabble.com.




More information about the riak-users mailing list