Slow performance using linkwalk, help wanted

Kevin Smith ksmith at basho.com
Tue Nov 9 10:40:27 EST 2010


Jan - 

I am hacking on it a bit to more closely match your use case. As soon as I have it done I will send it and the test generation script I'm using to populate test data.

--Kevin
On Nov 9, 2010, at 10:35 AM, Jan Buchholdt wrote:

> Kevin -
> 
> The test client is part of a bigger system and would be a bit too much top send to you. The method that is calling Riak looks like this:
> 
>    import com.basho.riak.client.*;
>    .
>    .
>    public List<Document> lookupDocuments(String personId, String url) {
>        RiakClient riak = new RiakClient(url);
> 
>        WalkResponse walkResponse = riak.walk("person", personId, "document,_,_");
>        if (walkResponse.isSuccess()) {
>            List<Document> out = new ArrayList<Document>();
>            List<? extends List<RiakObject>> steps = walkResponse.getSteps();
>            if (steps.size() != 1) {
>                throw new RuntimeException("Expected to walk one link. Walked " + steps.size());
>            }
>            List<RiakObject> step = steps.get(0);
>            for (RiakObject o : step) {
>                try {
>                    String chars = o.getValue();
>                    Builder builder = Protos.Document.newBuilder();
>                    JsonFormat2.merge(chars, builder);
>                    out.add(((Document) builder.build()).getDocument());
>                } catch (ParseException e) {
>                    throw new DocumentServiceException("Error parsing document", e);
>                }
>            }
>            return out;
>        } else {
>            throw new RuntimeException("Walk error: " + walkResponse.getHttpHeaders());
>        }
>    }
> 
> It could be interesting to repeat your test on our cluster, to see if we get the same numbers as you do. Is it possible for you to send the code behind your test?
> 
> --
> Jan Buchholdt
> Software Pilot
> Trifork A/S
> Cell +45 50761121
> 
> 
> 
> On 2010-11-09 15:47, Karsten Thygesen wrote:
>> On Nov 9, 2010, at 14:58 , Kevin Smith wrote:
>> 
>>> On Nov 9, 2010, at 5:01 AM, Karsten Thygesen wrote:
>>> 
>>>> Hi
>>>> 
>>>> OK, we will use a larger ringsize next time and will consider a data reload.
>>>> 
>>>> Regarding the metrics: the servers are dedicated to Riak use and it not used for anything else. They are new HP servers with 8 cores each and 4x146GB 10K RPM SAS disks in a contatenated mirror setup. We use Solaris with ZFS as filesystem and I have turned off atime update in the data partition.
>>>> 
>>>> The pool is built as such:
>>>> 
>>>> pool: pool01
>>>> state: ONLINE
>>>> scrub: scrub completed after 0h0m with 0 errors on Tue Oct 26 21:25:05 2010
>>>> config:
>>>> 
>>>>       NAME          STATE     READ WRITE CKSUM
>>>>       pool01        ONLINE       0     0     0
>>>>         mirror-0    ONLINE       0     0     0
>>>>           c0t0d0s7  ONLINE       0     0     0
>>>>           c0t1d0s7  ONLINE       0     0     0
>>>>         mirror-1    ONLINE       0     0     0
>>>>           c0t2d0    ONLINE       0     0     0
>>>>           c0t3d0    ONLINE       0     0     0
>>>> 
>>>> errors: No known data errors
>>>> 
>>>> so it is as fast as possible.
>>>> 
>>>> However - we use the ZFS default blocksize, which is 128Kb - is that optimal with bitcask as backend? It is rather large, but what is optimal with bitcask?
>>> I don't have much experience tuning Solaris or ZFS for Riak. This is a question best asked of Ryan and I will make sure he sees this.
>> Thanks!
>> 
>>>> The cluster is 4 servers with gigabit connection located in the same datacenter on the same switch. The loadbalancer is a Zeus ZTM, which does quote a few http optimizations including extended reuse of http connections and we usually see far better response times using the loadbalancer than using a node directly.
>>> Hmmm. Can you share what the performance times are like for direct cluster access?
>> In this case, there is no measurable difference whenever we ask a cluster node directly or we go through the loadbalancer. The largest difference is when we hit it with a lot of small requests, but that is not the case here.
>> 
>>>> When we run the test, each riak node is only about 100% cpu loaded (which on solaris means, that it only uses one of the 8 cores). We have seen spikes in the 160% area, but everything below 800% is not cpu bound. So all-in-all, the cpuload is between 5 and 10%.
>>> Can you send me the code you're using for the performance test? I'd like to run the exact code on my test hardware and see if that reveals anything.
>> Jan, can you please provide the test client?
>> 
>>> Also, low CPU usage might indicate you are IO bound. Do you know if Riak processes are spending much time waiting for IO to complete?
>>> 
>> It does not seem so. The servers are not IO bound, there is plenty of network capacity and the disks is only around 10% loaded.
>> 
>> My largest suspicion is on the datamodel - when having a 4-node cluster and doing a linkwalk, which need to combine around 5-600 documents, it will take quite some time, but we still feel, that the numbers is very high.
>> 
>> Perhaps we should consider a datamodel, where we collect, say, 100 documents in a basket and the only have to linkwalk 4-5 baskets to return an answer? Tempting, performancewise, but it makes it a lot harder to maintain the data afterwards as we can not just use map-reduce and similar technologies to handle data...
>> 
>> Karsten
>> 
>>> --Kevin
>>> 
>>> 





More information about the riak-users mailing list