forced timeout in riak_client:get/3

Tuncer Ayaz tuncer.ayaz at gmail.com
Sun Nov 1 10:01:20 EST 2009


I've been testing a 2 or 3 riak nodes cluster with the following setup:
debug-fresh riak0_config riak0 at 127.0.0.1
debug-join  riak1_config riak0 at 127.0.0.1
debug-join  riak2_config riak0 at 127.0.0.1

All configs use the gb_trees backend.

They all have unique doorbell ports and are all unique riak-0.6 trees
to be sure that there's no data dir conflicts.
I've chosen not to use hg tip as there seem to be no changes
to riak_get_fsm.erl which would possibly be a fix to the
issue I run into.

The test:
(1) knowing that all nodes are up I put/2 all test data with W=N
(2) run tests that get/3 with R=1 where each get/3 responds within ~60ms
(3) stop riak0
(4) re-run tests. it works correctly with get/3 R=1 but I run into the default
     timeout of 15 seconds.
(5) debug-restart riak0_config riak0 at 127.0.0.1
(6) rerun tests with riak0 back online and it again responds within ~60ms

The decision which of the 3 or 2 nodes to connect to is done with
a client-side availability check. From that list of online nodes
I do riak_client:connect/1 to a random online node.

Any idea what's going wrong?




More information about the riak-users mailing list