nodes aren't talking

B. Todd Burruss bburruss at real.com
Mon Nov 9 12:40:35 EST 2009


thx dan.  your explanation of vnodes makes sense, but isn't what i would
expect.  for HA reasons i would expect the riak server to understand the
physical location of 2 vnodes (maybe use IP address) and only replicate
to a vnode that is on another physical server.  if riak works as you
describe it doesn't seem very viable for an HA system.

can a riak developer comment on this?  i'm not an erlang guy :)

thx!


On Sun, 2009-11-08 at 16:34 -0800, Dan Reverri wrote:
> Hi Todd,
> 
> 
> Regarding the doorbell port, it looks like the latest updates have
> removed this configuration value:
> http://hg.basho.com/riak/src/tip/doc/basic-setup.txt
> 
> 
> 
> 
> Regarding the successful writes with a node down:
> This could be due to the way replicas are stored. Riak breaks a
> cluster into a set number of vnodes. These vnodes are distributed
> randomly between the physical nodes. In your case you have two
> physical nodes each handling half of the vnodes. If the keys you are
> storing are replicated to vnodes located on the working physical node
> than the write will succeed. This is, however, just a guess; the Riak
> developers may have more insight into this issue.
> 
> 
> Hinted handoff may also be playing a role; you can read about hinted
> handoff in the Amazon Dynamo paper:
> http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
> 
> 
> Hinted handoff is explained in section 4.6.
> 
> 
> On Fri, Nov 6, 2009 at 4:41 PM, B. Todd Burruss <bburruss at real.com>
> wrote:
>         i have two nodes and have set W=2 and DW=2 when i "store" a
>         value, and
>         R=2 when i read.  (see below for my server configuration)
>         
>         as long as my nodes stay up (i only have two) things are
>         great, but if i
>         simulate a failure by killing one of the nodes, the "store"
>         calls will
>         start to fail, but not right away.  some of the stores will
>         succeed
>         after the second node is down.  i can verify this by leaving
>         the failed
>         node down and starting the test again with unused keys.  about
>         the first
>         13 will succeed, but after that i get "internal server"
>         JiakExceptions.
>         
>         thx!
>         
>         
>         On Thu, 2009-11-05 at 16:03 -0800, Dan Reverri wrote:
>         
>         
>         > I think the start-join.sh script only makes use of 2
>         arguments (not
>         > sure why the documentation has the doorbell port). The
>         second argument
>         > (the node) is passed to riak_startup_cluster:join_cluster
>         which
>         > expects a node which should be in the form node at host
>         >
>         >
>         > Can you try the following?
>         > ./start-join.sh config/btoddb.erlenv btb1 at riak-btb1
>         >
>         >
>         > Thanks,
>         > Dan
>         >
>         > On Thu, Nov 5, 2009 at 9:09 AM, B. Todd Burruss
>         <bburruss at real.com>
>         > wrote:
>         >         i've setup two riak nodes on separate machines and
>         they don't
>         >         seem to be talking.  i've even used wireshark to
>         monitor the
>         >         activity.  i see the servers listening on their web
>         ports, but
>         >         nothing on the doorbell port.   i've tried with
>         version 0.6
>         >         and also version 379 from version control
>         >
>         >         i'm using the java_client like this:
>         >
>         >
>         >         JiakObject obj = new JiakObject( "mybucket", key );
>         >         obj.set( "anything", value );
>         >
>         >         JiakClient riakClient = new JiakClient( "riak-btb1",
>         "8001" );
>         >         riakClient.setBucketSchema( "mybucket",
>         Arrays.asList(new
>         >         String[]{"anything"}), null, null, null );
>         >         riakClient.store( obj, 2, 2 );
>         >
>         >
>         >         i start the first server like this:
>         >
>         >         ./start-fresh.sh config/btoddb.erlenv
>         >
>         >         and i start the second server like this:
>         >
>         >         ./start-join.sh config/btoddb.erlenv riak-btb1 9000
>         >
>         >         here is my riak config (i modify the host names and
>         node name
>         >         for the second node):
>         >
>         >         {cluster_name, "btoddb-cluster"}.
>         >         {ring_state_dir, "btoddb/ringstate"}.
>         >         {ring_creation_size, 16}.
>         >         {gossip_interval, 10000}.
>         >         {doorbell_port, 9000}.
>         >         {storage_backend, riak_dets_backend}.
>         >         {riak_dets_backend_root, "btoddb/dets-store"}.
>         >         {riak_cookie, default_riak_cookie}.
>         >         %% {riak_heart_command,
>         >         "(cd /btoddb/riak-0.6; ./start-restart.sh
>         >         config/btoddb.erlenv)"}.
>         >         {riak_nodename, btb1}.
>         >         {riak_hostname, "riak-btb1"}.
>         >
>         >         {jiak_name, "jiak"}.
>         >         {riak_web_ip, "riak-btb1"}.
>         >         {riak_web_port, 8001}.
>         >         {riak_web_logdir, "btoddb/weblogs"}.
>         >
>         >         when i run the servers and use the java client, data
>         is only
>         >         saved on the server that the client connected to.
>         if i stop
>         >         the client and point it to the second server, it
>         will only
>         >         write to the second server.  i see no evidence that
>         the
>         >         servers are even trying to communicated.  i have
>         used
>         >         wireshark to verify this.  these are linux boxes
>         without any
>         >         firewalls running.  i have tried it on ubuntu and
>         centos.
>         >
>         >         any ideas?  thx!
>         >
>         >         _______________________________________________
>         >         riak-users mailing list
>         >         riak-users at lists.basho.com
>         >
>         http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>         >
>         >
>         >
>         
>         
>         
> 
> 






More information about the riak-users mailing list