Riak 1.4.12 high I/O while almost idle

Doug Rohrer drohrer at basho.com
Fri May 22 09:55:19 EDT 2015


Try running `riak-admin top` on the node to see what's going on:

http://docs.basho.com/riak/1.4.12/ops/running/tools/riak-admin/#top <http://docs.basho.com/riak/1.4.12/ops/running/tools/riak-admin/#top>

This may give you some insight into what that node is doing.

Doug


> On May 22, 2015, at 9:52 AM, Timo Gatsonides <tpjg at me.com> wrote:
> 
> I have a 7 node Riak cluster. Most nodes are on 1.4.10 and I am currently migrating the nodes from RedHat to Ubuntu (and Ext4 -> ZFS). 
> 
> One of the nodes is running Ubuntu 14 with ZFS for the data and has Riak 1.4.12 installed. On this node Riak is continuously reading and writing a lot of data. Even hours after it is started. There are no running transfers (riak-admin transfers). The other nodes are not very busy, almost no CPU, no I/O. For this one node see the output of iotop, uname -a, riak-admin status and some recent logging below.
> 
> There is nothing obvious in the Riak logging, except that there is an occasional reading/writing large object (once a minute or so, objects of around 5Mb-10Mb).
> 
> Is there a way to see what Riak is doing, what it is reading/writing?
> 
> Thanks,
> Timo
> 
> Total DISK READ :      81.99 M/s | Total DISK WRITE :      33.33 M/s
> Actual DISK READ:      39.59 M/s | Actual DISK WRITE:     241.72 M/s
>   TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                    
>  8723 be/4 riak       23.82 M/s   12.05 M/s  0.00 % 80.54 % beam.smp -K true -A 64~ -smp enable -- console
>  8725 be/4 riak        4.80 M/s  814.60 B/s  0.00 % 72.67 % beam.smp -K true -A 64~ -smp enable -- console
>  8657 be/4 riak        3.83 M/s  814.60 B/s  0.00 % 68.98 % beam.smp -K true -A 64~ -smp enable -- console
>   698 be/0 root        0.00 B/s    0.00 B/s  0.00 % 47.67 % [txg_sync]
>  8724 be/4 riak       13.07 M/s    7.68 M/s  0.00 % 39.04 % beam.smp -K true -A 64~ -smp enable -- console
>  8726 be/4 riak       50.91 K/s   13.59 M/s  0.00 % 36.77 % beam.smp -K true -A 64~ -smp enable -- console
>  7917 be/4 root       51.11 K/s    0.00 B/s  0.00 %  4.26 % [kworker/u16:5]
>   234 be/3 root        0.00 B/s 1629.19 B/s  0.00 %  0.01 % [jbd2/sda6-8]
>  8302 be/4 riak        0.00 B/s  814.60 B/s  0.00 %  0.00 % run_erl -daemon /tmp/r~ /usr/sbin/riak console
>  9942 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kworker/u16:6]
> 10319 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kworker/u16:0]
> 10320 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kworker/u16:1]
>  8401 be/4 riak        0.00 B/s  814.60 B/s  0.00 %  0.00 % beam.smp -K true -A 
> 
> #uname -a
> Linux tx4 3.13.0-49-generic #83-Ubuntu SMP Fri Apr 10 20:11:33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> 
> sys_system_version : <<"Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:8:8] [async-threads:64] [kernel-poll:true]">>
> sys_system_architecture : <<"x86_64-unknown-linux-gnu">>
> sys_threads_enabled : true
> sys_thread_pool_size : 64
> sys_wordsize : 8
> ring_num_partitions : 64
> ring_creation_size : 64
> storage_backend : riak_kv_multi_backend
> erlydtl_version : <<"0.7.0">>
> riak_control_version : <<"1.4.12-0-g964c5db">>
> cluster_info_version : <<"1.2.4">>
> riak_search_version : <<"1.4.12-0-g7fe0e00">>
> merge_index_version : <<"1.3.2-0-gcb38ee7">>
> riak_kv_version : <<"1.4.12-0-gc6bbd66">>
> 
> 1-minute stats for ...
> -------------------------------------------
> riak_kv_stat_ts : 1432301431
> vnode_gets : 748
> vnode_gets_total : 10051
> vnode_puts : 182
> vnode_puts_total : 3690
> vnode_index_refreshes : 0
> vnode_index_refreshes_total : 0
> vnode_index_reads : 10
> vnode_index_reads_total : 100
> vnode_index_writes : 182
> vnode_index_writes_total : 3686
> vnode_index_writes_postings : 2
> vnode_index_writes_postings_total : 139
> vnode_index_deletes : 0
> vnode_index_deletes_total : 0
> vnode_index_deletes_postings : 0
> vnode_index_deletes_postings_total : 13
> node_gets : 0
> node_gets_total : 0
> node_get_fsm_siblings_mean : 0
> node_get_fsm_siblings_median : 0
> node_get_fsm_siblings_95 : 0
> node_get_fsm_siblings_99 : 0
> node_get_fsm_siblings_100 : 0
> node_get_fsm_objsize_mean : 0
> node_get_fsm_objsize_median : 0
> node_get_fsm_objsize_95 : 0
> node_get_fsm_objsize_99 : 0
> node_get_fsm_objsize_100 : 0
> node_get_fsm_time_mean : 0
> node_get_fsm_time_median : 0
> node_get_fsm_time_95 : 0
> node_get_fsm_time_99 : 0
> node_get_fsm_time_100 : 0
> node_puts : 64
> node_puts_total : 1196
> node_put_fsm_time_mean : 691827
> node_put_fsm_time_median : 2168
> node_put_fsm_time_95 : 4415096
> node_put_fsm_time_99 : 8226455
> node_put_fsm_time_100 : 12045311
> read_repairs : 0
> read_repairs_total : 0
> coord_redirs_total : 0
> 
> ==> /var/log/riak/erlang.log.1 <==
> 08:41:19.235 [info] monitor long_gc <0.7.0> [{name,application_controller},{initial_call,{application_controller,start,1}},{almost_current_function,{gen_server,loop,6}},{message_queue_len,0}] [{timeout,293},{old_heap_block_size,0},{heap_block_size,6765},{mbuf_size,0},{stack_size,23},{old_heap_size,0},{heap_size,2142}]
> 08:41:19.437 [warning] Writing very large object (19730102 bytes) to <<...>>
> ==> /var/log/riak/console.log <==
> 2015-05-22 08:41:19.235 [info] <0.96.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc <0.7.0> [{name,application_controller},{initial_call,{application_controller,start,1}},{almost_current_function,{gen_server,loop,6}},{message_queue_len,0}] [{timeout,293},{old_heap_block_size,0},{heap_block_size,6765},{mbuf_size,0},{stack_size,23},{old_heap_size,0},{heap_size,2142}]
> 2015-05-22 08:41:19.437 [warning] <0.710.0>@riak_kv_vnode:encode_and_put_no_sib_check:1900 Writing very large object (19730102 bytes) to <<...>>
> ==> /var/log/riak/erlang.log.1 <==
> 08:41:23.417 [info] monitor long_gc <0.436.0> [{name,proxy_riak_kv_vnode_68507889249886074290797726533575766546371837952},{initial_call,{riak_core_vnode_proxy,init,1}},{almost_current_function,{riak_core_vnode_proxy,loop,2}},{message_queue_len,0}] [{timeout,2093},{old_heap_block_size,0},{heap_block_size,987},{mbuf_size,0},{stack_size,5},{old_heap_size,0},{heap_size,481}]
> ==> /var/log/riak/console.log <==
> 2015-05-22 08:41:23.417 [info] <0.96.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc <0.436.0> [{name,proxy_riak_kv_vnode_68507889249886074290797726533575766546371837952},{initial_call,{riak_core_vnode_proxy,init,1}},{almost_current_function,{riak_core_vnode_proxy,loop,2}},{message_queue_len,0}] [{timeout,2093},{old_heap_block_size,0},{heap_block_size,987},{mbuf_size,0},{stack_size,5},{old_heap_size,0},{heap_size,481}]
> ==> /var/log/riak/erlang.log.1 <==
> 08:41:25.319 [warning] Reading large object of size 12178287 from <<...>>
> ==> /var/log/riak/console.log <==
> 2015-05-22 08:41:25.319 [warning] <0.707.0>@riak_kv_vnode:do_get_object:1365 Reading large object of size 12178287 from <<..">>
> ==> /var/log/riak/erlang.log.1 <==
> 08:42:00.687 [info] monitor long_gc <0.707.0> [{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,7}] [{timeout,485},{old_heap_block_size,0},{heap_block_size,2584},{mbuf_size,0},{stack_size,46},{old_heap_size,0},{heap_size,909}]
> ==> /var/log/riak/console.log <==
> 2015-05-22 08:42:00.687 [info] <0.96.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc <0.707.0> [{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,7}] [{timeout,485},{old_heap_block_size,0},{heap_block_size,2584},{mbuf_size,0},{stack_size,46},{old_heap_size,0},{heap_size,909}]
> ==> /var/log/riak/erlang.log.1 <==
> 08:42:40.036 [info] monitor long_gc <0.712.0> [{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{eleveldb,write,3}},{message_queue_len,0}] [{timeout,759},{old_heap_block_size,0},{heap_block_size,4181},{mbuf_size,0},{stack_size,73},{old_heap_size,0},{heap_size,1683}]
> ==> /var/log/riak/console.log <==
> 2015-05-22 08:42:40.036 [info] <0.96.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc <0.712.0> [{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{eleveldb,write,3}},{message_queue_len,0}] [{timeout,759},{old_heap_block_size,0},{heap_block_size,4181},{mbuf_size,0},{stack_size,73},{old_heap_size,0},{heap_size,1683}]
> 
> 
> Some of the non-standard /etc/riak/vm.args:
> 
> ## Increase number of concurrent ports/sockets
> -env ERL_MAX_PORTS 64000
> ## Tweak GC to run more often 
> -env ERL_FULLSWEEP_AFTER 0
> ## Set the location of crash dumps
> -env ERL_CRASH_DUMP /var/log/riak/erl_crash.dump
> ## Raise the ETS table limit
> -env ERL_MAX_ETS_TABLES 256000
> ## Force the erlang VM to use SMP
> -smp enable
> ## For nodes with many busy_dist_port events, Basho recommends
> ## raising the sender-side network distribution buffer size.
> ## 32MB may not be sufficient for some workloads and is a suggested
> ## starting point.
> ## The Erlang/OTP default is 1024 (1 megabyte).
> ## See: http://www.erlang.org/doc/man/erl.html#%2bzdbbl <http://www.erlang.org/doc/man/erl.html#%2bzdbbl>
> +zdbbl 32768
> ## Raise the default erlang process limit 
> +P 256000
> 
> 
> Some of the non-standard /etc/riak/app.config:
> 
> %% -*- tab-width: 4;erlang-indent-level: 4;indent-tabs-mode: nil -*-
> %% ex: ts=4 sw=4 et
> [
>  %% For firewall config limit ports
>  { kernel, [
>             {inet_dist_listen_min, 6000},
>             {inet_dist_listen_max, 7999}
>           ]},
>  %% Riak Client APIs config
>               %% riak_handoff_port is the TCP port that Riak uses for
>               %% intra-cluster data handoff.
>               {handoff_port, 8099 },
>               {handoff_concurrency, 3},             
>              
>  %% Riak KV config
>  {riak_kv, [
>             %% Storage_backend specifies the Erlang module defining the storage
>             %% mechanism that will be used on this node.
> 
>             {storage_backend, riak_kv_multi_backend},
>             {multi_backend_default, <<"eleveldb">>},
>             {multi_backend, [        
>               {<<"eleveldb">>, riak_kv_eleveldb_backend, [
>                 {data_root, "/data/riak/leveldb"}
>               ]},
>               {<<"fs">>, riak_kv_eleveldb_backend, [
>                 {data_root, "/bigdata/riak/bigleveldb"}
>               ]},
>               {<<"cache">>, riak_kv_memory_backend, [
>                 {max_memory, 16}, %% 16Mb
> 				{ttl, 86400} %% 1 Day in seconds
>               ]}
>             ]},
> 
>             {anti_entropy, {off, []}},
>             {vnode_vclocks, true},
>             {legacy_keylisting, false},
>             {listkeys_backpressure, true},
>             {legacy_vnode_routing, false},
>             {fsm_limit, 50000},
>             {object_format, v1}
>            ]},
>  {riak_search, [
>                 %% To enable Search functionality set this 'true'.
>                 {enabled, false}
>                ]},
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150522/f15d4ab4/attachment-0002.html>


More information about the riak-users mailing list