issue on riak bulk loading---taking huge time

Mathias Meyer meyer at paperplanes.de
Mon May 14 04:00:48 EDT 2012


Hey Sangeetha,

at first sight, what strikes me as odd about your bulk import is that it shells out to curl. That has a significant impact on the time it takes to load the data into Riak. As a first means to improve script and performance, I'd recommend looking into using the Riak Erlang client instead [1]. Alternatively you could also run the Erlang code in the context of a locally running Riak and use riak:local_client() [2].

Cheers, Mathias
http://riakhandbook.com

[1] https://github.com/basho/riak-erlang-client
[2] https://github.com/basho/riak_kv/blob/master/src/riak_client.erl





On Monday, 14. May 2012 at 07:33, Sangeetha.PattabiRaman2 at cognizant.com wrote:

>  
>  
>  
>  
>  
> From: Pattabi Raman, Sangeetha (Cognizant)  
> Sent: Thursday, May 10, 2012 3:25 PM
> To: riak-users at lists.basho.com (mailto:riak-users at lists.basho.com)
> Subject: issue on riak bulk loading---taking huge time
>  
>  
>  
>  
>  
>  
> Dear team,
>  
>  
>  
>  
> FYI:we have a 4 quad core intel processor on each server on 2 node cluster with more than 1 TB of storage
>  
>  
> I Ihave constructed the 2 node physical machine riak cluster with n_val 2 and my app.config ,vm.args are attached for your reference..
>  
>  
>  
>  
> Please tell me where the bulk inserted data onto riak db gets stored on Local file system…its taking huge time to load small size itself…how to tune it to perform to large scale since we deal wit hbigdata of in few hungred GB’s?????????????????
>  
>  
>  
>  
> Cmd used:time ./load_data1m Customercalls1m.csv
>  
>  
>  
>  
> ./load_data100m CustomerCalls100m(got this error so changed default config of app.config…from 8 MB to 3072 MB
>  
>  
> escript: exception error: no match of right hand side value {error,enoent}
>  
>  
>  
>  
>  
>  
> size
>  
>  
>  
> Load time  
>  
>  
>  
> No of mappersonapp.config
>  
>  
>  
> Js-max-vm-mem on app.config
>  
>  
>  
> Js-thread-stack
>  
>  
>  
>  
> 100k(10,lakhrows)—5 MB
>  
>  
>  
> 20m39.625 seconds
>  
>  
>  
> 48
>  
>  
>  
> 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large)
>  
>  
>  
> 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large)
>  
>  
>  
>  
> 1millionrows---54 MB  
>  
>  
>  
> 198m42.375seconds
>  
>  
>  
> 48
>  
>  
>  
> 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large)
>  
>  
>  
> 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large)
>  
>  
>  
>  
>  
>  
> .
>  
>  
>  
>  
>  
>  
> ./load_data script used:
>  
>  
>  
>  
> #!/usr/local/bin/escript
>  
>  
> main([Filename]) ->
>  
>  
> {ok, Data} = file:read_file(Filename),
>  
>  
> Lines = tl(re:split(Data, "\r?\n", [{return, binary},trim])),
>  
>  
> lists:foreach(fun(L) -> LS = re:split(L, ","), format_and_insert(LS) end, Lines).
>  
>  
>  
>  
> format_and_insert(Line) ->
>  
>  
> JSON = io_lib:format("{\"id\":\"~s\",\"phonenumber\":~s,\"callednumber\":~s,\"starttime\":~s,\"endtime\":~s,\"status\":~s}", Line),
>  
>  
> Command = io_lib:format("curl -X PUT http://10.232.5.169:8098/riak/CustomerCalls100k/~s -d '~s' -H 'content-type: application/json'", [hd(Line),JSON]),
>  
>  
> io:format("Inserting: ~s~n", [hd(Line)]),
>  
>  
> os:cmd(Command).
>  
>  
>  
>  
>  
>  
>  
>  
> Thanks in advance!!!!!!!!!!waiting fr the reply…plz anyone help..struck u pwit hbulk loading…..and make me clear how riak splits the data and gets loaded on cluster
>  
>  
> Thanks & regards
>  
>  
> sangeetha
>  
>  
>  
>  
> This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful.
>  
>  
>  
>  
>  
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com (mailto:riak-users at lists.basho.com)
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>  
>  
> Attachments:  
> - app.config.txt
>  
> - vm.args.txt
>  







More information about the riak-users mailing list