issue on riak bulk loading---taking huge time---Can any one help me out with the same pl

Reid Draper reiddraper at gmail.com
Tue May 15 09:02:23 EDT 2012


On May 15, 2012, at 1:11 AM, <Sangeetha.PattabiRaman2 at cognizant.com> wrote:

>  
> Can any one help me out with the same pl…struck up with the same.
Please see Mathias' answer here: http://lists.basho.com/pipermail/riak-users_lists.basho.com/2012-May/008342.html
>  
> Dear team,
>  
> FYI:we have a 4 quad core intel processor on each server on 2 node cluster with more than 1 TB of storage
> I Ihave constructed the  2 node physical machine riak  cluster with n_val 2 and my app.config ,vm.args are attached for your reference..
>  
> Please tell me where the bulk inserted data onto riak db gets stored on Local file system…its taking  huge time to load small size itself…how to tune it to perform to large scale since we deal wit hbigdata of in few hungred GB’s?????????????????
>  
> Cmd used:time ./load_data1m Customercalls1m.csv
>  
> ./load_data100m CustomerCalls100m(got this error so changed default config of app.config…from 8 MB to 3072 MB
> escript: exception error: no match of right hand side value {error,enoent}
>  
>  
> size
> Load time
> No of mappersonapp.config
> Js-max-vm-mem on app.config
> Js-thread-stack
> 100k(10,lakhrows)—5 MB
> 20m39.625 seconds
> 48
> 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large)
> 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large)
> 1millionrows---54 MB
> 198m42.375seconds
> 48
> 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large)
> 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large)
> .
>  
>  
> ./load_data script used:
>  
> #!/usr/local/bin/escript
> main([Filename]) ->
>     {ok, Data} = file:read_file(Filename),
>     Lines = tl(re:split(Data, "\r?\n", [{return, binary},trim])),
>     lists:foreach(fun(L) -> LS = re:split(L, ","), format_and_insert(LS) end, Lines).
>  
> format_and_insert(Line) ->
>     JSON = io_lib:format("{\"id\":\"~s\",\"phonenumber\":~s,\"callednumber\":~s,\"starttime\":~s,\"endtime\":~s,\"status\":~s}", Line),
>     Command = io_lib:format("curl -X PUT http://10.232.5.169:8098/riak/CustomerCalls100k/~s -d '~s' -H 'content-type: application/json'", [hd(Line),JSON]),
>     io:format("Inserting: ~s~n", [hd(Line)]),
>     os:cmd(Command).
>  
>  
>  
> Thanks in advance!!!!!!!!!!waiting fr  the reply…plz anyone help..struck u pwit hbulk loading…..and make me clear how riak splits the data and gets loaded on cluster
> Thanks & regards
> sangeetha
>  
> This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful.
> <app.config.txt><vm.args.txt>_______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120515/0e75e1de/attachment.html>


More information about the riak-users mailing list