rik loading taking huge time?any suggestion for betterment

Sangeetha.PattabiRaman2 at cognizant.com Sangeetha.PattabiRaman2 at cognizant.com
Mon Aug 27 23:26:49 EDT 2012


Dear team,


I am trying to load 25 million dataset (1.3 Gb)  of sample call data  onto riak..its a 4-quad core ---1.5 TB storage 2-node raik cluster...takes  real    5671m12.812s.please suggest the solutions for the betterment of the same...5671m12.812s is quite huge...we deal with bigdata and I need to store and test 165 GB on the riak..if so I may take years for loading I guess with the present scenario...loaded 165 GB on to mongodb and got the results..for comparative performance study of mongodb  and riak db  ...please do assist me with the  same .



using the following code for loading :

#!/usr/local/bin/escript
main([Filename]) ->
    {ok, Data} = file:read_file(Filename),
    Lines = tl(re:split(Data, "\r?\n", [{return, binary},trim])),
    lists:foreach(fun(L) -> LS = re:split(L, ","), format_and_insert(LS) end, Lines).

format_and_insert(Line) ->
    JSON = io_lib:format("{\"id\":\"~s\",\"phonenumber\":~s,\"callednumber\":~s,\"starttime\":~s,\"endtime\":~s,\"status\":~s}", Line),
    Command = io_lib:format("curl -X PUT http://10.232.5.169:8098/riak/CustCalls25m/~s -d '~s' -H 'content-type: application/json'", [hd(Line),JSON]),
    io:format("Inserting: ~s~n", [hd(Line)]),
    os:cmd(Command).

[hadoop at CTSINGMRGTO data]$ time ./load_data25m CustCalls25m.csv >> 25m.txt &
[3] 32354


[hadoop at CTSINGMRGTO data]$
real    5671m12.812s
user    1725m31.862s
sys     3074m42.135s
[hadoop at CTSINGMRGTO data]$

[hadoop at CTSINGMRGTO data]$ tail -4 25m.txt
Inserting: 24999997
Inserting: 24999998
Inserting: 24999999
Inserting: 25000000
[hadoop at CTSINGMRGTO data]$

This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120828/6a0783ee/attachment.html>


More information about the riak-users mailing list