Using Riak CS with Hadoop
Anthony.Valenti at inmar.com
Wed Aug 31 14:32:18 EDT 2016
I did try it with the command below, followed by the error. I am not sure how to specify that it should go to our s3 cluster, but I tried it after the @ sign and then the bucket. It goes to Amazon, which seems to be by default. I know it isn't an issue with Riak and that technically it should work, but I was looking for someone that might have tried this and gotten it to work. It seems like a configuration in Hadoop more than anything. So I am asking there also.
hadoop distcp -update /user/test/client/part-m-00000 s3a://access_key:email@example.com/test-bucket
16/08/31 09:37:38 INFO s3a.S3AFileSystem: Caught an AmazonServiceException, which means your request made it to Amazon S3, but was rejected with an error response for some reason.
16/08/31 09:37:38 INFO s3a.S3AFileSystem: Error Message: Status Code: 403, AWS Service: Amazon S3, AWS Request ID: 94AE01A87CE82C24, AWS Error Code: null, AWS Error Message: Forbidden
16/08/31 09:37:38 INFO s3a.S3AFileSystem: HTTP Status Code: 403
16/08/31 09:37:38 INFO s3a.S3AFileSystem: AWS Error Code: null
16/08/31 09:37:38 INFO s3a.S3AFileSystem: Error Type: Client
16/08/31 09:37:38 INFO s3a.S3AFileSystem: Request ID: 94AE01A87CE82C24
16/08/31 09:37:38 INFO s3a.S3AFileSystem: Class Name: com.amazonaws.services.s3.model.AmazonS3Exception
16/08/31 09:37:38 ERROR tools.DistCp: Invalid arguments:
com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 403, AWS Service: Amazon S3, AWS Request ID: 94AE01A87CE82C24, AWS Error Code: null, AWS Error Message: Forbidden, S3 Extended Request ID: qYOsZBJ5O3cFaKpbdKEpbou0Zx5hSgsHJHum/kPvjZu6CAvJ2jqVoUVUUXpquEtw
Invalid arguments: Status Code: 403, AWS Service: Amazon S3, AWS Request ID: 94AE01A87CE82C24, AWS Error Code: null, AWS Error Message: Forbidden
From: Luke Bakken [mailto:lbakken at basho.com]
Sent: Wednesday, August 31, 2016 1:18 PM
To: Valenti, Anthony
Cc: riak-users at lists.basho.com
Subject: Re: Using Riak CS with Hadoop
Riak CS provides an S3 capable API, so theoretically it could work.
Have you tried? If so and you're having issues, follow up here.
lbakken at basho.com
On Wed, Aug 31, 2016 at 7:38 AM, Valenti, Anthony <Anthony.Valenti at inmar.com> wrote:
> Has anyone setup Hadoop to be able use Raik CS as an S3
> source/destination instead of or in addition to Amazon S3? Hadoop
> assumes that it should go to Amazon S3 by default. Specifically, I am
> trying to use Hadoop distcp to copy files to Riak CS.
Inmar Confidentiality Note: This e-mail and any attachments are confidential and intended to be viewed and used solely by the intended recipient. If you are not the intended recipient, be aware that any disclosure, dissemination, distribution, copying or use of this e-mail or any attachment is prohibited. If you received this e-mail in error, please notify us immediately by returning it to the sender and delete this copy and all attachments from your system and destroy any printed copies. Thank you for your cooperation.
Notice of Protected Rights: The removal of any copyright, trademark, or proprietary legend contained in this e-mail or any attachment is prohibited without the express, written permission of Inmar, Inc. Furthermore, the intended recipient must maintain all copyright notices, trademarks, and proprietary legends within this e-mail and any attachments in their original form and location if the e-mail or any attachments are reproduced, printed or distributed.
More information about the riak-users