hadoop distcp with Riak as source throws exception

psterk jazzfan159 at gmail.com
Sat Apr 23 19:33:51 EDT 2016


I am trying to use hadoop distcp to copy objects from Riak into HDFS:

hadoop distcp s3n://<access key>:<secret key>@test/test/setup.ml  /tmp/riak

I am getting the following exception:

16/04/23 23:08:14 INFO mapreduce.Job: Task Id :
attempt_1461004632127_0033_m_000000_1, Status : FAILED
Error: org.apache.hadoop.security.AccessControlException: Permission denied:
s3n://test/test/setup.ml
	at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:449)
	at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:427)
	at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
	at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:181)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
	at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at org.apache.hadoop.fs.s3native.$Proxy17.retrieveMetadata(Unknown Source)
	at
org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:476)
	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:219)
	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.jets3t.service.impl.rest.HttpException
	at
org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:519)
	at
org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:281)
	at
org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:942)
	at
org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2148)
	at
org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2075)
	at
org.jets3t.service.StorageService.getObjectDetails(StorageService.java:1093)
	at
org.jets3t.service.StorageService.getObjectDetails(StorageService.java:548)
	at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:174)
	... 18 more


Here are the settings in /etc/hadoop/conf/jets3t.properties

s3service.s3-endpoint=dsriak1.<domain>
s3service.s3-endpoint-http-port=8080
s3service.disable-dns-buckets=true

s3service.max-thread-count=10
threaded-service.max-thread-count=10
s3service.https-only=false
httpclient.proxy-autodetect=false
httpclient.proxy-host=dsriak1.<domain>
httpclient.proxy-port=8080
httpclient.retry-max=11

When I unable DEBUG log level, I see this exception:

16/04/23 23:09:48 DEBUG ssl.FileBasedKeyStoresFactory: CLIENT TrustStore:
/etc/security/clientKeys/all.jks
16/04/23 23:09:48 DEBUG impl.TimelineClientImpl: Cannot load customized ssl
related configuration. Fallback to system-generic settings.
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at
sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:653)
        at
sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:56)
        at
sun.security.provider.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:225)
        at
sun.security.provider.JavaKeyStore$DualFormatJKS.engineLoad(JavaKeyStore.java:70)
        at java.security.KeyStore.load(KeyStore.java:1445)
        at
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.loadTrustManager(ReloadingX509TrustManager.java:166)
        at
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.<init>(ReloadingX509TrustManager.java:81)
        at
org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(FileBasedKeyStoresFactory.java:209)
        at
org.apache.hadoop.security.ssl.SSLFactory.init(SSLFactory.java:131)
        at
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.newSslConnConfigurator(TimelineClientImpl.java:532)
        at
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.newConnConfigurator(TimelineClientImpl.java:507)
        at
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.serviceInit(TimelineClientImpl.java:269)
        at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:170)
        at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at
org.apache.hadoop.mapred.ResourceMgrDelegate.serviceInit(ResourceMgrDelegate.java:103)
        at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at
org.apache.hadoop.mapred.ResourceMgrDelegate.<init>(ResourceMgrDelegate.java:97)
        at org.apache.hadoop.mapred.YARNRunner.<init>(YARNRunner.java:112)
        at
org.apache.hadoop.mapred.YarnClientProtocolProvider.create(YarnClientProtocolProvider.java:34)
        at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:95)
        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
        at
org.apache.hadoop.tools.DistCp.createMetaFolderPath(DistCp.java:408)
        at
org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:172)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:153)
        at org.apache.hadoop.tools.DistCp.run(DistCp.java:126)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.tools.DistCp.main(DistCp.java:430)
16/04/23 23:09:49 INFO impl.TimelineClientImpl: Timeline service address:
http://dsm03.eng.endgames.local:8188/ws/v1/timeline/
16/04/23 23:09:49 DEBUG security.UserGroupInformation: PrivilegedAction
as:hdfs (auth:SIMPLE)
from:org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:136)
16/04/23 23:09:49 DEBUG ipc.YarnRPC: Creating YarnRPC for
org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
16/04/23 23:09:49 DEBUG ipc.HadoopYarnProtoRPC: Creating a
HadoopYarnProtoRpc proxy for protocol interface
org.apache.hadoop.yarn.api.ApplicationClientProtocol

Do I have to export a client key from Riak and store on the hadoop gateway
host?



--
View this message in context: http://riak-users.197444.n3.nabble.com/hadoop-distcp-with-Riak-as-source-throws-exception-tp4034186.html
Sent from the Riak Users mailing list archive at Nabble.com.



More information about the riak-users mailing list