TransWikia.com

Google cloud dataproc --files is not working

Stack Overflow Asked on November 29, 2021

I want to copy some property files to master and workers while submitting spark job,
so as stated in the doc I am using –files to copy the files on executors working directory.
but below command is not copying anything in executors working directory. So anybody have idea please share.

gcloud dataproc jobs submit spark --cluster=cluster-name --class=dataproc.codelab.word_count.WordCount --jars=gs://my.jar --region=us-central1 --files=gs://my.properties -- gs://my/input/ gs://my/output3/

One Answer

According to official Spark documentation, when Spark is running on Yarn, the Spark executor will use local directory configured for Yarn as working directory, which is by default - /hadoop/yarn/nm-local-dir/usercache/{userName}/appcache/{applicationId}.

So based on you description if it dose show up there then it's working as expected.

Answered by Henry Gong on November 29, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP