Search code examples
Does hadoop create InputSplits parallely...

hadoopmapreduceemramazon-emr

Read More
Download a file from the Internet directly to my S3 bucket...

hadoopamazon-web-servicesamazon-s3emr

Read More
Spark cannot see hive external table...

hadoopamazon-web-servicesapache-sparkhiveemr

Read More
Hadoop EMR job runs out of memory before RecordReader initialized...

hadoopout-of-memoryheap-memoryemr

Read More
Run EMR job with output results in another AWS account S3 bucket...

javaamazon-web-servicesamazon-s3emr

Read More
run mrjob on Amazon EMR, t2.micro not supported...

pythonhadoopamazon-web-servicesemrmrjob

Read More
MRJob - Limit Number of Task Attemps...

emrmrjob

Read More
How can I check if a s3path exists or not in Spark [using scala]?...

scalaapache-sparkemr

Read More
fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey are not set for EMR default IAM roles...

emrelastic-map-reduceamazon-emr

Read More
Recommended Format for loading data into hadoop, for simple map reduce...

jsonhadoopamazon-s3emr

Read More
ClusterID vs JobFlowID on AWS EMR...

amazon-web-servicesbotoemr

Read More
Amazon EMR + mrjob: bootstrap error, "bootstrap action 1 returned a non-zero return code"...

amazon-ec2emrbootstrappingmrjob

Read More
Duplicate records get written to MongoDB after Hadoop MapReduce (using Mongo Hadoop Connector)...

mongodbhadoopemr

Read More
Centerlized EMR System...

emrhl7healthvault

Read More
nmap does not show all open ports...

network-programminghadoop-yarnemrnmap

Read More
Setting Spark Classpath on Amazon EMR...

hadoopamazon-s3apache-sparkemr

Read More
AWS EMR - install HUE using Java SDK...

javaamazon-web-servicesemrelastic-map-reducehue

Read More
Mahout - ParallelALSFactorizationJob running too long?...

hadoopmahoutrecommendation-engineemr

Read More
delete s3 files from a pipeline AWS...

amazon-web-servicesemramazon-data-pipeline

Read More
What is Apache Spark doing before a job start...

hadoopamazon-web-servicesamazon-s3apache-sparkemr

Read More
Error while submitting aws emr job from command line...

amazon-web-servicesamazon-ec2amazon-s3emr

Read More
Pydoop stucks on readline from HDFS files...

pythonhadoopemr

Read More
boto-emr job error: python broken pipeline error and java.lang.OutOfMemoryError...

pythonamazon-web-servicesbotoemr

Read More
How to edit and relaunch a terminated cluster on Amazon EMR?...

javahadoopamazon-web-servicesemr

Read More
How to run a PySpark job (with custom modules) on Amazon EMR?...

pythonamazon-ec2apache-sparkemrpyspark

Read More
How to execute AWS emr and redshift scripts?...

amazon-web-servicesamazon-ec2amazon-s3emr

Read More
What is the best practice to monitor AWS EMR job running progress?...

javaamazon-web-servicesemrelastic-map-reduceamazon-emr

Read More
AWS EMR: how to get the first element out of describe_jobflows() API call result...

pythonamazon-web-servicesbotoemrelastic-map-reduce

Read More
Can cloudera impala make use of task nodes in EMR?...

hadoopemrimpala

Read More
"Path is not legal" error when loading data from S3 into external Hive table located in S3...

hadoopamazon-web-servicesamazon-s3hiveemr

Read More
BackNext