Search code examples
Using GroupBy while copying from HDFS to S3 to merge files within a folder...


hadoopamazon-s3amazon-emrdistcps3distcp

Read More
Spark job fails when cluster size is large, succeeds when small...


apache-sparkemramazon-emr

Read More
Obtain Master public DNS value from AWS EMR Cluster using the Java SDK...


javaamazon-web-servicesdnsaws-sdkamazon-emr

Read More
Zeppelin in EMR Cluster not listing Catalog tables in AWS glue...


amazon-web-servicesapache-sparkamazon-emrapache-zeppelinaws-glue

Read More
Hadoop command not found in bootstrap actions...


hadoopemramazon-emr

Read More
How to select a file from aws s3 by using wild character...


amazon-web-servicesamazon-s3amazon-emr

Read More
AWS EMR Spark: Error writing to S3 - IllegalArgumentException - Cannot create a path from an empty s...


amazon-web-servicesapache-sparkamazon-s3amazon-emr

Read More
Exception with Table identified via AWS Glue Crawler and stored in Data Catalog...


amazon-web-servicesapache-sparkamazon-s3amazon-emraws-glue

Read More
Adding spark-csv dependency in Zeppelin is creating a network error...


apache-sparkapache-spark-sqlemramazon-emr

Read More
parallel query to spark with sqlcontext...


apache-sparkparallel-processingpysparkamazon-emrbigdata

Read More
Converting hadoop fs paths to hdfs:// paths on EMR...


hadoopamazon-s3emramazon-emr

Read More
EMR cluster with external MySQL as Hive metastore...


amazon-web-servicesemramazon-emr

Read More
Persist Spark df in S3 with random hash in file name prefix...


apache-sparkamazon-s3amazon-emr

Read More
How to register custom UDF jar in HiveThriftServer2?...


apache-sparkhiveamazon-emrspark-thriftserver

Read More
My python job I run on the master of EMR cluster fails, how do I troubleshoot?...


pythonhadoophadoop-yarnemramazon-emr

Read More
adding python packages for use in spark in aws EMR...


apache-sparkamazon-emr

Read More
AWS configuration for Apache flink using EMR...


apache-flinkemramazon-emramazon-kinesisflink-streaming

Read More
AWS EMR job using PowerShell Cmdlet...


powershellamazon-web-servicesemramazon-emr

Read More
Mapping in elasticsearch for multiple fields...


elasticsearchamazon-emr

Read More
How to choose appropriate analyzer in Elasticsearch...


elasticsearchamazon-emr

Read More
Submit spark job to AWS EMR with java code and wait for the execution and get final status...


javaapache-sparkamazon-emr

Read More
EMR Spark working in a java main, but not in a java function...


javaamazon-web-servicesapache-sparkemramazon-emr

Read More
How do I scale my AWS EMR cluster with 1 master and 2 core nodes using AWS auto scaling? Is there a ...


hadoopamazon-web-servicesamazon-emrhadoop2

Read More
How to start pig with -t ColumnMapKeyPrune on aws emr...


apache-pigamazon-emr

Read More
Best way to process .csv data using AWS...


amazon-web-servicesamazon-ec2amazon-redshiftamazon-emramazon-data-pipeline

Read More
Does EMRFS make S3 consistent for external clients...


hadoopamazon-s3amazon-emr

Read More
file path in hdfs...


javahadoopamazon-ec2mapreduceamazon-emr

Read More
Why Spark on AWS EMR doesn't load class from application fat jar?...


apache-sparkclasspathemramazon-emr

Read More
WARN mapreduce.LoadIncrementalHFiles: Skipping non-directory hdfs: on EMR...


hadoophbaseamazon-emrmapr

Read More
AWS EMR Spark Step args bug...


amazon-web-servicesapache-sparkemramazon-emr

Read More
BackNext