Search code examples
Best technology stack for aggregation across various properties...

hadoopamazon-web-servicesamazon-s3amazon-redshiftelastic-map-reduce

Read More
fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey are not set for EMR default IAM roles...

emrelastic-map-reduceamazon-emr

Read More
Elastic Search Nested Query with Nested Object...

elasticsearchquerydslelastic-map-reduceelasticsearch-plugin

Read More
Processing HUGE number of small files independently...

hadoopamazon-web-servicesamazon-ec2mapreduceelastic-map-reduce

Read More
Reading many small files from S3 very slow...

amazon-web-servicesamazon-s3hiveapache-pigelastic-map-reduce

Read More
No module named simplejson in python UDF on EMR...

amazon-web-servicesapache-pigelastic-map-reduce

Read More
AWS EMR - install HUE using Java SDK...

javaamazon-web-servicesemrelastic-map-reducehue

Read More
Running MapReduce jobs on AWS-EMR from Eclipse...

javajarmapreduceelastic-map-reduceamazon-emr

Read More
Elasticsearch-Hadoop get Non-indexed data...

hadoopelasticsearchhadoop-streamingelastic-map-reduceelasticsearch-hadoop

Read More
Hadoop map-reduce mapper programming...

javahadoopelastic-map-reduce

Read More
What is the best practice to monitor AWS EMR job running progress?...

javaamazon-web-servicesemrelastic-map-reduceamazon-emr

Read More
AWS EMR: how to get the first element out of describe_jobflows() API call result...

pythonamazon-web-servicesbotoemrelastic-map-reduce

Read More
BZip2 Native Splitting on Amazon/EMR...

hadoopamazon-s3elastic-map-reducebzip2

Read More
Lauching a map reduce job in amazon elastic map reduce...

elastic-map-reduce

Read More
How to set the precise max number of concurrently running tasks per node in Hadoop 2.4.0 on Elastic ...

amazon-web-serviceshadoop-streamingelastic-map-reducehadoop-yarnhadoop2

Read More
Exception in thread "main" org.elasticsearch.client.transport.NoNodeAvailableException: No...

javasearchelasticsearchsearch-engineelastic-map-reduce

Read More
Possibility of taking snapshot of AWS EMR cluster or namenode...

amazon-web-servicessnapshotelastic-map-reduce

Read More
Spark/Hadoop throws exception for large LZO files...

hadoopapache-sparkelastic-map-reducelzo

Read More
parallel generation of random forests using scikit-learn...

pythonrscikit-learnrandom-forestelastic-map-reduce

Read More
Is there a way to launch EMR jobs on AWS Virtual Private Cloud....

hadoopamazon-web-serviceselastic-map-reduceamazon-vpc

Read More
LeaseExpiredException with custom UDF in Hive...

hadoophiveelastic-map-reduceemr

Read More
How to use Python streaming UDFs in pig on Amazon EMR...

pythonnumpyapache-pigelastic-map-reduceamazon-ami

Read More
File not cacheing on AWS Elastic Map Reduce...

pythonhadoopamazon-web-serviceselastic-map-reduce

Read More
Number of concurrently running mappers per node drops precipitously on Elastic MapReduce w/ AMI 3.1....

hadoopamazon-web-servicesamazon-ec2elastic-map-reducehadoop-yarn

Read More
The reduce fails due to Task attempt failed to report status for 600 seconds. Killing! Solution?...

javaeclipsehadoopmapreduceelastic-map-reduce

Read More
Is there an easy way to dedupe a Hive table?...

hiveapache-pigelastic-map-reduce

Read More
Map Error- Attempy_xxxx_ Timed out after 600 seconds...

hadoopdictionarymapreducetimeoutelastic-map-reduce

Read More
"Unable to verify integrity of data" while running MR job...

hadoopamazon-web-servicesamazon-s3mapreduceelastic-map-reduce

Read More
Copying a large file (~6 GB) from S3 to every node of an Elastic MapReduce cluster...

cachinghadoopamazon-web-servicesamazon-s3elastic-map-reduce

Read More
Hive: converting a comma separated string to array for table generating function...

user-defined-functionshiveelastic-map-reduce

Read More
BackNext