hive-on-tez mapper stuck in INITIALIZING with total number of containers being -1 when accessing data on S3/MinIO

I have a Hadoop+Hive+Tez setup from scratch (meaning I deployed it component by component). Hive is set up using Tez as execution engine.

In its current status, Hive can access table on HDFS, but it can not access table stored on MinIO (using s3a filesystem implementation).

As shows the following screenshot, when executing SELECT COUNT(*) FROM s3_table,

Tez execution stuck forever
Map 1 always in INITIALIZING state
Map 1 always has a total count of -1 and pending count of -1. (why -1?)

Things already checked:

Hadoop can access MinIO/S3 without problem. For example, hdfs dfs -ls s3a://bucketname works well.
Hive-on-Tez can compute against tables on HDFS, with mappers and reducers generated successfully and quickly.
Hive-on-MR can compute against tables on MinIO/S3 without problem.

What could be the possible causes for this problem?

Attaching Tez UI screenshot:

Version informations:

Hadoop 3.2.1
Hive 3.1.2
Tez 0.9.2
MinIO RELEASE.2020-01-25T02-50-51Z

Solution

It turned out the problem is that Tez's S3 support must be enabled explicitly at compile time. For hadoop 2.8+, to enable S3 support, Tez must be compiled from source, with the following command:

mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true -Paws -Phadoop28 -P\!hadoop27

After that, drop the generated tez-x.y.z.tar.gz to HDFS and extract tez-x.x.x-minimal.tar.gz to $TEZ_LIB_DIR. Then it worked for me. Hive execution against MinIO/S3 runs smoothly.

However, Tez installation guide didn't mention anything about enabling S3 support. Nor does the default Tez binary releases build with S3 or Azure support.

The (hopefully) complete build options and pitfalls are actually documented in BUILDING.txt, where it says:

However, to build against hadoop versions higher than 2.7.0, you will need to do the following:

For Hadoop version X where X >= 2.8.0
$ mvn package  -Dhadoop.version=${X} -Phadoop28 -P\!hadoop27
For recent versions of Hadoop (which do not bundle aws and azure by default), you can bundle AWS-S3 (2.7.0+) or Azure (2.7.0+) support:
$ mvn package -Dhadoop.version=${X} -Paws -Pazure