Search code examples
amazon-web-servicesamazon-athena

How “Open port 444” is set on Athena


Athena’s documentation states that Port 444 must be open to support streaming query results.

I do encounter error while querying Athena via JDBC, and the error is gone as soon as I disable query result streaming and use pagination.

I am confused by that “keep port 444 open” part” - what does that mean to a fully managed, serverless offering like Athena - nothing more from the doc is said about how to do that and all my googling effort cannot provide a satisfactory answer.

What VPC is used by Athena? And what security group is used? Can I alternate the rules to allow outbound traffics via port 444?

What is the missing piece?


Solution

  • Caveat: I haven't used the China-Regions which you're linking to, and I think they may be subtly different from the "rest" of the AWS Global Infrastructure so take this with a grain of salt.

    The docs outline the following point, which helps to explain when this affects you:

    Open port 444 – Keep port 444, which Athena uses to stream query results, open to outbound traffic. When you use a PrivateLink endpoint to connect to Athena, ensure that the security group attached to the PrivateLink endpoint is open to inbound traffic on port 444. If port 444 is blocked, you may receive the error message [Simba][AthenaJDBC](100123) An error has occurred. Exception during column initialization.

    If you're calling the Athena service from a resource running inside a VPC via an Interface VPC-Endpoint, this interface endpoint needs to have a security group attached, that also opens port 444 for inbound traffic, not only the usual suspects (80, 443).

    If you're not using an interface VPC-endpoint and instead make a call to the public Athena endpoint (the default), this won't matter to you as AWS will ensure that this can receive traffic on port 444.