Search code examples
hivetrino

Trino load testing returns SERVICE_UNAVAILABLE


I am running a simple load test with concurrent queries on Trino, and the majority of the requests succeed. However, the failed ones relate to the following logs' entries:

2024-05-03 07:13:08 2024-05-03T05:13:08.691Z WARN ContinuousTaskStatusFetcher-20240503_051240_01897_g7z4z.4.0.0-4111 io.trino.server.remotetask.RequestErrorTracker Error getting task status 20240503_051240_01897_g7z4z.4.0.0:

http://172.31.0.10:8080/v1/task/20240503_051240_01897_g7z4z.4.0.0 2024-05-03 07:13:08 io.trino.server.remotetask.SimpleHttpResponseHandler$ServiceUnavailableException: Server returned SERVICE_UNAVAILABLE: http://172.31.0.10:8080/v1/task/20240503_051240_01897_g7z4z.4.0.0/status

2024-05-03 07:13:08 at io.trino.server.remotetask.SimpleHttpResponseHandler.onSuccess(SimpleHttpResponseHandler.java:52)

2024-05-03 07:13:08 at io.trino.server.remotetask.SimpleHttpResponseHandler.onSuccess(SimpleHttpResponseHandler.java:27)

2024-05-03 07:13:08 at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1137)

2024-05-03 07:13:08 at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:79)

2024-05-03 07:13:08 at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)

2024-05-03 07:13:08 at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)

2024-05-03 07:13:08 at java.base/java.lang.Thread.run(Thread.java:1570)

2024-05-03 07:13:08

2024-05-03 07:13:08

2024-05-03 07:13:08 2024-05-03T05:13:08.597Z WARN
ContinuousTaskStatusFetcher-20240503_051244_01926_g7z4z.5.0.0-1027
io.trino.server.remotetask.RequestErrorTracker Error getting task status 20240503_051244_01926_g7z4z.5.0.0: http://172.31.0.10:8080/v1/task/20240503_051244_01926_g7z4z.5.0.0

2024-05-03 07:13:08 io.trino.server.remotetask.SimpleHttpResponseHandler$ServiceUnavailableException: Server returned SERVICE_UNAVAILABLE: http://172.31.0.10:8080/v1/task/20240503_051244_01926_g7z4z.5.0.0/status

2024-05-03 07:13:08 at io.trino.server.remotetask.SimpleHttpResponseHandler.onSuccess(SimpleHttpResponseHandler.java:52)

2024-05-03 07:13:08 at io.trino.server.remotetask.SimpleHttpResponseHandler.onSuccess(SimpleHttpResponseHandler.java:27)

2024-05-03 07:13:08 at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1137)

2024-05-03 07:13:08 at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:79)

2024-05-03 07:13:08 at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)

2024-05-03 07:13:08 at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)

2024-05-03 07:13:08 at java.base/java.lang.Thread.run(Thread.java:1570)

The error is:

io.trino.server.remotetask.SimpleHttpResponseHandler$ServiceUnavailableException: Server returned SERVICE_UNAVAILABLE:

Which config.properties settings should I fine-tune to increase the HTTP throughput?


Solution

  • I think you are encountering a 'service unavailable' error because your coordinator may be running out of resources. You can try adding node-scheduler.include-coordinator=false. After doing this, no work will be scheduled on the coordinator. Please check and let me know if this resolves the issue."