Search code examples
google-cloud-platformgoogle-cloud-storageclickhouse

Error while reading from Google storage parquets to clickhouse


I have sql script that reads data from Google Cloud Storgage parquets to Clickhouse table using s3 function. It had been running very well approxiamtely for two years without issues. But today we faced an issue. What I receive when I run SQL query in DBeaver :

SQL Error [23] [07000]: Code: 23. DB::Exception: Cannot read from istream at offset 0: While executing ParquetBlockInputFormat: While executing S3. (CANNOT_READ_FROM_ISTREAM) (version 23.7.3.14 (official build))
, server ClickHouseNode [uri=http://servername.ru:8123/default, options={socket_timeout=30000000,use_server_time_zone=false,use_time_zone=false}]@1364421048

What I can see in system.text_log of clickhouse for my specific query_id:

Row 1:
──────
event_date:              2024-06-13
event_time:              2024-06-13 10:53:01
event_time_microseconds: 2024-06-13 10:53:01.291989
microseconds:            291989
thread_name:             VFSRead
thread_id:               45246
level:                   Error
query_id:                48eeab01-34f6-47d2-a452-901af146bb8e
logger_name:             AWSClient
message:                 Failed to make request to: ***path_to_google_storgae*** 000000000060events_20240612_1_of_6.parquet: Poco::Exception. Code: 1000, e.code() = 0, Timeout, Stack trace (when copying this message, always include the lines below):

0. Poco::Net::SecureSocketImpl::mustRetry(int, Poco::Timespan&) @ 0x0000000018250b4c in /usr/bin/clickhouse
1. Poco::Net::SecureSocketImpl::receiveBytes(void*, int, int) @ 0x0000000018251df4 in /usr/bin/clickhouse
2. Poco::Net::HTTPSession::refill() @ 0x000000001827246f in /usr/bin/clickhouse
3. Poco::Net::HTTPHeaderStreamBuf::readFromDevice(char*, long) @ 0x000000001826b680 in /usr/bin/clickhouse
4. ? @ 0x0000000018170668 in /usr/bin/clickhouse
5. std::basic_streambuf<char, std::char_traits<char>>::uflow() @ 0x0000000008f7f3ea in /usr/bin/clickhouse
6. std::basic_istream<char, std::char_traits<char>>::get() @ 0x0000000008f80a19 in /usr/bin/clickhouse
7. Poco::Net::HTTPResponse::read(std::basic_istream<char, std::char_traits<char>>&) @ 0x00000000182707ef in /usr/bin/clickhouse
8. Poco::Net::HTTPClientSession::receiveResponse(Poco::Net::HTTPResponse&) @ 0x000000001825e655 in /usr/bin/clickhouse
9. ? @ 0x0000000012e32685 in /usr/bin/clickhouse
10. DB::S3::PocoHTTPClient::makeRequestInternal(Aws::Http::HttpRequest&, std::shared_ptr<DB::S3::PocoHTTPResponse>&, Aws::Utils::RateLimits::RateLimiterInterface*, Aws::Utils::RateLimits::RateLimiterInterface*) const @ 0x0000000012e29192 in /usr/bin/clickhouse
11. DB::S3::PocoHTTPClient::MakeRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::Utils::RateLimits::RateLimiterInterface*, Aws::Utils::RateLimits::RateLimiterInterface*) const @ 0x0000000012e28f7e in /usr/bin/clickhouse
12. Aws::Client::AWSClient::AttemptOneRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const @ 0x0000000018483b64 in /usr/bin/clickhouse
13. Aws::Client::AWSClient::AttemptExhaustively(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const @ 0x0000000018480902 in /usr/bin/clickhouse
14. Aws::Client::AWSClient::MakeRequestWithUnparsedResponse(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const @ 0x00000000184882e6 in /usr/bin/clickhouse
15. Aws::S3::S3Client::GetObject(Aws::S3::Model::GetObjectRequest const&) const @ 0x000000001855f919 in /usr/bin/clickhouse
16. DB::S3::Client::GetObject(DB::S3::ExtendedRequest<Aws::S3::Model::GetObjectRequest> const&) const @ 0x0000000012e06e7b in /usr/bin/clickhouse
17. DB::ReadBufferFromS3::sendRequest(unsigned long, std::optional<unsigned long>) const @ 0x0000000012e4ff03 in /usr/bin/clickhouse
18. DB::ReadBufferFromS3::initialize() @ 0x0000000012e4de4c in /usr/bin/clickhouse
19. DB::ReadBufferFromS3::nextImpl() @ 0x0000000012e4d579 in /usr/bin/clickhouse
20. DB::ReadBufferFromRemoteFSGather::readImpl() @ 0x0000000012eb6240 in /usr/bin/clickhouse
21. DB::ReadBufferFromRemoteFSGather::readInto(char*, unsigned long, unsigned long, unsigned long) @ 0x0000000012eb59f3 in /usr/bin/clickhouse
22. ? @ 0x0000000012de83e2 in /usr/bin/clickhouse
23. ? @ 0x0000000012de8b23 in /usr/bin/clickhouse
24. ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>) @ 0x000000000ea0b70f in /usr/bin/clickhouse
25. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000000ea0e45c in /usr/bin/clickhouse
26. ThreadPoolImpl<std::thread>::worker(std::__list_iterator<std::thread, void*>) @ 0x000000000ea0704f in /usr/bin/clickhouse
27. ? @ 0x000000000ea0d041 in /usr/bin/clickhouse
28. start_thread @ 0x0000000000008105 in /usr/lib64/libpthread-2.17.so
29. clone @ 0x00000000000feb2d in /usr/lib64/libc-2.17.so
 (version 23.7.3.14 (official build))
revision:                54476
source_file:             src/Common/Exception.cpp; void DB::tryLogCurrentExceptionImpl(Poco::Logger *, const std::string &)
source_line:             225
message_format_string:   

We have separate DEV enviroment with the same configuration and same SQL script reading same parquets and it completes without issues there. Just repeat that we had no problem with the query until today's date.

We tried to reboot Prod clickhouse , tried to run query multiple times on Prod enviroment (same error again and again ) We tried to run query on Dev enviroment and no issues everytime.


Solution

  • Issue was fixed by itself, now it's working fine , no issues.............