I am testing replication over secure channels following this article:
https://clickhouse.com/docs/en/guides/sre/configuring-ssl/
And I couldn't make it yet. It seems that (ClickHouse) Keeper has communication issues. It says “invalid four letter command” and then the connections are being closed. The error messages come continually (more error messages at the bottom).
2023.03.15 08:51:23.210746 [ 56891 ] {} <Warning> KeeperTCPHandler: invalid four letter command
2023.03.15 08:51:23.210842 [ 56892 ] {} <Warning> KeeperTCPHandler: invalid four letter command
2023.03.15 08:51:23.211167 [ 57317 ] {} <Error> virtual bool
DB::DDLWorker::initializeMainThread(): Code: 999. Coordination::Exception: Connection loss, path: All connection tries failed while connecting to ZooKeeper. nodes: secure://10.5.106.233:9281, secure://10.5.106.232:9281
Poco::Exception. Code: 1000, e.code() = 0, SSL connection unexpectedly closed (version 23.1.3.5 (official build)), 10.5.106.233:9281
Poco::Exception. Code: 1000, e.code() = 0, SSL connection unexpectedly closed (version 23.1.3.5 (official build)), 10.5.106.232:9281
The 4lws seem to be corrupted (or encoded). The picture is added for they are not visible with a text editor.
I have some configurations copied under. Since clickhouse-client is able to connect to the same servers and I am following the article above (which has the comprehensive configuration data), I don’t include all configurations but if needed please let me know.
I am using 23.1.3. Without SSL, with similar configurations, the replication worked fine. For now, I am only doing some initial testing so only use 2 servers.
Any insight will be appreciated. Thanks in advance.
Configurations, openssl check, and error messages:
<zookeeper>
<node>
<host>MY-232.ch_test.local</host>
<port>9281</port>
<secure>1</secure>
</node>
<node>
<host>MY-233.ch_test.local</host>
<port>9281</port>
<secure>1</secure>
</node>
</zookeeper>
<keeper_server>
<tcp_port>9281</tcp_port>
<server_id>1</server_id>
<log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
<snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>
<coordination_settings>
<operation_timeout_ms>10000</operation_timeout_ms>
<session_timeout_ms>30000</session_timeout_ms>
<raft_logs_level>trace</raft_logs_level>
</coordination_settings>
<raft_configuration>
<server>
<id>1</id>
<hostname>MY-232.ch_test.local</hostname>
<port>9444</port>
<secure>1</secure>
</server>
<server>
<id>2</id>
<hostname>MY-233.ch_test.local</hostname>
<port>9444</port>
<secure>1</secure>
</server>
</raft_configuration>
</keeper_server>
[root@MY-233 ~]# openssl version
OpenSSL 1.1.1g FIPS 21 Apr 2020
[root@MY-233]# openssl s_client -connect MY-232.ch_test.local:9281
CONNECTED(00000003)
write:errno=0
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 320 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
2023.03.15 08:51:23.209573 [ 56892 ] {} <Warning> KeeperTCPHandler: invalid four letter command
2023.03.15 08:51:23.209643 [ 56891 ] {} <Warning> KeeperTCPHandler: invalid four letter command
2023.03.15 08:51:23.210139 [ 56891 ] {} <Warning> KeeperTCPHandler: invalid four letter command
2023.03.15 08:51:23.210297 [ 56891 ] {} <Warning> KeeperTCPHandler: invalid four letter command
2023.03.15 08:51:23.210746 [ 56891 ] {} <Warning> KeeperTCPHandler: invalid four letter command
2023.03.15 08:51:23.210842 [ 56892 ] {} <Warning> KeeperTCPHandler: invalid four letter command
2023.03.15 08:51:23.211167 [ 57317 ] {} <Error> virtual bool
DB::DDLWorker::initializeMainThread(): Code: 999. Coordination::Exception: Connection loss, path: All connection tries failed while connecting to ZooKeeper. nodes: secure://10.5.106.233:9281, secure://10.5.106.232:9281
Poco::Exception. Code: 1000, e.code() = 0, SSL connection unexpectedly closed (version 23.1.3.5 (official build)), 10.5.106.233:9281
Poco::Exception. Code: 1000, e.code() = 0, SSL connection unexpectedly closed (version 23.1.3.5 (official build)), 10.5.106.232:9281
Poco::Exception. Code: 1000, e.code() = 0, SSL connection unexpectedly closed (version 23.1.3.5 (official build)), 10.5.106.233:9281
Poco::Exception. Code: 1000, e.code() = 0, SSL connection unexpectedly closed (version 23.1.3.5 (official build)), 10.5.106.232:9281
Poco::Exception. Code: 1000, e.code() = 0, SSL connection unexpectedly closed (version 23.1.3.5 (official build)), 10.5.106.233:9281
Poco::Exception. Code: 1000, e.code() = 0, SSL connection unexpectedly closed (version 23.1.3.5 (official build)), 10.5.106.232:9281
. (KEEPER_EXCEPTION), Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0xddb0df5 in /usr/bin/clickhouse
1. Coordination::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, Coordination::Error, int) @ 0x14c55e10 in /usr/bin/clickhouse
2. Coordination::Exception::Exception(Coordination::Error, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) @ 0x14c56537 in /usr/bin/clickhouse
3. Coordination::ZooKeeper::connect(std::__1::vector<Coordination::ZooKeeper::Node, std::__1::allocator<Coordination::ZooKeeper::Node>> const&, Poco::Timespan) @ 0x14ca3798 in /usr/bin/clickhouse
4. Coordination::ZooKeeper::ZooKeeper(std::__1::vector<Coordination::ZooKeeper::Node, std::__1::allocator<Coordination::ZooKeeper::Node>> const&, zkutil::ZooKeeperArgs const&, std::__1::shared_ptr<DB::ZooKeeperLog>) @ 0x14ca204d in /usr/bin/clickhouse
5. zkutil::ZooKeeper::init(zkutil::ZooKeeperArgs) @ 0x14c58fda in /usr/bin/clickhouse
6. zkutil::ZooKeeper::ZooKeeper(Poco::Util::AbstractConfiguration const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::shared_ptr<DB::ZooKeeperLog>) @ 0x14c5c5f9 in /usr/bin/clickhouse
7. DB::Context::getZooKeeper() const @ 0x12a01d1b in /usr/bin/clickhouse
8. DB::DDLWorker::getAndSetZooKeeper() @ 0x12a709cd in /usr/bin/clickhouse
9. DB::DDLWorker::initializeMainThread() @ 0x12a82985 in /usr/bin/clickhouse
10. DB::DDLWorker::runMainThread() @ 0x12a6e782 in /usr/bin/clickhouse
11. void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true>::ThreadFromGlobalPoolImpl<void (DB::DDLWorker::*)(), DB::DDLWorker*>(void (DB::DDLWorker::*&&)(), DB::DDLWorker*&&)::'lambda'(), void ()>>(std::__1::__function::__policy_storage const*) @ 0x12a837a9 in /usr/bin/clickhouse
12. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0xde7de76 in /usr/bin/clickhouse
13. ? @ 0xde831a1 in /usr/bin/clickhouse
14. /usr/src/debug/glibc-2.28/nptl/pthread_create.c:480: start_thread @ 0x814a in /usr/lib/debug/usr/lib64/libpthread-2.28.so.debug
15. /usr/src/debug/glibc-2.28/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:97: clone @ 0xfcf23 in /usr/lib/debug/usr/lib64/libc-2.28.so.debug
(version 23.1.3.5 (official build))
There is a small mistake in your config.
When secure channel is used, you need to define tcp_port_secure
instead of tcp_port
under keeper_server
.
Hope this solves your issue!