Search code examples
http-headersprotocol-buffersclickhousedbeaverclickhouse-client

Clickhouse protobuf output format


I use clickhouse server in docker with just 1 table and several rows in it. I can request all the data in default format with clickhouse client (over TCP) or with some GUI tool like DBeaver (over HTTP).

SELECT * FROM some_table;

Also I can change format to something special:

SELECT * FROM some_table FORMAT Pretty;

I want to request data from clickhouse in protobuf format. Query looks like this:

SELECT * FROM some_table FORMAT Protobuf SETTINGS format_schema = 'proto_file:ProtoStructure';

I have the proto_file.proto in the same directory with clickhouse-client, so I can made TCP request throw it (successful).

But I don't know data structure of TCP request to reproduce it in my program by myself. So I tried to execute the same request in HTTP (through DBeaver) to intercept request and reproduce it. Unfortunately I can't execute script in DBeaver properly, because it complains on proto_file.proto (File not found, I don't know where to place it to make it work). The only thing I known, that format is specified by X-Clickhouse-Format HTTP header, but I don't know and can't find any info about where in HTTP request I should place content of proto file.

So, the main question: Is there any examples of pure HTTP request to clickhouse for protobuf data output format?


Solution

  • SETTINGS format_schema = 'proto_file:ProtoStructure' -- is the feature of clickhouse-client application. It's only possible with clickhouse-client.

    clickhouse-client is the reach client. It queries data from clickhouse-server using TPC/native protocol and forms Protobuf by itself using the schema file.

    clickhouse-server is also able to form Protobuf using .proto files (using HTTP and GRPC protocols). But in this case .proto files should be placed at the clickhouse-server node into /var/lib/clickhouse/format_schemas/ folder.

    https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings/#server_configuration_parameters-format_schema_path


    for example I created .proto file

    cat /var/lib/clickhouse/format_schemas/test.proto
    syntax = "proto3";
    
    message TestMessage {
      int64 id = 1;
      uint32 blockNo = 2;
      string val1 = 3;
      float  val2 = 4;
      uint32 val3 = 5;
    };
    

    made it available chown clickhouse.clickhouse test.proto

    Now I can do this

    curl -o out.protobuf 'localhost:8123/?format_schema=test:TestMessage&query=select+1+id+from+numbers(10)+format+Protobuf'