Correct way to bring up the yb-master and yb-tserver processes in a YugabyteDB cluster

[Question posted by a user on YugabyteDB Community Slack]

We are using YugabyteDB 2.14.0.0 on 3 node setup, where on each node 1 yb-master and 1 yb-tserver is running. We first bring up the yb-master on each node using a command of the following form:

nohup yb-master --master_addresses IP1:7100,IP2:7100,IP3:7100 \
    --rpc_bind_addresses LOCAL_NODE_IP:7100 \
    --fs_data_dirs /data/vlst/yugabyte/yugabyte-2.14.0.0/data \
    --max_log_size 100 &> yb-master.out &

and then the yb-tserver on each node:

nohup yb-tserver --tserver_master_addrs IP1:7100,IP2:7100,IP3:7100 \
    –rpc_bind_addresses LOCAL_NODE_IP:9100 \
    –fs_data_dirs /data/vlst/yugabyte/yugabyte-2.14.0.0/data \
    --max_log_size 100 \
    --start_pgsql_proxy \
    --pgsql_proxy_bind_address LOCAL_NODE_IP:5433 \
    --ysql_log_statement all \
    --ysql_timezone LOCAL_TIMEZONE \
    --pg_yb_session_timeout_ms 900000 \
    --cql_proxy_bind_address LOCAL_NODE_IP:9042 \
    --cql_rpc_keepalive_time_ms 0 \
    --ysql_client_read_write_timeout_ms 300000 \
    --yb_client_admin_operation_timeout_sec 300 &> yb-tserver.out &

Is this the correct way to bring up the yb-master and yb-tserver processes, or if we need to change something (for example, different command line options)? The reason I ask is that we have been having some intermittent issues (where a yb-tserver process dies, and getting transaction errors while creating database tables) and would like to make sure we are starting with proper setup.

Solution

Use a flagfile thats the good way to keep it consistent, When using a flagfile it’s easier to administer. With flags set directly, it’s easier to understand what is set, especially if you use nohup on the CLI directly, because then there’s no registration of the current flags, outside of the commandline which you can see when it’s still running, and the history of the shell. That is not the best way. I would go further and say it’s mandatory to have a flagfile.

I would suggest to use a (systemd) unit file to startup, so the startup is scripted and consistent. I can provide an example if you like.

I think the basis of your question lies in the issues you have faced. I would suggest to carefully document an issue, and ask in this channel for help. They might not be related to specific startup options, but other things.

One of the things that we frequently see is people expecting an experience that is exactly alike postgres. Whilst we strive for that, the storage engine is inherently different, which is why YugabyteDB exists, which has or can have some differences. The first one being the isolation level that is different by default; postgres uses read-committed, the default isolation level of YugabyteDB YSQL is repeatable read. That isolation level means that concurrent row DML can cause a transaction to expire. Another difference is DDL and the catalog: the catalog is stored in the master, and DDL needs some time to get added and be consistent over the nodes, which takes longer than on postgres itself, and because of that concurrent DDL might require reduction in concurrency.