Search code examples
linuxoperating-systemnamed-pipesio-redirection

Why does tee make a different in named fifo?


so the cpp code is trying to solve the problem on codeforces using some random numbers and BSGS algorithm.

I had issue when debugging this problem because it talks to another program through STDIN and STDOUT. I tried with named pipe:

preparation: first save the python judger code below into gen.py. then compile the cpp code below into g2.exe. then make 2 named fifo in working directory by makefifo g2in; makefifo g2out;.

open 2 terminals and execute cmds in each terminal: ./g2 < g2in > g2out and ./gen.py < g2out > g2in and both terminals get hung up forever and show no progress at all.

but if I do a tee in one of the commands, things would be different: ./g2 < g2in | tee g2out and ./gen.py < g2out > g2in. the program progress and exit as expected and the judger could tell the program is correctly behaved.

I've tried to take the strace down:

left side is the one with tee, right without

left side is the one with tee, right the one without. I wonder what's going on here and what makes the different?

this is python judger code:

#!/usr/bin/env python
import sys
import random
import subprocess


def main():
    n = random.randint(1, 1e6)
    sys.stderr.write(f"begin with {n=}\n")
    v = [x + 1 for x in range(n)]
    random.shuffle(v)
    idx = 0
    opt = 0
    print(f"{v[idx]}")
    sys.stdout.flush()
    while True:
        #sys.stdin.flush()
        #line = process.stdout.readline().strip()
        line = input()
        #sys.stderr.write(f"read {line}\n")
        if not line:
            break
        if line[0] == '!':
            exp = int(line[1:])
            if exp == n:
                sys.stderr.write(f"get right ans {exp=}\n")
                return 0
            else:
                raise Exception(f"real {n=}, {exp=}\n")
        else:
            op = line[0]
            cnt = int(line[1:])
            sys.stderr.write(f"get a query request at {cnt=}\n")
            if op == '+':
                idx = (idx + cnt) % n
            else:
                assert op == '-', f"{op} not '-'"
                idx = (idx - cnt) % n
            print(f"{v[idx]}")
            sys.stdout.flush()
            sys.stderr.write(f"write and flush ans {v[idx]=} done\n")
        opt += 1
        if opt % 100 == 0:
            sys.stderr.write(f"opt get ${opt}\n")
        if opt >= 1000:
            break

    raise Exception("Too many try")


if __name__ == "__main__":
    main()

this is the correct cpp code which get accepted:

#include <assert.h>
#include <bits/stdc++.h>
using namespace std;
#ifdef __DEBUG__
#include "dbg.h"
#else
#define dbg(...) 42
#endif
template <class T> using mpq = priority_queue<T, vector<T>, greater<T>>;

using ll = long long;
using pii = pair<int, int>;
using pll = pair<ll, ll>;
using vl = vector<ll>;
using vi = vector<int>;

mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());

int get_rdn() { return rng() % int(1e6 + 3); }

int main(int argc, char **argv)
{
    int x, m = 0;
    scanf("%d", &m);
    dbg(m);
    for (int i = 0; i < 300; ++i) {
        int roll = get_rdn();
        printf("+ %d\n", roll), fflush(stdout);
        scanf("%d", &x);
        m = max(m, x);
    }
    dbg(m);
    map<int, int> cnt;
    int offset = 0;
    for (int i = 0; i < 340; ++i) {
        printf("+ 1\n"), fflush(stdout);
        scanf("%d", &x);
        if (cnt.count(x)) {
            printf("! %d\n", offset), fflush(stdout);
            return 0;
        }
        cnt[x] = offset++;
    }
    dbg(offset);
    printf("+ %d\n", m - 340), fflush(stdout);
    scanf("%d", &x);
    offset = 0;
    for (int i = 0; i < 340; ++i) {
        printf("+ 340\n"), fflush(stdout);
        scanf("%d", &x);
        if (cnt.count(x)) {
            dbg(offset + m + 339 - cnt[x]);
            printf("! %d\n", offset + m + 339 - cnt[x]), fflush(stdout);
            return 0;
        }
        offset += 340;
    }
    return 0;
};

I am using WSL2 btw: Linux Ubuntu-WSL2 4.19.128-microsoft-standard #1 SMP Tue Jun 23 12:58:10 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux


Solution

  • Order matters. When you try to open a FIFO for reading, it blocks until something else opens the other end for writing (And vis versa). Shell redirections are executed in a left to right order. So in your first command ./g2 < g2in > g2out, the shell tries to open g2in for reading and blocks waiting for it to also be opened for writing. The g2 program hasn't even been started yet; that happens after all the redirections are set up. With the second command, ./gen.py < g2out > g2in, first the shell tries to open g2out for reading, which again blocks waiting for something to open the other end of the pipe. The commands become deadlocked, each waiting for the other to do something that it won't do until the other does something else first.

    If you keep the same order of files in the directions, by running ./g2 < g2in > g2out and ./gen.py > g2in < g2out, first both ends of the g2in FIFO are opened, and then both ends of g2out, and both programs are executed and can talk to each other.

    ./g2 < g2in | tee g2out worked because the shell only has one redirection, of g2in, and while it's blocking waiting for the write end of that FIFO to open, the rest of the pipeline is also being executed in parallel; tee opens g2out for writing, which allows the rest of ./gen.py < g2out > g2in to run (After g2out is open on both ends, g2in is opened for writing (Allowing the g2 < g2in command to proceed) and then ./gen.py is executed).