multithreading haskell concurrency deadlock socat

Why doesn't this program return? Or is it a deadlock?

Here's the full program¹

module Main where

import Control.Concurrent.Async
import Control.Concurrent.MVar
import System.Environment
import System.IO
import System.Process

main :: IO ()
main = do
  args <- getArgs
  (Just i, Just o, Nothing, p) <- createProcess (proc "socat" args)
                                                {std_in = CreatePipe, std_out = CreatePipe}
  sequence_ [hSetBuffering h NoBuffering | h <- [i, o, stdin, stdout]]
  hSetEcho stdin False
  mine <- newEmptyMVar
  res <- concurrently
    (do c <- getChar
        putMVar mine c
        hPutChar i c)
    (do other <- hGetChar o
        mine' <- takeMVar mine
        return (mine', other))
  print $ snd res
  terminateProcess p

I launch two instances of it in different terminals, like this:

$ cabal run myprogram -- TCP-LISTEN:12345,fork -       # in terminal 1
$ cabal run myprogram -- TCP-CONNECT:localhost:12345 - # in terminal 2

(in that order) then I hit one key in each terminal (doesn't matter the order), and they will both print those two keys in a pair (with sides swapped).

Sometimes, though if I hit the key in terminal 1 first, then the process in terminal 1 doesn't return (while the one in terminal 2 does).

Being not particularly experienced with concurrent programming, I wouldn't be surprised for a deadlock, but I don't see how this can be one! Here's my observations/reasoning:

Terminal 2 always returns (at least I've never seen it not return);
the behavior of terminal 1 seems to be "fixed" if I add the line putStrLn "hello" right before print $ snd res;
if I swap the lines print $ snd res and terminateProcess p, the "wrong" behavior is much more frequent;
the deadlock could happen between the two threads of each of the two processes I launch, but
- since terminal 2 is always returning (and specifically it prints the pair snd res), surely those 2 threads haven't deadlocked, I think,
- the other one is running an identical program, the only difference being the arguments passed to the spawned socat process, so I don't understand why the behavior should be asymmetrical, in the sense that if there was a deadlock, I'd expect it to happen regardless of what program received the keystroke first;
- and how can deadlock happen between the 2 threads (of each process) if one thread is calling putMVar and the other is calling takeMVar on the same single MVar in that program run? Each of the two calls will block until the other catches up, no?

(¹) In this very stripped down example, the frequency with which this happens is relatively low, but not too much (a few tens of attempts seems to suffice). I do have a slightly more nosisy example that seems to be impacted a bit more, but I don't think there's a "structural" difference with respect to this one, so I haven't posted it to keep it simpler, in case the reason for the observed behavior is apparent to the experts, but I can post it if deemed useful.

Solution

Here is one sequence of events that lead to the observed behavior:

you type a char and it is transmitted: terminal₁ → app₁ → socat₁ → socat₂ → app₂
you type a char in terminal₂ → app₂ → socat₂

now both threads in app₂ are done so:

it prints both chars and then
terminates socat₂ without checking if socat₂ is done sending and thus possibly before it had a chance to send any data to socat₁

finally:

app₁ forever waits for socat₁ to send it a char which never happens because it never receives one