Search code examples
pythongitsubprocesspopen

Python's Popen + communicate only returning the first line of stdout


I'm trying to use my command-line git client and Python's I/O redirection in order to automate some common operations on a lot of git repos. (Yes, this is hack-ish. I might go back and use a Python library to do this later, but for now it seems to be working out ok :) )

I'd like to be able to capture the output of calling git. Hiding the output will look nicer, and capturing it will let me log it in case it's useful.

My problem is that I can't get more than the first line of output when I run a 'git clone' command. Weirdly, the same code with 'git status' seems to work just fine.

I'm running Python 2.7 on Windows 7 and I'm using the cmd.exe command interpreter.

My sleuthing so far:

  1. When I call subprocess.call() with "git clone" it runs fine and I see the output on the console (which confirms that git is producing output, even though I'm not capturing it). This code:

    dir = "E:\\Work\\etc\\etc"
    os.chdir(dir)
    git_cmd = "git clone [email protected]:Mike_VonP/bit142_assign_2.git"
    
    #print "SUBPROCESS.CALL" + "="*20
    #ret = subprocess.call(git_cmd.split(), shell=True) 
    

    will produce this output on the console:

    SUBPROCESS.CALL====================
    Cloning into 'bit142_assign_2'...
    remote: Counting objects: 9, done.
    remote: Compressing objects: 100% (4/4), done.
    remote: Total 9 (delta 0), reused 0 (delta 0)
    Receiving objects: 100% (9/9), done.
    Checking connectivity... done.
    
  2. If I do the same thing with POpen directly, I see the same output on the console (which is also not being captured). This code:

    # (the dir = , os.chdir, and git_cmd= lines are still executed here)
    print "SUBPROCESS.POPEN" + "="*20
    p=subprocess.Popen(git_cmd.split(), shell=True)
    p.wait()
    

    will produce this (effectively identical) output:

    SUBPROCESS.POPEN====================
    Cloning into 'bit142_assign_2'...
    remote: Counting objects: 9, done.
    remote: Compressing objects: 100% (4/4), done.
    remote: Total 9 (delta 0), reused 0 (delta 0)
    Receiving objects: 100% (9/9), done.
    Checking connectivity... done.
    

    (Obviously I'm deleting the cloned repo between runs, otherwise I'd get a 'Everything is up to date' message)

  3. If I use the communicate() method what I expect is to get a string that contains all the output that I'm seeing above. Instead I only see the line Cloning into 'bit142_assign_2'....
    This code:

    print "SUBPROCESS.POPEN, COMMUNICATE" + "="*20
    p=subprocess.Popen(git_cmd.split(), shell=True,\
                bufsize = 1,\
                stderr=subprocess.PIPE,\
                stdout=subprocess.PIPE)
    tuple = p.communicate()
    p.wait()
    print "StdOut:\n" + tuple[0]
    print "StdErr:\n" + tuple[1]
    

    will produce this output:

    SUBPROCESS.POPEN, COMMUNICATE====================
    StdOut:
    
    StdErr:
    Cloning into 'bit142_assign_2'...
    

    On the one hand I've redirected the output (as you can see from the fact that it's not in the output) but I'm also only capturing that first line.

I've tried lots and lots of stuff (calling check_output instead of popen, using pipes with subprocess.call, using pipes with subprocess.popen, and probably other stuff I've forgotten about) but nothing works - I only ever capture that first line of output.

Interestingly, the exact same code does work correctly with 'git status'. Once the repo has been cloned calling git status produces three lines of output (which collectively say 'everything is up to date') and that third example (the POpen+communicate code) does capture all three lines of output.

If anyone has any ideas about what I'm doing wrong or any thoughts on anything I could try in order to better diagnose this problem I would greatly appreciate it.


Solution

  • Try adding the --progress option to your git command. This forces git to emit the progress status to stderr even when the the git process is not attached to a terminal - which is the case when running git via the subprocess functions.

    git_cmd = "git clone --progress [email protected]:Mike_VonP/bit142_assign_2.git"
    
    print "SUBPROCESS.POPEN, COMMUNICATE" + "="*20
    p = subprocess.Popen(git_cmd.split(), stderr=subprocess.PIPE, stdout=subprocess.PIPE)
    tuple = p.communicate()
    p.wait()
    print "StdOut:\n" + tuple[0]
    print "StdErr:\n" + tuple[1]
    

    N.B. I am unable to test this on Windows, but it is effective on Linux.

    Also, it should not be necessary to specify shell=True and this might be a security problem, so it's best avoided.