Search code examples
gitmergeworkflowintegrationupgrade

Is creating git diff patches then manually editing the before applying a valid workflow?


Ok there are 2 projects, A & B. The project I'm working on 'B' is downstream of 'A'. That is to say that B has added different features and grown in a different direction to A. But they share a common history.

A little while ago B was maintained with periodic updates with each major release of A. But this is no longer the case.

On tracking down a memory leak in B, I'd discovered it was fixed in A over a year ago & then a whole lot of changes where made to A to enhance performance.

It was decided that all these changes should be integrated into project B.

So this is what I'm now doing. Obviously there are quite a few difficulties which make the task unmanageable. But none of them really relates to my question.

I started of creating diff 's component by component to see what had changed & decide what could be included without disrupting features in B that didn't exist in A.

I created a new branch for each component (in order to make it more manageable – identifying what held refs to changed code outside a specific component) originally thinking I could do a merge, I seem to remember using merge before & it creates conflicts rather then over writing stuff; but for whatever reason when I tried to merge a component from A with B most of the code unique to B was lost.

What could work though ,I'd hesitant to spend time on a technique with possible pit falls I'm unaware of, is to just edit the patches created from the diff.

Then apply the edited patch.

Because in the patch I can see both sides of the code side by side, I can remove bits (-) where I don't want code removed and worry about the conflicts on a case by case basis.

Has anyone tried this? Or does anyone know of a more main stream technique of handling this sort of problem?

The other issue is how can one alter a patch without it being though corrupt by the git system?


Solution

  • Instead of editing git patches, I recommend cherry-picking the interesting commits, and modifying the applied diff in the working directory:

    git cherry-pick --no-commit upstream/commit1
    (edit changes in working directory)
    git commit
    

    If you have a lot of potentially interesting commits, you can try interactive rebase.

    git checkout upstream/master -b upstream-new
    git rebase -i master
    

    This way you can review & cherry-pick a lot of commits easily - for example, choose only commits that affect only a subsystem. You can help the selecting process by running a git log --oneline subsystem1/ >/tmp/subsystem1_commits.txt - it will create a file with similar syntax as accepted by git rebase -i.

    About your workflow: I don't think there is an easy solution for your problem. If the upstream repository and your branch will diverge for a long time, you will lose git's main advantage: easy branching and merging. What you will have left is almost-manual patching system, which is always a pain. But if you can update the upstream repo once or twice a year, it may worth doing the following:

    • start from version upstream/master, apply your commits to it (=> your master-1 branch)
    • cherry-pick hotfixes from upstream/master when needed
    • when you have time for updating to the upstream branch, rebase your master-1 branch onto upstream/master, but without the already cherry-picked hotfixes (you can do that with git rebase -i). Let's call this branch master-2.

    This way you will have the following history:

    A-A-A-A-A- (upstream/master at the beginning)
              \
               -B-B-A-B (your master-1 branch, with a cherry-pick)
              \
               -A-A-A-A-A-A-A-A -B-B-B (your master-2 branch, after the rebase)