Search code examples
gitgithubversion-controlrepositorymulti-user

shared github codevelopment (only 2 users) creates unresolvable push/pull conflicts


We are trying to use github, but we seem to be using it spectacularly wrong. I created a repository with raw data files, source code, processed data files, results files (like png and html and csv files). As long as I was the only person using it, all was well. However, I've granted access to a collaborator, and now she and I are completely unable to pull or push from the repository. It seems that the creation of processed data files is creating incompatibilities that make data pull/push impossible for both of us. Indeed, we have to delete all new work in order to fetch, which makes github impractical for its intended purpose, which is codevelopment of code (and results and figures, etc.)

We are at an impasse. The current workarounds are all supremely suboptimal: a) to operate under the notion that github is a single-user paradigm b) to use git hub for file passing but not involve it with active development, so it becomes a complicated dropbox.

I think the best strategy is to delete all but a the barest-of-bones source files and maybe the original untouchable raw data source file. Less is more. (Nothing is best?) But even deleting files from an active repository is not apparently supported. That is, I can add files to a .gitignore. I can delete them from my local image. But I have to pull before I push, and they all show up again. Rebasing isn't the answer.

Is there some way to "push" a file-delete event? Or delete all but 3 or 4 files from the github.com account? Or should I just delete the whole repository and start over? Or should I go back to svn, which had no problem layering png files, etc., and only declared a conflict when there was one in a source file?

Is there some way to teach github to merge only those files that should be sensibly merge-able, like files with suffixes .R, .h, .c, .cpp, .py, .javac, .html, etc. and simply layer-over the other files with appropriate version number increments, like in svn?

Efforts to find answers to these questions in documentation and on-line only perpetuate the frustration.

Thank you for your advice.


Solution

  • Ideally, Git (and repositories in GitHub) should be used for source code, not for generated files (your "result files").
    "original untouchable raw data source file." might be versioned as well if they are immutable.

    If you need to add generated files (that are likely to change each time you commit new source code, and will generate conflicts), you might consider, in order to memorize said files, to:

    • make an archive (zip, tar, ...) of all those non source files
    • add it as a "release" associated to your commit

    But that seems cumbersome to do for every commit though.