Is there a way to duplicate the following git command with LibGit2Sharp?
git --no-pager log --stat=999999 -l 7630 --first-parent "--pretty=commit %H%nDate: %ad" --date=local --since-as-filter=61.months.ago
I see the git log examples in the wiki and the tests, but I have not found a way to get the file stats.
I'm using the output from this to 1) find the last time each file was touched and 2) git the number of lines added and deleted in the last 30 days. I've got code that parses output from the above command, but I can adjust that as long as I have a way retrieve all the commits from the last 5 years on the current branch along with the full hash, author date, and number of adds and deletes for each file in the commit including path changes and file renames. To save processing time, I'd rather not get all the commit messages.
My ultimate goal is to run this process on a serverless host where it would be difficult to install and run git from the command line.
After digging deeper into the tests, I've figured this out:
var commits = repo.Commits.QueryBy(new CommitFilter
{
FirstParentOnly = true,
SortBy = CommitSortStrategies.Time
}).Where(c => c.Author.When > DateTimeOffset.Now.AddMonths(-61));
foreach (var commit in commits)
{
Console.WriteLine($"{commit.Author.When}\t{commit.Sha}");
using var patch = repo.Diff.Compare<Patch>(commit.Parents.FirstOrDefault().Tree, commit.Tree,
compareOptions: new CompareOptions
{
Similarity = SimilarityOptions.Copies
, IncludeUnmodified = false
});
Console.WriteLine($"\tAdded: {patch.LinesAdded}\tDeleted: {patch.LinesDeleted}\tFiles:");
foreach (var patchEntryChanges in patch)
{ Console.WriteLine($"\t\t{patchEntryChanges.Path}\tAdded: {patchEntryChanges.LinesAdded}\tDeleted: {patchEntryChanges.LinesDeleted}"); }
}
This is working, but it's a bit slow and a memory hog so any additional insights appreciated.
UPDATE Turns out we have some very large commits so the code above is unusable because of the content in the patches. Memory grows to about 16GB and runs out. I did try to change my code so I could dispose of the patches as soon as possible and even added a GC.Collect, but still uses too much memory.
foreach (var commit in commits)
{
IEnumerable<PreFileStats> preFileStats;
using (var patch = repo.Diff.Compare<Patch>(commit.Parents.First().Tree
, commit.Tree
, CompareOptions))
{
preFileStats = patch
.Where(c => !c.IsBinaryComparison)
.Select(p => new PreFileStats
{
LastCommit = commit.Sha,
LastAuthorDate = commit.Author.When,
FullPath = p.Path,
Added = p.LinesAdded,
Deleted = p.LinesDeleted
});
}
GC.Collect();
...more processing...
}
I'm trying to come up with a way to incrementally process the patches now.