about posts github email |
Posted on 2017-05-09
How many lines did a certain user actually contribute to a git repository? How many lines did he delete? This post shows a way to answer this questions with the help of git and the Unix shell.
Here follows the full command, the impatient reader can skip the explanatory exposition below:
git log --author="Linus Torvalds" --pretty=format: --shortstat \ | sed -e 's/\([0-9]*\) insertion.*/\n\1/;s/.*\n//' -e t -e d \ | tr '\n' '+' | sed 's/.$/\n/' | bc
The approach nicely follows the filter-map-reduce pattern and thus consists of three main steps:
With very small changes in the map step, the command above can be adapted to consider deleted instead of inserted lines. It does, however, not consider the fact that lines added by the user might have been deleted or overwritten by another user. If one is not interested in historical commit data but rather wants to know how many lines in a revision of the repository were committed by a certain user, the command below can be used. It might take a long time though on large repositories.
find . -type f -exec git blame {} \; 2>/dev/null \ | grep '^[a-z0-9]* (Linus Torvalds' \ | wc -l
© 2018 Johannes Tax (johannes@johannes.tax)