Most of the work is done by the 2to3 script. Popen required the
passing of 'universal_newlines=True', otherwise it returns a byte
stream in python3. __future__ import is added for backwards
compatibility. Lightly tested with python2.7, python3.{6,7,8}.
On systems like Arch Linux `python` defaults to `python3` preventing
gitstats to start. This fixes it.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
[hoxu@users.sf.net: Debian Jessie and CentOS 6.5 have /usr/bin/python2,
but OS X Yosemite only has /usr/bin/python2.7. There does not seem to be
a portable way to refer to python 2.x, so unfortunately on some
platforms the shebang needs to be modified manually]
When generating HTML output with a custom stylesheet specified using
-c style='mystyle.css' the CSS file specified was not being copied to
the target directory.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Split limit 5 was off by one, which resulted in incorrect
processing of file paths with spaces embedded in them.
New split limit of 4 gets the required parts, because it
specifies the allowed number of splits, not elements:
mode type hash size name
1 2 3 4
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
implement a way to limit the statistics to commits after a start date
This is really useful when computing statistics over a set of
repositories, where some repositories are much older than other.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
This fixes a memory / ressource leak that manifests when computing
stats over big sets of repositories. It was eating more than 8G of
memory for ~15 git repositories.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Remove backticks from author names passed to gnuplot.
Without this, author names containing `touch /tmp/vulnerable` would cause said
file to appear after generating statistics for the given repository.
This is not an optimal solution. Instead of blacklisting characters we should
either whitelist some, or find a safe escape mechanism for gnuplot.
* author.txt was renamed to AUTHOR
* use git shortlog instead of git-shortlog
because the latter is not necessarily in PATH
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Problem: gitstats will read every commit and every file in repository in one
thread during initial statistics generation (i.e. no cache available). It may
take much time in case of huge repositories (100 000+ files) Solution: Execute
all read commands in 24 threads instead of one
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
RHBZ#962168. In response to user request added support for invoking with the -h
or --help argument to print usage information. Previously usage was only
printed if the user provided less than two arguments AND did not provide any
invalid arguments, as they were unhandled the commonly used -h and --help
arguments counted as invalid.
https://bugzilla.redhat.com/show_bug.cgi?id=962168
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Make general stats total LOC show the total aggregate from all projects,
instead of the total LOC from last project.
[hoxu@users.sf.net: rewrote the commit message]
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Avoid corrupt cache by writing it to a temporary file first, and then
overwriting the original one. Should also fix other exceptional cases.
Thanks-to: Alexander Strasser <eclipse7@gmx.net>
There's a new config field, 'merge_authors', which is a dictionary of
source name to target name. Whenever an author name matches a source
name it will be treated as if it was the target name instead.
Use this if authors have committed under multiple names, to squash their
statistics down to a single author. You can also use it to rename an
author for the purposes of the output.
Additionally, the -c option has been extended so for a dictionary option
you specify -c field=key,value. The key,value pair is then ADDED to the
dictionary.
Putting it all together in an example:
./gitstats -c merge_authors=bob,Bob\ Jones \
-c merge_authors=bob2,Bob\ Jones \
-c merge_authors=erica,Erica\ Smith ....
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
If the VERSION variable was not filled in, gitstats would run
'git rev-parse' on the repository being examined instead of the
gitstats repository.
Signed-off-by: Sven van Haastregt <svhaastr@liacs.nl>
This should address issue:
Check for gnuplot before running - ID: 3062202
http://sourceforge.net/tracker/?func=detail&aid=3062202&group_id=203965&atid=987711
Signed-off-by: Sven van Haastregt <svhaastr@liacs.nl>
If a custom version of gnuplot would be used by setting the GNUPLOT
environment variable, then still the default gnuplot was called to fill
in the gnuplot version number.
Signed-off-by: Sven van Haastregt <svhaastr@liacs.nl>
When counting authors, do not fall back to 0 if the git call fails. Zero
authors causes ZeroDivisionError later on otherwise. This reverts an old change
for msysgit on win32, introduced in a01045f2480a21f28dd90e278b4f560020a97b9e.
It should no longer be necessary.
Git versions with commit 7f814632f5d4d7af9f4225ece6039dbc44e03079 print the
stat summary output slightly different. There were two changes that affect
GitStats:
a) Singular forms of files/insertions/deletions may be used
b) The number of counts is now variable ranging from 1 to 3:
1: only files changed if file count is 0
2: if either insertions or deletions are 0 (not if both are)
3: where files,insertions and deletions are >0
or both insertions and deletions are 0
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Make the units on par with active commit days. Previously a new project
committed to yesterday at 23:00 and today at 1:00 would have an age of 1 day
but 2 active days.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>