Fixed bare except clause in cache loading that could hide critical errors
Added input validation to getpipeoutput() function to detect potential command injection attempts
Improved file handling with proper context managers and error handling
2. Error Handling Patterns ✅
Replaced bare except with specific exception types (zlib.error, pickle.PickleError, EOFError, etc.)
Added proper error messages with context information
Fixed division by zero in execution time calculation
Added main-level exception handling with debug mode support
3. Undefined Variables/Functions ✅
Verified all function calls are properly defined
Fixed potential runtime errors in mathematical operations
Added safe division operations where needed
4. JavaScript Code Style Issues ✅
Fixed undeclared variables in sortable.js by adding proper var declarations
Fixed variable shadowing in compare_numeric() function
Improved loop variable declarations in sortables_init() and ts_resortTable()
Added missing variable declarations for ARROW, dt, mtstr, yr, etc.
5. Performance Issues ✅
Optimized file operations by using context managers
Reviewed and confirmed existing code already handles most performance concerns appropriately
Improved cache file handling to be more robust
6. Configuration and Path Validation ✅
Added comprehensive git repository validation (existence, directory check, .git folder verification)
Added output directory validation with permission checks
Enhanced configuration parsing with type validation and range checks
Added better error messages for invalid configurations
Key Improvements Made
Better Error Messages: All error conditions now provide clear, actionable feedback
Input Validation: Added checks for git repositories, output directories, and configuration values
Robust File Handling: Using context managers and proper exception handling
Security Hardening: Basic command injection detection and safer file operations
Code Quality: Fixed JavaScript variable scope issues and improved maintainability
Add styles for team analysis, performance tables, and highlight patterns
- Introduced grid layout for rankings display.
- Added styles for ranking sections, team performance, working patterns, impact analysis, and collaboration tables.
- Implemented highlighting for high performers and concerning patterns with hover effects.
On systems like Arch Linux `python` defaults to `python3` preventing
gitstats to start. This fixes it.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
[hoxu@users.sf.net: Debian Jessie and CentOS 6.5 have /usr/bin/python2,
but OS X Yosemite only has /usr/bin/python2.7. There does not seem to be
a portable way to refer to python 2.x, so unfortunately on some
platforms the shebang needs to be modified manually]
When generating HTML output with a custom stylesheet specified using
-c style='mystyle.css' the CSS file specified was not being copied to
the target directory.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Split limit 5 was off by one, which resulted in incorrect
processing of file paths with spaces embedded in them.
New split limit of 4 gets the required parts, because it
specifies the allowed number of splits, not elements:
mode type hash size name
1 2 3 4
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
implement a way to limit the statistics to commits after a start date
This is really useful when computing statistics over a set of
repositories, where some repositories are much older than other.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
This fixes a memory / ressource leak that manifests when computing
stats over big sets of repositories. It was eating more than 8G of
memory for ~15 git repositories.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Remove backticks from author names passed to gnuplot.
Without this, author names containing `touch /tmp/vulnerable` would cause said
file to appear after generating statistics for the given repository.
This is not an optimal solution. Instead of blacklisting characters we should
either whitelist some, or find a safe escape mechanism for gnuplot.
* author.txt was renamed to AUTHOR
* use git shortlog instead of git-shortlog
because the latter is not necessarily in PATH
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Problem: gitstats will read every commit and every file in repository in one
thread during initial statistics generation (i.e. no cache available). It may
take much time in case of huge repositories (100 000+ files) Solution: Execute
all read commands in 24 threads instead of one
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
RHBZ#962168. In response to user request added support for invoking with the -h
or --help argument to print usage information. Previously usage was only
printed if the user provided less than two arguments AND did not provide any
invalid arguments, as they were unhandled the commonly used -h and --help
arguments counted as invalid.
https://bugzilla.redhat.com/show_bug.cgi?id=962168
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Make general stats total LOC show the total aggregate from all projects,
instead of the total LOC from last project.
[hoxu@users.sf.net: rewrote the commit message]
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Avoid corrupt cache by writing it to a temporary file first, and then
overwriting the original one. Should also fix other exceptional cases.
Thanks-to: Alexander Strasser <eclipse7@gmx.net>
There's a new config field, 'merge_authors', which is a dictionary of
source name to target name. Whenever an author name matches a source
name it will be treated as if it was the target name instead.
Use this if authors have committed under multiple names, to squash their
statistics down to a single author. You can also use it to rename an
author for the purposes of the output.
Additionally, the -c option has been extended so for a dictionary option
you specify -c field=key,value. The key,value pair is then ADDED to the
dictionary.
Putting it all together in an example:
./gitstats -c merge_authors=bob,Bob\ Jones \
-c merge_authors=bob2,Bob\ Jones \
-c merge_authors=erica,Erica\ Smith ....
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
If the VERSION variable was not filled in, gitstats would run
'git rev-parse' on the repository being examined instead of the
gitstats repository.
Signed-off-by: Sven van Haastregt <svhaastr@liacs.nl>
This should address issue:
Check for gnuplot before running - ID: 3062202
http://sourceforge.net/tracker/?func=detail&aid=3062202&group_id=203965&atid=987711
Signed-off-by: Sven van Haastregt <svhaastr@liacs.nl>
If a custom version of gnuplot would be used by setting the GNUPLOT
environment variable, then still the default gnuplot was called to fill
in the gnuplot version number.
Signed-off-by: Sven van Haastregt <svhaastr@liacs.nl>
When counting authors, do not fall back to 0 if the git call fails. Zero
authors causes ZeroDivisionError later on otherwise. This reverts an old change
for msysgit on win32, introduced in a01045f2480a21f28dd90e278b4f560020a97b9e.
It should no longer be necessary.
Git versions with commit 7f814632f5d4d7af9f4225ece6039dbc44e03079 print the
stat summary output slightly different. There were two changes that affect
GitStats:
a) Singular forms of files/insertions/deletions may be used
b) The number of counts is now variable ranging from 1 to 3:
1: only files changed if file count is 0
2: if either insertions or deletions are 0 (not if both are)
3: where files,insertions and deletions are >0
or both insertions and deletions are 0
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Make the units on par with active commit days. Previously a new project
committed to yesterday at 23:00 and today at 1:00 would have an age of 1 day
but 2 active days.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
A hard-to-spot bug where the total number of commits for a tag was always set
to the number of commits that one of the commiters did and not the cumulated
sum over all commiters.
[hoxu@users.sf.net: created a commit from diff and description sent by Thomas]