Fixed bare except clause in cache loading that could hide critical errors
Added input validation to getpipeoutput() function to detect potential command injection attempts
Improved file handling with proper context managers and error handling
2. Error Handling Patterns ✅
Replaced bare except with specific exception types (zlib.error, pickle.PickleError, EOFError, etc.)
Added proper error messages with context information
Fixed division by zero in execution time calculation
Added main-level exception handling with debug mode support
3. Undefined Variables/Functions ✅
Verified all function calls are properly defined
Fixed potential runtime errors in mathematical operations
Added safe division operations where needed
4. JavaScript Code Style Issues ✅
Fixed undeclared variables in sortable.js by adding proper var declarations
Fixed variable shadowing in compare_numeric() function
Improved loop variable declarations in sortables_init() and ts_resortTable()
Added missing variable declarations for ARROW, dt, mtstr, yr, etc.
5. Performance Issues ✅
Optimized file operations by using context managers
Reviewed and confirmed existing code already handles most performance concerns appropriately
Improved cache file handling to be more robust
6. Configuration and Path Validation ✅
Added comprehensive git repository validation (existence, directory check, .git folder verification)
Added output directory validation with permission checks
Enhanced configuration parsing with type validation and range checks
Added better error messages for invalid configurations
Key Improvements Made
Better Error Messages: All error conditions now provide clear, actionable feedback
Input Validation: Added checks for git repositories, output directories, and configuration values
Robust File Handling: Using context managers and proper exception handling
Security Hardening: Basic command injection detection and safer file operations
Code Quality: Fixed JavaScript variable scope issues and improved maintainability
Add styles for team analysis, performance tables, and highlight patterns
- Introduced grid layout for rankings display.
- Added styles for ranking sections, team performance, working patterns, impact analysis, and collaboration tables.
- Implemented highlighting for high performers and concerning patterns with hover effects.
Remove obsolete documentation files and the GPLv3 license
- Deleted the GPLv3 license document as it is no longer needed.
- Removed the INSTALL file, which contained outdated installation instructions.
- Eliminated the LICENSE file that referenced GPLv2/GPLv3 and the MIT license for sortable.js.
- Cleared the README file that provided information about gitstats, including requirements and contribution guidelines.
- Removed the TODO.txt file that listed various feature requests and improvements.
- Deleted the gitstats.pod file, which contained usage documentation for the gitstats tool.
On systems like Arch Linux `python` defaults to `python3` preventing
gitstats to start. This fixes it.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
[hoxu@users.sf.net: Debian Jessie and CentOS 6.5 have /usr/bin/python2,
but OS X Yosemite only has /usr/bin/python2.7. There does not seem to be
a portable way to refer to python 2.x, so unfortunately on some
platforms the shebang needs to be modified manually]
When generating HTML output with a custom stylesheet specified using
-c style='mystyle.css' the CSS file specified was not being copied to
the target directory.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Split limit 5 was off by one, which resulted in incorrect
processing of file paths with spaces embedded in them.
New split limit of 4 gets the required parts, because it
specifies the allowed number of splits, not elements:
mode type hash size name
1 2 3 4
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
implement a way to limit the statistics to commits after a start date
This is really useful when computing statistics over a set of
repositories, where some repositories are much older than other.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
This fixes a memory / ressource leak that manifests when computing
stats over big sets of repositories. It was eating more than 8G of
memory for ~15 git repositories.
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Remove backticks from author names passed to gnuplot.
Without this, author names containing `touch /tmp/vulnerable` would cause said
file to appear after generating statistics for the given repository.
This is not an optimal solution. Instead of blacklisting characters we should
either whitelist some, or find a safe escape mechanism for gnuplot.
* author.txt was renamed to AUTHOR
* use git shortlog instead of git-shortlog
because the latter is not necessarily in PATH
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Problem: gitstats will read every commit and every file in repository in one
thread during initial statistics generation (i.e. no cache available). It may
take much time in case of huge repositories (100 000+ files) Solution: Execute
all read commands in 24 threads instead of one
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
RHBZ#962168. In response to user request added support for invoking with the -h
or --help argument to print usage information. Previously usage was only
printed if the user provided less than two arguments AND did not provide any
invalid arguments, as they were unhandled the commonly used -h and --help
arguments counted as invalid.
https://bugzilla.redhat.com/show_bug.cgi?id=962168
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Make general stats total LOC show the total aggregate from all projects,
instead of the total LOC from last project.
[hoxu@users.sf.net: rewrote the commit message]
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
Avoid corrupt cache by writing it to a temporary file first, and then
overwriting the original one. Should also fix other exceptional cases.
Thanks-to: Alexander Strasser <eclipse7@gmx.net>
There's a new config field, 'merge_authors', which is a dictionary of
source name to target name. Whenever an author name matches a source
name it will be treated as if it was the target name instead.
Use this if authors have committed under multiple names, to squash their
statistics down to a single author. You can also use it to rename an
author for the purposes of the output.
Additionally, the -c option has been extended so for a dictionary option
you specify -c field=key,value. The key,value pair is then ADDED to the
dictionary.
Putting it all together in an example:
./gitstats -c merge_authors=bob,Bob\ Jones \
-c merge_authors=bob2,Bob\ Jones \
-c merge_authors=erica,Erica\ Smith ....
Signed-off-by: Heikki Hokkanen <hoxu@users.sf.net>
If the VERSION variable was not filled in, gitstats would run
'git rev-parse' on the repository being examined instead of the
gitstats repository.
Signed-off-by: Sven van Haastregt <svhaastr@liacs.nl>