This repository was archived by the owner on Jun 21, 2020. It is now read-only.

Description
Sometimes it's nice to be able to completely remove things from git history when the current history no longer contains those things. This will need to automagically step across all branches!
Here are a few links:
Basically we want to:
- Look at git for the files no longer present
- Remove those from history
Here is a script from the first link above that returns all the sizes of things:
#!/bin/bash
#set -x
# Shows you the largest objects in your repo's pack file.
# Written for osx.
#
# @see http://stubbisms.wordpress.com/2009/07/10/git-script-to-show-largest-pack-objects-and-trim-your-waist-line/
# @author Antony Stubbs
# set the internal field spereator to line break, so that we can iterate easily over the verify-pack output
IFS=$'\n';
# list all objects including their size, sort by size, take top 10
objects=`git verify-pack -v .git/objects/pack/pack-*.idx | grep -v chain | sort -k3nr | head`
echo "All sizes are in kB. The pack column is the size of the object, compressed, inside the pack file."
output="size,pack,SHA,location"
for y in $objects
do
# extract the size in bytes
size=$((`echo $y | cut -f 5 -d ' '`/1024))
# extract the compressed size in bytes
compressedSize=$((`echo $y | cut -f 6 -d ' '`/1024))
# extract the SHA
sha=`echo $y | cut -f 1 -d ' '`
# find the objects location in the repository tree
other=`git rev-list --all --objects | grep $sha`
#lineBreak=`echo -e "\n"`
output="${output}\n${size},${compressedSize},${other}"
done
echo -e $output | column -t -s ', '
Basically we want a utility to cleanup (remove) like so:
cleanUp() {
git filter-branch --tag-name-filter cat --index-filter "git rm -r --cached --ignore-unmatch $1"--prune-empty -f -- --all
}