Features¶
Modes of Operation¶
While tartufo
started its life with one primary mode of operation, scanning
the history of a git repository, it has grown other time to have a number of
additional uses and modes of operation. These are all invoked via different
sub-commands of tartufo
.
Git Repository History Scan¶
This is the “classic” use case for tartufo
: Scanning the history of a git
repository. There are two ways to invoke this functionality, depending if you
are scanning a repository which you already have cloned locally, or one on a
remote system.
Scanning a Local Repository¶
$ tartufo scan-local-repo /path/to/my/repo
To use docker
, mount the local clone to the /git
folder in the docker
image:
$ docker run --rm -v "/path/to/my/repo:/git" godaddy/tartufo scan-local-repo /git
Note
If you are using podman
in place of docker
, you will need to add the
--privileged
flag to the run
command, in order to avoid a permission
denied error.
Scanning a Remote Repository¶
$ tartufo scan-remote-repo https://github.com/godaddy/tartufo.git
To use docker
:
$ docker run --rm godaddy/tartufo scan-remote-repo https://github.com/godaddy/tartufo.git
When used this way, tartufo will clone the repository to a temporary directory, scan the local clone, and then delete it.
Scanning a Folder¶
Operating in this mode, tartufo scans the files in a local folder, rather than operating on git commit history. This is ideal for locating secrets in the latest version of source files, or files not in source control.
$ tartufo scan-folder .
$ docker run --rm -v "/path/to/my/repo:/git" godaddy/tartufo scan-folder /git
Note
If you are using podman
in place of docker
, you will need to add the
--privileged
flag to the run
command, in order to avoid a permission
denied error.
This will scan all files and folders in the specified directory including
.git and any other files that may not be in source control. Perform a git clean
or use a fresh clone of the repository before running scanning a folder and add
.git
to the exclude-paths
.
When accessing repositories via SSH, the docker
runtime needs to have
access to your SSH keys for authorization. To allow this, make sure
ssh-agent
is running on your host machine and has the key added. You can
verify this by running ssh-add -L
on your host machine. You then need to
point Docker at that running SSH agent.
Using Docker for Linux, that will look something like this:
$ docker run --rm -v "/path/to/my/repo:/git" \
-v $SSH_AUTH_SOCK:/agent -e SSH_AUTH_SOCK=/agent \
godaddy/tartufo scan-local-repo /git
When using Docker Desktop for Mac, use /run/host-services/ssh-auth.sock
as
both source and target, then point the environment variable SSH_AUTH_SOCK
to
this same location:
$ docker run --rm -v "/path/to/my/repo:/git" \
-v /run/host-services/ssh-auth.sock:/run/host-services/ssh-auth.sock \
-e SSH_AUTH_SOCK="/run/host-services/ssh-auth.sock" godaddy/tartufo
Pre-commit Hook¶
This mode of operation instructs tartufo to scan staged, uncommitted changes in a local repository. This is the flip-side of the primary mode of operation. Instead of checking for secrets you have already checked in, this helps prevent you from committing new secrets!
When running this sub-command, the caller’s current working directory is assumed to be somewhere within the local clone’s tree and the repository root is determined automatically.
Note
It is always possible, although not recommended, to bypass the pre-commit
hook by using git commit --no-verify
.
Manual Setup¶
To set up a pre-commit hook for tartufo
by hand, you can place the following
in a .git/hooks/pre-commit
file inside your local repository clone:
Executing tartufo Directly¶
#!/bin/sh
# Redirect output to stderr.
exec 1>&2
# Check for suspicious content.
tartufo --regex --entropy pre-commit
Or, Using Docker¶
#!/bin/sh
# Redirect output to stderr.
exec 1>&2
# Check for suspicious content.
docker run -t --rm -v "$PWD:/git" godaddy/tartufo pre-commit
Git will execute tartufo
before actually committing any of your changes. If
any problems are detected, they are reported by tartufo
, and git aborts the
commit process. Only when tartufo
returns a success status (indicating no
potential secrets were discovered) will git commit the staged changes.
Using the “pre-commit” tool¶
New in version 2.0.0.
If you want a slightly more automated approach which can be more easily shared to ensure a unified setup across all developer’s systems, you can use the wonderful pre-commit tool.
Add a .pre-commit-config.yaml
file to your repository. You can use the
following example to get you started:
- repo: https://github.com/godaddy/tartufo
rev: main
hooks:
- id: tartufo
Warning
You probably don’t actually want to use the main rev. This is the active development branch for this project, and can not be guaranteed stable. Your best bet would be to choose the latest version, currently 3.0.0-rc.1.
That’s it! Now your contributors only need to install pre-commit, and then
run pre-commit install --install-hooks
, and tartufo
will automatically
be run as a pre-commit hook.
Scan Types¶
tartufo
offers multiple types of scans, each of which can be optionally
enabled or disabled, while looking through its target for secrets.
Regex Checking¶
tartufo
can scan for a pre-built list of known signatures for things such as
SSH keys, EC2 credentials, etc. These scans are activated by use of the
--regex
flag on the command line. They will be reported with an issue type
of Regular Expression Match
, and the issue detail will be the name of the
regular expression which was matched.
Customizing¶
Additional rules can be specified in a JSON file, pointed to on the command
line with the --rules
argument. The file should be in the following format:
{
"RSA private key": "-----BEGIN EC PRIVATE KEY-----"
}
Things like subdomain enumeration, s3 bucket detection, and other useful regexes highly custom to the situation can be added.
If you would like to deactivate the default regex rules, using only your custom
rule set, you can use the --no-default-regexes
flag.
Feel free to also contribute high signal regexes upstream that you think will benefit the community. Things like Azure keys, Twilio keys, Google Compute keys, are welcome, provided a high signal regex can be constructed.
tartufo’s base rule set can be found in the file data/default_regexes.json
.
High Entropy Checking¶
tartufo
calculates the Shannon entropy of each commit, finding strings
which appear to be generated from a stochastic source. In short, it looks for
pieces of data which look random, as these are likely to be things such as
cryptographic keys. These scans are activated by usage of the --entropy
command line flag.
Scan Limiting (Exclusions)¶
By its very nature, especially when it comes to high entropy scans, tartufo
can encounter a number of false positives. Whether those are things like links
to git commit hashes, tokens/passwords used for tests, or any other variety of
thing, there needs to be a way to tell tartufo
to ignore those things, and
not report them out as issues. For this reason, we provide multiple methods for
excluding these items.
Excluding Submodule Paths¶
New in version 2.7.0.
By default, any path in the repository specified as a submodule will be
excluded from scans. Since these are upstream repositories over which you may
not have direct control, tartufo
will not hold you accountable for the
secrets in those. If you want to include these in your scans, you can specify
the --include-submodules
option.
> tartufo ... --include-submodules
Entropy Limiting¶
New in version 2.5.0.
Entropy scans can produce a high number of false positives such as git SHAs or md5
digests. To avoid these false positives, enable exclude-entropy-patterns
. Exclusions
apply to any strings flagged by entropy checks. This option is not available on the command line,
and must be specified in your config file.
For example, if docs/README.md
contains a git SHA and .github/workflows/*.yml
contains pinned git SHAs
this would be flagged by entropy.
To exclude these, add the following entries to exclude-entropy-patterns
in the config file.
[tool.tartufo]
exclude-entropy-patterns = [
{path-pattern = 'docs/.*\.md$', pattern = '^[a-zA-Z0-9]$', reason = 'exclude all git SHAs in the docs'},
{path-pattern = '\.github/workflows/.*\.yml', pattern = 'uses: .*@[a-zA-Z0-9]{40}', reason = 'GitHub Actions'}
]
Note
match-type
is used to select the search
or match
regex operation. search
looks for the regex
anywhere in the selected scope, while match
requires the regex to match at the beginning of the selected scope.
Defaults to search
scope
is used to specify if you want to perform the regex operation (search or match) by word
or line
.
word
means exactly the high-entropy string of characters, while line
searches the entire input line
containing the high-entropy string. Defaults to line
Thanks to the magic of TOML, you could also split these out into their own tables in the config if you wanted. So the following would be 100% equivalent to what you see above:
[[tool.tartufo.exclude-entropy-patterns]]
path-pattern = 'docs/.*\.md$'
pattern = '^[a-zA-Z0-9]$'
reason = 'exclude all git SHAs in the docs'
[[tool.tartufo.exclude-entropy-patterns]]
path-pattern = '\.github/workflows/.*\.yml'
pattern = 'uses: .*@[a-zA-Z0-9]{40}'
reason = 'GitHub Actions'
Note
In reality, the only key you have to specify is pattern
. If you do
this, the pattern match will apply to all files that are scanned.
Limiting by Signature¶
New in version 2.0.0.
Every time an issue is found during a scan, tartufo
will generate a
“signature” for that issue. This is a stable hash generated from the filename
and the actual string that was identified as being an issue.
For example, you might see the following header in the output for an issue:
Looking at this information, it’s clear that this issue was found in a test
file, and it’s probably okay. Of course, you will want to look at the actual
body of what was found and determine that for yourself. But let’s say that this
really is okay, and we want tell tartufo
to ignore this issue in future
scans. To do this, you can either specify it on the command line…
> tartufo -e 2a3cb329b81351e357b09f1b97323ff726e72bd5ff8427c9295e6ef68226e1d1
# No output! Success!
>
Or you can add it to your config file, so that this exclusion is always remembered!
[tool.tartufo]
exclude-signatures = [
"2a3cb329b81351e357b09f1b97323ff726e72bd5ff8427c9295e6ef68226e1d1",
]
Done! This particular issue will no longer show up in your scan results.
Limiting Scans by Path¶
New in version 2.5.0.
By default tartufo
will scan all objects tracked by Git. You can limit
scanning by either including fewer paths or excluding some of them using
Python Regular Expressions (regex) and the –include-path-patterns and
–exclude-path-patterns options.
Warning
Using include patterns is more dangerous, since it’s easy to miss the creation of new secrets if future files don’t match an existing include rule. We recommend only using fine-grained exclude patterns instead.
[tool.tartufo]
include-path-patterns = [
'src/',
'gradle/',
# regexes must match the entire path, but can use python's regex syntax
# for case-insensitive matching and other advanced options
'(.*/)?id_[rd]sa$',
# Single quoted strings in TOML don't require escapes for `\` in regexes
'(?i).*\.(properties|conf|ini|txt|y(a)?ml)$',
]
exclude-path-patterns = [
'(.*/)?\.classpath$',
'.*\.jmx$',
'(.*/)?test/(.*/)?resources/',
]
The filter expressions can also be specified as command line arguments. Patterns specified like this are merged with any patterns specified in the config file:
> tartufo \
--include-path-patterns 'src/' -ip 'gradle/' \
--exclude-path-patterns '(.*/)?\.classpath$' -xp '.*\.jmx$' \
scan-local-repo file://path/to/my/repo.git
Additional usage information is provided when calling tartufo
with the
-h
or --help
options.
These features help cut down on noise, and makes the tool easier to shove into a devops pipeline.