Siegfried

Siegfried is a signature-based file format identification tool.

It implements:


Install

Windows

Mac Homebrew (or Homebrew on Linux)

brew install richardlehane/digipres/siegfried

(a fork of mistydemeo/digipres/siegfried)

Ubuntu/Debian (64 bit)

curl -sL "http://keyserver.ubuntu.com/pks/lookup?op=get&search=0x20F802FE798E6857" | gpg --dearmor | sudo tee /usr/share/keyrings/siegfried-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/siegfried-archive-keyring.gpg] https://www.itforarchivists.com/ buster main" | sudo tee -a /etc/apt/sources.list.d/siegfried.list
sudo apt-get update && sudo apt-get install siegfried

FreeBSD

pkg install siegfried

Arch Linux

git clone https://aur.archlinux.org/siegfried.git
cd siegfried
makepkg -si

Usage

Identify files and directories

sf file.ext // Identify a file
sf *.ext // Identify groups of files using a glob pattern
sf DIR // Identify all files in a directory and its subdirectories
sf -nr DIR // Identify all files in a directory but don't recurse into subdirectories

Save output

sf PATH > my_results.yaml // Use a redirect (">") to save your results. PATH means file.ext, *.ext or DIR
sf -csv PATH > my_results.csv // Get identification results in CSV format (default is YAML)
sf -json PATH > my_results.json // Get identification results in JSON format

Additional commands

sf -z PATH // Scan within zip, tar, gzip, warc or arc files
sf -hash sha1 PATH // Calculate md5, sha1, sha256, sha512, or crc hash
sf -multi 32 PATH // Scan many files at once
sf -setconf -multi 32 -hash md5 -csv // Save your preferred configuration
sf -setconf -csv -conf csv.conf // Save (or load) named configurations with -conf

Update your signature file

sf -update

User guide

Detailed information about installing siegfried, identifying file formats, as well as more advanced topics, is available on the wiki.

Modify your signature file

The roy tool builds siegfried signature files. For help using this tool, see this guide.

Examples


Code, License, Issues

To view the source code and see the license details, go to the project page on Github. Please post any bugs or feature request to the issues page.

Announcements

Join the Google Group for updates, signature releases, and help.

Try siegfried

Richard's portrait

Drag a file on to Siegfried's anvil!


Chart your results

Upload a siegfried, droid or fido results file for analysis and sharing.

You'll get a page like this: https://www.itforarchivists.com/siegfried/results/ea1zaj


Benchmarks

See how siegfried compares with other format identification tools by viewing these automated benchmarks.

To see how the next release is progressing, check out the develop benchmarks.


Sets tool

Format sets enable grouping of formats by the sf and roy tools: e.g. roy build -limit @pdf. They can be useful elswhere, e.g. to isolate the formats Apache Tika can extract text from. This widget converts format sets to plain text or code snippets.

1. Select one or more PUIDs or sets (e.g. fmt/123 or @pdf). Sets are prefixed with the '@' symbol.

 

2. Choose your output format (plain text or a code snippet with a function that matches an input PUID against a set.)