Siegfried
Siegfried is a signature-based file format identification tool.
It implements:
- the National Archives UK’s PRONOM file format signatures
- freedesktop.org’s MIME-info file format signatures
- the Library of Congress’s FDD file format signatures (beta).
Install
Windows
- download latest (v. 1.11.1) binary: ( 64bit )
- copy to a location in your system path
- run the
sf -update
command to download the latest signatures (got troubles? Try this troubleshooting guide) - if you want to build your own signatures with
roy
, copy the latest signature data into a “siegfried” directory within your user home directory (e.g. c:\users\richardl\siegfried): data.zip
Mac Homebrew (or Homebrew on Linux)
brew install richardlehane/digipres/siegfried
(a fork of mistydemeo/digipres/siegfried)
Ubuntu/Debian (64 bit)
curl -sL "http://keyserver.ubuntu.com/pks/lookup?op=get&search=0x20F802FE798E6857" | gpg --dearmor | sudo tee /usr/share/keyrings/siegfried-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/siegfried-archive-keyring.gpg] https://www.itforarchivists.com/ buster main" | sudo tee -a /etc/apt/sources.list.d/siegfried.list
sudo apt-get update && sudo apt-get install siegfried
FreeBSD
pkg install siegfried
Arch Linux
git clone https://aur.archlinux.org/siegfried.git
cd siegfried
makepkg -si
Usage
Identify files and directories
sf file.ext // Identify a file
sf *.ext // Identify groups of files using a glob pattern
sf DIR // Identify all files in a directory and its subdirectories
sf -nr DIR // Identify all files in a directory but don't recurse into subdirectories
Save output
sf PATH > my_results.yaml // Use a redirect (">") to save your results. PATH means file.ext, *.ext or DIR
sf -csv PATH > my_results.csv // Get identification results in CSV format (default is YAML)
sf -json PATH > my_results.json // Get identification results in JSON format
Additional commands
sf -z PATH // Scan within zip, tar, gzip, warc or arc files
sf -hash sha1 PATH // Calculate md5, sha1, sha256, sha512, or crc hash
sf -multi 32 PATH // Scan many files at once
sf -setconf -multi 32 -hash md5 -csv // Save your preferred configuration
sf -setconf -csv -conf csv.conf // Save (or load) named configurations with -conf
Update your signature file
sf -update
User guide
Detailed information about installing siegfried, identifying file formats, as well as more advanced topics, is available on the wiki.
Modify your signature file
The roy tool builds siegfried signature files. For help using this tool, see this guide.
Examples
Code, License, Issues
To view the source code and see the license details, go to the project page on Github. Please post any bugs or feature request to the issues page.
Announcements
Join the Google Group for updates, signature releases, and help.
Try siegfried
Drag a file on to Siegfried's anvil!
Chart your results
Upload a siegfried, droid or fido results file for analysis and sharing.
You'll get a page like this: https://www.itforarchivists.com/siegfried/results/ea1zaj
Benchmarks
See how siegfried compares with other format identification tools by viewing these automated benchmarks.
To see how the next release is progressing, check out the develop benchmarks.
Sets tool
Format sets enable grouping of formats by the sf and roy tools: e.g. roy build -limit @pdf
. They can be useful elswhere, e.g. to isolate the formats Apache Tika can extract text from. This widget converts format sets to plain text or code snippets.
1. Select one or more PUIDs or sets (e.g. fmt/123 or @pdf). Sets are prefixed with the '@' symbol.
2. Choose your output format (plain text or a code snippet with a function that matches an input PUID against a set.)