aboutsummaryrefslogtreecommitdiff

ct-sans

A tool that downloads certificates from CT logs recognized by Google Chrome, storing the encountered Subject Alternative Names (SANs) to disk. The dataset can be assembled so that it is de-duplicated with one SAN per line.

Warning: research prototype. The source code may also be moved.

Quickstart

Install

You will need a Go compiler and GNU sort on the local system:

$ which go >/dev/null || echo "Go compiler not PATH"
$ which sort >/dev/null || echo "GNU sort not PATH"

Install ct-sans:

$ go install git.cs.kau.se/rasmoste/ct-sans@latest
$ which ct-sans >/dev/null || echo "ct-sans not in PATH"

List all options:

$ ct-sans -h

Snapshot

Download and verify the signature of Google's list of known logs, then download and verify the signatures of the logs' tree heads:

$ ct-sans snapshot -d $HOME/ct-sans-demo
2023/03/23 12:43:49 cmd_snapshot.go:30: INFO: updating metadata file
2023/03/23 12:43:49 cmd_snapshot.go:47: INFO: updating signed tree heads
2023/03/23 12:43:49 cmd_snapshot.go:82: INFO: bootstrapped Google 'Argon2023' log at tree size 862104911
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Google 'Argon2024' log at tree size 55767940
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Google 'Xenon2023' log at tree size 990277299
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Google 'Xenon2024' log at tree size 66655425
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Cloudflare 'Nimbus2023' Log at tree size 527018586
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Cloudflare 'Nimbus2024' Log at tree size 34050592
2023/03/23 12:43:51 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Yeti2024 Log at tree size 38426463
2023/03/23 12:43:53 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Yeti2025 Log at tree size 697
2023/03/23 12:43:54 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Nessie2023 Log at tree size 200387219
2023/03/23 12:43:55 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Nessie2024 Log at tree size 40017666
2023/03/23 12:43:55 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Nessie2025 Log at tree size 704
2023/03/23 12:43:56 cmd_snapshot.go:82: INFO: bootstrapped Sectigo 'Sabre' CT log at tree size 229064032
2023/03/23 12:43:57 cmd_snapshot.go:82: INFO: bootstrapped Let's Encrypt 'Oak2023' log at tree size 467618545
2023/03/23 12:43:57 cmd_snapshot.go:82: INFO: bootstrapped Let's Encrypt 'Oak2024H1' log at tree size 34451205
2023/03/23 12:43:57 cmd_snapshot.go:82: INFO: bootstrapped Let's Encrypt 'Oak2024H2' log at tree size 14680
2023/03/23 12:43:59 cmd_snapshot.go:82: INFO: bootstrapped Trust Asia Log2023 at tree size 388349
2023/03/23 12:44:01 cmd_snapshot.go:82: INFO: bootstrapped Trust Asia Log2024-2 at tree size 112771

Subsequent uses of the snapshot command will update the signed list of known logs, then update the logs' signed tree heads after verifying consistency.

Collect

Download and verify the logs' Merkle trees up until the current snapshot:

$ ct-sans collect -d $HOME/ct-sans-demo
...
INFO: status update before shutdown

            Google 'Argon2023' log  |   162.5 entries/s  |  Estimated done in 1474.01 hours  |  Working on [11776, 862104911)
            Google 'Argon2024' log  |   157.5 entries/s  |  Estimated done in  98.31 hours  |  Working on [11584, 55767940)
            Google 'Xenon2023' log  |   472.6 entries/s  |  Estimated done in 582.01 hours  |  Working on [33888, 990277299)
            Google 'Xenon2024' log  |   458.5 entries/s  |  Estimated done in  40.37 hours  |  Working on [32896, 66655425)
       Cloudflare 'Nimbus2023' Log  |   276.1 entries/s  |  Estimated done in 530.24 hours  |  Working on [19328, 527018586)
       Cloudflare 'Nimbus2024' Log  |   301.2 entries/s  |  Estimated done in  31.39 hours  |  Working on [20736, 34050592)
             DigiCert Yeti2024 Log  |   379.1 entries/s  |  Estimated done in  28.14 hours  |  Working on [27520, 38426463)
             DigiCert Yeti2025 Log  |     0.0 entries/s  |  Estimated done in   0.00 hours  |  Working on [697, 697)
           DigiCert Nessie2023 Log  |   331.3 entries/s  |  Estimated done in 168.00 hours  |  Working on [23040, 200387219)
           DigiCert Nessie2024 Log  |   329.8 entries/s  |  Estimated done in  33.68 hours  |  Working on [21120, 40017666)
           DigiCert Nessie2025 Log  |     0.0 entries/s  |  Estimated done in   0.00 hours  |  Working on [704, 704)
            Sectigo 'Sabre' CT log  |   275.7 entries/s  |  Estimated done in 230.78 hours  |  Working on [19456, 229064032)
       Let's Encrypt 'Oak2023' log  |   462.8 entries/s  |  Estimated done in 280.67 hours  |  Working on [33664, 467618545)
     Let's Encrypt 'Oak2024H1' log  |   121.4 entries/s  |  Estimated done in  78.79 hours  |  Working on [5248, 34451205)
     Let's Encrypt 'Oak2024H2' log  |     0.0 entries/s  |  Estimated done in   0.00 hours  |  Working on [14680, 14680)
                Trust Asia Log2023  |   215.8 entries/s  |  Estimated done in   0.48 hours  |  Working on [15872, 388349)
              Trust Asia Log2024-2  |   246.2 entries/s  |  Estimated done in   0.11 hours  |  Working on [17664, 112771)

This will take a while depending on the local system, configuration of the optional collect flags, as well as how heavily the logs apply rate-limits. For reference, we downloaded the logs from scratch in less than 11 days using a single-IP machine that respects the logs' rate-limits.

Assemble

Once the collect phase is done, assemble the data set:

$ echo "for demo-purposes, only Nessie2025 and Oak2024H2 are shown below"^C
$ ct-sans assemble -d $HOME/ct-sans-demo
2023/03/23 13:05:12 cmd_assemble.go:54: INFO: merging and de-duplicating 2 input files with GNU sort
2023/03/23 13:05:12 cmd_assemble.go:67: INFO: created /home/rgdd/ct-sans-demo/archive/2023-03-23-ct-sans/sans.lst (0.3 MiB)
2023/03/23 13:05:12 cmd_assemble.go:69: INFO: adding notice file
2023/03/23 13:05:12 cmd_assemble.go:87: INFO: adding README
2023/03/23 13:05:12 cmd_assemble.go:96: INFO: adding signed metadata file
2023/03/23 13:05:12 cmd_assemble.go:108: INFO: adding signed tree heads
2023/03/23 13:05:12 cmd_assemble.go:117: INFO: uncompressed dataset available in /home/rgdd/ct-sans-demo/archive/2023-03-23-ct-sans
$ cat $HOME/ct-sans-demo/archive/2023-03-23-ct-sans/README.md
# ct-sans dataset

Dataset assembled at Thu Mar 23 13:05:12 CET 2023.  Contents:

  - README.md
  - metadata.json
  - metadata.sig
  - sths.json
  - notice.txt
  - sans.lst

The signed [metadata file][] and tree heads were downloaded at
Thu Mar 23 12:43:49 CET 2023.

[metadata file]: https://groups.google.com/a/chromium.org/g/ct-policy/c/IdbrdAcDQto

In total, 15377 certificates were downloaded from 2 CT logs;
0 certificates contained SANs that could not be parsed.
For more information about these errors, see notice.txt.

The SANs data set is sorted and de-duplicated, one SAN per line.

Good to know

  • It is safe to ctrl+C while collecting. Just wait for the collect command to exit on its own so that things are persisted to disk.
  • The different ct-sans commands must not run at the same time.
  • The dataset can be updated by running the same snapshot, collect and assemble commands again.

Running a measurement

See how we collected the 2023-04-03-ct-sans dataset in docs/operations.md.

Contact

Licence

BSD 2-Clause License