

A tool that downloads certificates from CT logs recognized by Google Chrome, storing the encountered Subject Alternative Names (SANs) to disk. The final data set sans.lst is de-duplicated and contains one SAN per line.

Warning: research prototype. The source code may also be moved.

Quick start

You will need a Go compiler and GNU sort on the local system:

$ which go || echo "Go compiler is not in $PATH"
$ which sort || echo "GNU sort is not in $PATH"

Install ct-sans:

$ go install git.cs.kau.se/rasmoste/ct-sans@latest
$ which ct-sans || echo "ct-sans is not in $PATH"

Download and verify the signature of Google's list of known logs, then download and verify the signatures of the logs' tree heads:

$ ct-sans snapshot -d $HOME/ct-sans-demo
2023/03/23 12:43:49 cmd_snapshot.go:30: INFO: updating metadata file
2023/03/23 12:43:49 cmd_snapshot.go:47: INFO: updating signed tree heads
2023/03/23 12:43:49 cmd_snapshot.go:82: INFO: bootstrapped Google 'Argon2023' log at tree size 862104911
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Google 'Argon2024' log at tree size 55767940
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Google 'Xenon2023' log at tree size 990277299
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Google 'Xenon2024' log at tree size 66655425
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Cloudflare 'Nimbus2023' Log at tree size 527018586
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Cloudflare 'Nimbus2024' Log at tree size 34050592
2023/03/23 12:43:51 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Yeti2024 Log at tree size 38426463
2023/03/23 12:43:53 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Yeti2025 Log at tree size 697
2023/03/23 12:43:54 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Nessie2023 Log at tree size 200387219
2023/03/23 12:43:55 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Nessie2024 Log at tree size 40017666
2023/03/23 12:43:55 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Nessie2025 Log at tree size 704
2023/03/23 12:43:56 cmd_snapshot.go:82: INFO: bootstrapped Sectigo 'Sabre' CT log at tree size 229064032
2023/03/23 12:43:57 cmd_snapshot.go:82: INFO: bootstrapped Let's Encrypt 'Oak2023' log at tree size 467618545
2023/03/23 12:43:57 cmd_snapshot.go:82: INFO: bootstrapped Let's Encrypt 'Oak2024H1' log at tree size 34451205
2023/03/23 12:43:57 cmd_snapshot.go:82: INFO: bootstrapped Let's Encrypt 'Oak2024H2' log at tree size 14680
2023/03/23 12:43:59 cmd_snapshot.go:82: INFO: bootstrapped Trust Asia Log2023 at tree size 388349
2023/03/23 12:44:01 cmd_snapshot.go:82: INFO: bootstrapped Trust Asia Log2024-2 at tree size 112771

Subsequent uses of the snapshot command will update the signed list of known logs, then update the logs' signed tree heads after verifying consistency.

Download and verify the logs' Merkle trees up until the current snapshot:

$ ct-sans collect -d $HOME/ct-sans-demo
INFO: status update before shutdown

            Google 'Argon2023' log  |   162.5 entries/s  |  Estimated done in 1474.01 hours  |  Working on [11776, 862104911)
            Google 'Argon2024' log  |   157.5 entries/s  |  Estimated done in  98.31 hours  |  Working on [11584, 55767940)
            Google 'Xenon2023' log  |   472.6 entries/s  |  Estimated done in 582.01 hours  |  Working on [33888, 990277299)
            Google 'Xenon2024' log  |   458.5 entries/s  |  Estimated done in  40.37 hours  |  Working on [32896, 66655425)
       Cloudflare 'Nimbus2023' Log  |   276.1 entries/s  |  Estimated done in 530.24 hours  |  Working on [19328, 527018586)
       Cloudflare 'Nimbus2024' Log  |   301.2 entries/s  |  Estimated done in  31.39 hours  |  Working on [20736, 34050592)
             DigiCert Yeti2024 Log  |   379.1 entries/s  |  Estimated done in  28.14 hours  |  Working on [27520, 38426463)
             DigiCert Yeti2025 Log  |     0.0 entries/s  |  Estimated done in   0.00 hours  |  Working on [697, 697)
           DigiCert Nessie2023 Log  |   331.3 entries/s  |  Estimated done in 168.00 hours  |  Working on [23040, 200387219)
           DigiCert Nessie2024 Log  |   329.8 entries/s  |  Estimated done in  33.68 hours  |  Working on [21120, 40017666)
           DigiCert Nessie2025 Log  |     0.0 entries/s  |  Estimated done in   0.00 hours  |  Working on [704, 704)
            Sectigo 'Sabre' CT log  |   275.7 entries/s  |  Estimated done in 230.78 hours  |  Working on [19456, 229064032)
       Let's Encrypt 'Oak2023' log  |   462.8 entries/s  |  Estimated done in 280.67 hours  |  Working on [33664, 467618545)
     Let's Encrypt 'Oak2024H1' log  |   121.4 entries/s  |  Estimated done in  78.79 hours  |  Working on [5248, 34451205)
     Let's Encrypt 'Oak2024H2' log  |     0.0 entries/s  |  Estimated done in   0.00 hours  |  Working on [14680, 14680)
                Trust Asia Log2023  |   215.8 entries/s  |  Estimated done in   0.48 hours  |  Working on [15872, 388349)
              Trust Asia Log2024-2  |   246.2 entries/s  |  Estimated done in   0.11 hours  |  Working on [17664, 112771)

This will take a while depending on the local system, configuration of the optional collect flags, as well as how heavily the logs apply rate-limits. For good performance while respecting rate-limits, you may want to try --workers 40 --batch-disk 131072 --batch-req 2048 --metrics 60m. This allowed us to download the logs (March 2023) in approximately 10 days. Our machine was located in EU with 2TiB SSD, 64GiB memory, 16 CPU cores, and 1Gbps line-speed.

Of note is that it is safe to ctrl+C while collecting. Just wait for the collect command to exit on its own so that things are persisted to disk.

Once the collect phase is done, assemble the data set:

$ echo "for demo-purposes, only Nessie2025 and Oak2024H2 are shown below"^C
$ ct-sans assemble -d $HOME/ct-sans-demo
2023/03/23 13:05:12 cmd_assemble.go:54: INFO: merging and de-duplicating 2 input files with GNU sort
2023/03/23 13:05:12 cmd_assemble.go:67: INFO: created /home/rgdd/ct-sans-demo/archive/2023-03-23-ct-sans/sans.lst (0.3 MiB)
2023/03/23 13:05:12 cmd_assemble.go:69: INFO: adding notice file
2023/03/23 13:05:12 cmd_assemble.go:87: INFO: adding README
2023/03/23 13:05:12 cmd_assemble.go:96: INFO: adding signed metadata file
2023/03/23 13:05:12 cmd_assemble.go:108: INFO: adding signed tree heads
2023/03/23 13:05:12 cmd_assemble.go:117: INFO: uncompressed dataset available in /home/rgdd/ct-sans-demo/archive/2023-03-23-ct-sans
$ cat $HOME/ct-sans-demo/archive/2023-03-23-ct-sans/README.md
# ct-sans dataset

Dataset assembled at Thu Mar 23 13:05:12 CET 2023.  Contents:

  - README.md
  - metadata.json
  - metadata.sig
  - sths.json
  - notice.txt
  - sans.lst

The signed [metadata file][] and tree heads were downloaded at
Thu Mar 23 12:43:49 CET 2023.

[metadata file]: https://groups.google.com/a/chromium.org/g/ct-policy/c/IdbrdAcDQto

In total, 15377 certificates were downloaded from 2 CT logs;
0 certificates contained SANs that could not be parsed.
For more information about these errors, see notice.txt.

The SANs data set is sorted and de-duplicated, one SAN per line.

Note: the different ct-sans commands must not run at the same time.

Updating the data set

Simply run the same snapshot, collect, and assemble commands again.



BSD 2-Clause License