1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
|
# ct-sans
A tool that downloads certificates from [CT logs][] [recognized by Google
Chrome][], storing the encountered [Subject Alternative Names (SANs)][] to disk.
The dataset can be assembled so that it is de-duplicated with one SAN per line.
[CT logs]: https://certificate.transparency.dev/
[recognized by Google Chrome]: https://groups.google.com/a/chromium.org/g/ct-policy/c/IdbrdAcDQto/
[Subject Alternative Names (SANs)]: https://www.rfc-editor.org/rfc/rfc5280#section-4.2.1.6/
**Warning:** research prototype.
## Quickstart
### Install
You will need a [Go compiler][] and [GNU sort][] on the local system:
$ which go >/dev/null || echo "Go compiler not PATH"
$ which sort >/dev/null || echo "GNU sort not PATH"
[Go compiler]: https://go.dev/doc/install
[GNU sort]: https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html
Install `ct-sans`:
$ go install git.cs.kau.se/rasmoste/ct-sans@latest
$ which ct-sans >/dev/null || echo "ct-sans not in PATH"
List all options:
$ ct-sans -h
### Snapshot
Download and verify the signature of Google's list of known logs,
then download and verify the signatures of the logs' tree heads:
$ ct-sans snapshot -d $HOME/ct-sans-demo
2023/03/23 12:43:49 cmd_snapshot.go:30: INFO: updating metadata file
2023/03/23 12:43:49 cmd_snapshot.go:47: INFO: updating signed tree heads
2023/03/23 12:43:49 cmd_snapshot.go:82: INFO: bootstrapped Google 'Argon2023' log at tree size 862104911
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Google 'Argon2024' log at tree size 55767940
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Google 'Xenon2023' log at tree size 990277299
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Google 'Xenon2024' log at tree size 66655425
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Cloudflare 'Nimbus2023' Log at tree size 527018586
2023/03/23 12:43:50 cmd_snapshot.go:82: INFO: bootstrapped Cloudflare 'Nimbus2024' Log at tree size 34050592
2023/03/23 12:43:51 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Yeti2024 Log at tree size 38426463
2023/03/23 12:43:53 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Yeti2025 Log at tree size 697
2023/03/23 12:43:54 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Nessie2023 Log at tree size 200387219
2023/03/23 12:43:55 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Nessie2024 Log at tree size 40017666
2023/03/23 12:43:55 cmd_snapshot.go:82: INFO: bootstrapped DigiCert Nessie2025 Log at tree size 704
2023/03/23 12:43:56 cmd_snapshot.go:82: INFO: bootstrapped Sectigo 'Sabre' CT log at tree size 229064032
2023/03/23 12:43:57 cmd_snapshot.go:82: INFO: bootstrapped Let's Encrypt 'Oak2023' log at tree size 467618545
2023/03/23 12:43:57 cmd_snapshot.go:82: INFO: bootstrapped Let's Encrypt 'Oak2024H1' log at tree size 34451205
2023/03/23 12:43:57 cmd_snapshot.go:82: INFO: bootstrapped Let's Encrypt 'Oak2024H2' log at tree size 14680
2023/03/23 12:43:59 cmd_snapshot.go:82: INFO: bootstrapped Trust Asia Log2023 at tree size 388349
2023/03/23 12:44:01 cmd_snapshot.go:82: INFO: bootstrapped Trust Asia Log2024-2 at tree size 112771
Subsequent uses of the `snapshot` command will update the signed list of known
logs, then update the logs' signed tree heads after verifying consistency.
### Collect
Download and verify the logs' Merkle trees up until the current snapshot:
$ ct-sans collect -d $HOME/ct-sans-demo
...
INFO: status update before shutdown
Google 'Argon2023' log | 162.5 entries/s | Estimated done in 1474.01 hours | Working on [11776, 862104911)
Google 'Argon2024' log | 157.5 entries/s | Estimated done in 98.31 hours | Working on [11584, 55767940)
Google 'Xenon2023' log | 472.6 entries/s | Estimated done in 582.01 hours | Working on [33888, 990277299)
Google 'Xenon2024' log | 458.5 entries/s | Estimated done in 40.37 hours | Working on [32896, 66655425)
Cloudflare 'Nimbus2023' Log | 276.1 entries/s | Estimated done in 530.24 hours | Working on [19328, 527018586)
Cloudflare 'Nimbus2024' Log | 301.2 entries/s | Estimated done in 31.39 hours | Working on [20736, 34050592)
DigiCert Yeti2024 Log | 379.1 entries/s | Estimated done in 28.14 hours | Working on [27520, 38426463)
DigiCert Yeti2025 Log | 0.0 entries/s | Estimated done in 0.00 hours | Working on [697, 697)
DigiCert Nessie2023 Log | 331.3 entries/s | Estimated done in 168.00 hours | Working on [23040, 200387219)
DigiCert Nessie2024 Log | 329.8 entries/s | Estimated done in 33.68 hours | Working on [21120, 40017666)
DigiCert Nessie2025 Log | 0.0 entries/s | Estimated done in 0.00 hours | Working on [704, 704)
Sectigo 'Sabre' CT log | 275.7 entries/s | Estimated done in 230.78 hours | Working on [19456, 229064032)
Let's Encrypt 'Oak2023' log | 462.8 entries/s | Estimated done in 280.67 hours | Working on [33664, 467618545)
Let's Encrypt 'Oak2024H1' log | 121.4 entries/s | Estimated done in 78.79 hours | Working on [5248, 34451205)
Let's Encrypt 'Oak2024H2' log | 0.0 entries/s | Estimated done in 0.00 hours | Working on [14680, 14680)
Trust Asia Log2023 | 215.8 entries/s | Estimated done in 0.48 hours | Working on [15872, 388349)
Trust Asia Log2024-2 | 246.2 entries/s | Estimated done in 0.11 hours | Working on [17664, 112771)
This will take a while depending on the local system, configuration of the
optional `collect` flags, as well as how heavily the logs apply rate-limits.
For reference, we [downloaded the logs](./docs/operations.md) from scratch in
less than 11 days using a machine with a single IP address. Note that it would
take roughly twice as long if the same measurement had been started during the
fall (because then the current year's log shards would have larger backlogs).
### Assemble
Once the collect phase is done, assemble the data set:
$ echo "for demo-purposes, only Nessie2025 and Oak2024H2 are shown below"^C
$ ct-sans assemble -d $HOME/ct-sans-demo
2023/03/23 13:05:12 cmd_assemble.go:54: INFO: merging and de-duplicating 2 input files with GNU sort
2023/03/23 13:05:12 cmd_assemble.go:67: INFO: created /home/rgdd/ct-sans-demo/archive/2023-03-23-ct-sans/sans.lst (0.3 MiB)
2023/03/23 13:05:12 cmd_assemble.go:69: INFO: adding notice file
2023/03/23 13:05:12 cmd_assemble.go:87: INFO: adding README
2023/03/23 13:05:12 cmd_assemble.go:96: INFO: adding signed metadata file
2023/03/23 13:05:12 cmd_assemble.go:108: INFO: adding signed tree heads
2023/03/23 13:05:12 cmd_assemble.go:117: INFO: uncompressed dataset available in /home/rgdd/ct-sans-demo/archive/2023-03-23-ct-sans
$ cat $HOME/ct-sans-demo/archive/2023-03-23-ct-sans/README.md
# ct-sans dataset
Dataset assembled at Thu Mar 23 13:05:12 CET 2023. Contents:
- README.md
- metadata.json
- metadata.sig
- sths.json
- notice.txt
- sans.lst
The signed [metadata file][] and tree heads were downloaded at
Thu Mar 23 12:43:49 CET 2023.
[metadata file]: https://groups.google.com/a/chromium.org/g/ct-policy/c/IdbrdAcDQto
In total, 15377 certificates were downloaded from 2 CT logs;
0 certificates contained SANs that could not be parsed.
For more information about these errors, see notice.txt.
The SANs data set is sorted and de-duplicated, one SAN per line.
### Good to know
- It is safe to ctrl+C while collecting. Just wait for the `collect` command
to exit on its own so that things are persisted to disk.
- The different `ct-sans` commands must not run at the same time.
- The dataset can be updated by running the same `snapshot`, `collect` and
`assemble` commands again.
## Running a measurement
See how we collected the 2023-04-03-ct-sans dataset in
[docs/operations.md](./docs/operations.md).
## Contact
- IRC: room `#certificate-transparency` at [OFTC.net][]
- Matrix: room `#certificate-transparency][]` at [matrix.org][]
- Email: rasmus (at) rgdd (dot) se
[OFTC.net]: https://www.oftc.net/
[matrix.org]: https://matrix.to/#/#certificate-transparency:matrix.org
## Licence
BSD 2-Clause License
|