aboutsummaryrefslogtreecommitdiff
path: root/docs/operations.md
blob: 1693ec34ce3cd3af92642ac66ffb4cc930bc9db4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
# Operations

This document describes our ct-sans data collection, including information about
the local system and a timeline leading up to assembling the 2023-04-03 dataset.

## Summary

The initial download time for the current CT logs was 11 days (March 2023).  The
time to assemble the final dataset of 0.91B unique SANs (25.2GiB) was 6 hours.

The assembled data set is available here:

  - https://dart.cse.kau.se/ct-sans/2023-04-03-ct-sans.zip

## Local system

We're running Ubuntu in a VM:

    $ lsb_release -a
    No LSB modules are available.
    Distributor ID: Ubuntu
    Description:    Ubuntu 22.04.2 LTS
    Release:        22.04
    Codename:       jammy

Our VM is configured with 62.9GiB RAM, one CPU core with 32 CPU threads, and a
~2TiB SSD:

    $ grep MemTotal /proc/meminfo
    processor /proc/cpuinfoemTotal:       65948412 keand 
    $ grep -c processor /proc/cpuinfo
    32
    $ grep 'cpu cores' /proc/cpuinfo | uniq
    cpu cores       : 1
    $ df -BG /home
    Filesystem                        1G-blocks  Used Available Use% Mounted on
    /dev/mapper/ubuntu--vg-ubuntu--lv     2077G  220G     1772G  12% /

This VM shares a 1x10Gbps link with other network VMs that we have no control
over.  We installed `vnstat` to track our own bandwidth-usage over time:

    # apt install vnstat
    # systemctl enable vnstat.service
    # systemctl start vnstat.service

We also installed Go version 1.20, see [install instructions][]:

    $ go version
    go version go1.20.2 linux/amd64

[install instructions]: https://go.dev/doc/install

The versions of `git.cs.kau.se/rasmoste/ct-sans@VERSION` are listed below.

## Timeline

| date       | time (UTC) | event                       | notes                                 |
| ---------- | ---------- | --------------------------- | ------------------------------------- |
| 2023/03/18 | 20:05:30   | snapshot and start collect  | running v0.0.1, see command notes [1] |
| 2023/03/27 | 14:53:59   | stop collect, bump version  | install v0.0.2, see migrate notes [2] |
| 2023/03/27 | 15:03:12   | start collect again         | mainly waiting for Argon2023 now [3]  |
| 2023/03/29 | 10:22:24   | collect completed           |                                       |
| 2023/03/29 | 15:46:44   | snapshot and collect again  | download backlog from last 10 days    |
| 2023/03/30 | 05:52:38   | collect completed           |                                       |
| 2023/03/30 | 08:58:50   | snapshot and collect again  | download backlog from last ~16 hours  |
| 2023/03/30 | 09:53:34   | collect completed           | bandwidth usage statistics [4]        |
| 2023/03/30 | 10:05:40   | start assemble              | still running v0.0.2 [5]              |
| 2023/03/30 | 16:06:39   | assemble done               | 0.9B sans (25GiB, 7GiB zipped in 15m) |
| 2023/04/02 | 23:31:37   | snapshot and collect again  | download backlog, again               |
| 2023/04/03 | 03:54:18   | collect completed           |                                       |
| 2023/04/03 | 08:52:28   | snapshot and collect again  | final before assembling for real use  |
| 2023/04/03 | 09:22:22   | collect completed           |                                       |
| 2023/04/03 | 09:30:00   | start assemble              | [5]                                   |
| 2023/04/03 | 16:12:38   | assemble done               | 0.91B SANs (25.2GiB) from 3.74B certs |
| 2024/02/10 | 09:10:20   | snapshot and start collect  | still running v0.0.2 [6]              |

## Notes

### 1

    $ ct-sans snapshot >snapshot.stdout
    $ ct-sans collect --workers 40 --batch-disk 131072 --batch-req 2048 --metrics 60m >collect.stdout 2>collect.stderr

### 2

In addition to adding the assemble command, `v0.0.2` stores notice.txt files in
each log's directory automatically.  This ensures that the output in stdout can
be discarded as opposed to being stored and managed manually in the long run
(e.g., grep for NOTICE prints when assembling data sets).

Commit `ad9fb49670e28414637761bac4b8e8940e2d6770` includes a Go program that
transforms an existing `collect.stderr` file to `notice.txt` files.

Steps to migrate:

  - [x] Stop (ctrl+c, wait)
  - [x] Move collect.{stdout,stderr} to data/notes/
  - [x] `grep NOTICE data/notes/collect.stdout | wc -l` gives 6919 lines
  - [x] run the program in the above commit with the appropriate `directory` and
    `noticeFile` paths.  See output below.
  - [x] `wc -l $(find . -name notice.txt) -> total says 6919 lines
  - [x] go install git.cs.kau.se/rasmoste/ct-sans@latest, downloaded v0.0.2
  - [x] run the same collect command as in note (1); this will not overwrite the
    previous collect files because they have been moved to data/notes/.  In the
    future we will not need to store any of this, but doing it now just in case
    something goes wrong.
  - [x] The only two logs that had entries left to download resumed

Output from migrate program and santity check:

    $ go run .
    2023/03/27 14:57:41 Google 'Argon2023' log: 608 notices
    2023/03/27 14:57:41 Google 'Argon2024' log: 101 notices
    2023/03/27 14:57:41 Google 'Xenon2023' log: 2119 notices
    2023/03/27 14:57:41 Google 'Xenon2024' log: 170 notices
    2023/03/27 14:57:41 Cloudflare 'Nimbus2023' Log: 2194 notices
    2023/03/27 14:57:41 Cloudflare 'Nimbus2024' Log: 164 notices
    2023/03/27 14:57:41 DigiCert Yeti2024 Log: 17 notices
    2023/03/27 14:57:41 DigiCert Yeti2025 Log: no notices
    2023/03/27 14:57:41 DigiCert Nessie2023 Log: 155 notices
    2023/03/27 14:57:41 DigiCert Nessie2024 Log: 19 notices
    2023/03/27 14:57:41 DigiCert Nessie2025 Log: no notices
    2023/03/27 14:57:41 Sectigo 'Sabre' CT log: 1140 notices
    2023/03/27 14:57:41 Let's Encrypt 'Oak2023' log: 156 notices
    2023/03/27 14:57:41 Let's Encrypt 'Oak2024H1' log: 14 notices
    2023/03/27 14:57:41 Let's Encrypt 'Oak2024H2' log: no notices
    2023/03/27 14:57:41 Trust Asia Log2023: 62 notices
    2023/03/27 14:57:41 Trust Asia Log2024-2: no notices
    $ wc -l $(find . -name notice.txt)
    101 ./data/logs/eecdd064d5db1acec55cb79db4cd13a23287467cbcecdec351485946711fb59b/notice.txt
     14 ./data/logs/3b5377753e2db9804e8b305b06fe403b67d84fc3f4c7bd000d2d726fe1fad417/notice.txt
    155 ./data/logs/b3737707e18450f86386d605a9dc11094a792db1670c0b87dcf0030e7936a59a/notice.txt
     62 ./data/logs/e87ea7660bc26cf6002ef5725d3fe0e331b9393bb92fbf58eb3b9049daf5435a/notice.txt
    164 ./data/logs/dab6bf6b3fb5b6229f9bc2bb5c6be87091716cbb51848534bda43d3048d7fbab/notice.txt
    608 ./data/logs/e83ed0da3ef5063532e75728bc896bc903d3cbd1116beceb69e1777d6d06bd6e/notice.txt
    156 ./data/logs/b73efb24df9c4dba75f239c5ba58f46c5dfc42cf7a9f35c49e1d098125edb499/notice.txt
   1140 ./data/logs/5581d4c2169036014aea0b9b573c53f0c0e43878702508172fa3aa1d0713d30c/notice.txt
     19 ./data/logs/73d99e891b4c9678a0207d479de6b2c61cd0515e71192a8c6b80107ac17772b5/notice.txt
    170 ./data/logs/76ff883f0ab6fb9551c261ccf587ba34b4a4cdbb29dc68420a9fe6674c5a3a74/notice.txt
   2119 ./data/logs/adf7befa7cff10c88b9d3d9c1e3e186ab467295dcfb10c24ca858634ebdc828a/notice.txt
   2194 ./data/logs/7a328c54d8b72db620ea38e0521ee98416703213854d3bd22bc13a57a352eb52/notice.txt
     17 ./data/logs/48b0e36bdaa647340fe56a02fa9d30eb1c5201cb56dd2c81d9bbbfab39d88473/notice.txt
   6919 total

### 3

For some reason Nimbus2023 is stuck at

    {"tree_size":512926523,"RootHash":[41,19,83,107,69,253,233,106,68,143,173,151,177,196,60,228,22,57,246,105,184,51,24,50,230,153,233,189,214,93,132,186]}

while trying to fetch until

     {"sth_version":0,"tree_size":513025681,"timestamp":1679169572616,"sha256_root_hash":"0SzzS0M2RP5BHC6M9bvOPySYJadPi9nnk2Dsav4NKKs=","tree_head_signature":"BAMARjBEAiBXrmT+W2Ct+32DX/XL+YwS9Ut4rnOG6Y+A4Lxbf/6TogIgYEM32vweDC0QStwMq1PzIvm97cQhj6bUSdZWq/wMkNw=","log_id":"ejKMVNi3LbYg6jjgUh7phBZwMhOFTTvSK8E6V6NS61I="}

These tree heads are not inconsistent, and a restart should resolve the problem.

(There is likely a corner-case somewhere that made the fetcher exit or halt.  We
should debug this further at some point; but have not happened more than once.)

### 4

Quick overview:

    $ vnstat -d
     ens160  /  daily
    
              day        rx      |     tx      |    total    |   avg. rate
         ------------------------+-------------+-------------+---------------
         2023-03-18     1.49 TiB |   17.07 GiB |    1.51 TiB |  153.44 Mbit/s
         2023-03-19     3.77 TiB |   41.21 GiB |    3.81 TiB |  387.83 Mbit/s
         2023-03-20     3.09 TiB |   36.67 GiB |    3.13 TiB |  318.26 Mbit/s
         2023-03-21     3.11 TiB |   32.24 GiB |    3.14 TiB |  319.61 Mbit/s
         2023-03-22     2.08 TiB |   25.98 GiB |    2.10 TiB |  213.89 Mbit/s
         2023-03-23     1.16 TiB |   15.59 GiB |    1.18 TiB |  119.97 Mbit/s
         2023-03-24     1.17 TiB |   15.44 GiB |    1.18 TiB |  120.44 Mbit/s
         2023-03-25     1.18 TiB |   15.72 GiB |    1.19 TiB |  121.55 Mbit/s
         2023-03-26   707.47 GiB |    9.64 GiB |  717.11 GiB |   71.30 Mbit/s
         2023-03-27   448.80 GiB |    6.43 GiB |  455.23 GiB |   45.26 Mbit/s
         2023-03-28   451.49 GiB |    6.49 GiB |  457.98 GiB |   45.53 Mbit/s
         2023-03-29     1.01 TiB |   12.73 GiB |    1.03 TiB |  104.45 Mbit/s
         2023-03-30   256.75 GiB |    3.40 GiB |  260.15 GiB |   59.59 Mbit/s
         ------------------------+-------------+-------------+---------------
         estimated    591.55 GiB |    7.84 GiB |  599.39 GiB |

### 5

Use at most 58GiB RAM for sorting, 8 parallel sort workers.  More than this does
not improve performance according to the [GNU sort manual][].  We're also
setting the `LC_ALL=C` variable to ensure consistent sort order (see man).

    $ export LC_ALL=C
    $ ct-sans assemble -b 58 -p 8 >assemble.stdout

[GNU sort manual]: https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html

(We don't need to change the default directories, because the collected data is
stored in ./data and /tmp is a fine place to put things on our system.)

### 6

There are 0.91B unique SANs in the 25.2GiB dataset (6.1GiB compressed):

    $ du -shb data/archive/2023-04-03-ct-sans
    27050799992     data/archive/2023-04-03-ct-sans
    $ python3 -c "print(f'{27050799992 / 1024**3:.1f}GiB')"
    25.2GiB
    $ du -shb data/archive/2023-04-03-ct-sans.zip
    6526876407      data/archive/2023-04-03-ct-sans.zip
    $ python3 -c "print(f'{6526876407 / 1024**3:.1f}GiB')"
    6.1GiB
    $ wc -l data/archive/2023-04-03-ct-sans/sans.lst
    907332515 data/archive/2023-04-03-ct-sans/sans.lst
    $ python3 -c "print(f'{907332515 / 1000**3:.2f}B')"
    0.91B

These SANs were found in 3.74B certificates from 17 CT logs:

    $ grep "In total," data/archive/2023-04-03-ct-sans/README.md
    In total, 3743244652 certificates were downloaded from 17 CT logs;
    $ python3 -c "print(f'{3743244652 / 1000**3:.2f}B')"
    3.74B

## 6

    $ ct-sans snapshot >snapshot.stdout
    $ ct-sans collect --workers 40 --batch-disk 131072 --batch-req 2048 --metrics 60m >collect.stdout 2>collect.stderr