7 files changed, 241 insertions, 132 deletions
diff --git a/docs/design.md b/docs/design.md
new file mode 100644
index 0000000..2e21f12
--- /dev/null
+++ b/docs/design.md
@@ -0,0 +1,116 @@
+# silentct
+
+This document introduces a silent Certificate Transparency monitor design.
+
+## Setting
+
+We consider a setting where one or more trusted systems request certificates for
+a list of domains.  The domains that a system request certificates for may
+overlap with the domains of other systems.  For example, there may be two
+distinct systems that host and request certificates for `www.example.org`.
+Other examples of "systems" that request certificates could include
+`jitsi.example.org`, `etherpad.example.org` and `gitlab.example.org`.
+
+The threat we are worried about is certificate mis-issuance.  Due to considering
+a multi-system setting with overlapping domains, no single system can be aware
+of all legitimately issued certificates for the domains that are being managed.
+
+A certificate is considered mis-issued if it contains:
+
+  1. at least one domain that any of the trusted systems manage _but without any
+     of the trusted systems requesting that certificate to be issued_, or
+  2. at least one subdomain of the domains that any of the trusted systems
+     manage _unless that subdomain is explicitly specified as out of scope_.
+
+The cause of certificate mis-issuance can vary, ranging from BGP and DNS hijacks
+to certificate authorities that are coerced, compromised, or actively malicious.
+
+## Goals and non-scope
+
+The goal is to detect certificate mis-issuance.  It is however out of scope to
+detect certificate mis-issuance that happened in the past.  In other words, if
+the design described herein is put into operation at time `T`, then any
+certificate mis-issuance that happened before time `T` is out of scope.  This is
+an important constraint that makes it _a lot less costly_ to bootstrap the
+monitor.  For example, old certificate backlogs can simply be ignored.
+
+It is also out of scope to detect certificate mis-issuance that targets web
+browsers without Certificate Transparency enforcement.  This is because we
+cannot get a concise view of all certificates without Certificate Transparency.
+
+To detect certificate mis-issuance, we want to construct a monitor that:
+
+  1. _is easy to self-host_, because you trust yourself or can then (more
+     easily) find someone you trust to do the monitoring on your behalf, and
+  2. _is silent_, so that there is little or no noise unless certificate
+     mis-issuance is actually suspected.  In other words, there should not be a
+     notification every time a legitimate certificate is issued or renewed.
+
+The "silent" property helps a lot for system administrators that manage more
+than a few certificates.  It also helps in the third-party monitoring setting,
+as it would not be more noisy to subscribe to notifications from >1 monitor.
+
+## Assumptions
+
+  - The attacker is unable to control two independent logs that count towards
+    the SCT checks in web browsers.  So, we need not worry about split-views and
+    can just download the logs while verifying that they are locally consistent.
+  - The systems that request certificates start in good states but may be
+    compromised sometime in the future.  Detection of certificate mis-issuance
+    is then out of scope for all domains that the compromised systems managed.
+  - A mis-issued certificate will only be used to target connections from a
+    fixed set of IP addresses.  A party that can distinguish between
+    certificates that are legitimate and mis-issued will never be targeted.
+  - A domain owner notices alerts about suspected certificate mis-issuance.  The
+    monitor that generates these alerts is trusted and never compromised.
+
+## Architecture
+
+A monitor downloads all certificates that are issued by certificate authorities
+from Certificate Transparency logs.  The exact logs to download is automatically
+updated using a list that Google publishes in signed form.  All historical
+updates to the list of logs is stored locally in case any issues are suspected.
+
+(It is possible to get INFO output whenever logs are added and removed.  The
+default verbosity is however NOTICE, which aims to be as silent as possible.)
+
+To filter out certificates that are not relevant, the monitor is configured with
+a list of domains to match on.  Only matching certificates will be stored, which
+means there are nearly no storage requirements to run this type of monitor.
+
+To get the "silent" property, the monitor pulls the trusted systems for
+legitimately issued certificates via HTTP GET.  Alternatively, the monitor can
+read a local file in case it is co-located with a single trusted system.  The
+monitor uses this as [feedback](./feedback.md) to filter the downloaded
+certificates that matched.  If a certificate is found that none of the trusted
+systems made available, only then is an alert emitted (NOTICE level output).
+
+The communication channel between the trusted systems and the monitor can be
+tampered with.  For example, it may be plain HTTP or an HTTPS connection that
+the attacker trivially hijacks by obtaining yet another mis-issued certificate.
+Owning that the communication channel is insecure helps avoid misconfiguration.
+
+A shared secret is used for each system to authenticate with the monitor.  This
+secret is never shown on the wire: an HMAC key is derived from it, which is used
+to produce message authentication codes.  All a machine-in-the-middle attacker
+can do is replay or block integrity-protected files that a system generated.
+
+"Replays" can happen either way because the monitor polls periodically, i.e.,
+the monitor needs to account for the fact that it may poll the same file twice.
+Blocking can not be solved by cryptography and would simply result in alerts.
+
+## Related work
+
+The commercial version of `certspotter` supports a push-based method for
+[authorizing][] legitimately issued certificates.  The monitor does its
+authentication using HTTP tokens.  In contrast, the silentct design is:
+
+  1. Safe against attackers that MitM the communication to the monitor, i.e.,
+     message authentication codes are used instead of HTTP access tokens.
+  2. Applicable in asynchronous workflows, i.e., the monitor does not need to
+     always be online and listen for allowlist requests on a public address.
+
+The initial authors of silentct were not aware of Andrew Ayer's related work
+before [this thread](https://follow.agwa.name/notice/AmyLDdYcAqF2p5sG24).
+
+[authorizing]: https://sslmate.com/help/reference/certspotter_authorization_api
diff --git a/docs/feedback.md b/docs/feedback.md
new file mode 100644
index 0000000..d79d57f
--- /dev/null
+++ b/docs/feedback.md
@@ -0,0 +1,23 @@
+# Feedback
+
+This document describes the integrity-protected file format that a trusted
+system uses when making legitimately issued certificates available to a monitor.
+
+## Format
+
+    NAME MAC
+    <CERTIFICATE CHAIN>
+    ...
+    <CERTIFICATE CHAIN>
+
+`NAME`: identifier that the monitor uses to locate the shared secret.
+
+`MAC`: HMAC with SHA256 as the hash function, computed for line two and forward.
+The shared HMAC key is derived as follows by the trusted system and the monitor:
+
+    hkdf := hkdf.New(sha256.New, SECRET, []byte("silentct"), NAME)
+    key := make([]byte, 16)
+    io.ReadFull(hkdf, key)
+
+`<CERTIFICATE CHAIN>`: certificate chain in PEM format that the trusted system
+considers legitimate.  Can be repeated, then delimited by "silentct:separator".
diff --git a/docs/help2man/reporting-bugs.help2man b/docs/help2man/reporting-bugs.help2man
index 81a4147..cfe3036 100644
--- a/docs/help2man/reporting-bugs.help2man
+++ b/docs/help2man/reporting-bugs.help2man
@@ -1,12 +1,14 @@
 [REPORTING BUGS]
 Use
 .B https://git.glasklar.is/rgdd/silentct/-/issues
-for filing issues.
-.br
-Reach out to
+for filing issues.  To file issues without a GitLab account, email
+.B rgdd-silentct-issues@incoming.glasklar.is
+and wait for a maintainer to make the issue public.
+You can also reach out to
 .B rgdd
 in room
 .B #certificate-transparency
 at
 .B OFTC.net
-and Matrix.
+and
+.B matrix.org.
diff --git a/docs/introduction.md b/docs/introduction.md
deleted file mode 100644
index 0aab2cc..0000000
--- a/docs/introduction.md
+++ /dev/null
@@ -1,103 +0,0 @@
-# Silent Certificate Transparency
-
-This document introduces a silent Certificate Transparency monitor design.
-
-## Setting
-
-We consider a setting where one or more trusted _nodes_ request certificates for
-a specified list of domain names.  The domain names that a node requests
-certificates for may overlap with the domain names of other nodes.  For example,
-there may be two distinct nodes that request certificates for a given domain.
-
-The threat we are worried about is certificate mis-issuance.  Due to considering
-a multi-node setting with overlapping domain names, no single node can be aware
-of all legitimately issued certificates for the domain names that it manages.
-
-A certificate is considered mis-issued if it contains:
-
-  1. at least one domain name that any of the trusted nodes manage _but without
-     any of the trusted nodes requesting that certificate to be issued_, or
-  2. at least one subdomain of the domain names that any of the trusted nodes
-     manage _unless that subdomain is explicitly specified as out of scope_.
-
-The cause of certificate mis-issuance can vary, ranging from BGP and DNS hijacks
-to certificate authorities that are coerced, compromised, or actively malicious.
-
-## Goals and non-scope
-
-The goal is to detect certificate mis-issuance, not to prevent it.  It is out of
-scope to detect certificate mis-issuance that happened in the past.  In other
-words, if the architecture described herein is put into operation at time `T`,
-then any certificate mis-issuance that happened before time `T` is out of scope.
-
-It is also out of scope to detect certificate mis-issuance that targets web
-browsers without Certificate Transparency enforcement.  This is because we
-cannot get a concise view of all certificates without Certificate Transparency.
-
-To achieve the goal of certificate mis-issuance, we want a _monitor_ that:
-
-  1. _is easy to self-host_, because you trust yourself or can then find someone
-     else that is appropriate and willing to host your infrastructure, and
-  2. _is silent_, so that there is little or no noise unless certificate
-     mis-issuance is suspected or other noteworthy log events are happening.
-
-## Assumptions
-
-  - The attacker is unable to control two independent logs that count towards
-    the SCT checks in web browsers.  So, we need not worry about split-views and
-    can just download the logs while verifying that they are locally consistent.
-  - The nodes that request certificates start in good states but may be
-    compromised sometime in the future.  Detection of certificate mis-issuance
-    is then out of scope for all domains that the compromised nodes managed.
-  - A mis-issued certificate will only be used to target connections from a
-    fixed set of IP addresses.  Any party that can distinguish between
-    certificates that are legitimate and mis-issued will never be targeted.
-  - A domain owner notices alerts about suspected certificate mis-issuance.  The
-    monitor that generates these alerts is trusted and never compromised.
-
-## Architecture
-
-A monitor downloads all certificates that are issued by certificate authorities
-from Certificate Transparency logs.  The exact logs to download is automatically
-updated using a list that Google publishes in signed form.  All historical
-updates to the list of logs is stored locally in case any issues are suspected.
-
-(It is possible to get INFO output whenever logs are added and removed.  The
-default verbosity is however NOTICE, which aims to be as silent as possible.)
-
-To filter out certificates that are not relevant, the monitor is configured with
-a list of domains to match on.  Only matching certificates will be stored, which
-means there are nearly no storage requirements to run this type of monitor.
-
-To get the property of _silence_, the monitor pulls the trusted nodes via HTTP
-GET for legitimately issued certificates (periodic job).  The monitor will use
-this feedback to filter the downloaded certificates that matched.  If any
-certificates are found that no node pushed to the monitor, an alert is printed.
-
-The communication channel between the trusted nodes and the monitor can be
-tampered with.  For example, it may be plain HTTP or an HTTPS connection that
-the attacker trivially hijacks by obtaining yet another mis-issued certificate.
-Owning that the communication channel is insecure helps avoid misconfiguration.
-
-A shared secret is used for each node to authenticate with the monitor.  This
-secret is never shown on the wire: an HMAC key is derived from it, which is used
-to produce message authentication codes.  All a machine-in-the-middle attacker
-can do is replay or block integrity-protected submissions that a node generated.
-
-"Replays" can happen either way because the monitor polls periodically, i.e.,
-the monitor needs to account for the fact that it may poll the same thing twice.
-Blocking can not be solved by cryptography and would simply result in alerts.
-
-## Further reading
-
-docdoc
-
-## Future ideas
-
-  - Reduce the amount of bandwidth that the monitor spends downloading
-    certificates that are either way discarded (non-matches).  This can be
-    achieved by introducing a _verifiable proxy_ supporting wildcard
-    (non-)membership proofs, see [verifiable light-weight monitoring][].  Ignore
-    the parts about changing the logs; that is easily solved by the proxy alone.
-
-[verifiable light-weight monitoring]: https://arxiv.org/pdf/1711.03952.pdf
diff --git a/docs/metrics.md b/docs/metrics.md
new file mode 100644
index 0000000..627776a
--- /dev/null
+++ b/docs/metrics.md
@@ -0,0 +1,96 @@
+# Metrics
+
+`silentct-mon` can output Prometheus metrics -- enable using the `-m` option.
+
+## Examples of useful alerts
+
+  - **The monitor is falling behind on downloading a particular log**, e.g.,
+    `silentct_log_size - silentct_log_index > 65536`.
+  - **The monitor hasn't seen a fresh timestamp from a particular log**, e.g.,
+    `time() - silentct_log_timestamp > 24*60*60`.
+  - **The monitor needs restarting**, e.g., `silentct_need_restart != 0`
+  - **Unexpected certificates have been found**, e.g.,
+    `silentct_unexpected_certificate_count > 0`.
+
+## `"silentct_error_counter"`
+
+```
+# HELP silentct_error_counter The number of errors propagated to the main loop.
+# TYPE silentct_error_counter counter
+silentct_error_counter 0
+```
+
+Do not use for alerting, this metric is too noisy and currently used for debug.
+
+## `"silentct_log_index"`
+
+```
+# HELP silentct_log_index The next log entry to be downloaded.
+# TYPE silentct_log_index gauge
+silentct_log_index{log_id="4e75a3275c9a10c3385b6cd4df3f52eb1df0e08e1b8d69c0b1fa64b1629a39df",log_name="Google 'Argon2025h1'} 7.30980064e+08
+```
+
+`log_id` is a unique log identifier in hex, computed as in RFC 6962 §3.2.
+
+`log_name` is a human-meaningful name of the log.
+
+## `"silentct_log_size"`
+
+```
+# HELP silentct_log_size The number of entries in the log.
+# TYPE silentct_log_size gauge
+silentct_log_size{log_id="4e75a3275c9a10c3385b6cd4df3f52eb1df0e08e1b8d69c0b1fa64b1629a39df",log_name="Google 'Argon2025h1'} 7.31044085e+08
+```
+
+`log_id` is a unique log identifier in hex, computed as in RFC 6962 §3.2.
+
+`log_name` is a human-meaningful name of the log.
+
+## `"silentct_log_timestamp"`
+
+```
+# HELP silentct_log_timestamp The log's UNIX timestamp in ms.
+# TYPE silentct_log_timestamp gauge
+silentct_log_timestamp{log_id="4e75a3275c9a10c3385b6cd4df3f52eb1df0e08e1b8d69c0b1fa64b1629a39df",log_name="Google 'Argon2025h1'} 1.737202578179e+12
+```
+
+`log_id` is a unique log identifier in hex, computed as in RFC 6962 §3.2.
+
+`log_name` is a human-meaningful name of the log.
+
+## `"silentct_need_restart"`
+
+```
+# HELP silentct_need_restart A non-zero value if the monitor needs restarting.
+# TYPE silentct_need_restart gauge
+silentct_need_restart 0
+```
+
+Restarts are normally not needed; but here's a metric until the `silentct-mon`
+implementation can assure that all corner-cases are handled without restarts.
+
+## `"silentct_unexpected_certificate_count"`
+
+```
+# HELP silentct_unexpected_certificate_count Number of certificates without any allowlisting
+# TYPE silentct_unexpected_certificate_count gauge
+silentct_unexpected_certificate_count{crt_sans="example.org www.example.org",log_id="4e75a3275c9a10c3385b6cd4df3f52eb1df0e08e1b8d69c0b1fa64b1629a39df",log_index="1234",log_name="Google 'Argon2025h1'} 1
+```
+
+`crt_sans` are the subject alternative names in the unexpected certificate,
+space separated.
+
+`log_id` is a unique log identifier in hex, computed as in RFC 6962 §3.2.
+
+`log_index` specifies the log entry that contains the unexpected certificate.
+
+`log_name` is a human-meaningful name of the log.
+
+See `STATE_DIRECTORY/crt_found/<log_id>-<log_index>.*` for further details.  The
+`.json` file contains the downloaded log entry.  The `.ascii` file contains the
+parsed leaf certificate in a human-readable format to make debugging easier.
+
+Allowlist an unexpected certificate by ingesting it from a trusted certificate
+requester.  Alternatively: stop the monitor, manually move the unexpected
+certificate from the "alerting" dictionary to the "legitimate" dictionary in
+`STATE_DIRECTORY/crt_index.json`, save, and then start the monitor again.
diff --git a/docs/storage.md b/docs/storage.md
deleted file mode 100644
index a0616ed..0000000
--- a/docs/storage.md
+++ /dev/null
@@ -1,3 +0,0 @@
-# Storage
-
-docdoc
diff --git a/docs/submission.md b/docs/submission.md
deleted file mode 100644
index 1d9c189..0000000
--- a/docs/submission.md
+++ /dev/null
@@ -1,22 +0,0 @@
-# Submission
-
-docdoc
-
-## Format
-
-    NAME MAC
-    <PEM CHAIN>
-    silentct:separator
-    ...
-    <PEM CHAIN>
-
-`NAME`: identifier that the monitor uses to locate the right secret.
-
-`MAC`: HMAC with SHA256 as the hash function, computed for line two and forward.
-The HMAC key is derived by the node and the monitor from their shared secret:
-
-    hkdf := hkdf.New(sha256.New, SECRET, []byte("silentct"), NAME)
-    key := make([]byte, 16)
-    io.ReadFull(hkdf, key)
-
-`<PEM CHAIN>`: certificate chain in PEM format the node considers legitimate.