diff options
Diffstat (limited to 'summary/src/ctor/src/background.tex')
-rw-r--r-- | summary/src/ctor/src/background.tex | 150 |
1 files changed, 150 insertions, 0 deletions
diff --git a/summary/src/ctor/src/background.tex b/summary/src/ctor/src/background.tex new file mode 100644 index 0000000..85d972f --- /dev/null +++ b/summary/src/ctor/src/background.tex @@ -0,0 +1,150 @@ +\section{Background} \label{ctor:sec:background} +The theory and current practise of CT is introduced first, then Tor +and its privacy-preserving Tor Browser. + +\subsection{Certificate Transparency} \label{ctor:sec:background:ct} +The idea to transparently log TLS certificates emerged at Google in response to +a lack of proposals that could be deployed without drastic ecosystem changes +and/or significant downsides~\cite{ct/a}. By making the set of issued +certificate chains\footnote{% + A domain owner's certificate is signed by an intermediate CA, whose + certificate is in turned signed by a root CA that acts as a trust + anchor~\cite{ca-ecosystem}. Such a \emph{certificate chain} is valid if it + ends in a trusted anchor that is shipped in the user's system software. +} transparent, anyone that inspect the logs can detect certificate +mis-issuance \emph{after the fact}. It would be somewhat circular to solve +issues in the CA ecosystem by adding trusted CT logs. Therefore, the +cryptographic foundation of CT is engineered to avoid any such reliance. +Google's \emph{gradual} CT roll-out started in 2015, and evolved from +downgrading user-interface indicators in Chrome to the current state of hard +failures unless a certificate is accompanied by a signed \emph{promise} that it +will appear in two CT logs~\cite{does-ct-break-the-web}. Unlike Apple's +Safari~\cite{apple-log-policy}, these two logs must additionally be operated by +Google and not-Google to ensure independence~\cite{google-log-policy}. + +The lack of mainstream verification, i.e., beyond checking signatures, allows an +attacker to side-step the current CT enforcement with minimal risk of exposure +\emph{if the required logs are controlled by the attacker}. +CTor integrates into the gradual CT roll-out by starting on the +premise of pairwise-independently trusted CT logs, which +avoids the risk of bad user experience~\cite{does-ct-break-the-web} +and significant system complexity. For example, web pages are unlikely to +break, TLS handshake latency stays about the same, and no robust management of +suspected log misbehavior is needed. Retaining the latter property as part of +our incremental designs simplifies deployment. + +\subsubsection{Cryptographic Foundation} +The operator of a CT log maintains a tamper-evident append-only Merkle +tree~\cite{ct,ct/bis}. At any time, a Signed Tree Head (STH) can be produced +which fixes the log's structure and content. Important attributes of an STH +include + the tree head (a cryptographic hash), + the tree size (a number of entries), and + the current time. +Given two tree sizes, a log can produce a \emph{consistency proof} that proves +the newer tree head entails everything that the older tree head does. As such, +anyone can verify that the log is append-only without downloading all entries +and recomputing the tree head. Membership of an entry can also be proven +by producing an \emph{inclusion proof} for an STH. These proof techniques are +formally verified~\cite{secure-logging-and-ct}. + +Upon a valid request, a log must add an entry and produce a new STH that covers +it within a time known as the Maximum Merge Delay (MMD), e.g., 24~hours. This +policy aspect can be verified because in response, a Signed Certificate +Timestamp (SCT) is returned. An SCT is a signed promise that an entry will +appear in the log within an MMD. A log that violates its MMD is said to perform +an \emph{omission attack}. It can be detected by challenging the log to prove +inclusion. A log that forks, presenting one append-only version +to some entities and another to others, is said to perform a \emph{split-view +attack}. Split-views can be detected by STH +gossip~\cite{chuat,dahlberg,nordberg,syta}. + +\subsubsection{Standardization and Verification} +The standardized CT protocol defines public HTTP(S) endpoints that allow anyone +to check the log's accepted trust anchors and added certificates, as well as +to obtain the most recent STH and to fetch proofs~\cite{ct,ct/bis}. For +example, the \texttt{add-chain} endpoint returns an SCT if the added certificate +chain ends in a trust anchor returned by the \texttt{get-roots} endpoint. We +use \texttt{add-chain} in Section~\ref{ctor:sec:incremental}, as well as several +other endpoints in Section~\ref{ctor:sec:base} to fetch proofs and STHs. It might be +helpful to know that an inclusion proof is fetched based on two parameters: a +certificate hash and the tree size of an STH. The former specifies the log entry +of interest, and the latter with regards to which view inclusion should be +proven. The returned proof is valid if it can be used in combination with the +certificate to reconstruct the STH's tree head. + +The CT landscape provides a limited value unless it is verified that the logs +play by the rules. What the rules are changed over time, but they are largely +influenced by the major browser vendors that define \emph{CT policies}. For +example, what is required to become a recognized CT log in terms of uptime and +trust anchors, and which criteria should pass to consider a certificate CT +compliant~\cite{apple-log-policy,google-log-policy}. While there are several ways that +a log can misbehave with regards to these policy aspects, the most fundamental +forms of cheating are omission and split-view attacks. A party that follows-up +on inclusion and consistency proofs is said to \emph{audit} the logs. + +Widespread client-side auditing is a premise for CT logs to be untrusted, but +none of the web browsers that enforce CT engage in such activities yet. For +example, requesting an inclusion proof is privacy-invasive because it leaks +browsing patterns to the logs, and reporting suspected log misbehavior comes +with privacy~\cite{ct-with-privacy} as well as operational challenges. +Found log incidents are mostly reported manually to the CT policy +list~\cite{ct-policy-mailing-list}. This is in contrast to automated +\emph{CT monitors}, which notify domain owners +of newly issued certificates based on what actually appeared in the public +logs~\cite{lwm,ct-monitors}. + +\subsection{Tor} \label{ctor:sec:background:tor} + +Most of the activity of Tor's millions of daily users starts with Tor Browser +and connects to some ordinary website via a circuit comprised of three +randomly-selected Tor relays. In this way no identifying information from +Internet protocols (such as IP address) are automatically provided to the +destination, and no single entity can observe both the source and destination of +a connection. Tor Browser is also configured and performs some filtering to resist +browser fingerprinting, and first party isolation to resist sharing state or +linking of identifiers across origins. More generally it avoids storing +identifying configuration and behavioral information to disk. + +Tor relays in a circuit are selected at random, but not uniformly. A typical +circuit is comprised of a \emph{guard}, a \emph{middle}, and an \emph{exit}. A +guard is selected by a client and used for several months as the entrance to all +Tor circuits. If the guard is not controlled by an adversary, that adversary +will not find itself selected to be on a Tor circuit adjacent to (thus +identifying) the client. And because some relay operators do not wish to act as +the apparent Internet source for connections to arbitrary destinations, relay +operators can configure the ports (if any) on which they will permit connections +besides to other Tor relays. Finally, to facilitate load balancing, relays are +assigned a weight based on their apparent capacity to carry traffic. In keeping +with avoiding storing of linkable state, even circuits that share an origin will +only permit new connections over that circuit for ten minutes. After that, if +all connections are closed, all state associated with the circuit is cleared. + +Tor clients use this information when choosing relays with which to build a +circuit. They receive the information via an hourly updated \emph{consensus}. +The consensus assigns weights as well as flags such as \texttt{guard} or +\texttt{exit}. It also assigns auxiliary flags such as +\texttt{stable}, which, e.g., +is necessary to obtain the \texttt{guard} flag since guards must have good +availability. Self-reported information by relays in their \emph{extra-info +document}, such as statistics on their read and written bytes, are also part of +the consensus and uploaded to \emph{directory authorities}. Directory +authorities determine the consensus by voting on various components making up +the shared view of the state of the Tor network. Making sure that all clients +have a consistent view of the network prevents epistemic attacks wherein clients +can be separated based on the routes that are consistent with their +understanding~\cite{danezis:pets2008}. This is only a very rough sketch of Tor's +design and operation. More details can be found by following links at Tor's +documentation site~\cite{tor-documentation}. + +Tor does not aim to prevent end-to-end correlation attacks. An adversary +controlling the guard and exit, or controlling the destination and observing the +client ISP, etc., is assumed able to confirm who is connected to whom on that +particular circuit. The Tor threat model assumes an adversary able to control +and/or observe a small to moderate fraction of Tor relays measured by both +number of relays and by consensus weight, and it assumes a large +number of Tor clients +able to, for example, flood individual relays to detect traffic signatures of +honest traffic on a given circuit~\cite{long-paths}. Also, the adversary can +knock any small number of relays offline via either attacks from clients or +direct Internet DDoS. |