aboutsummaryrefslogtreecommitdiff
path: root/summary/src/sauteed/src/sauteed.tex
diff options
context:
space:
mode:
Diffstat (limited to 'summary/src/sauteed/src/sauteed.tex')
-rw-r--r--summary/src/sauteed/src/sauteed.tex260
1 files changed, 260 insertions, 0 deletions
diff --git a/summary/src/sauteed/src/sauteed.tex b/summary/src/sauteed/src/sauteed.tex
new file mode 100644
index 0000000..06a581c
--- /dev/null
+++ b/summary/src/sauteed/src/sauteed.tex
@@ -0,0 +1,260 @@
+\section{Saut\'{e} Onions Until Discovery is Transparent and Confection is Firm} \label{sauteed:sec:trans}
+
+\subsection{System Goals} \label{sauteed:sec:system-goals}
+Let an onion association be unidirectional from a traditional domain name to an
+onion address. Three main system goals are as follows:
+
+\begin{description}
+ \item[Privacy-Preserving Onion Associations] Users should discover the same
+ onion associations, and otherwise the possibility of an
+ inconsistency must become public knowledge.
+ \item[Forward Censorship Resistance] Unavailability of a TLS
+ site must not impede discovery of past onion associations.
+ \item[Automated Verifiable Discovery] Onion association search should be
+ possible without requiring blind trust in third-parties. It must be hard to
+ fabricate non-empty answers, and easy to automate the setup for scalability
+ and robustness.
+\end{description}
+
+For comparison, today's onion location~\cite{onion-location} does not assure a
+user that the same HTTP header is set for them as for everyone else. Classes of
+users that connect to a domain at different times or via different
+links can be given targeted redirects to distinct onion addresses
+without detection~\cite{onion-discovery-attacks}. Onion location also
+does not work if a regular site becomes unavailable due to censorship.
+The \emph{search engine approach} is further a frequent ask by Tor
+users~\cite{winter}. The solutions that exist in practice rely on
+manually curated
+lists~\cite{muffet-onions,onion-service-overview,h-e-securedrop}, notably with
+little or no retroactive accountability. As specified above, we aim for a
+similar utility but with a setup that can be automated for all onion
+associations and without the ability to easily fabricate non-empty answers
+without trivial detection. We sketch out how these security properties are
+achieved in Section~\ref{sauteed:sec:security-sketch}.
+
+\subsection{Threat Model and Scope} \label{sauteed:sec:threat-model}
+We consider an attacker that wants to trick a user into visiting a targeted
+onionsite without anyone noticing the possibility of such behavior. Users are
+assumed to know the right traditional domain name that is easy to remember (such
+as \texttt{torproject.org}), but not its corresponding onion address. We
+further assume that the attacker either controls a trusted CA sufficiently to
+issue certificates or is able to deceive them sufficiently during certificate
+issuance to obtain a valid certificate
+from that CA\@. Any misbehavior is however assumed to be detectable in CT. So,
+the certificate ecosystem is treated as a \emph{building block} that we make no
+attempt to improve.
+
+We permit the attacker to make TLS sites unavailable after setup, but
+we assume it is difficult to censor the CT log ecosystem because it can
+be mirrored by anyone. Also, as part of the Internet authentication
+infrastructure, adversaries may have equities conflicts in blocking CT logs,
+and if concerned at all about appearance would have a
+harder time justifying such a block versus, e.g., a political,
+journalism, or social media site.
+Similar to CT, we do not attempt to solve certificate revocation and
+especially not in relation to certificates that are connected to
+discovery of onion associations. This is consistent with Tor Browser's existing
+model for revocation with onion location, which similarly depends on the
+certificate for the redirecting domain. There is no formal counterpart to revoke
+a result in a search engine, but we outline future work related to this.
+
+Our threat model includes countries that block direct access to HTTPS
+sites~\cite{russia-blocks}.
+This is arguably a capable attacker, as no country is currently known to
+completely block indirect access via the Tor network (though in some places
+Tor bridges and/or obfuscated transport is needed). Our threat model also
+considers the plethora of blindly trusted parties that help users discover onion
+addresses with little or no retroactive
+accountability~\cite{ahmia.fi,muffet-onions,onion-service-overview,h-e-securedrop}.
+In other words, it is in-scope to pave the path towards more accountability.
+
+\subsection{Description of Sauteed Onions} \label{sauteed:sec:sauteed-onions}
+An observation that inspired work on sauteed onions is that onion
+location requires HTTPS~\cite{onion-location}. This means that
+discovery of onion associations \emph{already} relies on the CA ecosystem. By
+incorporating the use of CT, it is possible to add accountability to CAs and
+other parties that help with onion address discovery while also raising the bar
+for censoring sites and reducing anonymity. The name sauteed onions is a cooking pun;
+the association of an onion address with a domain name becomes transparent for
+everyone to see in CT logs.
+
+For background, a CA-issued certificate can contain both a traditional domain
+name and a \texttt{.onion address}~\cite{cab-ballot144,cab-onion-dv}. This can
+be viewed as a mutual association because the issuing CA must verify the
+traditional domain name \emph{and} the specified onion address. An immediate
+problem is that this would be ambiguous if there are multiple domain names;
+which one (if any) should be associated with an onion address with such
+certificate coalescence? A more appropriate path forward would therefore be to
+define an X.509v3 extension for sauteed onions which clearly \emph{declares that
+a domain-validated name wants to be associated with an onion address}.
+
+We describe two uses of sauteed onions that achieve our goals; first assuming it
+is easy to get CA-issued certificates that contain associated onion addresses
+for domain-validated names, and then a short-term roll-out approach that
+could make it a reality now. A sauteed onion is simply a CT-logged certificate
+that claims \texttt{example.com} wants to be associated with
+\texttt{<addr>.onion} but not necessarily the other way around, i.e., a
+unidirectional association.
+
+\subsubsection{Onion Location} \label{sauteed:sec:onion-location}
+Figure~\ref{sauteed:fig:onion-location} illustrates onion location that uses
+certificates. A user establishes a TLS connection to a site as usual. Upon
+encountering a certificate that is CT-logged with an associated onion address
+for the visited site \texttt{example.com}, an onion-location prompt becomes
+available in Tor Browser or the onion site is visited automatically. This is the same type
+of redirect behavior as today's onion location~\cite{onion-location}, except
+that the possibility of such a redirect is disclosed in public CT logs.
+Attempts at targeted redirects would thus be visible to site owners and
+independent third-parties. A redirect to someone else's onion address would
+also be visible to the respective site owners. Notably the ability to detect
+inappropriate redirects acts as a deterrence while also being the first step
+towards remediation, e.g., if users bookmarked onion addresses~\cite{winter}
+to achieve trust on first use or to avoid visiting a regular site \emph{and} an
+onionsite in a way that might reduce a user's anonymity set.
+
+\begin{figure}[!t]
+ \centering
+ \includegraphics[width=.6\columnwidth]{src/sauteed/img/onion-location}
+ \caption{Onion location based on a CT-logged certificate.}
+ \label{sauteed:fig:onion-location}
+\end{figure}
+
+A key observation is that onion location has always been a feature
+facilitated by TLS. By implementing it in certificates rather than HTTP
+headers that are delivered via HTTPS connections, TLS applications that are ``not
+web'' can use it too without rolling their own mechanisms. The addition of
+requiring CT to follow onion-location redirects is also an improvement compared
+to today, although one that could be achieved with an HTTP-based approach as
+well (or more ambitiously, for all Tor Browser certificate
+validations~\cite{ctor-popets}).
+
+We prototyped the above in a web extension that is free and open
+source~\cite{sauteed-onion-artifacts}. The criterion for CT logging is at least
+one embedded SCT from a log in the policy used by Google
+Chrome~\cite{chrome-logs}. If an onion-location redirect is followed, the
+path of the current webpage is preserved, similar to a typical configuration of
+today's HTTP-based onion location header that instead lists a complete
+URL~\cite{onion-location}.
+
+\subsubsection{Search Engine} \label{sauteed:sec:search-engine}
+A significant challenge for third-parties that help users discover TLS sites
+that are available as onion services is to gain confidence in the underlying
+dataset at scale. For example, SecureDrop onion names are scoped to news
+sites~\cite{h-e-securedrop}; the list by Muffett is scoped as ``no sites for tech
+with less than (arbitrary) 10,000 users''~\cite{muffet-onions}; and
+\texttt{ahmia.fi} does not even attempt to give onion addresses human-meaningful
+names~\cite{nurmi}. To make matters worse, solutions based on manually curated
+lists and third-party search are currently implemented with little or no
+accountability.
+
+Figure~\ref{sauteed:fig:search-engine} shows what our approach brings to the table.
+All CT logs can be monitored by a third-party to discover sauteed onions.
+A search API can then be presented to users for the resulting dataset, similar
+to existing monitoring services but scoped specifically for discovery of onion
+associations. The utility of such a search API is:
+``\emph{what onion addresses are available for \texttt{www.example.com}}''.
+
+\begin{figure}[!t]
+ \centering
+ \includegraphics[width=.6\columnwidth]{src/sauteed/img/onion-search}
+ \caption{Verifiable domain name to onion address search.}
+ \label{sauteed:fig:search-engine}
+\end{figure}
+
+The expected behavior of the search API is that an answer can not be fabricated
+without controlling a CA or hijacking certificate issuance, and any CA
+malfeasance should further be caught by CT\@. This
+means that no party can fabricate inappropriate answers without detection.
+This is a major improvement compared to the alternative of no verifiability at
+all, although one that in and of itself does not prevent \emph{false negatives}.
+In other words, available answers could trivially be omitted. This is a
+limitation with the authenticated data structure in CT that can be fixed; see
+security sketch in Section~\ref{sauteed:sec:security-sketch} for an intuition of how to
+work around it.
+
+We specified an HTTP REST API that facilitates search using a domain name; the
+API also makes available additional information like the actual certificate and
+its exact index in a CT log. In total there are two endpoints: \texttt{search}
+(list of matches with identifiers to more info) and \texttt{get} (more info). The
+complete API specification is available online together with our implementation,
+which is free and open source~\cite{sauteed-onion-artifacts}. An independent
+implementation from Tor's hack week is also available by Rhatto~\cite{rhatto}.
+Our prototype runs against all CT logs in Google Chrome for certificates
+logged after July 16, 2022. A few query examples are available in
+Appendix~\ref{sauteed:app:search}.
+
+\subsubsection{Certificate Format} \label{sauteed:sec:cert-format}
+Until now we assumed that a sauteed onion is easily set up, e.g., using an
+X.509v3 extension. The bad news is that such an extension does not exist, and
+it would likely be a long journey to standardize and see deployment by CAs.
+Therefore, our prototypes rely on a backwards-compatible approach that encodes
+onion addresses as subdomains~\cite{once-and-future}. To declare that
+\texttt{example.com} wants to be associated with \texttt{<addr>.onion}, one can
+request a domain-validated certificate that contains both \texttt{example.com}
+and \texttt{<addr>onion.example.com}~\cite{secdev19}. The inclusion of
+\texttt{example.com} ensures that such a setup does not result in a dangerous
+label~\cite{dangerous-labels}. The \emph{hack to encode an onion address as a
+subdomain} makes it part of the certificate without requiring changes to CAs.
+Appendix~\ref{sauteed:app:setup} details the necessary setup-steps further. The gist
+is the addition of a subdomain DNS record and using the \texttt{-d} option in
+\texttt{certbot}~\cite{certbot}.
+
+Although the subdomain approach is easy to deploy right now, it is by
+no means a perfect solution. An X.509v3 extension would not require
+the configuration of an
+additional DNS record. In other words, the unidirectional sauteed onions
+property works just as well if the subdomain is not domain-validated. The
+important part is that the CA validates \texttt{example.com}, and that the
+associated onion address can be declared somewhere in the issued certificate
+without an ambiguous intent.
+Another imperfection that goes hand-in-hand with backwards-compatibility is that
+CAs would have to \emph{opt-out} from sauteed onions, unlike site owners
+that instead have to \emph{opt-in}.
+
+To avoid recommending a pattern that is discouraged by CAs, the Tor Project
+should at least have a dialog with Let's Encrypt which issues the most
+certificates~\cite{le}. Somewhat similar subdomain hacks related to CAs exist,
+but then with explicit negotiations~\cite{plex}.
+Subdomain hacks without a relation to CAs and TLS were discouraged in the
+past~\cite{trans-laurie}. We argue that sauteed onions is related because
+CA-validated names are at the heart of our approach. For example, this is
+unlike Mozilla's binary transparency idea that just wanted to reuse a public
+log~\cite{mozilla-bt}. Sauteed onions also do not result in more issued
+certificates; it is just the number of domain-validated names that increase by
+one for TLS sites that do the setup.
+
+\subsubsection{Security Sketch} \label{sauteed:sec:security-sketch}
+Our threat model disallows the attacker to tamper with CT and to make the log
+ecosystem unavailable. Onion location as described in
+Section~\ref{sauteed:sec:onion-location} therefore ensures that a redirect becomes
+public, achieving detectability as defined in our privacy-preserving onion
+association goal. The search engine in Section~\ref{sauteed:sec:search-engine}
+trivially achieves the same goal because onion associations are \emph{found}
+via CT. Blocking a TLS site is additionally \emph{too late} if an association
+is already in a CT log, thus achieving forward censorship resistance.
+Our search engine approach further makes it hard to forge non-answers without
+detection because it requires control of a CA and defeating the tamper-evidence
+of CT logs. While it is possible to omit available answers, this can be
+mitigated by having multiple search APIs, domains that check the integrity of
+their own onion associations similar to the proposed verification pattern in
+CONIKS~\cite{coniks}, or to represent the sauteed onion dataset as a sparse
+Merkle tree to get a verifiable log-backed map that additionally supports
+efficient non-membership proofs that CT lacks~\cite{smt,vds}.
+
+\subsection{Future Work}
+It would be valuable to implement proofs of no omissions as well as native
+lookups in a web extension or Tor Browser to verify everything before showing
+the user a result (certificates, proofs of logging, etc). The entire or
+selected parts of the sauteed onion dataset may further be delivered to Tor
+Browser similar to SecureDrop onion names~\cite{h-e-securedrop}. The difference
+would be that the list is automated using a selection criteria from CT logs
+rather than doing it manually on a case-by-case basis. A major benefit is that
+the sauteed onion dataset can then be queried locally, completely avoiding
+third-party queries and visits to the regular site. Another approach to explore
+is potential integration of the sauteed onion dataset into Tor's DHT: a
+cryptographic source of truth for available onion associations is likely a
+helpful starting point so that there is \emph{something to distribute}. It
+would also be interesting to consider other search-engine policies than
+\emph{show everything} as in our work, e.g., only first association or last
+association. (These policies can be verified with \emph{full
+audits}~\cite{vds}.)