1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
|
\section{Saut\'{e} Onions Until Discovery is Transparent and Confection is Firm} \label{sauteed:sec:trans}
\subsection{System Goals} \label{sauteed:sec:system-goals}
Let an onion association be unidirectional from a traditional domain name to an
onion address. Three main system goals are as follows:
\begin{description}
\item[Privacy-Preserving Onion Associations] Users should discover the same
onion associations, and otherwise the possibility of an
inconsistency must become public knowledge.
\item[Forward Censorship Resistance] Unavailability of a TLS
site must not impede discovery of past onion associations.
\item[Automated Verifiable Discovery] Onion association search should be
possible without requiring blind trust in third-parties. It must be hard to
fabricate non-empty answers, and easy to automate the setup for scalability
and robustness.
\end{description}
For comparison, today's onion location~\cite{onion-location} does not assure a
user that the same HTTP header is set for them as for everyone else. Classes of
users that connect to a domain at different times or via different
links can be given targeted redirects to distinct onion addresses
without detection~\cite{onion-discovery-attacks}. Onion location also
does not work if a regular site becomes unavailable due to censorship.
The \emph{search engine approach} is further a frequent ask by Tor
users~\cite{winter}. The solutions that exist in practice rely on
manually curated
lists~\cite{muffet-onions,onion-service-overview,h-e-securedrop}, notably with
little or no retroactive accountability. As specified above, we aim for a
similar utility but with a setup that can be automated for all onion
associations and without the ability to easily fabricate non-empty answers
without trivial detection. We sketch out how these security properties are
achieved in Section~\ref{sauteed:sec:security-sketch}.
\subsection{Threat Model and Scope} \label{sauteed:sec:threat-model}
We consider an attacker that wants to trick a user into visiting a targeted
onionsite without anyone noticing the possibility of such behavior. Users are
assumed to know the right traditional domain name that is easy to remember (such
as \texttt{torproject.org}), but not its corresponding onion address. We
further assume that the attacker either controls a trusted CA sufficiently to
issue certificates or is able to deceive them sufficiently during certificate
issuance to obtain a valid certificate
from that CA\@. Any misbehavior is however assumed to be detectable in CT. So,
the certificate ecosystem is treated as a \emph{building block} that we make no
attempt to improve.
We permit the attacker to make TLS sites unavailable after setup, but
we assume it is difficult to censor the CT log ecosystem because it can
be mirrored by anyone. Also, as part of the Internet authentication
infrastructure, adversaries may have equities conflicts in blocking CT logs,
and if concerned at all about appearance would have a
harder time justifying such a block versus, e.g., a political,
journalism, or social media site.
Similar to CT, we do not attempt to solve certificate revocation and
especially not in relation to certificates that are connected to
discovery of onion associations. This is consistent with Tor Browser's existing
model for revocation with onion location, which similarly depends on the
certificate for the redirecting domain. There is no formal counterpart to revoke
a result in a search engine, but we outline future work related to this.
Our threat model includes countries that block direct access to HTTPS
sites~\cite{russia-blocks}.
This is arguably a capable attacker, as no country is currently known to
completely block indirect access via the Tor network (though in some places
Tor bridges and/or obfuscated transport is needed). Our threat model also
considers the plethora of blindly trusted parties that help users discover onion
addresses with little or no retroactive
accountability~\cite{ahmia.fi,muffet-onions,onion-service-overview,h-e-securedrop}.
In other words, it is in-scope to pave the path towards more accountability.
\subsection{Description of Sauteed Onions} \label{sauteed:sec:sauteed-onions}
An observation that inspired work on sauteed onions is that onion
location requires HTTPS~\cite{onion-location}. This means that
discovery of onion associations \emph{already} relies on the CA ecosystem. By
incorporating the use of CT, it is possible to add accountability to CAs and
other parties that help with onion address discovery while also raising the bar
for censoring sites and reducing anonymity. The name sauteed onions is a cooking pun;
the association of an onion address with a domain name becomes transparent for
everyone to see in CT logs.
For background, a CA-issued certificate can contain both a traditional domain
name and a \texttt{.onion address}~\cite{cab-ballot144,cab-onion-dv}. This can
be viewed as a mutual association because the issuing CA must verify the
traditional domain name \emph{and} the specified onion address. An immediate
problem is that this would be ambiguous if there are multiple domain names;
which one (if any) should be associated with an onion address with such
certificate coalescence? A more appropriate path forward would therefore be to
define an X.509v3 extension for sauteed onions which clearly \emph{declares that
a domain-validated name wants to be associated with an onion address}.
We describe two uses of sauteed onions that achieve our goals; first assuming it
is easy to get CA-issued certificates that contain associated onion addresses
for domain-validated names, and then a short-term roll-out approach that
could make it a reality now. A sauteed onion is simply a CT-logged certificate
that claims \texttt{example.com} wants to be associated with
\texttt{<addr>.onion} but not necessarily the other way around, i.e., a
unidirectional association.
\subsubsection{Onion Location} \label{sauteed:sec:onion-location}
Figure~\ref{sauteed:fig:onion-location} illustrates onion location that uses
certificates. A user establishes a TLS connection to a site as usual. Upon
encountering a certificate that is CT-logged with an associated onion address
for the visited site \texttt{example.com}, an onion-location prompt becomes
available in Tor Browser or the onion site is visited automatically. This is the same type
of redirect behavior as today's onion location~\cite{onion-location}, except
that the possibility of such a redirect is disclosed in public CT logs.
Attempts at targeted redirects would thus be visible to site owners and
independent third-parties. A redirect to someone else's onion address would
also be visible to the respective site owners. Notably the ability to detect
inappropriate redirects acts as a deterrence while also being the first step
towards remediation, e.g., if users bookmarked onion addresses~\cite{winter}
to achieve trust on first use or to avoid visiting a regular site \emph{and} an
onionsite in a way that might reduce a user's anonymity set.
\begin{figure}[!t]
\centering
\includegraphics[width=.6\columnwidth]{src/sauteed/img/onion-location}
\caption{Onion location based on a CT-logged certificate.}
\label{sauteed:fig:onion-location}
\end{figure}
A key observation is that onion location has always been a feature
facilitated by TLS. By implementing it in certificates rather than HTTP
headers that are delivered via HTTPS connections, TLS applications that are ``not
web'' can use it too without rolling their own mechanisms. The addition of
requiring CT to follow onion-location redirects is also an improvement compared
to today, although one that could be achieved with an HTTP-based approach as
well (or more ambitiously, for all Tor Browser certificate
validations~\cite{ctor-popets}).
We prototyped the above in a web extension that is free and open
source~\cite{sauteed-onion-artifacts}. The criterion for CT logging is at least
one embedded SCT from a log in the policy used by Google
Chrome~\cite{chrome-logs}. If an onion-location redirect is followed, the
path of the current webpage is preserved, similar to a typical configuration of
today's HTTP-based onion location header that instead lists a complete
URL~\cite{onion-location}.
\subsubsection{Search Engine} \label{sauteed:sec:search-engine}
A significant challenge for third-parties that help users discover TLS sites
that are available as onion services is to gain confidence in the underlying
dataset at scale. For example, SecureDrop onion names are scoped to news
sites~\cite{h-e-securedrop}; the list by Muffett is scoped as ``no sites for tech
with less than (arbitrary) 10,000 users''~\cite{muffet-onions}; and
\texttt{ahmia.fi} does not even attempt to give onion addresses human-meaningful
names~\cite{nurmi}. To make matters worse, solutions based on manually curated
lists and third-party search are currently implemented with little or no
accountability.
Figure~\ref{sauteed:fig:search-engine} shows what our approach brings to the table.
All CT logs can be monitored by a third-party to discover sauteed onions.
A search API can then be presented to users for the resulting dataset, similar
to existing monitoring services but scoped specifically for discovery of onion
associations. The utility of such a search API is:
``\emph{what onion addresses are available for \texttt{www.example.com}}''.
\begin{figure}[!t]
\centering
\includegraphics[width=.6\columnwidth]{src/sauteed/img/onion-search}
\caption{Verifiable domain name to onion address search.}
\label{sauteed:fig:search-engine}
\end{figure}
The expected behavior of the search API is that an answer can not be fabricated
without controlling a CA or hijacking certificate issuance, and any CA
malfeasance should further be caught by CT\@. This
means that no party can fabricate inappropriate answers without detection.
This is a major improvement compared to the alternative of no verifiability at
all, although one that in and of itself does not prevent \emph{false negatives}.
In other words, available answers could trivially be omitted. This is a
limitation with the authenticated data structure in CT that can be fixed; see
security sketch in Section~\ref{sauteed:sec:security-sketch} for an intuition of how to
work around it.
We specified an HTTP REST API that facilitates search using a domain name; the
API also makes available additional information like the actual certificate and
its exact index in a CT log. In total there are two endpoints: \texttt{search}
(list of matches with identifiers to more info) and \texttt{get} (more info). The
complete API specification is available online together with our implementation,
which is free and open source~\cite{sauteed-onion-artifacts}. An independent
implementation from Tor's hack week is also available by Rhatto~\cite{rhatto}.
Our prototype runs against all CT logs in Google Chrome for certificates
logged after July 16, 2022. A few query examples are available in
Appendix~\ref{sauteed:app:search}.
\subsubsection{Certificate Format} \label{sauteed:sec:cert-format}
Until now we assumed that a sauteed onion is easily set up, e.g., using an
X.509v3 extension. The bad news is that such an extension does not exist, and
it would likely be a long journey to standardize and see deployment by CAs.
Therefore, our prototypes rely on a backwards-compatible approach that encodes
onion addresses as subdomains~\cite{once-and-future}. To declare that
\texttt{example.com} wants to be associated with \texttt{<addr>.onion}, one can
request a domain-validated certificate that contains both \texttt{example.com}
and \texttt{<addr>onion.example.com}~\cite{secdev19}. The inclusion of
\texttt{example.com} ensures that such a setup does not result in a dangerous
label~\cite{dangerous-labels}. The \emph{hack to encode an onion address as a
subdomain} makes it part of the certificate without requiring changes to CAs.
Appendix~\ref{sauteed:app:setup} details the necessary setup-steps further. The gist
is the addition of a subdomain DNS record and using the \texttt{-d} option in
\texttt{certbot}~\cite{certbot}.
Although the subdomain approach is easy to deploy right now, it is by
no means a perfect solution. An X.509v3 extension would not require
the configuration of an
additional DNS record. In other words, the unidirectional sauteed onions
property works just as well if the subdomain is not domain-validated. The
important part is that the CA validates \texttt{example.com}, and that the
associated onion address can be declared somewhere in the issued certificate
without an ambiguous intent.
Another imperfection that goes hand-in-hand with backwards-compatibility is that
CAs would have to \emph{opt-out} from sauteed onions, unlike site owners
that instead have to \emph{opt-in}.
To avoid recommending a pattern that is discouraged by CAs, the Tor Project
should at least have a dialog with Let's Encrypt which issues the most
certificates~\cite{le}. Somewhat similar subdomain hacks related to CAs exist,
but then with explicit negotiations~\cite{plex}.
Subdomain hacks without a relation to CAs and TLS were discouraged in the
past~\cite{trans-laurie}. We argue that sauteed onions is related because
CA-validated names are at the heart of our approach. For example, this is
unlike Mozilla's binary transparency idea that just wanted to reuse a public
log~\cite{mozilla-bt}. Sauteed onions also do not result in more issued
certificates; it is just the number of domain-validated names that increase by
one for TLS sites that do the setup.
\subsubsection{Security Sketch} \label{sauteed:sec:security-sketch}
Our threat model disallows the attacker to tamper with CT and to make the log
ecosystem unavailable. Onion location as described in
Section~\ref{sauteed:sec:onion-location} therefore ensures that a redirect becomes
public, achieving detectability as defined in our privacy-preserving onion
association goal. The search engine in Section~\ref{sauteed:sec:search-engine}
trivially achieves the same goal because onion associations are \emph{found}
via CT. Blocking a TLS site is additionally \emph{too late} if an association
is already in a CT log, thus achieving forward censorship resistance.
Our search engine approach further makes it hard to forge non-answers without
detection because it requires control of a CA and defeating the tamper-evidence
of CT logs. While it is possible to omit available answers, this can be
mitigated by having multiple search APIs, domains that check the integrity of
their own onion associations similar to the proposed verification pattern in
CONIKS~\cite{coniks}, or to represent the sauteed onion dataset as a sparse
Merkle tree to get a verifiable log-backed map that additionally supports
efficient non-membership proofs that CT lacks~\cite{smt,vds}.
\subsection{Future Work}
It would be valuable to implement proofs of no omissions as well as native
lookups in a web extension or Tor Browser to verify everything before showing
the user a result (certificates, proofs of logging, etc). The entire or
selected parts of the sauteed onion dataset may further be delivered to Tor
Browser similar to SecureDrop onion names~\cite{h-e-securedrop}. The difference
would be that the list is automated using a selection criteria from CT logs
rather than doing it manually on a case-by-case basis. A major benefit is that
the sauteed onion dataset can then be queried locally, completely avoiding
third-party queries and visits to the regular site. Another approach to explore
is potential integration of the sauteed onion dataset into Tor's DHT: a
cryptographic source of truth for available onion associations is likely a
helpful starting point so that there is \emph{something to distribute}. It
would also be interesting to consider other search-engine policies than
\emph{show everything} as in our work, e.g., only first association or last
association. (These policies can be verified with \emph{full
audits}~\cite{vds}.)
|