aboutsummaryrefslogtreecommitdiff

onion-grab

A tool that visits a list of domains over HTTPS to see if they have Onion-Location configured.

Warning: research prototype.

Quickstart

Install

You will need a Go compiler on the local system:

$ which go >/dev/null || echo "Go compiler is not in PATH"

Install onion-grab:

$ go install gitlab.torproject.org/tpo/onion-services/onion-grab@latest

List all options:

$ onion-grab -h

Basic usage

Store one domain per line in a file:

$ cat domains.lst
www.eff.org
www.qubes-os.org
www.torproject.org

Run onion-grab with default parameters:

$ onion-grab -i domains.lst
2023/04/07 20:29:45 INFO: ctrl+C to exit prematurely
2023/04/07 20:29:45 INFO: starting 128 workers with limit 64/s
2023/04/07 20:29:45 INFO: starting work receiver
2023/04/07 20:29:45 INFO: starting work generator
www.qubes-os.org header= attribute=http://qubesosfasa4zl44o4tws22di6kepyzfeqv3tg4e3ztknltfxqrymdad.onion/
www.torproject.org header=http://2gzyxa5ihm7nsggfxnu52rck2vv4rvmdlkiu3zzui5du4xyclen53wid.onion/index.html attribute=
2023/04/07 20:29:50 INFO: metrics@receiver:

  Processed: 3
    Success: 3 (Onion-Location:2)
    Failure: 0 (See breakdown below)
        Req: 0 (Before sending request)
        DNS: 0 (NotFound:0 Timeout:0 Other:0)
        TCP: 0 (Timeout:0 Syscall:0)
        TLS: 0 (Cert:0 Other:0)
        3xx: 0 (Too many redirects)
        EOF: 0 (Unclear meaning)
        CTX: 0 (Deadline exceeded)
        ???: 0 (Other errors)

2023/04/07 20:29:51 INFO: about to exit in at most 11s, reading remaining answers
2023/04/07 20:29:57 INFO: metrics@receiver: summary:

  Processed: 3
    Success: 3 (Onion-Location:2)
    Failure: 0 (See breakdown below)
        Req: 0 (Before sending request)
        DNS: 0 (NotFound:0 Timeout:0 Other:0)
        TCP: 0 (Timeout:0 Syscall:0)
        TLS: 0 (Cert:0 Other:0)
        3xx: 0 (Too many redirects)
        EOF: 0 (Unclear meaning)
        CTX: 0 (Deadline exceeded)
        ???: 0 (Other errors)

2023/04/07 20:29:57 INFO: measurement duration was 12s

Sites with Onion-Location are printed to stdout, here showing that www.torproject.org configures it with an HTTP header while www.qubes-os.org does it with an HTML attribute. All three sites connected successfully.

In case of errors, the type of error is identified with relatively few ???.

Scripts

Digest the results, here stored as onion-grab.stdout:

$ cat onion-grab.stdout
www.qubes-os.org header= attribute=http://qubesosfasa4zl44o4tws22di6kepyzfeqv3tg4e3ztknltfxqrymdad.onion/
www.torproject.org header=http://2gzyxa5ihm7nsggfxnu52rck2vv4rvmdlkiu3zzui5du4xyclen53wid.onion/index.html attribute=
$ ./scripts/digest.py -i onion-grab.stdout
digest.py:25 INFO: found 1 HTTP headers with Onion-Location
digest.py:26 INFO: found 1 HTML meta attributes with Onion-Location
digest.py:27 INFO: found 2 unqiue domain names that set Onion-Location
digest.py:28 INFO: found 2 unique two-label onion addresses in the process
digest.py:30 INFO: storing domains with valid Onion-Location configurations in domains.txt
digest.py:35 INFO: storing two-label onion addresses that domains referenced in onions.txt
$ cat domains.txt
www.qubes-os.org http://qubesosfasa4zl44o4tws22di6kepyzfeqv3tg4e3ztknltfxqrymdad.onion/
www.torproject.org http://2gzyxa5ihm7nsggfxnu52rck2vv4rvmdlkiu3zzui5du4xyclen53wid.onion/index.html
$ cat onions.txt
qubesosfasa4zl44o4tws22di6kepyzfeqv3tg4e3ztknltfxqrymdad.onion www.qubes-os.org
2gzyxa5ihm7nsggfxnu52rck2vv4rvmdlkiu3zzui5du4xyclen53wid.onion www.torproject.org

In other words, the digest script prints some information and writes two files:

  • domains.txt: domains that configured valid Onion-Location headers. The listed Onion-Location values are de-duplicated and space-separated.
  • onions.txt: two-label .onion addresses that were discovered. The listed domains referenced this address in their Onion-Location configuration, possibly with subdomains, paths, etc., that were removed. Such pruning of the set Onion-Location values is useful to estimate the number of onions.

See scripts/test.sh and if you are looking to test different onion-grab configuration. You may find scripts/measure.sh to be a useful measurement script.

Running a larger measurement

See docs/operations.md for measurements of Tranco top-1M and ct-sans.

Contact

  • rasmus (at) rgdd (dot) se

Licence

BSD 2-Clause License