# onion-grab A tool that visits a list of domains over HTTPS to see if they have [Onion-Location][] configured. [Onion-Location]: https://community.torproject.org/onion-services/advanced/onion-location/ **Warning:** research prototype. ## Quickstart ### Install You will need a [Go compiler][] on the local system: $ which go >/dev/null || echo "Go compiler is not in PATH" [Go compiler]: https://go.dev/doc/install Install `onion-grab`: $ go install gitlab.torproject.org/tpo/onion-services/onion-grab@latest List all options: $ onion-grab -h ### Basic usage Store one domain per line in a file: $ cat domains.lst www.eff.org www.qubes-os.org www.torproject.org Run onion-grab with default parameters: $ onion-grab -i domains.lst 2023/04/07 20:29:45 INFO: ctrl+C to exit prematurely 2023/04/07 20:29:45 INFO: starting 128 workers with limit 64/s 2023/04/07 20:29:45 INFO: starting work receiver 2023/04/07 20:29:45 INFO: starting work generator www.qubes-os.org header= attribute=http://qubesosfasa4zl44o4tws22di6kepyzfeqv3tg4e3ztknltfxqrymdad.onion/ www.torproject.org header=http://2gzyxa5ihm7nsggfxnu52rck2vv4rvmdlkiu3zzui5du4xyclen53wid.onion/index.html attribute= 2023/04/07 20:29:50 INFO: metrics@receiver: Processed: 3 Success: 3 (Onion-Location:2) Failure: 0 (See breakdown below) Req: 0 (Before sending request) DNS: 0 (NotFound:0 Timeout:0 Other:0) TCP: 0 (Timeout:0 Syscall:0) TLS: 0 (Cert:0 Other:0) 3xx: 0 (Too many redirects) EOF: 0 (Unclear meaning) CTX: 0 (Deadline exceeded) ???: 0 (Other errors) 2023/04/07 20:29:51 INFO: about to exit in at most 11s, reading remaining answers 2023/04/07 20:29:57 INFO: metrics@receiver: summary: Processed: 3 Success: 3 (Onion-Location:2) Failure: 0 (See breakdown below) Req: 0 (Before sending request) DNS: 0 (NotFound:0 Timeout:0 Other:0) TCP: 0 (Timeout:0 Syscall:0) TLS: 0 (Cert:0 Other:0) 3xx: 0 (Too many redirects) EOF: 0 (Unclear meaning) CTX: 0 (Deadline exceeded) ???: 0 (Other errors) 2023/04/07 20:29:57 INFO: measurement duration was 12s Sites with Onion-Location are printed to stdout, here showing that `www.torproject.org` configures it with an HTTP header while `www.qubes-os.org` does it with an HTML attribute. All three sites connected successfully. In case of errors, the type of error is identified with relatively few `???`. ### Scripts Digest the results, here stored as `onion-grab.stdout`: $ cat onion-grab.stdout www.qubes-os.org header= attribute=http://qubesosfasa4zl44o4tws22di6kepyzfeqv3tg4e3ztknltfxqrymdad.onion/ www.torproject.org header=http://2gzyxa5ihm7nsggfxnu52rck2vv4rvmdlkiu3zzui5du4xyclen53wid.onion/index.html attribute= $ ./scripts/digest.py -i onion-grab.stdout digest.py:25 INFO: found 1 HTTP headers with Onion-Location digest.py:26 INFO: found 1 HTML meta attributes with Onion-Location digest.py:27 INFO: found 2 unqiue domain names that set Onion-Location digest.py:28 INFO: found 2 unique two-label onion addresses in the process digest.py:30 INFO: storing domains with valid Onion-Location configurations in domains.txt digest.py:35 INFO: storing two-label onion addresses that domains referenced in onions.txt $ cat domains.txt www.qubes-os.org http://qubesosfasa4zl44o4tws22di6kepyzfeqv3tg4e3ztknltfxqrymdad.onion/ www.torproject.org http://2gzyxa5ihm7nsggfxnu52rck2vv4rvmdlkiu3zzui5du4xyclen53wid.onion/index.html $ cat onions.txt qubesosfasa4zl44o4tws22di6kepyzfeqv3tg4e3ztknltfxqrymdad.onion www.qubes-os.org 2gzyxa5ihm7nsggfxnu52rck2vv4rvmdlkiu3zzui5du4xyclen53wid.onion www.torproject.org In other words, the digest script prints some information and writes two files: - `domains.txt`: domains that configured valid Onion-Location headers. The listed Onion-Location values are de-duplicated and space-separated. - `onions.txt`: two-label `.onion` addresses that were discovered. The listed domains referenced this address in their Onion-Location configuration, possibly with subdomains, paths, etc., that were removed. Such pruning of the set Onion-Location values is useful to estimate the number of onions. See [scripts/test.sh](./scripts/test.sh) and if you are looking to test different `onion-grab` configuration. You may find [scripts/measure.sh](scripts/measure.sh) to be a useful measurement script. ## Running a larger measurement See [docs/operations.md](TODO) for measurements of [Tranco top-1M][] and [ct-sans][]. [Tranco top-1M]: https://tranco-list.eu/latest_list [ct-sans]: https://git.cs.kau.se/rasmoste/ct-sans/-/blob/main/docs/operations.md ## Contact - rasmus (at) rgdd (dot) se ## Licence BSD 2-Clause License