ooni-probe-cli/internal/experiment/webconnectivity
Simone Basso 800217d15b
chore: bump web_connectivity@v0.5 version to 0.5.5 (#945)
chore: web_connectivity v0.5.5

We're bumping the version number to reflect recent improvements in the
data format implemented in these pull requests:

- https://github.com/ooni/probe-cli/pull/942

- https://github.com/ooni/probe-cli/pull/943

- https://github.com/ooni/probe-cli/pull/944

Reference issue: https://github.com/ooni/probe/issues/2238
2022-09-08 11:22:42 +02:00
..
analysiscore.go feat(webconnectivity@v0.5): use TLS info from TH (#933) 2022-09-05 11:35:48 +02:00
analysisdns.go refactor: spin geoipx off geolocate (#893) 2022-08-28 20:00:25 +02:00
analysishttpcore.go feat(webconnectivity@v0.5): use TLS info from TH (#933) 2022-09-05 11:35:48 +02:00
analysishttpdiff.go refactor: move WebGetTitle inside measurexlite (#895) 2022-08-28 20:26:40 +02:00
analysistcpip.go feat(webconnectivity@v0.5): use TLS info from TH (#933) 2022-09-05 11:35:48 +02:00
analysistls.go feat(webconnectivity@v0.5): use TLS info from TH (#933) 2022-09-05 11:35:48 +02:00
cleartextflow.go fix(webconnectivity@v0.5): include http transaction start/done (#943) 2022-09-08 10:37:08 +02:00
config.go feat(webconnectivity): long-term-evolution prototype (#882) 2022-08-26 16:42:48 +02:00
control.go fix(webconnectivity@v0.5): fetch HTTP only using system-resolver addrs (#935) 2022-09-05 13:33:59 +02:00
dnscache.go fix(webconnectivity@v0.5): fetch HTTP only using system-resolver addrs (#935) 2022-09-05 13:33:59 +02:00
dnsresolvers.go fix(webconnectivity@v0.5): fetch HTTP only using system-resolver addrs (#935) 2022-09-05 13:33:59 +02:00
dnswhoami.go feat(webconnectivity): long-term-evolution prototype (#882) 2022-08-26 16:42:48 +02:00
doc.go feat(webconnectivity): long-term-evolution prototype (#882) 2022-08-26 16:42:48 +02:00
inputparser.go feat(webconnectivity): long-term-evolution prototype (#882) 2022-08-26 16:42:48 +02:00
measurer.go chore: bump web_connectivity@v0.5 version to 0.5.5 (#945) 2022-09-08 11:22:42 +02:00
README.md feat(webconnectivity): long-term-evolution prototype (#882) 2022-08-26 16:42:48 +02:00
secureflow.go fix(webconnectivity@v0.5): include http transaction start/done (#943) 2022-09-08 10:37:08 +02:00
summary.go feat(webconnectivity): long-term-evolution prototype (#882) 2022-08-26 16:42:48 +02:00
testkeys.go fix(webconnectivity@v0.5): include http transaction start/done (#943) 2022-09-08 10:37:08 +02:00

webconnectivity

This directory contains a new implementation of Web Connectivity.

As of 2022-08-26, this code is experimental and is not selected by default when you run the websites group. You can select this implementation with miniooni using miniooni web_connectivity@v0.5 from the command line.

Issue #2237 explains the rationale behind writing this new implementation.

Implementation overview

The experiment measures a single URL at a time. The OONI Engine invokes the Run method inside the measurer.go file.

This code starts a number of background tasks, waits for them to complete, and finally calls TestKeys.finalize to finalize the content of the JSON measurement.

The first task that is started deals with DNS and lives in the dnsresolvers.go file. This task is responsible for resolving the domain inside the URL into 0..N IP addresses.

The domain resolution includes the system resolver and a DNS-over-UDP resolver. The implementaion may do more than that, but this is the bare minimum we're feeling like documenting right now. (We need to experiment a bit more to understand what else we can do there, hence the code is probably doing more than just that.)

Once we know the 0..N IP addresses for the domain we do the following:

  1. start a background task to communicate with the Web Connectivity test helper, using code inside control.go;

  2. start an endpoint measurement task for each IP adddress (which of course only happens when we know at least one addr).

Regarding starting endpoint measurements, we follow this policy:

  1. if the original URL is http://... then we start a cleartext task and an encrypted task for each address using ports 80 and 443 respectively.

  2. if it's https://..., then we only start encrypted tasks.

Cleartext tasks are implemented by cleartextflow.go while the encrypted tasks live in secureflow.go.

A cleartext task does the following:

  1. TCP connect;

  2. additionally, the first task to establish a connection also performs a GET request to fetch a webpage (we cannot GET for all connections, because that would be websteps and would require a different data format).

An encrypted task does the following:

  1. TCP connect;

  2. TLS handshake;

  3. additionally, the first task to handshake also performs a GET request to fetch a webpage iff the input URL was https://... (we cannot GET for all connections, because that would be websteps and would require a different data format).

If fetching the webpage returns a redirect, we start a new DNS task passing it the redirect URL as the new URL to measure. We do not call the test helper again when this happens, though. The Web Connectivity test helper already follows the whole redirect chain, so we would need to change the test helper to get information on each flow. When this will happen, this experiment will probably not be Web Connectivity anymore, but rather some form of websteps.

Additionally, when the test helper terminates, we run TCP connect and TLS handshake (when applicable) for new IP addresses discovered using the test helper that were previously unknown to the probe, thus collecting extra information. This logic lives inside the control.go file.

As previously mentioned, when all tasks complete, we call TestKeys.finalize.

In turn, this function analyzes the collected data by calling code implemented inside the following files:

We emit the blocking and accessible keys we emitted before as well as new keys, prefixed by x_ to indicate that they're experimental.

Limitations and next steps

We need to extend the Web Connectivity test helper to return us information about TLS handshakes with IP addresses discovered by the probe. This information would allow us to make more precise TLS blocking statements.

Further changes are probably possible. Departing too radically from the Web Connectivity model, though, will lead us to have a websteps implementation (but then the data model would most likely be different).