5e76c6ec92
We're bumping the experiment's version number because we changed the name of the field used to contain late/duplicate DNS responses. We have also changed the algorithm to determine `#dnsDiff`. However, the change should only impact how we log this information. Overall, here the idea is to provide users with a reasonably clear explanation of how the probe maps observations to blocking and accessible using expected/unexpected as the conceptual framework. Part of https://github.com/ooni/probe/issues/2237 |
||
---|---|---|
.. | ||
analysiscore.go | ||
analysisdns.go | ||
analysishttpcore.go | ||
analysishttpdiff.go | ||
analysistcpip.go | ||
analysistls.go | ||
cleartextflow.go | ||
config.go | ||
control.go | ||
dnscache.go | ||
dnsresolvers.go | ||
dnswhoami.go | ||
doc.go | ||
inputparser.go | ||
iox.go | ||
measurer.go | ||
priority.go | ||
README.md | ||
secureflow.go | ||
summary.go | ||
testkeys.go |
webconnectivity
This directory contains a new implementation of Web Connectivity.
As of 2022-08-26, this code is experimental and is not selected
by default when you run the websites
group. You can select this
implementation with miniooni
using miniooni web_connectivity@v0.5
from the command line.
Issue #2237 explains the rationale behind writing this new implementation.
Implementation overview
The experiment measures a single URL at a time. The OONI Engine invokes the
Run
method inside the measurer.go file.
This code starts a number of background tasks, waits for them to complete, and
finally calls TestKeys.finalize
to finalize the content of the JSON measurement.
The first task that is started deals with DNS and lives in the
dnsresolvers.go file. This task is responsible for
resolving the domain inside the URL into 0..N
IP addresses.
The domain resolution includes the system resolver and a DNS-over-UDP resolver. The implementaion may do more than that, but this is the bare minimum we're feeling like documenting right now. (We need to experiment a bit more to understand what else we can do there, hence the code is probably doing more than just that.)
Once we know the 0..N
IP addresses for the domain we do the following:
-
start a background task to communicate with the Web Connectivity test helper, using code inside control.go;
-
start an endpoint measurement task for each IP adddress (which of course only happens when we know at least one addr).
Regarding starting endpoint measurements, we follow this policy:
-
if the original URL is
http://...
then we start a cleartext task and an encrypted task for each address using ports80
and443
respectively. -
if it's
https://...
, then we only start encrypted tasks.
Cleartext tasks are implemented by cleartextflow.go while the encrypted tasks live in secureflow.go.
A cleartext task does the following:
-
TCP connect;
-
additionally, the first task to establish a connection also performs a GET request to fetch a webpage (we cannot GET for all connections, because that would be
websteps
and would require a different data format).
An encrypted task does the following:
-
TCP connect;
-
TLS handshake;
-
additionally, the first task to handshake also performs a GET request to fetch a webpage iff the input URL was
https://...
(we cannot GET for all connections, because that would bewebsteps
and would require a different data format).
If fetching the webpage returns a redirect, we start a new DNS task passing it the redirect URL as the new URL to measure. We do not call the test helper again when this happens, though. The Web Connectivity test helper already follows the whole redirect chain, so we would need to change the test helper to get information on each flow. When this will happen, this experiment will probably not be Web Connectivity anymore, but rather some form of websteps.
Additionally, when the test helper terminates, we run TCP connect and TLS handshake (when applicable) for new IP addresses discovered using the test helper that were previously unknown to the probe, thus collecting extra information. This logic lives inside the control.go file.
As previously mentioned, when all tasks complete, we call TestKeys.finalize
.
In turn, this function analyzes the collected data by calling code implemented inside the following files:
-
analysiscore.go contains the core analysis algorithm;
-
analysisdns.go contains DNS specific analysis;
-
analysishttpcore.go contains the bulk of the HTTP analysis, where we mainly determine TLS blocking;
-
analysishttpdiff.go contains the HTTP diff algorithm;
-
analysistcpip.go checks for TCP/IP blocking.
We emit the blocking
and accessible
keys we emitted before as well as new
keys, prefixed by x_
to indicate that they're experimental.
Limitations and next steps
We need to extend the Web Connectivity test helper to return us information about TLS handshakes with IP addresses discovered by the probe. This information would allow us to make more precise TLS blocking statements.
Further changes are probably possible. Departing too radically from the Web
Connectivity model, though, will lead us to have a websteps
implementation (but
then the data model would most likely be different).