39cb5959c9
* fix(model/archival.go): more optional keys Basically, `t0` and `transaction_id` should be optional. Version 0.4.x of web_connectivity should not include them, version 0.5.x should. There is a technical reason why v0.4.x should not include them. The code it is based on, tracex, does not record these two fields. Whereas, v0.5.x, uses measurexlite, which records these two fields. Part of https://github.com/ooni/probe/issues/2238 * fix(webconnectivity@v0.5): add more fields This diff adds the following fields to webconnectivity@v0.5: 1. agent, always set to "redirect" (legacy field); 2. client_resolver, properly initialized w/ the resolver's IPv4 address; 3. retries, legacy field always set to null; 4. socksproxy, legacy field always set to null. Part of https://github.com/ooni/probe/issues/2238 * fix(webconnectivity@v0.5): register extensions The general idea behind this field is that we would be able in the future to tweak the data model for some fields, by declaring we're using a later version, so it seems useful to add it. See https://github.com/ooni/probe/issues/2238 * fix(measurexlite): use tcp or quic for tls handshake network This diff fixes a bug where measurexlite was using "tls" as the protocol for the TLS handshake when using TCP. While this choice _could_ make sense, the rest of the code we have written so far uses "tcp" instead. Using "tcp" makes more sense because it allows you to search for the same endpoint across different events by checking for the same network and for the same endpoint rather than special casing TLS handshakes for using "tls" when the endpoint is "tcp". See https://github.com/ooni/probe/issues/2238 * chore: run alltests.yml for "alltestsbuild" branches Part of https://github.com/ooni/probe/issues/2238 |
||
---|---|---|
.. | ||
analysiscore.go | ||
analysisdns.go | ||
analysishttpcore.go | ||
analysishttpdiff.go | ||
analysistcpip.go | ||
analysistls.go | ||
cleartextflow.go | ||
config.go | ||
control.go | ||
dnscache.go | ||
dnsresolvers.go | ||
dnswhoami.go | ||
doc.go | ||
inputparser.go | ||
measurer.go | ||
README.md | ||
secureflow.go | ||
summary.go | ||
testkeys.go |
webconnectivity
This directory contains a new implementation of Web Connectivity.
As of 2022-08-26, this code is experimental and is not selected
by default when you run the websites
group. You can select this
implementation with miniooni
using miniooni web_connectivity@v0.5
from the command line.
Issue #2237 explains the rationale behind writing this new implementation.
Implementation overview
The experiment measures a single URL at a time. The OONI Engine invokes the
Run
method inside the measurer.go file.
This code starts a number of background tasks, waits for them to complete, and
finally calls TestKeys.finalize
to finalize the content of the JSON measurement.
The first task that is started deals with DNS and lives in the
dnsresolvers.go file. This task is responsible for
resolving the domain inside the URL into 0..N
IP addresses.
The domain resolution includes the system resolver and a DNS-over-UDP resolver. The implementaion may do more than that, but this is the bare minimum we're feeling like documenting right now. (We need to experiment a bit more to understand what else we can do there, hence the code is probably doing more than just that.)
Once we know the 0..N
IP addresses for the domain we do the following:
-
start a background task to communicate with the Web Connectivity test helper, using code inside control.go;
-
start an endpoint measurement task for each IP adddress (which of course only happens when we know at least one addr).
Regarding starting endpoint measurements, we follow this policy:
-
if the original URL is
http://...
then we start a cleartext task and an encrypted task for each address using ports80
and443
respectively. -
if it's
https://...
, then we only start encrypted tasks.
Cleartext tasks are implemented by cleartextflow.go while the encrypted tasks live in secureflow.go.
A cleartext task does the following:
-
TCP connect;
-
additionally, the first task to establish a connection also performs a GET request to fetch a webpage (we cannot GET for all connections, because that would be
websteps
and would require a different data format).
An encrypted task does the following:
-
TCP connect;
-
TLS handshake;
-
additionally, the first task to handshake also performs a GET request to fetch a webpage iff the input URL was
https://...
(we cannot GET for all connections, because that would bewebsteps
and would require a different data format).
If fetching the webpage returns a redirect, we start a new DNS task passing it the redirect URL as the new URL to measure. We do not call the test helper again when this happens, though. The Web Connectivity test helper already follows the whole redirect chain, so we would need to change the test helper to get information on each flow. When this will happen, this experiment will probably not be Web Connectivity anymore, but rather some form of websteps.
Additionally, when the test helper terminates, we run TCP connect and TLS handshake (when applicable) for new IP addresses discovered using the test helper that were previously unknown to the probe, thus collecting extra information. This logic lives inside the control.go file.
As previously mentioned, when all tasks complete, we call TestKeys.finalize
.
In turn, this function analyzes the collected data by calling code implemented inside the following files:
-
analysiscore.go contains the core analysis algorithm;
-
analysisdns.go contains DNS specific analysis;
-
analysishttpcore.go contains the bulk of the HTTP analysis, where we mainly determine TLS blocking;
-
analysishttpdiff.go contains the HTTP diff algorithm;
-
analysistcpip.go checks for TCP/IP blocking.
We emit the blocking
and accessible
keys we emitted before as well as new
keys, prefixed by x_
to indicate that they're experimental.
Limitations and next steps
We need to extend the Web Connectivity test helper to return us information about TLS handshakes with IP addresses discovered by the probe. This information would allow us to make more precise TLS blocking statements.
Further changes are probably possible. Departing too radically from the Web
Connectivity model, though, will lead us to have a websteps
implementation (but
then the data model would most likely be different).