We introduce a fork of internal/httpx, named internal/httpapi, where there is a clear split between the concept of an API endpoint (such as https://0.th.ooni.org/) and of an API descriptor (such as using `GET` to access /api/v1/test-list/url).
Additionally, httpapi allows to create a SequenceCaller that tries to call a given API descriptor using multiple API endpoints. The SequenceCaller will stop once an endpoint works or when all the available endpoints have been tried unsuccessfully.
The definition of "success" is the following: we consider "failure" any error that occurs during the HTTP round trip or when reading the response body. We DO NOT consider "failure" errors (1) when parsing the input URL; (2) when the server returns >= 400; (3) when the server returns a string that does not parse as valid JSON. The idea of this classification of failures is that we ONLY want to retry when we see what looks like a network error that may be caused by (collateral or targeted) censorship.
We take advantage of the availability of this new package and we refactor web_connectivity@v0.4 and web_connectivity@v0.5 to use a SequenceCaller for calling the web connectivity TH API. This means that we will now try all the available THs advertised by the backend rather than just selecting and using the first one provided by the backend.
Because this diff is designed to be backported to the `release/3.16` branch, we have omitted additional changes to always use httpapi where we are currently using httpx. Yet, to remind ourselves about the need to do that, we have deprecated the httpx package. We will rewrite all the code currently using httpx to use httpapi as part of future work.
It is also worth noting that httpapi will allow us to refactor the backend code such that (1) we remove code to select a backend URL endpoint at the beginning and (2) we try several endpoints. The design of the code is such that we can add to the mix some endpoints using as `http.Client` a special client using a tunnel. This will allow us to automatically fallback backend queries.
Closes https://github.com/ooni/probe/issues/2353.
Related to https://github.com/ooni/probe/issues/1519.
All measurements collected since 2022-10-19 with previous versions
of OONI Probe will wrongly report sfu.voip.signal.org as blocked
as it switched to using a different root CA
This fixes: https://github.com/ooni/probe/issues/2344
This change ensures that, in turn, we're able to "remote" all the traffic generated by the `geolocate` package, rather than missing some bits of it that were still using the standard library and caused _some_ geolocations to geolocate as the local host rather than as the remote host.
Extracted from https://github.com/ooni/probe-cli/pull/969, where we tested this functionality.
Closes https://github.com/ooni/probe/issues/1383 (which was long overdue).
Part of https://github.com/ooni/probe/issues/2340, because it allows us to make progress with that.
This diff re-enables `E2E/miniooni.bash`. To make it working properly, we
needed to figure out which were the right cloudfronts to use.
I looked into the configuration and determined that both cloudfronts
should be used because they basically map to the same host.
I also determined it was backwards to test a mixture of prod and testing
APIs, and probably also flaky. So, I choose to only test the prod.
Additionally, I added support for testing all supported tunnels.
Closes https://github.com/ooni/probe/issues/2336
This diff adds to miniooni support for using the torsf tunnel. Such a
tunnel consists of a snowflake pluggable transport in front of a custom
instance of tor and requires tor to be installed.
The usage is like:
```
./miniooni --tunnel=torsf [...]
```
The default snowflake rendezvous method is "domain_fronting". You can
select the AMP cache instead using "amp":
```
./miniooni --snowflake-rendezvous=amp --tunnel=torsf [...]
```
Part of https://github.com/ooni/probe/issues/1955
Closes https://github.com/ooni/probe/issues/2334.
While there, reinstate integration tests, which were also lost in a previous refactoring. However, only run those tests for linux/amd64 because we can be confident that the Go compiler is WAI for all archs we support.
While there, always use bash for running end-to-end tests.
H/T @ainghazal for discovering and reporting this bug.
We introduce the -f, --input-file FILE option with which we
are able to run an OONI Run v2 descriptor stored locally.
In this running mode, there are no checks related to whether the
descriptor has changed, since we're dealing with a local file.
Closes https://github.com/ooni/probe/issues/2328
We're bumping the experiment's version number because we changed the name of the field used to contain late/duplicate DNS responses. We have also changed the algorithm to determine `#dnsDiff`. However, the change should only impact how we log this information. Overall, here the idea is to provide users with a reasonably clear explanation of how the probe maps observations to blocking and accessible using expected/unexpected as the conceptual framework.
Part of https://github.com/ooni/probe/issues/2237
This diff includes a rule to recover from the "measurement failed" state that kicks in when we have a chain of successful redirects from the client side leading to a webpage _and_ any URL in the chain uses HTTPS. See https://github.com/ooni/probe/issues/2307.
While there, fix `i/e/w/iox.go` to avoid triggering the `./script/nocopyreadall.bash` script.
This diff introduces a special rule to avoid emitting null, null when all the connects failed in both the probe and the TH.
While there, recognize that the subset of null, null we're hunting actually deals with websites that are down, so change the internal naming to reflect that and make the code easier to read/understand.
See https://github.com/ooni/probe/issues/2299
It's confusing to see
```
measuring additional addrs from TH: []
```
when actually nothing is going to be measured.
So, instead, let us log about the additional addrs
discovered by the TH instead, which is less confusing.
Part of https://github.com/ooni/probe/issues/2237
See https://github.com/ooni/probe/issues/2290
While there, notice that in such a case the priority selector would hang because of the WaitGroup, so get rid of the WaitGroup and accept that the priority selector is going to hang around for the whole duration of the measurement in some cases. The cancellable `measurer.go`'s context will cause the priority selector to eventually exit when we return from `measurer.go`'s `Run` method.
This diff changes the data format to prefer "udp" to "quic" everywhere we were previously using "quic".
Previously, the code inconsistently used "quic" for operations where we knew we were using "quic" and "udp" otherwise (e.g., for generic operations like ReadFrom).
While it would be more correct to say that a specific HTTP request used "quic" rather than "udp", using "udp" consistently allows one to see how distinct events such as ReadFrom and an handshake all refer to the same address, port, and protocol triple. Therefore, this change makes it easier to programmatically unpack a single measurement and create endpoint stats.
Before implementing this change, I discussed the problem with @hellais who mentioned that ooni/data is not currently using the "quic" string anywhere. I know that ooni/pipeline also doesn't rely on this string. The only users of this feature have been research-oriented experiments such as urlgetter, for which such a change would actually be acceptable.
See https://github.com/ooni/probe/issues/2238 and https://github.com/ooni/spec/pull/262.
Code based on urlgetter had this event and we would like to have this
event with step-by-step code as well.
Because there's no tracing for HTTP when using step-by-step, we will
need to include emitting these events inside the boilerplate.
By doing that, we emit events out of order, so make sure we sort
them by T, which is "the moment when the event was collected".
Part of https://github.com/ooni/probe/issues/2238
* fix(model/archival.go): more optional keys
Basically, `t0` and `transaction_id` should be optional. Version 0.4.x
of web_connectivity should not include them, version 0.5.x should.
There is a technical reason why v0.4.x should not include them. The code
it is based on, tracex, does not record these two fields.
Whereas, v0.5.x, uses measurexlite, which records these two fields.
Part of https://github.com/ooni/probe/issues/2238
* fix(webconnectivity@v0.5): add more fields
This diff adds the following fields to webconnectivity@v0.5:
1. agent, always set to "redirect" (legacy field);
2. client_resolver, properly initialized w/ the resolver's IPv4 address;
3. retries, legacy field always set to null;
4. socksproxy, legacy field always set to null.
Part of https://github.com/ooni/probe/issues/2238
* fix(webconnectivity@v0.5): register extensions
The general idea behind this field is that we would be able
in the future to tweak the data model for some fields, by declaring
we're using a later version, so it seems useful to add it.
See https://github.com/ooni/probe/issues/2238
* fix(measurexlite): use tcp or quic for tls handshake network
This diff fixes a bug where measurexlite was using "tls" as the
protocol for the TLS handshake when using TCP.
While this choice _could_ make sense, the rest of the code we have
written so far uses "tcp" instead.
Using "tcp" makes more sense because it allows you to search for
the same endpoint across different events by checking for the same
network and for the same endpoint rather than special casing TLS
handshakes for using "tls" when the endpoint is "tcp".
See https://github.com/ooni/probe/issues/2238
* chore: run alltests.yml for "alltestsbuild" branches
Part of https://github.com/ooni/probe/issues/2238
I've just branched off the `release/3.16` branch since we're
really looking good for release modulo minor changes.
Hence, it's time to update `master`'s version.
This diff modifies webconnectivity@v0.5 to take decisions regarding
TLS blocking by using the response from the TH rather than using
questionable heuristics based on inspecting the TLSHandshake list
alone. This change should improve correctness _when_ we're using
the improved TH, which is currently used for 50% of the probes.
See https://github.com/ooni/probe/issues/2257
While there, modify `control.go` to specify which control is being used.
As silly as it seems, emojis help _a lot_ when eyeballing logs
to quickly identify unexpected lines.
I'm doing this work as part of https://github.com/ooni/probe/issues/2257