Commit Graph

549 Commits

Author SHA1 Message Date
Simone Basso
5904e6988d
fix(netxlite): map servfail error (#728)
This error occurred for example when querying kazemjalali.com
in websteps measurements run from Iran.

This error is relatively uncommon, but it still makes sense to
create a specific mapping rule for it.

Originally: 4269e82fbd

See https://github.com/ooni/probe/issues/2096
2022-05-13 19:25:22 +02:00
Simone Basso
b872dd0e1e
fix(netxlite): HTTPSSvc: better no_answer checks (#727)
I've seen some measurements returning some IP addresses for HTTPSSvc
queries but not returning any ALPN value.

For example:

```
% d4
decoding DNS round trip 0:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57768
;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;psiphon.ca.                    IN      HTTPS

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57768
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;psiphon.ca.                    IN      HTTPS
;; ANSWER SECTION:
psiphon.ca.             121     IN      A       31.13.85.53
```

Now, the response is clearly bogus. At the time of this writing that
IP address belongs to Facebook. This measurement has been collected in
China, so it's expected for the GFW to behave like this.

Yet, I don't feel like it's accurate to report this measurement as a
"no answer" response. Rather, this response is a valid one containing
a clearly invalid IP address and should be flagged as such.

Originally: 57a023bcf4

See https://github.com/ooni/probe/issues/2096
2022-05-13 19:00:51 +02:00
Simone Basso
192dfd49b4
fix(netxlite): consolidate IPv4/IPv6 checking code (#726)
Originally 966e7f7cdd

See https://github.com/ooni/probe/issues/2096
2022-05-13 18:49:18 +02:00
Simone Basso
e126e73de7
fix(netxlite): LookupHTTPS: short circuit IP addr (#725)
This diff fixes the short-circuit-IP-addr resolver to
correctly handle IP addrs during LookupHTTPS.

The original diff was: 2b51d144bf

See https://github.com/ooni/probe/issues/2096

While there, add unit tests for IPv6.
2022-05-13 18:26:15 +02:00
Simone Basso
ec0561ea8c
feat(netxlite): implement parallel resolver (#724)
This diff imports the parallel resolver from websteps winter 2022
edition, which was originally implemented here:

55231d73cd

See https://github.com/ooni/probe/issues/2096
2022-05-13 17:36:58 +02:00
Simone Basso
0efd4ff130
chore: import improved bogons handling code (#723)
This diff imports improved bogons handling code from websteps
winter 2022 edition's repository.

See https://github.com/ooni/probe/issues/2095

See a65f3e8579/internal/netxlite/bogon.go
2022-05-13 15:32:47 +02:00
Simone Basso
1776ea1288
cleanup: remove websteps summer 2021 implementation (#722)
See https://github.com/ooni/probe/issues/2094
2022-05-13 15:06:03 +02:00
Yeganathan S
ded4b08113
fix(ndt7): discards all incoming websockets messages during upload (#719)
See https://github.com/ooni/probe/issues/2084
2022-05-12 08:18:05 +02:00
Simone Basso
b7cc309901
feat: re-implement the vanilla_tor experiment (#718)
This diff re-implements the vanilla_tor experiment. This experiment was
part of the ooni/probe-legacy implementation.

The reference issue is https://github.com/ooni/probe/issues/803. We didn't
consider the possible improvements mentioned by the
https://github.com/ooni/probe/issues/803#issuecomment-598715694 comment,
which means we'll need to create a follow-up issue for them. We will
then decide whether, when, and how to implement those follow-up measurements
either into `vanilla_tor` or into the existing `tor` experiment.

This novel `vanilla_tor` implementation emits test_keys that are mostly
compatible with the original implementation, however:

1. the `timeout` is a `float64` rather than integer (but the default
timeout is an integer, so there are no JSON-visible changes);

2. the `tor_log` string is gone and replaced by the `tor_logs` list
of strings, which contains the same information;

3. the definition of `error` has been augmented to include the
case in which there is an unknown error;

4. the implementation of vanilla_tor mirrors closely the one of torsf
and we have taken steps to make the two implementations as comparable
as possible in terms of the generated JSON measurement.

The main reason why we replaced `tor_log` with `tor_logs` are:

1. that `torsf` already used that;

2. that reading the JSON is easier with this implementation compared to
an implementation where all logs are into the same string.

If one is processing the new data format using Python, then it will
not be difficult convert `tor_log` to `tor_logs`. In any case, because
we extract the most interesting fields (e.g., the percentage of the
bootstrap where tor fails), it seems that logs are probably more useful
as something you want to read in edge cases (I guess).

Also, because we want `torsf` and `vanilla_tor` to have similar JSONs,
we renamed `torsf`'s `default_timeout` to `timeout`. This change has little
to none real-world impact, because no stable version of OONI Probe has
ever shipped a `torsf` producing the `default_timeout` field.

Regarding the structure of this diff, we have:

1. factored code to parse tor logs into a separate package;

2. implemented `vanilla_tor` as a stripped down `torsf` and added further
changes to ensure compatibility with the previous `vanilla_tor`'s data format;

3. improved `torsf` to merge back the changes in `vanilla_tor`, so the two
data formats of the two experiments are as similar as possible.

We believe producing as similar as possible data formats helps anyone who's
reading measurements generated by both experiments.

We have retained/introduced `vanilla_tor`'s `error` field, which is not very
useful when one has a more precise failure but is still what `vanilla_tor`
used to emit, so it makes sense to also have this field.

In addition to changing the implementation, we also updated the specs.

As part of our future work, we may want to consider factoring the common code
of these two experiments into the same underlying support library.
2022-05-10 15:43:28 +02:00
Yeganathan S
3d81845614
fix(httpx): correctly combine paths (#706)
See https://github.com/ooni/probe/issues/2010

Co-authored-by: Simone Basso <bassosimone@gmail.com>
2022-05-09 21:32:49 +02:00
Simone Basso
36ca28d673
feat: add a simple dnsping experiment (#674)
See https://github.com/ooni/probe/issues/1987 (issue).

See https://github.com/ooni/spec/pull/238 (impl).

While there, fix the build for go1.18 by adding go1.18 specific tests. I was
increasingly bothered by the build being red.
2022-05-09 15:28:18 +02:00
Simone Basso
a7a6d7df7f
feat: introduce the simplequicping experiment (#717)
See https://github.com/ooni/probe/issues/2091 (issue) and https://github.com/ooni/spec/pull/237 (spec).
2022-05-09 11:22:44 +02:00
Simone Basso
2917dd6c76
feat: introduce the tlsping experiment (#716)
See https://github.com/ooni/probe/issues/2088 (issue) and https://github.com/ooni/spec/pull/236 (spec).
2022-05-09 10:25:50 +02:00
Simone Basso
e983a5cffb
feat: introduce the tcpping experiment (#696)
See https://github.com/ooni/probe/issues/2030 (reference issue) and https://github.com/ooni/spec/pull/235 (spec).
2022-05-09 09:33:18 +02:00
DecFox
5d2afaade4
cli: upgrade to lucas-clemente/quic-go@v0.27.0 (#715)
* quic-go upgrade: replaced Session/EarlySession with Connection/EarlyConnection

* quic-go upgrade: added context to RoundTripper.Dial

* quic-go upgrade: made corresponding changes to tutorial

* quic-go upgrade: changed sess variable instances to qconn

* quic-go upgrade: made corresponding changes to tutorial

* cleanup: remove unnecessary comments

Those comments made sense in terms of illustrating the changes
but they're going to be less useful once we merge.

* fix(go.mod): apparently we needed `go1.18.1 mod tidy`

VSCode just warned me about this. It seems fine to apply this
change as part of the pull request at hand.

* cleanup(netxlite): http3dialer can be removed

We used to use http3dialer to glue a QUIC dialer, which had a
context as its first argument, to the Dial function used by the
HTTP3 transport, which did not have a context as its first
argument.

Now that HTTP3 transport has a Dial function taking a context as
its first argument, we don't need http3dialer
anymore, since we can use the QUIC dialer directly.

Cc: @DecFox

* Revert "cleanup(netxlite): http3dialer can be removed"

This reverts commit c62244c620cee5fadcc2ca89d8228c8db0b96add
to investigate the build failure mentioned at
https://github.com/ooni/probe-cli/pull/715#issuecomment-1119450484

* chore(netx): show that test was already broken

We didn't see the breakage before because we were not using
the created transport, but the issue of using a nil dialer was
already present before, we just didn't see it.

Now we understand why removing the http3transport in
c62244c620cee5fadcc2ca89d8228c8db0b96add did cause the
breakage mentioned at
https://github.com/ooni/probe-cli/pull/715#issuecomment-1119450484

* fix(netx): convert broken integration test to working unit test

There's no point in using the network here. Add a fake dialer that
breaks and ensure we're getting the expected error.

We've now improved upon the original test because the original test was
not doing anything while now we're testing whether we get back a QUIC
dialer that _can be used_.

After this commit, I can then readd the cleanup commit
c62244c620cee5fadcc2ca89d8228c8db0b96add and it won't be
broken anymore (at least, this is what I expected to happen).

* Revert "Revert "cleanup(netxlite): http3dialer can be removed""

This reverts commit 0e254bfc6ba3bfd65365ce3d8de2c8ec51b925ff
because now we should have fixed the broken test.

Co-authored-by: decfox <decfox>
Co-authored-by: Simone Basso <bassosimone@gmail.com>
2022-05-06 12:24:03 +02:00
DecFox
a72cc7151c
tls_handshakes: add endpoint addresses to handshake list (#711)
* tls_handshakes: add IP addresses

* tls_handshakes: extract ip from tcp-connect

* tls_handshake: switched to trace event

* saver.go: get remoteAddr before handshake

Not sure whether this is strictly necessary, but I'd rather take the
remoteAddr before calling Handshake, just in case a future version
of the handshake closes the `conn`. In such a case, `conn.RemoteAddr`
would return `nil` and we would crash here.

This occurred to me while reading once again the diff before merging.

Co-authored-by: decfox <decfox>
Co-authored-by: Simone Basso <bassosimone@gmail.com>
2022-05-06 11:09:54 +02:00
DecFox
b81af5b058
feat(torsf): add default_timeout test keys (#709)
See https://github.com/ooni/probe/issues/2061
2022-05-06 10:47:26 +02:00
ParitoshKabra
4c55102789
fix(torsf): ensure tor-logs-filtering regexp is correct (#707)
* Fix Regex in TorProgressRegex

* fix: update regexp link

As suggested by @hellais

Co-authored-by: Simone Basso <bassosimone@gmail.com>
2022-05-06 10:36:26 +02:00
Simone Basso
d3c5196474
fix(ooniprobe): use ooniprobe-cli-unattended for unattended runs (#714)
This diff changes the software name used by unattended runs for which
we did not override the default software name (`ooniprobe-cli`).

It will become `ooniprobe-cli-unattended`. This software name is in line
with the one we use for Android, iOS, and desktop unattended runs.

While working in this diff, I introduced string constants for the run
types and a string constant for the default software name.

See https://github.com/ooni/probe/issues/2081.
2022-04-29 13:41:09 +02:00
Simone Basso
306d18e466
chore: support go1.18 and update dependencies (#708)
Here's the squash of the following patches that enable support
for go1.18 and update our dependencies.

This diff WILL need to be backported to the release/3.14 branch.

* chore: use go1.17.8

See https://github.com/ooni/probe/issues/2067

* chore: upgrade to probe-assets@v0.8.0

See https://github.com/ooni/probe/issues/2067.

* chore: update dependencies and enable go1.18

As mentioned in 7a0d17ea91,
the tree won't build with `go1.18` unless we say it does.

So, not only here we need to update dependencies but also we
need to explicitly say `go1.18` in the `go.mod`.

This work is part of https://github.com/ooni/probe/issues/2067.

* chore(coverage.yml): run with go1.18

This change will give us a bare minimum confidence that we're
going to build our tree using version 1.18 of golang.

See https://github.com/ooni/probe/issues/2067.

* chore: update user agent used for measuring

See https://github.com/ooni/probe/issues/2067

* chore: run `go generate ./...`

See https://github.com/ooni/probe/issues/2067

* fix(dialer_test.go): make test work with go1.17 and go1.18

1. the original test wanted the dial to fail, so ensure we're not
passing any domain name to exercise dialing not resolving;

2. match the end of the error rather than the whole error string.

Tested locally with both go1.17 and go1.18.

See https://github.com/ooni/probe-cli/pull/708#issuecomment-1096447186
2022-04-12 11:43:12 +02:00
Dionysis Grigoropoulos
07f8db9dc2
feat: add support for OpenBSD (#703)
Closes https://github.com/ooni/probe/issues/2052
2022-03-08 12:25:33 +01:00
Yeganathan S
74e31d5cc1
cleanup: use ErrorToStringOrOK func in other tests that returns nil (#701)
Reference issue: https://github.com/ooni/probe/issues/2040
2022-03-08 11:59:44 +01:00
Simone Basso
024eb42334
fix(ndt7): force our bundled CA pool (#700)
This change should prevent old clients (e.g., Android 6) from
failing to perform a ndt7 experiment because their internal CA
bundle is now too old.

Reference issue: https://github.com/ooni/probe/issues/2031

While there, run `go mod tidy` to fix a minor inconsistence in
the current `go.mod` file.

This diff WILL require a backport to release/3.14.
2022-02-23 12:59:03 +01:00
Yeganathan S
6e78cc2d71
chore: import DoH servers from DNSCrypt/dnscrypt-resolvers (#693)
See https://github.com/ooni/probe/issues/1969
2022-02-17 17:52:16 +01:00
Yeganathan S
6a63f1b044
fix(dnscheck): log "ok" rather than "<nil>" on success (#695)
See https://github.com/ooni/probe/issues/2020
2022-02-16 20:47:44 +01:00
kelmenhorst
88236a4352
feat: add an experimental quicping experiment (#677)
This experiment pings a QUIC-able host. It can be used to measure QUIC availability independently from TLS.
This is the reference issue: https://github.com/ooni/probe/issues/1994

### A QUIC PING is:
- a QUIC Initial packet with a size of 1200 bytes (minimum datagram size defined in the [RFC 9000](https://www.rfc-editor.org/rfc/rfc9000.html#initial-size)),
- with a random payload (i.e. no TLS ClientHello),
- with the version string 0xbabababa which forces Version Negotiation at the server.

QUIC-able hosts respond to the QUIC PING with a Version Negotiation packet.

The input is a domain name or an IP address. The default port used by quicping is 443, as this is the port used by HTTP/3. The port can be modified with the `-O Port=` option.
The default number of repetitions is 10, it can be changed with `-O Repetitions=`.

### Usage:
```
./miniooni -i google.com quicping
./miniooni -i 142.250.181.206 quicping
./miniooni -i 142.250.181.206 -OPort=443 quicping
./miniooni -i 142.250.181.206 -ORepetitions=2 quicping

```
2022-02-14 19:21:16 +01:00
kelmenhorst
0735e2018f
feat: add oonireport client (#682)
The oonireport client (re-)uploads a measurement report file. This can be helpful when the measurement was not uploaded at runtime.

Usage: `./oonireport upload <file>`, where `<file>` is a json(l) file containing one OONI measurement per line.

This pull request refers to https://github.com/ooni/probe/issues/2003 and https://github.com/ooni/probe/issues/950.

Co-authored-by: Simone Basso <bassosimone@gmail.com>
2022-02-14 15:24:36 +01:00
Ain Ghazal
00b5c73c3a
jafar(README.md): fix typo (#692)
Co-authored-by: Ain Ghazal <ainghazal@riseup.net>
2022-02-10 17:38:51 +01:00
Simone Basso
7bbd36a434
[forwardport] fix(jafar/iptables/test): force using pure Go resolver (#690)
This commit forward ports 8f2d7945f806579af4d0495f4b8f5a6a01eefb0c, whose
commit message is as follows:

- - -

The discrepancy I was seeing between my local tests and tests run
in the CI is that my systemd is configured to use DoT.

Hence, it was bypassing iptables rules because the query was sent
over an encrypted tunnel. Using a pure Go resolver fixes since
that always uses UDP, so the filter works.

Also, reason that we want as minimal as possible tests, so refactor
a test so that we use just a resolver rather than an HTTP client, and,
while there, also enforce this resolver to be a pure Go resolver.

Reference issue: https://github.com/ooni/probe/issues/2016

This diff WILL need to be forward ported to master.
2022-02-09 15:32:45 +01:00
Simone Basso
bf3c8bcdc3
[forwardport] fix(netx): stop collecting HTTP performance metrics (#689)
This diff forward ports b6db4f64dc83a2a27ee3ce6bba5ac93db922832d, whose
original log message is the following:

- - -

We're now using ooni/oohttp as our HTTP library in most cases.

A limitation of this library is that net/http/httptrace does not
work very well and reliably because (1) we need to use oohttp's
version of that code and (2) we cannot observe net events.

I noticed this fact because an integration test for collecting
HTTP performance metrics was broken.

The best solution here is to remove this functionality, since
it was basically unused in the repository. Only some integration
tests inside urlgetter bothered with these metrics.

A more clinical fix would have been to use ooni/oohttp/httptrace
instead of net/http/httptrace in the stdlib, but it does not
seem to be a good idea, given that those metrics were not used.

With this diff applied, we'll further reduce the number of locally
failing integration tests to just jafar-specific tests.

This diff WILL need to be forwardported to `master`.
2022-02-09 15:08:19 +01:00
Simone Basso
eed007a5d0
chore: start hacking on 3.15.0-alpha (#687)
We've just branched off the release/3.14 branch for finalizing
the release of 3.14.0, hence let's declare that from now on we're
3.15.0-alpha to avoid any confusion.
2022-02-09 14:15:50 +01:00
Simone Basso
024de0e498
fix(geolocate): enforce 7s timeout for each lookupper (#678)
This issue aims at making life slighly better for users impacted by
sanctions whose iplookup may be quite slow in case there are timeouts
as documented in https://github.com/ooni/probe/issues/1988.
2022-02-09 13:22:01 +01:00
Srijan Srivastava
f7fd29b246
geolocate: add cloudflare-based IP lookup (#676)
Cloudflare hosted services provide a certain service of `/cdn-cgi/trace` with their base url (for example, `www.cloudflare.com` or `www.nginx.com`), which can be used to obtain `ip` in the probe's `geolocate` feature.

The same feature was added in this pr, hence, increasing the number of `baseURL`s in `geolocate`.

Co-authored-by: Simone Basso <bassosimone@gmail.com>
2022-02-09 11:54:19 +01:00
Simone Basso
85664f1e31
feat(torsf): collect tor logs, select rendezvous method, count bytes (#683)
This diff contains significant improvements over the previous
implementation of the torsf experiment.

We add support for configuring different rendezvous methods after
the convo at https://github.com/ooni/probe/issues/2004. In doing
that, I've tried to use a terminology that is consistent with the
names being actually used by tor developers.

In terms of what to do next, this diff basically instruments
torsf to always rendezvous using domain fronting. Yet, it's also
possible to change the rendezvous method from the command line,
when using miniooni, which allows to experiment a bit more. In the
same vein, by default we use a persistent tor datadir, but it's
also possible to use a temporary datadir using the cmdline.

Here's how a generic invocation of `torsf` looks like:

```bash
./miniooni -O DisablePersistentDatadir=true \
           -O RendezvousMethod=amp \
           -O DisableProgress=true \
           torsf
```

(The default is `DisablePersistentDatadir=false` and
`RendezvousMethod=domain_fronting`.)

With this implementation, we can start measuring whether snowflake
and tor together can boostrap, which seems the most important thing
to focus on at the beginning. Understanding why the bootstrap most
often does not converge with a temporary datadir on Android devices
remains instead an open problem for now. (I'll also update the
relevant issues or create new issues after commit this.)

We also address some methodology improvements that were proposed
in https://github.com/ooni/probe/issues/1686. Namely:

1. we record the tor version;

2. we include the bootstrap percentage by reading the logs;

3. we set the anomaly key correctly;

4. we measure the bytes send and received (by `tor` not by `snowflake`, since
doing it for snowflake seems more complex at this stage).

What remains to be done is the possibility of including Snowflake
events into the measurement, which is not possible until the new
improvements at common/event in snowflake.git are included into a
tagged version of snowflake itself. (I'll make sure to mention
this aspect to @cohosh in https://github.com/ooni/probe/issues/2004.)
2022-02-07 17:05:36 +01:00
Simone Basso
d2fb7f8e6c
fix(jafar): re-enable previously broken integration test (#681)
I have tested this integration test locally and it's now WAI.

It may be that it will fail again when run on GitHub Actions, which will
indicate we cannot fully trust Actions for running _some_ tests.

Closes https://github.com/ooni/probe/issues/1913.
2022-02-01 14:47:22 +01:00
Yeganathan S
502ce1267a
fix(resolvermake): re-enable dns.google DoH HTTP3 resolutions (#680) 2022-01-31 19:12:43 +01:00
Simone Basso
ce8ec5b391
fix(reduceErrors): return error when given an empty list (#675)
See https://github.com/ooni/probe/issues/1985 for context.

While there, ensure nextlite has 100% of coverage.
2022-01-26 12:18:36 +01:00
Simone Basso
4d50dd6d54
fix(i/t/tor.go): show correct command line (#673)
While there, ensure that we print a warning if we cannot find
the correct tor binary.

Work part of https://github.com/ooni/probe/issues/1917.
2022-01-25 20:43:27 +01:00
Simone Basso
2a566f2046
feat: start preparing for a cli release (#672)
This diff includes some final changes to be ready for blessing
a cli release. These changes are:

1. run `go generate ./...` to update the bundled CA

2. update the header we use for measuring

3. ensure `mk` uses the latest version of several tools

Reference issue: https://github.com/ooni/probe/issues/1845
2022-01-24 14:56:51 +01:00
Simone Basso
d92c1641ac
feat: start adding torsf to desktop and mobile (#671)
This commit message is the same across probe-cli, probe-desktop,
and probe-android. With the changes contained in the enclosed
diff, I'm starting to add support for torsf for android and for
desktop.

When smoke testing that torsf was WAI, I also noticed that its
progress messages in output are too frequent. We may want to do
better in a future version when we'll be able to read `tor`'s
output. In the meanwhile, make the progress messages less
frequent and indicated the maximum runtime inside of the messages
themselves. This improved message, albeit not so nice from the
UX PoV, should at least provide a clue that we're not stuck.

Reference issue: https://github.com/ooni/probe/issues/1917
2022-01-24 12:39:27 +01:00
Simone Basso
a01f901e13
feat(ooniprobe): add torsf to experimental group (#670)
Reference issue: https://github.com/ooni/probe/issues/1917.

I needed to change the summary key type returned by `torsf` to be a value. It seems the DB layer assumes that. If we pass it a pointer, it panics because it's experiment a value rather than a pointer 🤷.
2022-01-21 12:32:08 +01:00
Simone Basso
97d2b5a0e3
chore: upgrade psiphon and go-cmp (#669)
I have experimented with a new approach for embedding psiphon in
7fc0bcd97c.

It seems the build is still building and the experiment is still
running. With the new approach, we're now vendoring less dependencies,
which hopefully puts us in the right track to, one day, import
Psiphon as a normal Go dependency.

I'll make sure to report to the Psiphon team what is currently
preventing us from importing their ClientLibrary directly.

This work is part of https://github.com/ooni/probe/issues/1894.

As part of running the update, I run `go get -u -v ./...`, which
led to go-cmp also being updated in the process.
2022-01-21 11:54:48 +01:00
Simone Basso
cfb054efd4
feat(snowflake): upgrade to v2 (+ small tweaks) (#667)
This diff contains the following changes and enhancements:

1. upgrade snowflake to v2

2. observe that we were not changing defaults from outside of snowflake.go, so remove code allowing to do that;

3. bump the timeout to 600 seconds (it seems 300 was not always enough based on my testing);

4. add useful knob to disable `torsf` progress (it's really annoying on console, we should do something about this);

5. ptx.go: avoid printing an error when the connection has just been closed;

6. snowflake: test AMP cache, see that it's not working currently, so leave it disabled.

Related issues: https://github.com/ooni/probe/issues/1845, https://github.com/ooni/probe/issues/1894, and https://github.com/ooni/probe/issues/1917.
2022-01-19 17:23:27 +01:00
Simone Basso
e904b90006
feature: merge measurex and netx archival layer (1/N) (#663)
This diff introduces a new package called `./internal/archival`. This package collects data from `./internal/model` network interfaces (e.g., `Dialer`, `QUICDialer`, `HTTPTransport`), saves such data into an internal tabular data format suitable for on-line processing and analysis, and allows exporting data into the OONI data format.

The code for collecting and the internal tabular data formats are adapted from `measurex`. The code for formatting and exporting OONI data-format-compliant structures is adapted from `netx/archival`.

My original objective was to _also_ (1) fully replace `netx/archival` with this package and (2) adapt `measurex` to use this package rather than its own code. Both operations seem easily feasible because: (a) this code is `measurex` code without extensions that are `measurex` related, which will need to be added back as part of the process; (b) the API provided by this code allows for trivially converting from using `netx/archival` to using this code.

Yet, both changes should not be taken lightly. After implementing them, there's need to spend some time doing QA and ensuring all nettests work as intended. However, I am planning a release in the next two weeks, and this QA task is likely going to defer the release. For this reason, I have chosen to commit the work done so far into the tree and defer the second part of this refactoring for a later moment in time. (This explains why the title mentions "1/N").

On a more high-level perspective, it would also be beneficial, I guess, to explain _why_ I am doing these changes. There are two intertwined reasons. The first reason is that `netx/archival` has shortcomings deriving from its original https://github.com/ooni/netx legacy. The most relevant shortcoming is that it saves all kind of data into the same tabular structure named `Event`. This design choice is unfortunate because it does not allow one to apply data-type specific logic when processing the results. In turn, this choice results in complex processing code. Therefore, I believe that replacing the code with event-specific data structures is clearly an improvement in terms of code maintainability and would quite likely lead us to more confidently change and evolve the codebase.

The second reason why I would like to move forward these changes is to unify the codepaths used for measuring. At this point in time, we basically have two codepaths: `./internal/engine/netx` and `./internal/measurex`. They both have pros and cons and I don't think we want to rewrite whole experiments using `netx`. Rather, what we probably want is to gradually merge these two codepaths such that `netx` is a set of abstractions on top of `measurex` (which is more low-level and has a more-easily-testable design). Because saving events and generating an archival data format out of them consists of at least 50% of the complexity of both `netx` and `measurex`, it seems reasonable to unify this archival-related part of the two codebases as the first step.

At the highest level of abstraction, these changes are part of the train of changes which will eventually lead us to bless `websteps` as a first class citizen in OONI land. Because `websteps` requires different underlying primitives, I chose to develop these primitives from scratch rather than wrestling with `netx`, which used another model. The model used by `websteps` is that we perform each operation in isolation and immediately we save the results, while `netx` creates whole data structures and collects all the events happening via tracing. We believe the model used by `websteps` to be better because it does not require your code to figure out everything that happened after the measurement, which is a source of subtle bugs in the current implementation. So, when I started implementing websteps I extracted the bits of `netx` that could also be beneficial to `websteps` into a separate library, thus `netxlite` was born.

The reference issue describing merging the archival of `netx` and `measurex` is https://github.com/ooni/probe/issues/1957. As of this writing the issue still references the original plan, which I could not complete by the end of this Sprint, so I am going to adapt the text of the issue to only refer to what was done in here next. Of course, I also need follow-up issues.
2022-01-14 12:13:10 +01:00
Simone Basso
b5da8be183
fix(netxlite): robust {ReadAll,Copy}Context with wrapped io.EOF (#661)
* chore(netxlite): add currently failing test case

This diff introduces a test cases that will fail because of the reason
explained in https://github.com/ooni/probe/issues/1965.

* chore(netxlite/iox_test.go): add failing unit tests

These tests directly show how the Go implementation of ReadAll
and Copy has the issue of checking for io.EOF equality.

* fix(netxlite): make {ReadAll,Copy}Context robust to wrapped io.EOF

The fix is simple: we just need to check for `errors.Is(err, io.EOF)`
after either io.ReadAll or io.Copy has returned. When this condition is
true, we need to convert the error back to `nil` as it ought to be.

While there, observe that the unit tests I committed in the previous
commit are wrongly asserting that the error must be wrapped. This
assertion is not correct, because in both cases we have just ensured
that the returned error is `nil` (i.e., success).

See https://github.com/ooni/probe/issues/1965.

* cleanup: remove previous workaround for wrapped io.EOF

These workarounds were partial, meaning that they would cover some
cases in which the issue occurred but not all of them.

Handling the problem in `netxlite.{ReadAll,Copy}Context` is the
right thing to do _as long as_ we always use these functions instead
of `io.{ReadAll,Copy}`.

This is why it's now important to ensure we clearly mention that
inside of the `CONTRIBUTING.md` guide and to also ensure that we're
not using these functions in the code base.

* fix(urlgetter): repair tests who assumed to see EOF error

Now that we have established that we should normalize EOF when
reading bodies like the stdlib does and now that it's clear why
our behavior diverged from the stdlib, we also need to repair
all the tests that assumed this incorrect behavior.

* fix(all): don't use io{,util}.{Copy,ReadAll}

* feat: add checks to ensure we don't use io.{Copy,ReadAll}

* doc(netxlite): document we know how to deal w/ wrapped io.EOF

* fix(nocopyreadall.bash): add exception for i/n/iox.go
2022-01-12 14:26:10 +01:00
Simone Basso
d3c6c11e48
cleanup(netx): remove the DNSClient type (#660)
The DNSClient type existed because the Resolver type did not
include CloseIdleConnections in its signature.

Now that Resolver includes CloseIdleConnections, the DNSClient
type has become unnecessary and can be safely removed.

See https://github.com/ooni/probe/issues/1956.
2022-01-10 11:53:06 +01:00
Simone Basso
730373cc75
refactor: move i/netx/archival structs to i/model (#659)
We recently started moving core data structures inside of the
internal/model package as detailed in https://github.com/ooni/probe/issues/1885.

The chief reason to do that is to have a set of fundamental
shared data types to help us rationalize the codebase.

This specific diff moves internal/netx/archival's core data types
inside the internal/model package. While there, it also refactors the
existing tests to improve their quality. Additionally, we also added
an extra test to ensure `ArchivalHTTPBody` is an alias for
`ArchivalMaybeBinaryData`, which is required to ensure the
custom JSON serialization process works for it.

We're doing that because both internal/netx/archival and
internal/measurex define their own archival data structures.

We developed measurex using its own structures because it
allowed to iterate more quickly. Now that we have sketched
out measurex, the time has come to consolidate.

My overall aim is to spend a few more hours this week on
engineering measurex. This work is preliminary work before
we finish up both measurex and websteps.

We described this cleanup in https://github.com/ooni/probe/issues/1957.
2022-01-10 11:25:52 +01:00
Simone Basso
554ae47c5a
cleanup(netx): remove more legacy names and functions (#658)
This diff addresses two items of https://github.com/ooni/probe/issues/1956:

> - [ ] we can remove legacy names from `./internal/engine/netx/resolver/legacy.go`
>
> - [ ] we can remove `DialTLSContext` from `./internal/engine/netx/resolver/tls_test.go`

More cleanups may follow.
2022-01-07 20:02:19 +01:00
Simone Basso
423a3feacc
cleanup(netx): remove unused ChainResolver (#657)
This is another cleanup point mentioned by https://github.com/ooni/probe/issues/1956.

While there, fix a bunch of comments in jafar that were incorrectly
referring to the netx package name.
2022-01-07 19:18:33 +01:00
Simone Basso
566c6b246a
cleanup: remove unnecessary legacy interfaces (#656)
This diff addresses another point of https://github.com/ooni/probe/issues/1956:

> - [ ] observe that we're still using a bunch of private interfaces for common interfaces such as the `Dialer`, so we can get rid of these private interfaces and always use the ones in `model`, which allows us to remove a bunch of legacy wrappers

Additional cleanups may still be possible. The more I cleanup, the more I see
there's extra legacy code we can dispose of (which seems good?).
2022-01-07 18:33:37 +01:00