We introduce a fork of internal/httpx, named internal/httpapi, where there is a clear split between the concept of an API endpoint (such as https://0.th.ooni.org/) and of an API descriptor (such as using `GET` to access /api/v1/test-list/url).
Additionally, httpapi allows to create a SequenceCaller that tries to call a given API descriptor using multiple API endpoints. The SequenceCaller will stop once an endpoint works or when all the available endpoints have been tried unsuccessfully.
The definition of "success" is the following: we consider "failure" any error that occurs during the HTTP round trip or when reading the response body. We DO NOT consider "failure" errors (1) when parsing the input URL; (2) when the server returns >= 400; (3) when the server returns a string that does not parse as valid JSON. The idea of this classification of failures is that we ONLY want to retry when we see what looks like a network error that may be caused by (collateral or targeted) censorship.
We take advantage of the availability of this new package and we refactor web_connectivity@v0.4 and web_connectivity@v0.5 to use a SequenceCaller for calling the web connectivity TH API. This means that we will now try all the available THs advertised by the backend rather than just selecting and using the first one provided by the backend.
Because this diff is designed to be backported to the `release/3.16` branch, we have omitted additional changes to always use httpapi where we are currently using httpx. Yet, to remind ourselves about the need to do that, we have deprecated the httpx package. We will rewrite all the code currently using httpx to use httpapi as part of future work.
It is also worth noting that httpapi will allow us to refactor the backend code such that (1) we remove code to select a backend URL endpoint at the beginning and (2) we try several endpoints. The design of the code is such that we can add to the mix some endpoints using as `http.Client` a special client using a tunnel. This will allow us to automatically fallback backend queries.
Closes https://github.com/ooni/probe/issues/2353.
Related to https://github.com/ooni/probe/issues/1519.
This commit moves the TH structs and definitions to model. We don't want
oohelperd to depend on web_connectivity@v0.4.
Part of https://github.com/ooni/probe/issues/2240
A bunch of packages (including oohelperd) just need the ability to
use MaxMind-like databases. They don't need the additional functionality
implemented by the geolocate package. Such a package, in fact, is
mostly (if not only) needed by the engine package.
Therefore, move code to query MaxMind-like databases to a separate
package, and avoid depending on geolocate in all the packages for
which it's sufficient to use geoipx.
Part of https://github.com/ooni/probe/issues/2240
This diff introduces the following `oohelperd` enhancements:
1. measure both IP addresses resolved by the TH and IP addresses resolved by the probe;
2. when the URL scheme is http and there's no explicit port, measure both 80 and 443 (which will pay off big once we introduce support for optionally performing TLS handshakes);
3. include information about the probe and TH IP addresses into the results: who resolved each IP address, whether an address is a bogon, the ASN associated to an address.
This diff is part of https://github.com/ooni/probe/issues/2237
This diff implements the first two cleanups defined at https://github.com/ooni/probe/issues/1956:
> - [ ] observe that `netxlite` and `netx` differ in error wrapping only in the way in which we set `ErrWrapper.Operation`. Observe that the code using `netxlite` does not care about such a field. Therefore, we can modify `netxlite` to set such a field using the code of `netx` and we can remove `netx` specific code for errors (which currently lives inside of the `./internal/engine/legacy/errorsx` package
>
> - [ ] after we've done the previous cleanup, we can make all the classifiers code private, since there's no code outside `netxlite` that needs them
A subsequent diff will address the remaining cleanup.
While there, notice that there are failing, unrelated obfs4 tests, so disable them in short mode. (I am confident these tests are unrelated because they fail for me when running test locally from the `master` branch.)
1. we want optionally to log the body (we don't want to log the body
when we're fetching psiphon secrets or tor targets)
2. we want body logging to _also_ happen on error since this is quite
useful to debug possible errors when accessing the API
This diff adds the above functionality, which were previously
described in https://github.com/ooni/probe/issues/1951.
This diff also adds comprehensive testing.
As mentioned in https://github.com/ooni/probe/issues/1951, one of
the main issues I did see with httpx.APIClient is that in some cases
it's used in a very fragile way by probeservices.Client.
This happens in psiphon.go and tor.go, where we create a copy of
the APIClient and then modify it's Authorization field.
If we ever refactor probeservices.Client to take a pointer to
httpx.Client, we are now mutating the httpx.Client.
Of course, we don't want that to happen.
This diff attempts to address such a problem as follows:
1. we create a new APIClientTemplate type that holds the same
fields of an APIClient and allows to build an APIClient
2. we modify every user of APIClient to use APIClientTemplate
3. when we need an APIClient, we build it from the corresponding
template and, when we need to use a specific Authorization, we
use a build factory that sets APIClient.Authorization
4. we hide APIClient by renaming it apiClient and by defining
an interface called APIClient that allows to use it
So, now the codebase always uses the opaque APIClient interface to
issue API calls and always uses the APIClientTemplate to build an
opaque APIClient.
Boom! We have separated construction from usage and we are not
mutating in weird ways the APIClient anymore.
This PR starts to implement the refactoring described at https://github.com/ooni/probe/issues/1951. I originally wrote more patches than the ones in this PR, but overall they were not readable. Since I want to squash and merge, here's a reasonable subset of the original patches that will still be readable and understandable in the future.
## Checklist
- [x] I have read the [contribution guidelines](https://github.com/ooni/probe-cli/blob/master/CONTRIBUTING.md)
- [x] reference issue for this pull request: https://github.com/ooni/probe/issues/1885
- [x] related ooni/spec pull request: N/A
Location of the issue tracker: https://github.com/ooni/probe
## Description
This PR contains a set of changes to move important interfaces and data types into the `./internal/model` package.
The criteria for including an interface or data type in here is roughly that the type should be important and used by several packages. We are especially interested to move more interfaces here to increase modularity.
An additional side effect is that, by reading this package, one should be able to understand more quickly how different parts of the codebase interact with each other.
This is what I want to move in `internal/model`:
- [x] most important interfaces from `internal/netxlite`
- [x] everything that was previously part of `internal/engine/model`
- [x] mocks from `internal/netxlite/mocks` should also be moved in here as a subpackage
* [forwardport] fix(webconnectivity): send specific user agent (#615)
This forward ports b8c530388e66b2cc86abad26d077202782e4a823 to `master`.
See https://github.com/ooni/probe/issues/1902
* fix(websteps): send the correct user agent
Also related to https://github.com/ooni/probe/issues/1902: let's just
ensure that also websteps behaves in the correct way.
When preparing a tutorial for netxlite, I figured it is easier
to tell people "hey, this is the package you should use for all
low-level networking stuff" rather than introducing people to
a set of packages working together where some piece of functionality
is here and some other piece is there.
Part of https://github.com/ooni/probe/issues/1591
We need still to add similar wrappers to internal/netxlite but we
will adopt a saner approach to error wrapping this time.
See https://github.com/ooni/probe/issues/1591
The legacy part for now is internal/errorsx. It will stay there until
I figure out whether it also needs some extra bug fixing.
The good part is now in internal/netxlite/errorsx and contains all the
logic for mapping errors. We need to further improve upon this logic
by writing more thorough integration tests for QUIC.
We also need to copy the various dialer, conn, etc adapters that set
errors. We will put them inside netxlite and we will generate errors in
a way that is less crazy with respect to the major operation. (The
idea is to always wrap, given that now we measure in an incremental way
and we don't measure every operation together.)
Part of https://github.com/ooni/probe/issues/1591
* fix(pkg.go.dev): import a subpackage containing the assets
We're trying to fix this issue that pkg.go.dev does not build.
Thanks to @hellais for this very neat idea! Let's keep our
fingers crossed and see whether it fixes!
* feat: use embedded geoip databases
Closes https://github.com/ooni/probe/issues/1372.
Work done as part of https://github.com/ooni/probe/issues/1369.
* fix(assetsx): add tests
* feat: simplify and just vendor uncompressed DBs
* remove tests that seems not necessary anymore
* fix: run go mod tidy
* Address https://github.com/ooni/probe-cli/pull/260/files#r605181364
* rewrite a test in a better way
* fix: gently cleanup the legacy assetsdir
Do not remove the whole directory with brute force. Just zap the
files whose name we know. Then attempt to delete the legacy directory
as well. If not empty, just fail. This is fine because it means the
user has stored other files inside the directory.
* fix: create .miniooni if missing
* fix(webconnectivity): allow measuring https://1.1.1.1
There were two issues preventing us from doing so:
1. in netx, the address resolver was too later in the resolver
chain. Therefore, its result wasn't added to the events.
2. when building the DNSCache (in httpget.go), we didn't consider
the case where the input is an address. We need to treat this
case specially to make sure there is no DNSCache.
See https://github.com/ooni/probe/issues/1376.
* fix: add unit tests for code making the dnscache
* fix(netx): make sure all tests pass
* chore: bump webconnectivity version
* refactor: move more commands to internal/cmd
Part of https://github.com/ooni/probe/issues/1335.
We would like all commands to be at the same level of engine
rather than inside engine (now that we can do it).
* fix: update .gitignore
* refactor: also move jafar outside engine
* We should be good now?
This is how I did it:
1. `git clone https://github.com/ooni/probe-engine internal/engine`
2. ```
(cd internal/engine && git describe --tags)
v0.23.0
```
3. `nvim go.mod` (merging `go.mod` with `internal/engine/go.mod`
4. `rm -rf internal/.git internal/engine/go.{mod,sum}`
5. `git add internal/engine`
6. `find . -type f -name \*.go -exec sed -i 's@/ooni/probe-engine@/ooni/probe-cli/v3/internal/engine@g' {} \;`
7. `go build ./...` (passes)
8. `go test -race ./...` (temporary failure on RiseupVPN)
9. `go mod tidy`
10. this commit message
Once this piece of work is done, we can build a new version of `ooniprobe` that
is using `internal/engine` directly. We need to do more work to ensure all the
other functionality in `probe-engine` (e.g. making mobile packages) are still WAI.
Part of https://github.com/ooni/probe/issues/1335