## Checklist
- [x] I have read the [contribution guidelines](https://github.com/ooni/probe-cli/blob/master/CONTRIBUTING.md)
- [x] reference issue for this pull request: https://github.com/ooni/probe/issues/2158
- [x] if you changed anything related how experiments work and you need to reflect these changes in the ooni/spec repository, please link to the related ooni/spec pull request: https://github.com/ooni/spec/pull/250
## Description
This diff refactors the codebase to reimplement tlsping and tcpping
to use the step-by-step measurements style.
See docs/design/dd-003-step-by-step.md for more information on the
step-by-step measurement style.
* refactor: move tracex outside of engine/netx
Consistently with https://github.com/ooni/probe/issues/2121 and
https://github.com/ooni/probe/issues/2115, we can now move tracex
outside of engine/netx. The main reason why this makes sense now
is that the package is now changed significantly from the one
that we imported from ooni/probe-engine.
We have improved its implementation, which had not been touched
significantly for quite some time, and converted it to unit
testing. I will document tomorrow some extra work I'd like to
do with this package but likely could not do $soon.
* go fmt
* regen tutorials
The exercise already allowed me to notice issues such as fields not
being properly initialized by savers.
This is one of the last steps before moving tracex away from the
internal/netx package and into the internal package.
See https://github.com/ooni/probe/issues/2121
Tracex contained some fragile code that assembled HTTP measurements
from scattered events, which worked because we were sure we were
performing a single measurement at any given time.
This diff restructures the code to emit a transaction-start and a
transaction-done events only. We have basically removed all the other
events (which we were not using). We kept the transaction-start
though, because it may be useful to see it when reading events. In
any case, what matters here is that we're now using the transaction-done
event aline to generate the archival HTTP measurement.
Hence, the original issue has been addressed. We will possibly
do more refactoring in the future, but for now this seems sufficient.
Part of https://github.com/ooni/probe/issues/2121
The code that is now into the tracex package was written a long
time ago, so let's start to make it more in line with the coding
style of packages that were written more recently.
I didn't apply all the changes I'd like to apply in a single diff
and for now I am committing just this diff.
Broadly, what we need to do is:
1. improve documentation
2. ~always use pointer receivers (object receives have the issue
that they are not mutable by accident meaning that you can mutate
them but their state do not change after the call returns, which
is potentially a source of bugs in case you later refactor to use
a pointer receiver, so always use pointer receivers)
3. ~always avoid embedding (let's say we want to avoid embedding
for types we define and it's instead fine to embed types that are
defined in the stdlib: if later we add a new method, we will not
see a broken build and we'll probably forget to add the new method
to all wrappers -- conversely, if we're wrapping rather than
embedding, we'll see a broken build and act accordingly)
4. prefer unit tests and group tests by type being tested rather
than using a flat structure for tests
There's a coverage slippage that I'll compensate in a follow-up diff where I'll focus on unit testing.
Reference issue: https://github.com/ooni/probe/issues/2121
This diff creates a new package under netx called tracex that
contains everything we need to perform measurements using events
tracing and postprocessing (which is the technique with which
we implement most network experiments).
The general idea here is to (1) create a unique package out of
all of these packages; (2) clean up the code a bit (improve tests,
docs, apply more recent code patterns); (3) move the resulting
code as a toplevel package inside of internal.
Once this is done, netx can be further refactored to avoid
subpackages and we can search for more code to salvage/refactor.
See https://github.com/ooni/probe/issues/2121
These two small packages could easily be merged into the model
package, since they're clearly model-like packages.
Part of https://github.com/ooni/probe/issues/2115
This diff forward ports b6db4f64dc83a2a27ee3ce6bba5ac93db922832d, whose
original log message is the following:
- - -
We're now using ooni/oohttp as our HTTP library in most cases.
A limitation of this library is that net/http/httptrace does not
work very well and reliably because (1) we need to use oohttp's
version of that code and (2) we cannot observe net events.
I noticed this fact because an integration test for collecting
HTTP performance metrics was broken.
The best solution here is to remove this functionality, since
it was basically unused in the repository. Only some integration
tests inside urlgetter bothered with these metrics.
A more clinical fix would have been to use ooni/oohttp/httptrace
instead of net/http/httptrace in the stdlib, but it does not
seem to be a good idea, given that those metrics were not used.
With this diff applied, we'll further reduce the number of locally
failing integration tests to just jafar-specific tests.
This diff WILL need to be forwardported to `master`.
This diff contains significant improvements over the previous
implementation of the torsf experiment.
We add support for configuring different rendezvous methods after
the convo at https://github.com/ooni/probe/issues/2004. In doing
that, I've tried to use a terminology that is consistent with the
names being actually used by tor developers.
In terms of what to do next, this diff basically instruments
torsf to always rendezvous using domain fronting. Yet, it's also
possible to change the rendezvous method from the command line,
when using miniooni, which allows to experiment a bit more. In the
same vein, by default we use a persistent tor datadir, but it's
also possible to use a temporary datadir using the cmdline.
Here's how a generic invocation of `torsf` looks like:
```bash
./miniooni -O DisablePersistentDatadir=true \
-O RendezvousMethod=amp \
-O DisableProgress=true \
torsf
```
(The default is `DisablePersistentDatadir=false` and
`RendezvousMethod=domain_fronting`.)
With this implementation, we can start measuring whether snowflake
and tor together can boostrap, which seems the most important thing
to focus on at the beginning. Understanding why the bootstrap most
often does not converge with a temporary datadir on Android devices
remains instead an open problem for now. (I'll also update the
relevant issues or create new issues after commit this.)
We also address some methodology improvements that were proposed
in https://github.com/ooni/probe/issues/1686. Namely:
1. we record the tor version;
2. we include the bootstrap percentage by reading the logs;
3. we set the anomaly key correctly;
4. we measure the bytes send and received (by `tor` not by `snowflake`, since
doing it for snowflake seems more complex at this stage).
What remains to be done is the possibility of including Snowflake
events into the measurement, which is not possible until the new
improvements at common/event in snowflake.git are included into a
tagged version of snowflake itself. (I'll make sure to mention
this aspect to @cohosh in https://github.com/ooni/probe/issues/2004.)
* chore(netxlite): add currently failing test case
This diff introduces a test cases that will fail because of the reason
explained in https://github.com/ooni/probe/issues/1965.
* chore(netxlite/iox_test.go): add failing unit tests
These tests directly show how the Go implementation of ReadAll
and Copy has the issue of checking for io.EOF equality.
* fix(netxlite): make {ReadAll,Copy}Context robust to wrapped io.EOF
The fix is simple: we just need to check for `errors.Is(err, io.EOF)`
after either io.ReadAll or io.Copy has returned. When this condition is
true, we need to convert the error back to `nil` as it ought to be.
While there, observe that the unit tests I committed in the previous
commit are wrongly asserting that the error must be wrapped. This
assertion is not correct, because in both cases we have just ensured
that the returned error is `nil` (i.e., success).
See https://github.com/ooni/probe/issues/1965.
* cleanup: remove previous workaround for wrapped io.EOF
These workarounds were partial, meaning that they would cover some
cases in which the issue occurred but not all of them.
Handling the problem in `netxlite.{ReadAll,Copy}Context` is the
right thing to do _as long as_ we always use these functions instead
of `io.{ReadAll,Copy}`.
This is why it's now important to ensure we clearly mention that
inside of the `CONTRIBUTING.md` guide and to also ensure that we're
not using these functions in the code base.
* fix(urlgetter): repair tests who assumed to see EOF error
Now that we have established that we should normalize EOF when
reading bodies like the stdlib does and now that it's clear why
our behavior diverged from the stdlib, we also need to repair
all the tests that assumed this incorrect behavior.
* fix(all): don't use io{,util}.{Copy,ReadAll}
* feat: add checks to ensure we don't use io.{Copy,ReadAll}
* doc(netxlite): document we know how to deal w/ wrapped io.EOF
* fix(nocopyreadall.bash): add exception for i/n/iox.go
The DNSClient type existed because the Resolver type did not
include CloseIdleConnections in its signature.
Now that Resolver includes CloseIdleConnections, the DNSClient
type has become unnecessary and can be safely removed.
See https://github.com/ooni/probe/issues/1956.
This diff addresses two items of https://github.com/ooni/probe/issues/1956:
> - [ ] we can remove legacy names from `./internal/engine/netx/resolver/legacy.go`
>
> - [ ] we can remove `DialTLSContext` from `./internal/engine/netx/resolver/tls_test.go`
More cleanups may follow.
This diff implements the first two cleanups defined at https://github.com/ooni/probe/issues/1956:
> - [ ] observe that `netxlite` and `netx` differ in error wrapping only in the way in which we set `ErrWrapper.Operation`. Observe that the code using `netxlite` does not care about such a field. Therefore, we can modify `netxlite` to set such a field using the code of `netx` and we can remove `netx` specific code for errors (which currently lives inside of the `./internal/engine/legacy/errorsx` package
>
> - [ ] after we've done the previous cleanup, we can make all the classifiers code private, since there's no code outside `netxlite` that needs them
A subsequent diff will address the remaining cleanup.
While there, notice that there are failing, unrelated obfs4 tests, so disable them in short mode. (I am confident these tests are unrelated because they fail for me when running test locally from the `master` branch.)
Since https://github.com/ooni/probe-cli/pull/527, if an experiment
returns an error, the corresponding measurement is not submitted since
the semantics of returning an error is that something fundamental
went wrong (e.g., we could not parse the input URL).
This diff ensures that all experiments only return and error when
something fundamental was wrong and return nil otherwise.
Reference issue: https://github.com/ooni/probe/issues/1808.
## Checklist
- [x] I have read the [contribution guidelines](https://github.com/ooni/probe-cli/blob/master/CONTRIBUTING.md)
- [x] reference issue for this pull request: https://github.com/ooni/probe/issues/1885
- [x] related ooni/spec pull request: N/A
Location of the issue tracker: https://github.com/ooni/probe
## Description
This PR contains a set of changes to move important interfaces and data types into the `./internal/model` package.
The criteria for including an interface or data type in here is roughly that the type should be important and used by several packages. We are especially interested to move more interfaces here to increase modularity.
An additional side effect is that, by reading this package, one should be able to understand more quickly how different parts of the codebase interact with each other.
This is what I want to move in `internal/model`:
- [x] most important interfaces from `internal/netxlite`
- [x] everything that was previously part of `internal/engine/model`
- [x] mocks from `internal/netxlite/mocks` should also be moved in here as a subpackage
When preparing a tutorial for netxlite, I figured it is easier
to tell people "hey, this is the package you should use for all
low-level networking stuff" rather than introducing people to
a set of packages working together where some piece of functionality
is here and some other piece is there.
Part of https://github.com/ooni/probe/issues/1591
We need still to add similar wrappers to internal/netxlite but we
will adopt a saner approach to error wrapping this time.
See https://github.com/ooni/probe/issues/1591
The legacy part for now is internal/errorsx. It will stay there until
I figure out whether it also needs some extra bug fixing.
The good part is now in internal/netxlite/errorsx and contains all the
logic for mapping errors. We need to further improve upon this logic
by writing more thorough integration tests for QUIC.
We also need to copy the various dialer, conn, etc adapters that set
errors. We will put them inside netxlite and we will generate errors in
a way that is less crazy with respect to the major operation. (The
idea is to always wrap, given that now we measure in an incremental way
and we don't measure every operation together.)
Part of https://github.com/ooni/probe/issues/1591
## Description
This PR continues the refactoring of `netx` under the following principles:
1. do not break the rest of the tree and do not engage in extensive tree-wide refactoring yet
2. move under `netxlite` clearly related subpackages (e.g., `iox`, `netxmocks`)
3. move into `internal/netxlite/internal` stuff that is clearly private of `netxlite`
4. hide implementation details in `netxlite` pending new factories
5. refactor `tls` code in `netxlite` to clearly separate `crypto/tls` code from `utls` code
After each commit, I run `go test -short -race ./...` locally. Each individual commit explains what it does. I will squash, but this operation will preserve the original commit titles, so this will give further insight on each step.
## Commits
* refactor: rename netxmocks -> netxlite/mocks
Part of https://github.com/ooni/probe/issues/1591
* refactor: rename quicx -> netxlite/quicx
See https://github.com/ooni/probe/issues/1591
* refactor: rename iox -> netxlite/iox
Regenerate sources and make sure the tests pass.
See https://github.com/ooni/probe/issues/1591.
* refactor(iox): move MockableReader to netxlite/mocks
See https://github.com/ooni/probe/issues/1591
* refactor(netxlite): generator is an implementation detail
See https://github.com/ooni/probe/issues/1591
* refactor(netxlite): separate tls and utls code
See https://github.com/ooni/probe/issues/1591
* refactor(netxlite): hide most types but keep old names as legacy
With this change we avoid breaking the rest of the tree, but we start
hiding some implementation details a bit. Factories will follow.
See https://github.com/ooni/probe/issues/1591
* fix(all): introduce and use iox.CopyContext
This PR is part of https://github.com/ooni/probe/issues/1417.
In https://github.com/ooni/probe-cli/pull/379 we introduced a context
aware wrapper for io.ReadAll (formerly ioutil.ReadAll).
Here we introduce a context aware wrapper for io.Copy.
* fix(humanize): more significant digits
* fix: rename humanize files to follow the common pattern
* fix aligment
* fix test
* refactor(atomicx): move outside the engine package
After merging probe-engine into probe-cli, my impression is that we have
too much unnecessary nesting of packages in this repository.
The idea of this commit and of a bunch of following commits will instead
be to reduce the nesting and simplify the structure.
While there, improve the documentation.
* fix: always use the atomicx package
For consistency, never use sync/atomic and always use ./internal/atomicx
so we can just grep and make sure we're not risking to crash if we make
a subtle mistake on a 32 bit platform.
While there, mention in the contributing guidelines that we want to
always prefer the ./internal/atomicx package over sync/atomic.
* fix(atomicx): remove unnecessary constructor
We don't need a constructor here. The default constructed `&Int64{}`
instance is already usable and the constructor does not add anything to
what we are doing, rather it just creates extra confusion.
* cleanup(atomicx): we are not using Float64
Because atomicx.Float64 is unused, we can safely zap it.
* cleanup(atomicx): simplify impl and improve tests
We can simplify the implementation by using defer and by letting
the Load() method call Add(0).
We can improve tests by making many goroutines updated the
atomic int64 value concurrently.
* refactor(fsx): can live in the ./internal pkg
Let us reduce the amount of nesting. While there, ensure that the
package only exports the bare minimum, and improve the documentation
of the tests, to ease reading the code.
* refactor: move runtimex to ./internal
* refactor: move shellx into the ./internal package
While there, remove unnecessary dependency between packages.
While there, specify in the contributing guidelines that
one should use x/sys/execabs instead of os/exec.
* refactor: move ooapi into the ./internal pkg
* refactor(humanize): move to ./internal and better docs
* refactor: move platform to ./internal
* refactor(randx): move to ./internal
* refactor(multierror): move into the ./internal pkg
* refactor(kvstore): all kvstores in ./internal
Rather than having part of the kvstore inside ./internal/engine/kvstore
and part in ./internal/engine/kvstore.go, let us put every piece of code
that is kvstore related into the ./internal/kvstore package.
* fix(kvstore): always return ErrNoSuchKey on Get() error
It should help to use the kvstore everywhere removing all the
copies that are lingering around the tree.
* sessionresolver: make KVStore mandatory
Simplifies implementation. While there, use the ./internal/kvstore
package rather than having our private implementation.
* fix(ooapi): use the ./internal/kvstore package
* fix(platform): better documentation
* urlgetter: fix tunnel test
This diff fixes the urlgetter test suite to make sure we
are correctly testing for tunnel creation.
While there, improve the way in which we create a testing
directory and add a test for that.
Part of https://github.com/ooni/probe/issues/985.
* fix comment
* fix comment
* feat: create tunnel inside NewSession
We want to create the tunnel when we create the session. This change
allows us to nicely ignore the problem of creating a tunnel when we
already have a proxy, as well as the problem of locking. Everything is
happening, in fact, inside of the NewSession factory.
Modify miniooni such that --tunnel is just syntactic sugar for
--proxy, at least for now. We want, in the future, to teach the
tunnel to possibly use a socks5 proxy.
Because starting a tunnel is a slow operation, we need a context in
NewSession. This causes a bunch of places to change. Not really a big
deal except we need to propagate the changes.
Make sure that the mobile code can create a new session using a
proxy for all the APIs we support.
Make sure all tests are still green and we don't loose coverage of
the various ways in which this code could be used.
This change is part of https://github.com/ooni/probe/issues/985.
* changes after merge
* fix: only keep tests that can hopefully work
While there, identify other places where we should add more
tests or fix integration tests.
Part of https://github.com/ooni/probe/issues/985
* feat(tunnel): introduce persistent tunnel state dir
This diff introduces a persistent state directory for tunnels, so that
we can bootstrap them more quickly after the first time.
Part of https://github.com/ooni/probe/issues/985
* fix: make tunnel dir optional
We have many tests where it does not make sense to explicitly
provide a tunnel dir because we're not using tunnels.
This should simplify setting up a session.
* fix(tunnel): repair tests
* final changes
* more cleanups
We're trying to remove a circular dependency between the measurement
Session and the tunnel package. To this end, continue to reduce the
dependency scope by providing TorArgs and TorBinary directly.
Part of https://github.com/ooni/probe/issues/985
* fix(pkg.go.dev): import a subpackage containing the assets
We're trying to fix this issue that pkg.go.dev does not build.
Thanks to @hellais for this very neat idea! Let's keep our
fingers crossed and see whether it fixes!
* feat: use embedded geoip databases
Closes https://github.com/ooni/probe/issues/1372.
Work done as part of https://github.com/ooni/probe/issues/1369.
* fix(assetsx): add tests
* feat: simplify and just vendor uncompressed DBs
* remove tests that seems not necessary anymore
* fix: run go mod tidy
* Address https://github.com/ooni/probe-cli/pull/260/files#r605181364
* rewrite a test in a better way
* fix: gently cleanup the legacy assetsdir
Do not remove the whole directory with brute force. Just zap the
files whose name we know. Then attempt to delete the legacy directory
as well. If not empty, just fail. This is fine because it means the
user has stored other files inside the directory.
* fix: create .miniooni if missing
* refactor: move more commands to internal/cmd
Part of https://github.com/ooni/probe/issues/1335.
We would like all commands to be at the same level of engine
rather than inside engine (now that we can do it).
* fix: update .gitignore
* refactor: also move jafar outside engine
* We should be good now?
* refactor: start building an Android package
Part of https://github.com/ooni/probe/issues/1335.
This seems also a good moment to move some packages out of the
engine, e.g., oonimkall. This package, for example, is a consumer
of the engine, so it makes sense it's not _inside_ it.
* fix: committed some stuff I didn't need to commit
* fix: oonimkall needs to be public to build
The side effect is that we will probably need to bump the major
version number every time we change one of these APIs.
(We can also of course choose to violate the basic guidelines of Go
software, but I believe this is bad form.)
I have no problem in bumping the major quite frequently and in
any case this monorepo solution is convinving me more than continuing
to keep a split between engine and cli. The need to embed assets to
make the probe more reliable trumps the negative effects of having to
~frequently bump major because we expose a public API.
* fix: let's not forget about libooniffi
Honestly, I don't know what to do with this library. I added it
to provide a drop in replacement for MK but I have no idea whether
it's used and useful. I would not feel comfortable exposing it,
unlike oonimkall, since we're not using it.
It may be that the right thing to do here is just to delete the
package and reduce the amount of code we're maintaining?
* woops, we're still missing the publish android script
* fix(publish-android.bash): add proper API key
* ouch fix another place where the name changed
This is how I did it:
1. `git clone https://github.com/ooni/probe-engine internal/engine`
2. ```
(cd internal/engine && git describe --tags)
v0.23.0
```
3. `nvim go.mod` (merging `go.mod` with `internal/engine/go.mod`
4. `rm -rf internal/.git internal/engine/go.{mod,sum}`
5. `git add internal/engine`
6. `find . -type f -name \*.go -exec sed -i 's@/ooni/probe-engine@/ooni/probe-cli/v3/internal/engine@g' {} \;`
7. `go build ./...` (passes)
8. `go test -race ./...` (temporary failure on RiseupVPN)
9. `go mod tidy`
10. this commit message
Once this piece of work is done, we can build a new version of `ooniprobe` that
is using `internal/engine` directly. We need to do more work to ensure all the
other functionality in `probe-engine` (e.g. making mobile packages) are still WAI.
Part of https://github.com/ooni/probe/issues/1335