This diff forward ports 90bf0629b957c912a5a6f3bb6c98ad3abb5a2ff6 to `master`.
If we close the channel to signal the end of a task we may panic
when some background goroutine tries to post on the channel.
This bug is rare but may happen.
See for example https://github.com/ooni/probe/issues/1438.
How can we improve?
First, let us add a timeout when sending to the channel. Given that
the channel is buffered and we have a generous timeout (1/4 of a
second), it's unlikely we will really block. But, in the event in
which a late message appears, we'll eventually _unblock_ when
sending with a timeout. So, now we don't have to worry anymore
about leaking forever a goroutine.
Then, let us change the protocol with which we signal that a task
is done. We used to close the channel. Now, instead we just
synchronously post a nil on the channel when done.
In turn, we interpret this nil to mean that the task is done when
we receive messages.
The _main_ different with respect to before is that now we are
asking the consumer of our API to drain the channel. Because
before we had a blocking channel, it seems to me we were already
requiring the consumer of the API to do that. Which means, I think
in practical terms it did not change much.
Finally, acknowledge that we don't need a specific state variable
to tell us we're done and simplify a little bit the API by
just making isRunning private and using the "we're done" signal
to determine whether we've stopped running the task.
All these changes should be enough to close https://github.com/ooni/probe/issues/1438.
This is required to implement websteps, which is currently tracked
by https://github.com/ooni/probe/issues/1733.
We introduce the concept of async runner. An async runner will
post measurements on a channel until it is done. When it is done,
it will close the channel to notify the reader about that.
This change causes sync experiments now to strictly return either
a non-nil measurement or a non-nil error.
While this is a pretty much obvious situation in golang, we had
some parts of the codebase that were not robust to this assumption
and attempted to submit a measurement after the measure call
returned an error.
Luckily, we had enough tests to catch this change in our assumption
and this is why there are extra docs and tests changes.
* feat(oonimkall): instrument code to understand CI issue
It seems ~difficult to reproduce the problem locally and I could not
see it after five runs of
```
go test -race -count 1 ./pkg/oonimkall/...
```
So, here's some diagnostic code that could help understanding the
reference issue https://github.com/ooni/probe/issues/1785.
Also, it seems the issue pops up much more frequently when running
CI anyway. So, I am going to leave this diff around and when it
appears again I have more context to fix the issue.
* fix(oonimkall): skip flaky test in short mode
See https://github.com/ooni/probe/issues/1785
Reference issue: https://github.com/ooni/probe/issues/1769
Motivation: The CI is failing. Those are integration tests. Let us figure out the issue when we approach release. Until we approach release, do not let those tests distracting us. Normal merges should only pass the `-short` tests.
* refactor(atomicx): move outside the engine package
After merging probe-engine into probe-cli, my impression is that we have
too much unnecessary nesting of packages in this repository.
The idea of this commit and of a bunch of following commits will instead
be to reduce the nesting and simplify the structure.
While there, improve the documentation.
* fix: always use the atomicx package
For consistency, never use sync/atomic and always use ./internal/atomicx
so we can just grep and make sure we're not risking to crash if we make
a subtle mistake on a 32 bit platform.
While there, mention in the contributing guidelines that we want to
always prefer the ./internal/atomicx package over sync/atomic.
* fix(atomicx): remove unnecessary constructor
We don't need a constructor here. The default constructed `&Int64{}`
instance is already usable and the constructor does not add anything to
what we are doing, rather it just creates extra confusion.
* cleanup(atomicx): we are not using Float64
Because atomicx.Float64 is unused, we can safely zap it.
* cleanup(atomicx): simplify impl and improve tests
We can simplify the implementation by using defer and by letting
the Load() method call Add(0).
We can improve tests by making many goroutines updated the
atomic int64 value concurrently.
* refactor(fsx): can live in the ./internal pkg
Let us reduce the amount of nesting. While there, ensure that the
package only exports the bare minimum, and improve the documentation
of the tests, to ease reading the code.
* refactor: move runtimex to ./internal
* refactor: move shellx into the ./internal package
While there, remove unnecessary dependency between packages.
While there, specify in the contributing guidelines that
one should use x/sys/execabs instead of os/exec.
* refactor: move ooapi into the ./internal pkg
* refactor(humanize): move to ./internal and better docs
* refactor: move platform to ./internal
* refactor(randx): move to ./internal
* refactor(multierror): move into the ./internal pkg
* refactor(kvstore): all kvstores in ./internal
Rather than having part of the kvstore inside ./internal/engine/kvstore
and part in ./internal/engine/kvstore.go, let us put every piece of code
that is kvstore related into the ./internal/kvstore package.
* fix(kvstore): always return ErrNoSuchKey on Get() error
It should help to use the kvstore everywhere removing all the
copies that are lingering around the tree.
* sessionresolver: make KVStore mandatory
Simplifies implementation. While there, use the ./internal/kvstore
package rather than having our private implementation.
* fix(ooapi): use the ./internal/kvstore package
* fix(platform): better documentation
* feat: create tunnel inside NewSession
We want to create the tunnel when we create the session. This change
allows us to nicely ignore the problem of creating a tunnel when we
already have a proxy, as well as the problem of locking. Everything is
happening, in fact, inside of the NewSession factory.
Modify miniooni such that --tunnel is just syntactic sugar for
--proxy, at least for now. We want, in the future, to teach the
tunnel to possibly use a socks5 proxy.
Because starting a tunnel is a slow operation, we need a context in
NewSession. This causes a bunch of places to change. Not really a big
deal except we need to propagate the changes.
Make sure that the mobile code can create a new session using a
proxy for all the APIs we support.
Make sure all tests are still green and we don't loose coverage of
the various ways in which this code could be used.
This change is part of https://github.com/ooni/probe/issues/985.
* changes after merge
* fix: only keep tests that can hopefully work
While there, identify other places where we should add more
tests or fix integration tests.
Part of https://github.com/ooni/probe/issues/985
* feat(tunnel): introduce persistent tunnel state dir
This diff introduces a persistent state directory for tunnels, so that
we can bootstrap them more quickly after the first time.
Part of https://github.com/ooni/probe/issues/985
* fix: make tunnel dir optional
We have many tests where it does not make sense to explicitly
provide a tunnel dir because we're not using tunnels.
This should simplify setting up a session.
* fix(tunnel): repair tests
* final changes
* more cleanups
* fix(pkg.go.dev): import a subpackage containing the assets
We're trying to fix this issue that pkg.go.dev does not build.
Thanks to @hellais for this very neat idea! Let's keep our
fingers crossed and see whether it fixes!
* feat: use embedded geoip databases
Closes https://github.com/ooni/probe/issues/1372.
Work done as part of https://github.com/ooni/probe/issues/1369.
* fix(assetsx): add tests
* feat: simplify and just vendor uncompressed DBs
* remove tests that seems not necessary anymore
* fix: run go mod tidy
* Address https://github.com/ooni/probe-cli/pull/260/files#r605181364
* rewrite a test in a better way
* fix: gently cleanup the legacy assetsdir
Do not remove the whole directory with brute force. Just zap the
files whose name we know. Then attempt to delete the legacy directory
as well. If not empty, just fail. This is fine because it means the
user has stored other files inside the directory.
* fix: create .miniooni if missing
* feat(session): expose CheckIn method
It seems to me the right thing to do is to query the CheckIn API
from the Session rather than querying it from InputLoader.
Then, InputLoader could just take a reference to a Session-like
interface that allows this functionality.
So, this diff exposes the Session.CheckIn method.
Doing that, in turn, required some refactoring to allow for
more and better unit tests.
While doing that, I also noticed that Session required a mutex
to be a well-behaving type, so I did that.
While doing that, I also tried to cover all the lines in session.go
and, as part of that, I have removed unused code.
Reference issue: https://github.com/ooni/probe/issues/1299.
* fix: reinstate comment I shan't have removed
* fix: repair broken test
* fix: a bit more coverage, annotations, etc.
* Update internal/engine/session.go
* Update internal/engine/session_integration_test.go
* Update internal/engine/session_internal_test.go
We use ExperimentOrchestraClient in several places to help us
calling probe-services APIs. We need to call CheckIn because we
want to use CheckIn in InputLoader.
(We also want to remove the URLs API, but that is not something
doable now, since the mobile app is still using this API via
the wrappers at pkg/oonimkall.)
Work part of https://github.com/ooni/probe/issues/1299.
* poc: mobile api for running WebConnectivity
Yesterday we had a team meeting where we discussed the importance
of stopping using MK wrappers for running experiments.
Here's a PoC showing how we can do that for WebConnectivity. It
seems to me we probably want to add more tests and merge this code
such that we can experiment with it quite soon.
There seems to be opportunities for auto-generating code, BTW.
While working on this PoC, I've also took a chance to revamp the
external documentation of pkg/oonimkall.
* feat: start making the code more abstract
* chore: write unit tests for new code
* fix(oonimkall): improve naming
* refactor: cosmetic changes
* fix: explain
* chore: update dependencies
* chore: update user agent for measurements
* chore: we're now at v3.6.0
* chore: update assets
* chore: update bundled CA
* fix: address some goreportcard.com warnings
* fix(debian/changelog): zap release that breaks out build scripts
We're forcing the content of changelog with `dch`, so it's fine to
not have any specific new release in there.
* fix: make sure tests are passing locally
Notably, I removed a chunk of code where we were checking for network
activity. Now we don't fetch the databases and it's not important. Before,
it was important because the databases are ~large.
* fix: temporarily comment out riseupvn integration tests
See https://github.com/ooni/probe/issues/1354 for work aimed at
reducing the rate of false positives (thanks @cyBerta!)
* MVP of a signal messenger test
* Add minimal signal test unit tests
* Add Signal test to the im nettest group
* Add test for https://sfu.voip.signal.org/
* Fix bug in client-side determination of blocking status
* Add uptime.signal.org to the test targets
* Add more tests
* Check for invalid CA being passed
* Check that the update function works as expected
* Update internal/engine/experiment/signal/signal_test.go
Co-authored-by: Simone Basso <bassosimone@gmail.com>
* fix: back out URL we shouldn't have changed
When merging probe-engine into probe-cli, we changed too many URLs
and some of them should not have been changed.
I noticed this during the review of Signal and I choose to add
this commit to revert such changes.
While there, make sure the URL of the experiment is OK.
* fix(signal): reach 100% of coverage
Just so that we can focus on areas of the codebase where we need
more coverage, let us avoid missing an easy line to test.
Co-authored-by: Simone Basso <bassosimone@gmail.com>
* doc: ensure all top dirs have an explanatory README
This makes the repository a lil bit nicer to newcomers.
Part of https://github.com/ooni/probe/issues/1335
* fix: re-run bindata to embed the README
The readme is small, so we can pay the price of adding it.
On a related note, I am very pleased the Go team implemented the
`//go:embed` feature, so we can get rid of this bindata thing.
* refactor: move more commands to internal/cmd
Part of https://github.com/ooni/probe/issues/1335.
We would like all commands to be at the same level of engine
rather than inside engine (now that we can do it).
* fix: update .gitignore
* refactor: also move jafar outside engine
* We should be good now?
* refactor: start building an Android package
Part of https://github.com/ooni/probe/issues/1335.
This seems also a good moment to move some packages out of the
engine, e.g., oonimkall. This package, for example, is a consumer
of the engine, so it makes sense it's not _inside_ it.
* fix: committed some stuff I didn't need to commit
* fix: oonimkall needs to be public to build
The side effect is that we will probably need to bump the major
version number every time we change one of these APIs.
(We can also of course choose to violate the basic guidelines of Go
software, but I believe this is bad form.)
I have no problem in bumping the major quite frequently and in
any case this monorepo solution is convinving me more than continuing
to keep a split between engine and cli. The need to embed assets to
make the probe more reliable trumps the negative effects of having to
~frequently bump major because we expose a public API.
* fix: let's not forget about libooniffi
Honestly, I don't know what to do with this library. I added it
to provide a drop in replacement for MK but I have no idea whether
it's used and useful. I would not feel comfortable exposing it,
unlike oonimkall, since we're not using it.
It may be that the right thing to do here is just to delete the
package and reduce the amount of code we're maintaining?
* woops, we're still missing the publish android script
* fix(publish-android.bash): add proper API key
* ouch fix another place where the name changed