doc: publish the step-by-step design document (#814)

This pull request publishes the step-by-step design document that I have been discussing with @hellais and @DecFox recently. Compared to the document that was approved, this one has been edited for readability.

While there, I figured it was also be beneficial to publish the few ooni/probe-cli related design documents we produced in the past, because they probably help someone to get acquainted with the codebase.

Reference issue for this pull request: https://github.com/ooni/probe/issues/2148
This commit is contained in:
Simone Basso 2022-06-14 14:38:29 +02:00 committed by GitHub
parent 1685ef75b5
commit be2da83b1b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 2250 additions and 0 deletions

4
DESIGN.md Normal file
View File

@ -0,0 +1,4 @@
# Design documents
Please, see the [docs/design](docs/design) directory for a collection
of important design documents related to this repo.

17
docs/design/README.md Normal file
View File

@ -0,0 +1,17 @@
# Design documents
This directory contains ooni/probe-cli design documents. We generally
use issues as short design documents, but some more-complex discussions
require proper design documents. We started collecting existing design
documents into this repository on 2022-06-13, to provide additional
clarity regarding our more-complex choices.
Implementation aspects _may_ change freely over time and there's no need
to rush back and update the related design document. We want to keep a
design document around to document the _why_ of specific choices.
Most design documents are, therefore, informational and historic by
nature, given that we do not aim to continuously update them.
Rather than radically revamp a design document in case of radical direction
changes, it is advisable to create a new design document referencing the
previous design document and explaining what changed.

View File

@ -0,0 +1,149 @@
# OONIMKAll: Replacing Measurement Kit C API
| | |
|:-------------|:-------------|
| Author | [@bassosimone](https://github.com/bassosimone) |
| Last-Updated | 2020-07-09 |
| Status | historic |
*Abstract.* We describe our plan of replacing Measurement Kit for OONI
Probe Android and iOS (in particular) and (also) the CLI.
## Introduction
We want to write experiments in Go. This reduces our burden
compared to writing them using C/C++ code.
Go consumers of probe-engine shall directly use its Go API. We
will discuss the Go API in a future revision of this spec.
For mobile apps, we want to replace these MK APIs:
- [measurement-kit/android-libs](https://github.com/measurement-kit/android-libs)
- [measurement-kit/mkall-ios](https://github.com/measurement-kit/mkall-ios)
We also want consumers of [measurement-kit's FFI API](https://git.io/Jv4Rv)
to be able to replace measurement-kit with probe-engine.
## APIs to replace
### Mobile APIs
We define a Go API that `gomobile` binds to a Java/ObjectiveC
API that is close enough to the MK's mobile APIs.
### FFI API
We define a CGO API such that `go build -buildmode=c-shared`
yields an API reasonably close to MK's FFI API.
## API for running experiments
It seems the generic API for enabling running experiments both on
mobile devices and for FFI consumers is like:
```Go
type Task struct{ ... }
func StartTask(input string) (*Task, error)
func (t *Task) Interrupt()
func (t *Task) IsDone() bool
func (t *Task) WaitForNextEvent() string
```
This should be enough to generate a suitable mobile API when
using the `gomobile` Go subcommand.
We can likewise generate a FFI API as follows:
```Go
package main
import (
"C"
"sync"
"github.com/ooni/probe-engine/oonimkall"
)
var (
idx int64 = 1
m = make(map[int64]*oonimkall.Task)
mu sync.Mutex
)
//export ooni_task_start
func ooni_task_start(settings string) int64 {
tp, err := oonimkall.StartTask(settings)
if err != nil {
return 0
}
mu.Lock()
handle := idx
idx++
m[handle] = tp
mu.Unlock()
return handle
}
//export ooni_task_interrupt
func ooni_task_interrupt(handle int64) {
mu.Lock()
if tp := m[handle]; tp != nil {
tp.Interrupt()
}
mu.Unlock()
}
//export ooni_task_is_done
func ooni_task_is_done(handle int64) bool {
isdone := true
mu.Lock()
if tp := m[handle]; tp != nil {
isdone = tp.IsDone()
}
mu.Unlock()
return isdone
}
//export ooni_task_wait_for_next_event
func ooni_task_wait_for_next_event(handle int64) (event string) {
mu.Lock()
tp := m[handle]
mu.Unlock()
if tp != nil {
event = tp.WaitForNextEvent()
}
return
}
func main() {}
```
This is close enough to [measurement-kit's FFI API](https://git.io/Jv4Rv) that
a few lines of C allow to implement an ABI-compatible replacement.
## Other APIs of interest
We currently don't have plans for replacing other MK APIs. We will introduce
new APIs specifically tailored for our OONI needs, but they will be out of
scope with respect to the main goal of this design document.
## measurement_kit command line client
We replace the `measurement_kit` CLI tool with `miniooni`,
which will use the ~same set of CLI flags for QA reasons.
## History
[The initial version of this design document](
https://github.com/measurement-kit/engine/blob/master/DESIGN.md)
lived in the measurement-kit namespace at GitHub. It discussed
a bunch of broad, extra topics, e.g., code bloat that are not
discussed in this document. More details regarding the migration
from MK to probe-engine are at [measurement-kit/measurement-kit#1913](
https://github.com/measurement-kit/measurement-kit/issues/1913).

409
docs/design/dd-002-netx.md Normal file
View File

@ -0,0 +1,409 @@
# OONI Network Extensions
| | |
|:-------------|:-------------|
| Author | [@bassosimone](https://github.com/bassosimone) |
| Last-Updated | 2020-04-02 |
| Status | historic |
| Obsoleted-by | [ooni/probe-engine#359](https://github.com/ooni/probe-engine/issues/359) |
| Obsoleted-by | [ooni/probe-engine#522](https://github.com/ooni/probe-engine/pull/522) |
| Obsoleted-by | [dd-003-step-by-step.md](dd-003-step-by-step.md) |
*Abstract.* Rationale and design of [ooni/netx](https://github.com/ooni/netx),
which was later merged into [ooni/probe-engine](
https://github.com/ooni/probe-engine) and
[ooni/probe-cli](https://github.com/ooni/probe-cli).
## Introduction
OONI experiments send and/or receive network traffic to
determine if there is blocking. We want the implementation
of OONI experiments to be as simple as possible. We also
_want to attribute errors to the major network or protocol
operation that caused them_.
At the same time, _we want an experiment to collect as much
low-level data as possible_. For example, we want to know
whether and when the TLS handshake completed; what certificates
were provided by the server; what TLS version was selected;
and so forth. These bits of information are very useful
to analyze a measurement and better classify it.
We also want to _automatically or manually run follow-up
measurements where we change some configuration properties
and repeat the measurement_. For example, we may want to
configure DNS over HTTPS (DoH) and then attempt to
fetch again an URL. Or we may want to detect whether
there is SNI bases blocking. This package allows us to
do that in other parts of probe-engine.
## Rationale
As we observed [ooni/probe-engine#13](
https://github.com/ooni/probe-engine/issues/13), every
experiment consists of two separate phases:
1. measurement gathering
2. measurement analysis
During measurement gathering, we perform specific actions
that cause network data to be sent and/or received. During
measurement analysis, we process the measurement on the
device. For some experiments (e.g., Web Connectivity), this
second phase also entails contacting OONI backend services
that provide data useful to complete the analysis.
This package implements measurement gathering. The analysis
is performed by other packages in probe-engine. The core
design idea is to provide OONI-measurements-aware replacements
for Go standard library interfaces, e.g., the
`http.RoundTripper`. On top of that, we'll create all the
required interfaces to achive the measurement goals mentioned above.
We are of course writing test templates in `probe-engine`
anyway, because we need additional abstraction, but we can
take advantage of the fact that the API exposed by this package
is stable by definition, because it mimics the stdlib. Also,
for many experiments we can collect information pertaining
to TCP, DNS, TLS, and HTTP with a single call to `netx`.
This code used to live at `github.com/ooni/netx`. On 2020-03-02
we merged github.com/ooni/netx@4f8d645bce6466bb into `probe-engine`
because it was more practical and enabled easier refactoring.
## Definitions
Consistently with Go's terminology, we define
_HTTP round trip_ the process where we get a request
to send; we find a suitable connection for sending
it, or we create one; we send headers and
possibly body; and we receive response headers.
We also define _HTTP transaction_ the process starting
with an HTTP round trip and terminating by reading
the full response body.
We define _netx replacement_ a Go struct of interface that
has the same interface of a Go standard library object
but additionally performs measurements.
## Enhanced error handling
This library MUST wrap `error` such that:
1. we can classify all errors we care about; and
2. we can map them to major operations.
The `github.com/ooni/netx/modelx` MUST contain a wrapper for
Go `error` named `ErrWrapper` that is at least like:
```Go
type ErrWrapper struct {
Failure string // error classification
Operation string // operation that caused error
WrappedErr error // the original error
}
func (e *ErrWrapper) Error() string {
return e.Failure
}
```
Where `Failure` is one of the errors we care about, i.e.:
- `connection_refused`: ECONNREFUSED
- `connection_reset`: ECONNRESET
- `dns_bogon_error`: detected bogon in DNS reply
- `dns_nxdomain_error`: NXDOMAIN in DNS reply
- `eof_error`: unexpected EOF on connection
- `generic_timeout_error`: some timer has expired
- `ssl_invalid_hostname`: certificate not valid for SNI
- `ssl_unknown_autority`: cannot find CA validating certificate
- `ssl_invalid_certificate`: e.g. certificate expired
- `unknown_failure <string>`: any other error
Note that we care about bogons in DNS replies because they are
often used to censor specific websites.
And where `Operation` is one of:
- `resolve`: domain name resolution
- `connect`: TCP connect
- `tls_handshake`: TLS handshake
- `http_round_trip`: reading/writing HTTP
The code in this library MUST wrap returned errors such
that we can cast back to `ErrWrapper` during the analysis
phase, using Go 1.13 `errors` library as follows:
```Go
var wrapper *modelx.ErrWrapper
if errors.As(err, &wrapper) == true {
// Do something with the error
}
```
## Netx replacements
We want to provide netx replacements for the following
interfaces in the Go standard library:
1. `http.RoundTripper`
2. `http.Client`
3. `net.Dialer`
4. `net.Resolver`
Accordingly, we'll define the following interfaces in
the `github.com/ooni/probe-engine/netx/modelx` package:
```Go
type DNSResolver interface {
LookupHost(ctx context.Context, hostname string) ([]string, error)
}
type Dialer interface {
Dial(network, address string) (net.Conn, error)
DialContext(ctx context.Context, network, address string) (net.Conn, error)
}
type TLSDialer interface {
DialTLS(network, address string) (net.Conn, error)
DialTLSContext(ctx context.Context, network, address string) (net.Conn, error)
}
```
We won't need an interface for `http.RoundTripper`
because it is already an interface, so we'll just use it.
Our replacements will implement these interfaces.
Using an API compatible with Go's standard libary makes
it possible to use, say, our `net.Dialer` replacement with
other libraries. Both `http.Transport` and
`gorilla/websocket`'s `websocket.Dialer` have
functions like `Dial` and `DialContext` that can be
overriden. By overriding such function pointers,
we could use our replacements instead of the standard
libary, thus we could collect measurements while
using third party code to implement specific protocols.
Also, using interfaces allows us to combine code
quite easily. For example, a resolver that detects
bogons is easily implemented as a wrapper around
another resolve that performs the real resolution.
## Dispatching events
The `github.com/ooni/netx/modelx` package will define
an handler for low level events as:
```Go
type Handler interface {
OnMeasurement(Measurement)
}
```
We will provide a mechanism to bind a specific
handler to a `context.Context` such that the handler
will receive all the measurements caused by code
using such context. This mechanism is like:
```Go
type MeasurementRoot struct {
Beginning time.Time // the "zero" time
Handler Handler // the handler to use
}
```
You will be able to assign a `MeasurementRoot` to
a context by using the following function:
```Go
func WithMeasurementRoot(
ctx context.Context, root *MeasurementRoot) context.Context
```
which will return a clone of the original context
that uses the `MeasurementRoot`. Pass this context to
any method of our replacements to get measurements.
Given such context, or a subcontext, you can get
back the original `MeasurementRoot` using:
```Go
func ContextMeasurementRoot(ctx context.Context) *MeasurementRoot
```
which will return the context `MeasurementRoot` or
`nil` if none is set into the context. This is how our
internal code gets access to the `MeasurementRoot`.
## Constructing and configuring replacements
The `github.com/ooni/probe-engine/netx` package MUST provide an API such
that you can construct and configure a `net.Resolver` replacement
as follows:
```Go
r, err := netx.NewResolverWithoutHandler(dnsNetwork, dnsAddress)
if err != nil {
log.Fatal("cannot configure specifc resolver")
}
var resolver modelx.DNSResolver = r
// now use resolver ...
```
where `DNSNetwork` and `DNSAddress` configure the type
of the resolver as follows:
- when `DNSNetwork` is `""` or `"system"`, `DNSAddress` does
not matter and we use the system resolver
- when `DNSNetwork` is `"udp"`, `DNSAddress` is the address
or domain name, with optional port, of the DNS server
(e.g., `8.8.8.8:53`)
- when `DNSNetwork` is `"tcp"`, `DNSAddress` is the address
or domain name, with optional port, of the DNS server
(e.g., `8.8.8.8:53`)
- when `DNSNetwork` is `"dot"`, `DNSAddress` is the address
or domain name, with optional port, of the DNS server
(e.g., `8.8.8.8:853`)
- when `DNSNetwork` is `"doh"`, `DNSAddress` is the URL
of the DNS server (e.g. `https://cloudflare-dns.com/dns-query`)
When the resolve is not the system one, we'll also be able
to emit events when performing resolution. Otherwise, we'll
just emit the `DNSResolveDone` event defined below.
Any resolver returned by this function may be configured to return the
`dns_bogon_error` if any `LookupHost` lookup returns a bogon IP.
The package will also contain this function:
```Go
func ChainResolvers(
primary, secondary modelx.DNSResolver) modelx.DNSResolver
```
where you can create a new resolver where `secondary` will be
invoked whenever `primary` fails. This functionality allows
us to be more resilient and bypass automatically certain types
of censorship, e.g., a resolver returning a bogon.
The `github.com/ooni/probe-engine/netx` package MUST also provide an API such
that you can construct and configure a `net.Dialer` replacement
as follows:
```Go
d := netx.NewDialerWithoutHandler()
d.SetResolver(resolver)
d.ForceSpecificSNI("www.kernel.org")
d.SetCABundle("/etc/ssl/cert.pem")
d.ForceSkipVerify()
var dialer modelx.Dialer = d
// now use dialer
```
where `SetResolver` allows you to change the resolver,
`ForceSpecificSNI` forces the TLS dials to use such SNI
instead of using the provided domain, `SetCABundle`
allows to set a specific CA bundle, and `ForceSkipVerify`
allows to disable certificate verification. All these funcs
MUST NOT be invoked once you're using the dialer.
The `github.com/ooni/probe-engine/netx` package MUST contain
code so that we can do:
```Go
t := netx.NewHTTPTransportWithProxyFunc(
http.ProxyFromEnvironment,
)
t.SetResolver(resolver)
t.ForceSpecificSNI("www.kernel.org")
t.SetCABundle("/etc/ssl/cert.pem")
t.ForceSkipVerify()
var transport http.RoundTripper = t
// now use transport
```
where the functions have the same semantics as the
namesake functions described before and the same caveats.
We also have syntactic sugar on top of that and legacy
methods, but this fully describes the design.
## Structure of events
The `github.com/ooni/probe-engine/netx/modelx` will contain the
definition of low-level events. We are interested in
knowing the following:
1. the timing and result of each I/O operation.
2. the timing of HTTP events occurring during the
lifecycle of an HTTP request.
3. the timing and result of the TLS handshake including
the negotiated TLS version and other details such as
what certificates the server has provided.
4. DNS events, e.g. queries and replies, generated
as part of using DoT and DoH.
We will represent time as a `time.Duration` since the
beginning configured either in the context or when
constructing an object. The `modelx` package will also
define the `Measurement` event as follows:
```Go
type Measurement struct {
Connect *ConnectEvent
HTTPConnectionReady *HTTPConnectionReadyEvent
HTTPRoundTripDone *HTTPRoundTripDoneEvent
ResolveDone *ResolveDoneEvent
TLSHandshakeDone *TLSHandshakeDoneEvent
}
```
The events above MUST always be present, but more
events will likely be available. The structure
will contain a pointer for every event that
we support. The events processing code will check
what pointer or pointers are not `nil` to known
which event or events have occurred.
To simplify joining events together the following holds:
1. when we're establishing a new connection there is a nonzero
`DialID` shared by `Connect` and `ResolveDone`
2. a new connection has a nonzero `ConnID` that is emitted
as part of a successful `Connect` event
3. during an HTTP transaction there is a nonzero `TransactionID`
shared by `HTTPConnectionReady` and `HTTPRoundTripDone`
4. if the TLS handshake is invoked by HTTP code it will have a
nonzero `TrasactionID` otherwise a nonzero `ConnID`
5. the `HTTPConnectionReady` will also see the `ConnID`
6. when a transaction starts dialing, it will pass its
`TransactionID` to `ResolveDone` and `Connect`
7. when we're dialing a connection for DoH, we pass the `DialID`
to the `HTTPConnectionReady` event as well
Because of the following rules, it should always be possible
to bind together events. Also, we define more events than the
above, but they are ancillary to the above events. Also, the
main reason why `HTTPConnectionReady` is here is because it is
the event allowing to bind `ConnID` and `TransactionID`.

File diff suppressed because it is too large Load Diff

View File

@ -43,4 +43,7 @@
//
// Additionally, new code that does not need to perform measurements
// SHOULD NOT use netx and SHOULD instead use netxlite.
//
// See docs/design/dd-002-nets.md in the probe-cli repository for
// the design document describing this package.
package netx

View File

@ -34,6 +34,9 @@
//
// We want to have reasonable watchdog timeouts for each operation.
//
// See also the design document at docs/design/dd-003-step-by-step.md,
// which provides an overview of netxlite's main concerns.
//
// Operations
//
// This package implements the following operations:

View File

@ -9,6 +9,10 @@
// reimplementation of the mkall API implemented by Measurement Kit
// in, e.g., https://github.com/measurement-kit/mkall-ios.
//
// Design
//
// See docs/design/dd-001-oonimkall.md in the probe-cli repository.
//
// Semantic versioning policy
//
// This package is public for technical reasons. We cannot use `go