ooni-probe-cli/internal/tutorial/measurex/chapter13/README.md
Simone Basso d45e58c14f
doc(measurex): explain how to write experiments (#529)
Part of https://github.com/ooni/ooni.org/issues/361

Co-authored-by: Arturo Filastò <arturo@openobservatory.org>
2021-09-30 01:36:03 +02:00

98 lines
2.9 KiB
Markdown

# Chapter XIII: Rewriting Web Connectivity
This chapter contains an exercise. We are going to
use the `measurex` API to rewrite part of the
Web Connectivity network experiment.
(This is probably the right place to prod you
to go to the [ooni/spec](https://github.com/ooni/spec)
repository, locate the ts-017-web-connectivity.md
spec, and read it.)
Read the spec? Good, so
what we are more precisely going to do here
is implement the network measurement part of
Web Connectivity where we:
1. enumerate all the IP addresses of the target
URL using the system resolver;
2. build endpoints with such IPs with a suitable
port, thus obtaining a list of HTTP endpoints;
3. TCP connect each of the endpoints and save the
results into a measurement object compatible
with Web Connectivity's data format;
4. TLS handshake each endpoint (only if this
makes sense, of course);
5. HTTP GET the URL and follow redirects until
we reach a webpage, fetch the body, and store it
for later analysis (which we'll not implement
as part of this exercise).
Let us now provide extra context that should
help you figure out how to solve this exercise.
## Regarding points 3-4
You already know all the primitives.
## Regarding point 5
Historically this point has always been
performed by a separate HTTP client. This
means that any implementation:
- will not include any TCP or TLS event
generated during point 5 in the measurement;
- most likely will resolve the URL's domain
again (even though the probe-cli implementation
uses a fake Resolver to avoid that);
- tries every available IP address and stops
at the first one to which it can connect to (which
is what a naive HTTP client does, whereas a more
advanced one likely tries a couple of addrs in
parallel, especially when both IPv4 and IPv6
are supported - this is also known as happy eyeballs).
In terms of `measurex`, the best API to do what
you're required to do in point 5 is probably
`NewTracingHTTPTransportWithDefaultSettings`, which
allows you to trace only the HTTP round trip and
ignores any other event.
Once you have such a transport, the best `Measurer`
API for the task is probably `HTTPClientGET`.
## Other remarks
You also need to learn about how to measure
events at low level, which entails creating an
instance of `MeasurementDB`, passing it to
the relevant networking code, and then calling
its `AsMeasurement` method to get back a
measurement. (You can probably get an idea
of how this is done in general by checking the
implementation of `Measurer.TCPConnect`.)
Hopefully, this should be enough information
to help you tackle this task. As you see
below, the main function is there empty waiting
for your implementation. We will provide our
own solution to this problem in the next chapter.
(This file is auto-generated. Do not edit it directly! To apply
changes you need to modify `./internal/tutorial/measurex/chapter13/main.go`.)
## The main.go file
```Go
package main
func main() {
}