ooni-probe-cli/internal/tutorial/measurex/chapter14/main.go

321 lines
8.4 KiB
Go
Raw Normal View History

// -=-=- StartHere -=-=-
//
// # Chapter XIV: A possible rewrite of Web Connectivity
//
// In this chapter we try to solve the exercise laid out in
// the previous chapter, using `measurex` primitives.
//
// (This file is auto-generated. Do not edit it directly! To apply
// changes you need to modify `./internal/tutorial/measurex/chapter14/main.go`.)
//
// ## main.go
//
// The beginning of the file is always pretty much the same.
//
// ```Go
package main
import (
"context"
"crypto/tls"
"encoding/json"
"flag"
"fmt"
"net/http"
"net/url"
"time"
"github.com/ooni/probe-cli/v3/internal/measurex"
"github.com/ooni/probe-cli/v3/internal/netxlite"
"github.com/ooni/probe-cli/v3/internal/runtimex"
)
func print(v interface{}) {
data, err := json.Marshal(v)
runtimex.PanicOnError(err, "json.Marshal failed")
fmt.Printf("%s\n", string(data))
}
// ```
//
// ## measurement type
//
// We define a measurement type with the fields
// that a Web Connectivity measurement should have.
//
// ```Go
type measurement struct {
Queries []*measurex.DNSLookupEvent `json:"queries"`
TCPConnect []*measurex.NetworkEvent `json:"tcp_connect"`
TLSHandshakes []*measurex.TLSHandshakeEvent `json:"tls_handshakes"`
Requests []*measurex.HTTPRoundTripEvent `json:"requests"`
}
// ```
//
// ## WebConnectivity implementation
//
// We define a function that takes in input a context and a URL to
// measure and returns a measurement or an error.
//
// We will only error out in case the input does not allow us to
// proceed (i.e., invalid input URL).
//
// ```Go
func webConnectivity(ctx context.Context, URL string) (*measurement, error) {
// ```
//
// We start by parsing the input URL. If we cannot parse it, of
2021-10-11 17:48:45 +02:00
// course this is a hard error and we cannot continue.
//
// ```Go
parsedURL, err := url.Parse(URL)
if err != nil {
return nil, err
}
// ```
//
// We create an empty measurement and a measurer with
// default settings like we did in the previous chapters.
//
// ```Go
m := &measurement{}
mx := measurex.NewMeasurerWithDefaultSettings()
// ```
//
// Now it's time to start measuring. We will address all
// the points laid out in the previous chapter.
//
// ### 1. Enumerating IP addrs
//
// Let us enumerate all the IP addresses for
// the input URL's domain using the system resolver.
//
// ```Go
dns := mx.LookupHostSystem(ctx, parsedURL.Hostname())
m.Queries = append(m.Queries, dns.LookupHost...)
// ```
//
2021-10-11 17:48:45 +02:00
// This is code we have already seen in the previous chapters.
//
//
// ### 2. Building a list of endpoints
//
// ```Go
epnts, err := measurex.AllHTTPEndpointsForURL(parsedURL, http.Header{}, dns)
if err != nil {
return nil, err
}
// ```
//
// This is also code we have seen in previous chapters. The only
// difference is that we supply empty headers since we're not going
// to actually use the headers inside the endpoints.
//
// ### 3 and 4. Measure each endpoint
//
// We will loop through the endpoints in the previous point
// and issue the correct TCP or TLS primitive depending on
// whether the input URL is HTTP or HTTPS.
//
// ```Go
for _, epnt := range epnts {
switch parsedURL.Scheme {
case "http":
tcp := mx.TCPConnect(ctx, epnt.Address)
m.TCPConnect = append(m.TCPConnect, tcp.Connect...)
case "https":
config := &tls.Config{
ServerName: parsedURL.Hostname(),
NextProtos: []string{"h2", "http/1.1"},
RootCAs: netxlite.NewDefaultCertPool(),
}
tls := mx.TLSConnectAndHandshake(ctx, epnt.Address, config)
m.TCPConnect = append(m.TCPConnect, tls.Connect...)
m.TLSHandshakes = append(m.TLSHandshakes, tls.TLSHandshake...)
}
}
// ```
//
// At this point we've addressed points 1-4. So let's
// now focus on the last point:
//
// ### 5. HTTP measurement
//
// We need to manually build a `MeasurementDB`. This is a
2021-10-11 17:48:45 +02:00
// "database" where the networking code will store events.
//
// ```Go
db := &measurex.MeasurementDB{}
// ```
//
// Following the hint from the previous chapter we use the
// `NewTracingHTTPTransportWithDefaultSettings` factory
// to create an `http.Transport`-like object that will trace
// HTTP round trip events writing them into `db`.
//
//
// ```Go
txp := measurex.NewTracingHTTPTransportWithDefaultSettings(mx.Begin, mx.Logger, db)
// ```
//
// We now build an `http.Client` using the transport
// we've just created and a cookie jar (which we
// use because otherwise some redirects will lead
// to a redirect loop, as mentioned in previous chapters).
//
// ```Go
clnt := &http.Client{
Transport: txp,
Jar: measurex.NewCookieJar(),
}
// ```
//
// Now we use a method of the measurer that allows us to
// perform an HTTP GET with an existing HTTP client
// and a URL. This method will set a timeout and perform
// the round trip. Reading a snapshot of the response
// body is not implemented by this function but rather
// is a property of the "tracing" HTTP transport we
// created above (this type of transport is the one we
2021-10-11 17:48:45 +02:00
// have been using internally in all the examples
// presented so far.)
//
// ```Go
resp, _ := mx.HTTPClientGET(ctx, clnt, parsedURL)
// ```
//
// To be tidy, we also close the response body in case
// we have a response. We don't really need to read
// the body here. As mentioned previously, we're already
// using an HTTP transport reading a body snapshot.
//
// ```Go
if resp != nil {
resp.Body.Close() // tidy
}
// ```
//
// Finally, we append the round trips we performed into
// the right field and return the measurement.
//
// To this end, we're using the `db.AsMeasurement` method that
// takes the current set of events into `db` and assembles
// them into the `Measurement` struct we've been using in all
// the chapters we have seen so far.
//
// ```Go
m.Requests = append(m.Requests, db.AsMeasurement().HTTPRoundTrip...)
return m, nil
}
// ```
//
// The rest of the program is pretty straightforward.
//
// ```Go
func main() {
URL := flag.String("url", "https://www.google.com/", "URL to fetch")
timeout := flag.Duration("timeout", 60*time.Second, "timeout to use")
flag.Parse()
ctx, cancel := context.WithTimeout(context.Background(), *timeout)
defer cancel()
m, err := webConnectivity(ctx, *URL)
runtimex.PanicOnError(err, "invalid arguments to webConnectivity (wrong URL?)")
print(m)
}
// ```
//
// ## Running the example program
//
// Let us perform a vanilla run first:
//
// ```bash
// go run -race ./internal/tutorial/measurex/chapter14
// ```
//
// Take a look at the JSON.
//
// Now try running the program with `http://gmail.com` as
// input. Take note of the redirect chain. See how the
// domain changes during the redirect. Take note of the
// fact that we are not measuring any TLS handshake. See
// how we're not trying QUIC endpoints. These are, in
// fact, some of the limitations of Web Connectivity that
// we were trying to address when we wrote `measurex`.
//
// Also, build the miniooni research client:
//
// ```
// go build -v ./internal/cmd/miniooni
// ```
//
// Run Web Connectivity with:
//
// ```
// ./miniooni -ni http://gmail.com web_connectivity
// ```
//
// This writes the report in a file named `report.jsonl`.
//
// Check the content of the file and match it with the
// output of this chapter. Are there other notable
// differences between the two outputs?
//
// ### Bonus question
//
// The solution we presented is true to the original
// spirit of Web Connectivity, where we first perform
// separate DNS, TCP/TLS steps, and then we also
// perform a separate HTTP step. Is there in `measurex`
// an API allowing you to invert the order of the
// operations, that is:
//
// 1. build a full-fledged HTTP client where we can
// trace _any_ operation;
//
// 2. use such client to measure the URL;
//
// 3. figure out what TCP endpoints we did not
// test for TCP/TLS during this process and run
// TCP/TLS testing only for them?
//
// If such an API exist, can you write a simple
// main.go client that implements points 1-3 above?
//
// ## Conclusion
//
// We have presented the solution to the exercise
// proposed in the previous chapter, i.e., how
// to rewrite Web Connectivity using `measurex` API.
//
// You have now been exposed to some complexity and
// APIs to perform OONI measurements. So you should now
// be read to help us write new and maitain existing
// network experiments.
//
// If you have further questions, please [contact us](
// https://ooni.org/about/).
//
// -=-=- StopHere -=-=-