ooni-probe-cli/internal/tutorial/measurex/chapter12/main.go
Simone Basso aa27bbe33f
fix(measurex): use same keys of the OONI data format (#572)
This change should simplify the pipeline's job.

Reference issue: https://github.com/ooni/probe/issues/1817.

I previously dismissed this possibility, but now it seems clear it
is simpler to have a very tabular data format internally and to
convert such a format to OONI's data format when serializing.

The OONI data format is what the pipeline expects, but processing
is easier with a more linear/tabular format.
2021-11-05 10:46:45 +01:00

100 lines
2.9 KiB
Go

// -=-=- StartHere -=-=-
//
// # Chapter XII: Following redirections.
//
// This program shows how to combine the URL measurement
// "step" introduced in the previous chapter with
// following redirections. If we say that the previous
// chapter performed a "web step", then we can say
// that here we're performing multiple "web steps".
//
// (This file is auto-generated. Do not edit it directly! To apply
// changes you need to modify `./internal/tutorial/measurex/chapter12/main.go`.)
//
// ## main.go
//
// The beginning of the program is pretty much the
// same, except that here we need to define a
// `measurement` container type that will contain
// the result of each "web step".
//
// ```Go
package main
import (
"context"
"encoding/json"
"flag"
"fmt"
"time"
"github.com/ooni/probe-cli/v3/internal/measurex"
"github.com/ooni/probe-cli/v3/internal/runtimex"
)
type measurement struct {
URLs []*measurex.ArchivalURLMeasurement
}
func print(v interface{}) {
data, err := json.Marshal(v)
runtimex.PanicOnError(err, "json.Marshal failed")
fmt.Printf("%s\n", string(data))
}
func main() {
URL := flag.String("url", "http://facebook.com/", "URL to fetch")
timeout := flag.Duration("timeout", 60*time.Second, "timeout to use")
flag.Parse()
ctx, cancel := context.WithTimeout(context.Background(), *timeout)
defer cancel()
all := &measurement{}
mx := measurex.NewMeasurerWithDefaultSettings()
cookies := measurex.NewCookieJar()
headers := measurex.NewHTTPRequestHeaderForMeasuring()
// ```
//
// Everything above this line is like in chapter11. What changes
// now is that we're calling `MeasureURLAndFollowRedirections`
// instead of `MeasureURL`.
//
// Rather than returning a single measurement, this function
// returns a channel where it posts the result of measuring
// the original URL along with all its redirections. Internally,
// `MeasureURLAndFollowRedirections` calls `MeasureURL`.
//
// We accumulate the results in `URLs` and print `m`. The channel
// is closed when done by `MeasureURLAndFollowRedirections`, so we leave the loop.
//
// ```Go
for m := range mx.MeasureURLAndFollowRedirections(ctx, *URL, headers, cookies) {
all.URLs = append(all.URLs, measurex.NewArchivalURLMeasurement(m))
}
print(all)
}
// ```
//
// ## Running the example program
//
// Let us perform a vanilla run first:
//
// ```bash
// go run -race ./internal/tutorial/measurex/chapter12 | jq
// ```
//
// Take a look at the JSON. You should see several redirects
// and that we measure each endpoint of each redirect, including
// QUIC endpoints that we discover on the way.
//
// Exercise: remove code for converting to OONI data format
// and compare output with previous chapter. See any difference?
//
// ## Conclusion
//
// We have introduced `MeasureURLAndFollowRedirect`, the
// top-level API for fully measuring a URL and all the URLs
// that derive from such an URL via redirection.
//
// -=-=- StopHere -=-=-