ooni-probe-cli/internal/tutorial/measurex/chapter12
2021-11-05 14:37:03 +01:00
..
main.go fix(measurex): allow API user to choose parallelism (#581) 2021-11-05 14:37:03 +01:00
README.md fix(measurex): allow API user to choose parallelism (#581) 2021-11-05 14:37:03 +01:00

Chapter XII: Following redirections.

This program shows how to combine the URL measurement "step" introduced in the previous chapter with following redirections. If we say that the previous chapter performed a "web step", then we can say that here we're performing multiple "web steps".

(This file is auto-generated. Do not edit it directly! To apply changes you need to modify ./internal/tutorial/measurex/chapter12/main.go.)

main.go

The beginning of the program is pretty much the same, except that here we need to define a measurement container type that will contain the result of each "web step".

package main

import (
	"context"
	"encoding/json"
	"flag"
	"fmt"
	"time"

	"github.com/ooni/probe-cli/v3/internal/measurex"
	"github.com/ooni/probe-cli/v3/internal/runtimex"
)

type measurement struct {
	URLs []*measurex.ArchivalURLMeasurement
}

func print(v interface{}) {
	data, err := json.Marshal(v)
	runtimex.PanicOnError(err, "json.Marshal failed")
	fmt.Printf("%s\n", string(data))
}

func main() {
	URL := flag.String("url", "http://facebook.com/", "URL to fetch")
	timeout := flag.Duration("timeout", 60*time.Second, "timeout to use")
	flag.Parse()
	ctx, cancel := context.WithTimeout(context.Background(), *timeout)
	defer cancel()
	all := &measurement{}
	mx := measurex.NewMeasurerWithDefaultSettings()
	cookies := measurex.NewCookieJar()
	headers := measurex.NewHTTPRequestHeaderForMeasuring()

Everything above this line is like in chapter11. What changes now is that we're calling MeasureURLAndFollowRedirections instead of MeasureURL.

Rather than returning a single measurement, this function returns a channel where it posts the result of measuring the original URL along with all its redirections. Internally, MeasureURLAndFollowRedirections calls MeasureURL.

The parallelism argument dictates how many parallel goroutine to use for parallelizable operations. (A zero or negative value implies that the code should use a sensible default value.)

We accumulate the results in URLs and print m. The channel is closed when done by MeasureURLAndFollowRedirections, so we leave the loop.

	const parallelism = 3
	for m := range mx.MeasureURLAndFollowRedirections(ctx, parallelism, *URL, headers, cookies) {
		all.URLs = append(all.URLs, measurex.NewArchivalURLMeasurement(m))
	}
	print(all)
}

Running the example program

Let us perform a vanilla run first:

go run -race ./internal/tutorial/measurex/chapter12 | jq

Take a look at the JSON. You should see several redirects and that we measure each endpoint of each redirect, including QUIC endpoints that we discover on the way.

Exercise: remove code for converting to OONI data format and compare output with previous chapter. See any difference?

Conclusion

We have introduced MeasureURLAndFollowRedirect, the top-level API for fully measuring a URL and all the URLs that derive from such an URL via redirection.