ooni-probe-cli/nettests/nettests.go
Simone Basso b9b555ba68 Integrate further with ooni/probe-engine: episode two (#46)
* utils/geoip.go: use github.com/ooni/probe-engine

Let's start using the engine by rewriting utils/geoip.go to
be just a thin wrapper around the engine functionality.

* Ready for review

* Checkpoint: the im tests are converted

Still have some doubts with respect to the variables that
are passed to MK via probe-engine. Will double check.

* fix(i/c/r/run.go): write the correct logic

* nettests: one more comment and also fix a format string

* Tweak previous

* progress

* Fix doofus

* better comment

* XXX => actionable comment

* Add glue to simplify test keys management

Making the concept of measurement more abstract in the engine is
not feasible because, when submitting a measurement, we need to
modify it to update the report ID and the measurement ID. Therefore,
returning a serialized measurement is not a good idea. We will
keep using a model.Measurement in the engine.

Changing model.Measurement.TestKeys's type from a `interface{}`
pointing to a well defined data structure to `map[string]interface{}`
is a regression because means that we are moving from code that
has a clear and defined structure to code that is more complicated
to parse and validate. Since we're already suffering havily from
the lack of a good schema, I'm not going to make the situation
worst by worsening the engine. At least for ndt7 and psiphon, we
now have a good schema and I don't want to lose that.

However, the current code in this repository is expecting the
test keys to be a `map[string]interface{}`. This choice was
dictated by the fact that we receive a JSON from Measurement Kit
and by the fact that there's not a clear schema.

To solve this tension, in this commit I am going to write glue
adapter code that makes sure that the TestKeys of a Measurement
are converted to `map[string]interface{}`. This will be done
using a type cast where possible and JSON serialization and parsing
otherwise. In a perfect world, glue is not a good idea, but in a
real world it may actually be useful.

When all tests in the engine will have a clear Go data structure,
we'll then remove the glue and just cast to the proper data
structure from `interface{}` where required.

* nettests/performance: use probe-engine

* go.{mod,sum}: upgrade to latest probe-engine

* nettests/middlebox: use ooni/probe-engine

* Update to the latest probe-engine

* web_connectivity: rewrite to use probe-engine

* Cosmetic change suggested by @hellais

* nettests/nettests.go: remove unused code

* nettests/nettests.go: fix progress

* nettests/nettests.go: remove go-measurement-kit code

* We don't depend on go-measurement-kit anymore

* Improve non-verbose output where possible

See also: https://github.com/measurement-kit/measurement-kit/issues/1856

* Make web_connectivity output pleasant

* Update to the latest probe-engine

* nettests/nettests.go: honour sharing settings

* Update to the latest probe-engine

* Use log.WithFields for probe-engine

* Update go.mod go.sum

* Revert "Update go.mod go.sum"

This reverts commit 5ecd38d8236f4a4e9b77ddb8e8a0d1e3cdd4b818.

* Revert "Revert "Update go.mod go.sum""

This reverts commit 6114b31eca98826112032776bd0feff02d763ecd.

* Upgrade ooni/probe-engine

* Unset GOPATH before running go build commands

* Dockefile: fix linux build by using latest

* Update to the latest ooni/probe-engine

```
go get -u github.com/ooni/probe-engine
go mod tidy
```

* Repair build
2019-08-15 18:08:43 +02:00

222 lines
7.2 KiB
Go

package nettests
import (
"context"
"database/sql"
"fmt"
"path/filepath"
"time"
"github.com/apex/log"
"github.com/fatih/color"
ooni "github.com/ooni/probe-cli"
"github.com/ooni/probe-cli/internal/database"
"github.com/ooni/probe-cli/internal/enginex"
"github.com/ooni/probe-cli/internal/output"
"github.com/ooni/probe-cli/utils"
"github.com/ooni/probe-engine/experiment"
"github.com/ooni/probe-engine/experiment/handler"
"github.com/ooni/probe-engine/model"
"github.com/pkg/errors"
)
// Nettest interface. Every Nettest should implement this.
type Nettest interface {
Run(*Controller) error
GetTestKeys(map[string]interface{}) (interface{}, error)
LogSummary(string) error
}
// NewController creates a nettest controller
func NewController(nt Nettest, ctx *ooni.Context, res *database.Result) *Controller {
msmtPath := filepath.Join(ctx.TempDir,
fmt.Sprintf("msmt-%T-%s.jsonl", nt,
time.Now().UTC().Format(utils.ResultTimestamp)))
return &Controller{
Ctx: ctx,
nt: nt,
res: res,
msmtPath: msmtPath,
}
}
// Controller is passed to the run method of every Nettest
// each nettest instance has one controller
type Controller struct {
Ctx *ooni.Context
res *database.Result
nt Nettest
ntCount int
ntIndex int
msmts map[int64]*database.Measurement
msmtPath string // XXX maybe we can drop this and just use a temporary file
inputIdxMap map[int64]int64 // Used to map mk idx to database id
// numInputs is the total number of inputs
numInputs int
// curInputIdx is the current input index
curInputIdx int
}
// SetInputIdxMap is used to set the mapping of index into input. This mapping
// is used to reference, for example, a particular URL based on the index inside
// of the input list and the index of it in the database.
func (c *Controller) SetInputIdxMap(inputIdxMap map[int64]int64) error {
c.inputIdxMap = inputIdxMap
return nil
}
// SetNettestIndex is used to set the current nettest index and total nettest
// count to compute a different progress percentage.
func (c *Controller) SetNettestIndex(i, n int) {
c.ntCount = n
c.ntIndex = i
}
// Run runs the selected nettest using the related experiment
// with the specified inputs.
//
// This function will continue to run in most cases but will
// immediately halt if something's wrong with the file system.
func (c *Controller) Run(exp *experiment.Experiment, inputs []string) error {
ctx := context.Background()
// This will configure the controller as handler for the callbacks
// called by ooni/probe-engine/experiment.Experiment.
exp.Callbacks = handler.Callbacks(c)
c.numInputs = len(inputs)
c.msmts = make(map[int64]*database.Measurement)
// These values are shared by every measurement
var reportID sql.NullString
resultID := c.res.ID
log.Debug(color.RedString("status.queued"))
log.Debug(color.RedString("status.started"))
log.Debugf("OutputPath: %s", c.msmtPath)
if c.Ctx.Config.Sharing.UploadResults {
if err := exp.OpenReport(ctx); err != nil {
log.Debugf(
"%s: %s", color.RedString("failure.report_create"), err.Error(),
)
} else {
defer exp.CloseReport(ctx)
log.Debugf(color.RedString("status.report_create"))
reportID = sql.NullString{String: exp.ReportID(), Valid: true}
}
}
for idx, input := range inputs {
c.curInputIdx = idx // allow for precise progress
idx64 := int64(idx)
log.Debug(color.RedString("status.measurement_start"))
var urlID sql.NullInt64
if c.inputIdxMap != nil {
urlID = sql.NullInt64{Int64: c.inputIdxMap[idx64], Valid: true}
}
msmt, err := database.CreateMeasurement(
c.Ctx.DB, reportID, exp.TestName, resultID, c.msmtPath, urlID,
)
if err != nil {
return errors.Wrap(err, "failed to create measurement")
}
c.msmts[idx64] = msmt
measurement, err := exp.Measure(ctx, input)
if err != nil {
log.WithError(err).Debug(color.RedString("failure.measurement"))
if err := c.msmts[idx64].Failed(c.Ctx.DB, err.Error()); err != nil {
return errors.Wrap(err, "failed to mark measurement as failed")
}
continue
}
// Make sure we share what the user wants us to share.
if c.Ctx.Config.Sharing.IncludeIP == false {
measurement.ProbeIP = model.DefaultProbeIP
}
if c.Ctx.Config.Sharing.IncludeASN == false {
measurement.ProbeASN = fmt.Sprintf("AS%d", model.DefaultProbeASN)
}
if c.Ctx.Config.Sharing.IncludeCountry == false {
measurement.ProbeCC = model.DefaultProbeCC
}
if c.Ctx.Config.Sharing.UploadResults {
// Implementation note: SubmitMeasurement will fail here if we did fail
// to open the report but we still want to continue. There will be a
// bit of a spew in the logs, perhaps, but stopping seems less efficient.
if err := exp.SubmitMeasurement(ctx, &measurement); err != nil {
log.Debug(color.RedString("failure.measurement_submission"))
if err := c.msmts[idx64].UploadFailed(c.Ctx.DB, err.Error()); err != nil {
return errors.Wrap(err, "failed to mark upload as failed")
}
} else if err := c.msmts[idx64].UploadSucceeded(c.Ctx.DB); err != nil {
return errors.Wrap(err, "failed to mark upload as succeeded")
}
}
if err := exp.SaveMeasurement(measurement, c.msmtPath); err != nil {
return errors.Wrap(err, "failed to save measurement on disk")
}
if err := c.msmts[idx64].Done(c.Ctx.DB); err != nil {
return errors.Wrap(err, "failed to mark measurement as done")
}
// We're not sure whether it's enough to log the error or we should
// instead also mark the measurement as failed. Strictly speaking this
// is an inconsistency between the code that generate the measurement
// and the code that process the measurement. We do have some data
// but we're not gonna have a summary. To be reconsidered.
genericTk, err := enginex.MakeGenericTestKeys(measurement)
if err != nil {
log.WithError(err).Error("failed to cast the test keys")
continue
}
tk, err := c.nt.GetTestKeys(genericTk)
if err != nil {
log.WithError(err).Error("failed to obtain testKeys")
continue
}
log.Debugf("Fetching: %d %v", idx, c.msmts[idx64])
if err := database.AddTestKeys(c.Ctx.DB, c.msmts[idx64], tk); err != nil {
return errors.Wrap(err, "failed to add test keys to summary")
}
}
log.Debugf("status.end")
for idx, msmt := range c.msmts {
log.Debugf("adding msmt#%d to result", idx)
if err := msmt.AddToResult(c.Ctx.DB, c.res); err != nil {
return errors.Wrap(err, "failed to add to result")
}
}
return nil
}
// OnProgress should be called when a new progress event is available.
func (c *Controller) OnProgress(perc float64, msg string) {
log.Debugf("OnProgress: %f - %s", perc, msg)
if c.numInputs >= 1 {
// make the percentage relative to the current input over all inputs
floor := (float64(c.curInputIdx) / float64(c.numInputs))
step := 1.0 / float64(c.numInputs)
perc = floor + perc * step
}
if c.ntCount > 0 {
// make the percentage relative to the current nettest over all nettests
perc = float64(c.ntIndex)/float64(c.ntCount) + perc/float64(c.ntCount)
}
key := fmt.Sprintf("%T", c.nt)
output.Progress(key, perc, msg)
}
// OnDataUsage should be called when we have a data usage update.
func (c *Controller) OnDataUsage(dloadKiB, uploadKiB float64) {
c.res.DataUsageDown += dloadKiB
c.res.DataUsageUp += uploadKiB
}