feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
package measurex
|
|
|
|
|
|
|
|
//
|
|
|
|
// HTTP
|
|
|
|
//
|
|
|
|
// This file contains basic networking code. We provide:
|
|
|
|
//
|
|
|
|
// - a wrapper for netxlite.HTTPTransport that stores
|
|
|
|
// round trip events into an EventDB
|
|
|
|
//
|
|
|
|
// - an interface that is http.Client like and one internal
|
|
|
|
// implementation of such an interface that helps us to
|
|
|
|
// store HTTP redirections info into an EventDB
|
|
|
|
//
|
|
|
|
|
|
|
|
import (
|
|
|
|
"bytes"
|
|
|
|
"context"
|
|
|
|
"crypto/tls"
|
|
|
|
"errors"
|
|
|
|
"io"
|
|
|
|
"net/http"
|
|
|
|
"net/http/cookiejar"
|
|
|
|
"net/url"
|
|
|
|
"time"
|
|
|
|
"unicode/utf8"
|
|
|
|
|
|
|
|
"github.com/lucas-clemente/quic-go"
|
|
|
|
"github.com/ooni/probe-cli/v3/internal/engine/httpheader"
|
2022-01-03 13:53:23 +01:00
|
|
|
"github.com/ooni/probe-cli/v3/internal/model"
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
"github.com/ooni/probe-cli/v3/internal/netxlite"
|
|
|
|
"github.com/ooni/probe-cli/v3/internal/runtimex"
|
|
|
|
"golang.org/x/net/publicsuffix"
|
|
|
|
)
|
|
|
|
|
|
|
|
// WrapHTTPTransport creates a new transport that saves
|
|
|
|
// HTTP events into the WritableDB.
|
|
|
|
func (mx *Measurer) WrapHTTPTransport(
|
2022-01-03 13:53:23 +01:00
|
|
|
db WritableDB, txp model.HTTPTransport) *HTTPTransportDB {
|
2022-01-04 13:20:48 +01:00
|
|
|
return WrapHTTPTransport(mx.Begin, db, txp, mx.httpMaxBodySnapshotSize())
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
}
|
|
|
|
|
2022-01-04 13:20:48 +01:00
|
|
|
// DefaultHTTPMaxBodySnapshotSize is the default size used when
|
|
|
|
// saving HTTP body snapshots. We only save a small snapshot of the
|
|
|
|
// body to keep measurements lean, since we're mostly interested
|
|
|
|
// in TLS interference nowadays and much less in full bodies.
|
|
|
|
const DefaultHTTPMaxBodySnapshotSize = 1 << 11
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
|
2022-01-04 13:20:48 +01:00
|
|
|
// httpMaxBodySnapshotSize selects the maximum body snapshot size.
|
|
|
|
func (mx *Measurer) httpMaxBodySnapshotSize() int64 {
|
|
|
|
if mx.HTTPMaxBodySnapshotSize > 0 {
|
|
|
|
return mx.HTTPMaxBodySnapshotSize
|
|
|
|
}
|
|
|
|
return DefaultHTTPMaxBodySnapshotSize
|
|
|
|
}
|
|
|
|
|
|
|
|
// WrapHTTPTransport creates a new model.HTTPTransport instance
|
|
|
|
// using the following configuration:
|
|
|
|
//
|
|
|
|
// - begin is the conventional "zero time" indicating the
|
|
|
|
// moment when the measurement begun;
|
|
|
|
//
|
|
|
|
// - db is the writable DB into which to write the measurement;
|
|
|
|
//
|
|
|
|
// - txp is the underlying transport to use;
|
|
|
|
//
|
|
|
|
// - maxBodySnapshotSize is the max size of the response body snapshot
|
|
|
|
// to save: we'll truncate bodies larger than that.
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
func WrapHTTPTransport(
|
2022-01-04 13:20:48 +01:00
|
|
|
begin time.Time, db WritableDB, txp model.HTTPTransport,
|
|
|
|
maxBodySnapshotSize int64) *HTTPTransportDB {
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
return &HTTPTransportDB{
|
|
|
|
HTTPTransport: txp,
|
|
|
|
Begin: begin,
|
|
|
|
DB: db,
|
2022-01-04 13:20:48 +01:00
|
|
|
MaxBodySnapshotSize: maxBodySnapshotSize,
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// NewHTTPTransportWithConn creates and wraps an HTTPTransport that
|
|
|
|
// does not dial and only uses the given conn.
|
|
|
|
func (mx *Measurer) NewHTTPTransportWithConn(
|
2022-01-03 13:53:23 +01:00
|
|
|
logger model.Logger, db WritableDB, conn Conn) *HTTPTransportDB {
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
return mx.WrapHTTPTransport(db, netxlite.NewHTTPTransport(
|
|
|
|
logger, netxlite.NewSingleUseDialer(conn), netxlite.NewNullTLSDialer()))
|
|
|
|
}
|
|
|
|
|
|
|
|
// NewHTTPTransportWithTLSConn creates and wraps an HTTPTransport that
|
|
|
|
// does not dial and only uses the given conn.
|
|
|
|
func (mx *Measurer) NewHTTPTransportWithTLSConn(
|
2022-01-03 13:53:23 +01:00
|
|
|
logger model.Logger, db WritableDB, conn netxlite.TLSConn) *HTTPTransportDB {
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
return mx.WrapHTTPTransport(db, netxlite.NewHTTPTransport(
|
|
|
|
logger, netxlite.NewNullDialer(), netxlite.NewSingleUseTLSDialer(conn)))
|
|
|
|
}
|
|
|
|
|
2022-05-06 12:24:03 +02:00
|
|
|
// NewHTTPTransportWithQUICConn creates and wraps an HTTPTransport that
|
|
|
|
// does not dial and only uses the given QUIC connection.
|
|
|
|
func (mx *Measurer) NewHTTPTransportWithQUICConn(
|
|
|
|
logger model.Logger, db WritableDB, qconn quic.EarlyConnection) *HTTPTransportDB {
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
return mx.WrapHTTPTransport(db, netxlite.NewHTTP3Transport(
|
2022-05-06 12:24:03 +02:00
|
|
|
logger, netxlite.NewSingleUseQUICDialer(qconn), &tls.Config{}))
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
// HTTPTransportDB is an implementation of HTTPTransport that
|
|
|
|
// writes measurement events into a WritableDB.
|
|
|
|
//
|
|
|
|
// There are many factories to construct this data type. Otherwise,
|
|
|
|
// you can construct it manually. In which case, do not modify
|
|
|
|
// public fields during usage, since this may cause a data race.
|
|
|
|
type HTTPTransportDB struct {
|
2022-01-03 13:53:23 +01:00
|
|
|
model.HTTPTransport
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
|
|
|
|
// Begin is when we started measuring.
|
|
|
|
Begin time.Time
|
|
|
|
|
|
|
|
// DB is where to write events.
|
|
|
|
DB WritableDB
|
|
|
|
|
|
|
|
// MaxBodySnapshotSize is the maximum size of the body
|
|
|
|
// snapshot that we take during a round trip.
|
|
|
|
MaxBodySnapshotSize int64
|
|
|
|
}
|
|
|
|
|
|
|
|
// HTTPRequest is the HTTP request.
|
|
|
|
type HTTPRequest struct {
|
|
|
|
// Names consistent with df-001-http.md
|
|
|
|
Method string `json:"method"`
|
|
|
|
URL string `json:"url"`
|
|
|
|
Headers ArchivalHeaders `json:"headers"`
|
|
|
|
}
|
|
|
|
|
|
|
|
// HTTPResponse is the HTTP response.
|
|
|
|
type HTTPResponse struct {
|
|
|
|
// Names consistent with df-001-http.md
|
|
|
|
Code int64 `json:"code"`
|
|
|
|
Headers ArchivalHeaders `json:"headers"`
|
|
|
|
Body *ArchivalBinaryData `json:"body"`
|
|
|
|
BodyIsTruncated bool `json:"body_is_truncated"`
|
|
|
|
|
|
|
|
// Fields not part of the spec
|
|
|
|
BodyLength int64 `json:"x_body_length"`
|
|
|
|
BodyIsUTF8 bool `json:"x_body_is_utf8"`
|
|
|
|
}
|
|
|
|
|
|
|
|
// HTTPRoundTripEvent contains information about an HTTP round trip.
|
|
|
|
type HTTPRoundTripEvent struct {
|
2021-11-05 10:46:45 +01:00
|
|
|
Failure *string
|
|
|
|
Method string
|
|
|
|
URL string
|
|
|
|
RequestHeaders http.Header
|
|
|
|
StatusCode int64
|
|
|
|
ResponseHeaders http.Header
|
|
|
|
ResponseBody []byte
|
|
|
|
ResponseBodyLength int64
|
|
|
|
ResponseBodyIsTruncated bool
|
|
|
|
ResponseBodyIsUTF8 bool
|
|
|
|
Finished float64
|
|
|
|
Started float64
|
|
|
|
Oddity Oddity
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
func (txp *HTTPTransportDB) RoundTrip(req *http.Request) (*http.Response, error) {
|
|
|
|
started := time.Since(txp.Begin).Seconds()
|
|
|
|
resp, err := txp.HTTPTransport.RoundTrip(req)
|
|
|
|
rt := &HTTPRoundTripEvent{
|
2021-11-05 10:46:45 +01:00
|
|
|
Method: req.Method,
|
|
|
|
URL: req.URL.String(),
|
|
|
|
RequestHeaders: req.Header,
|
|
|
|
Started: started,
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
}
|
|
|
|
if err != nil {
|
|
|
|
rt.Finished = time.Since(txp.Begin).Seconds()
|
2021-11-05 10:46:45 +01:00
|
|
|
rt.Failure = NewFailure(err)
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
txp.DB.InsertIntoHTTPRoundTrip(rt)
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
switch {
|
|
|
|
case resp.StatusCode == 403:
|
|
|
|
rt.Oddity = OddityStatus403
|
|
|
|
case resp.StatusCode == 404:
|
|
|
|
rt.Oddity = OddityStatus404
|
|
|
|
case resp.StatusCode == 503:
|
|
|
|
rt.Oddity = OddityStatus503
|
|
|
|
case resp.StatusCode >= 400:
|
|
|
|
rt.Oddity = OddityStatusOther
|
|
|
|
}
|
2021-11-05 10:46:45 +01:00
|
|
|
rt.StatusCode = int64(resp.StatusCode)
|
|
|
|
rt.ResponseHeaders = resp.Header
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
r := io.LimitReader(resp.Body, txp.MaxBodySnapshotSize)
|
|
|
|
body, err := netxlite.ReadAllContext(req.Context(), r)
|
|
|
|
if err != nil {
|
|
|
|
rt.Finished = time.Since(txp.Begin).Seconds()
|
2021-11-05 10:46:45 +01:00
|
|
|
rt.Failure = NewFailure(err)
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
txp.DB.InsertIntoHTTPRoundTrip(rt)
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
resp.Body = &httpTransportBody{ // allow for reading more if needed
|
|
|
|
Reader: io.MultiReader(bytes.NewReader(body), resp.Body),
|
|
|
|
Closer: resp.Body,
|
|
|
|
}
|
2021-11-05 10:46:45 +01:00
|
|
|
rt.ResponseBody = body
|
|
|
|
rt.ResponseBodyLength = int64(len(body))
|
|
|
|
rt.ResponseBodyIsTruncated = int64(len(body)) >= txp.MaxBodySnapshotSize
|
|
|
|
rt.ResponseBodyIsUTF8 = utf8.Valid(body)
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
rt.Finished = time.Since(txp.Begin).Seconds()
|
|
|
|
txp.DB.InsertIntoHTTPRoundTrip(rt)
|
|
|
|
return resp, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
type httpTransportBody struct {
|
|
|
|
io.Reader
|
|
|
|
io.Closer
|
|
|
|
}
|
|
|
|
|
|
|
|
// NewHTTPClient creates a new HTTPClient instance that
|
|
|
|
// does not automatically perform redirects.
|
|
|
|
func NewHTTPClientWithoutRedirects(
|
2022-01-03 16:47:54 +01:00
|
|
|
db WritableDB, jar http.CookieJar, txp model.HTTPTransport) model.HTTPClient {
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
return newHTTPClient(db, jar, txp, http.ErrUseLastResponse)
|
|
|
|
}
|
|
|
|
|
|
|
|
// NewHTTPClientWithRedirects creates a new HTTPClient
|
|
|
|
// instance that automatically perform redirects.
|
|
|
|
func NewHTTPClientWithRedirects(
|
2022-01-03 16:47:54 +01:00
|
|
|
db WritableDB, jar http.CookieJar, txp model.HTTPTransport) model.HTTPClient {
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
return newHTTPClient(db, jar, txp, nil)
|
|
|
|
}
|
|
|
|
|
|
|
|
// HTTPRedirectEvent records an HTTP redirect.
|
|
|
|
type HTTPRedirectEvent struct {
|
|
|
|
// URL is the URL triggering the redirect.
|
|
|
|
URL *url.URL
|
|
|
|
|
|
|
|
// Location is the URL to which we're redirected.
|
|
|
|
Location *url.URL
|
|
|
|
|
|
|
|
// Cookies contains the cookies for Location.
|
|
|
|
Cookies []*http.Cookie
|
|
|
|
|
|
|
|
// The Error field can have three values:
|
|
|
|
//
|
|
|
|
// - nil if the redirect occurred;
|
|
|
|
//
|
|
|
|
// - ErrHTTPTooManyRedirects when we see too many redirections;
|
|
|
|
//
|
|
|
|
// - http.ErrUseLastResponse if redirections are disabled.
|
|
|
|
Error error
|
|
|
|
}
|
|
|
|
|
|
|
|
// ErrHTTPTooManyRedirects is the unexported error that the standard library
|
|
|
|
// would return when hitting too many redirects.
|
|
|
|
var ErrHTTPTooManyRedirects = errors.New("stopped after 10 redirects")
|
|
|
|
|
|
|
|
func newHTTPClient(db WritableDB, cookiejar http.CookieJar,
|
2022-01-03 16:47:54 +01:00
|
|
|
txp model.HTTPTransport, defaultErr error) model.HTTPClient {
|
2021-11-06 17:49:58 +01:00
|
|
|
return netxlite.WrapHTTPClient(&http.Client{
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
Transport: txp,
|
|
|
|
Jar: cookiejar,
|
|
|
|
CheckRedirect: func(req *http.Request, via []*http.Request) error {
|
|
|
|
err := defaultErr
|
|
|
|
if len(via) >= 10 {
|
|
|
|
err = ErrHTTPTooManyRedirects
|
|
|
|
}
|
|
|
|
db.InsertIntoHTTPRedirect(&HTTPRedirectEvent{
|
|
|
|
URL: via[0].URL, // bug in Go stdlib if we crash here
|
|
|
|
Location: req.URL,
|
|
|
|
Cookies: cookiejar.Cookies(req.URL),
|
|
|
|
Error: err,
|
|
|
|
})
|
|
|
|
return err
|
|
|
|
},
|
2021-11-06 17:49:58 +01:00
|
|
|
})
|
feat(measurex): refactored measurement library (#528)
This commit introduce a measurement library that consists of
refactored code from earlier websteps experiments.
I am not going to add tests for the time being, because this library
is still a bit in flux, as we finalize websteps.
I will soon though commit documentation explaining in detail how
to use it, which currrently is at https://github.com/ooni/probe-cli/pull/506
and adds a new directory to internal/tutorial.
The core idea of this measurement library is to allow two
measurement modes:
1. tracing, which is what we're currently doing now, and the
tutorial shows how we can rewrite the measurement part of web
connectivity with measurex using less code. Under a tracing
approach, we construct a normal http.Client that however has
tracing configured, we gather events for resolve, connect, TLS
handshake, QUIC handshake, HTTP round trip, etc. and then we
try to make sense of what happened from the events stream;
2. step-by-step, which is what websteps does, and basically
means that after each operation you immediately write into
a Measurement structure its results and immediately draw the
conclusions on what seems odd (which later may become an
anomaly if we see what the test helper measured).
This library is also such that it produces a data format
compatible with the current OONI spec.
This work is part of https://github.com/ooni/probe/issues/1733.
2021-09-30 01:24:08 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
// NewCookieJar is a convenience factory for creating an http.CookieJar
|
|
|
|
// that is aware of the effective TLS / public suffix list. This
|
|
|
|
// means that the jar won't allow a domain to set cookies for another
|
|
|
|
// unrelated domain (in the public-suffix-list sense).
|
|
|
|
func NewCookieJar() http.CookieJar {
|
|
|
|
jar, err := cookiejar.New(&cookiejar.Options{
|
|
|
|
PublicSuffixList: publicsuffix.List,
|
|
|
|
})
|
|
|
|
// Safe to PanicOnError here: cookiejar.New _always_ returns nil.
|
|
|
|
runtimex.PanicOnError(err, "cookiejar.New failed")
|
|
|
|
return jar
|
|
|
|
}
|
|
|
|
|
|
|
|
// NewHTTPRequestHeaderForMeasuring returns an http.Header where
|
|
|
|
// the headers are the ones we use for measuring.
|
|
|
|
func NewHTTPRequestHeaderForMeasuring() http.Header {
|
|
|
|
h := http.Header{}
|
|
|
|
h.Set("Accept", httpheader.Accept())
|
|
|
|
h.Set("Accept-Language", httpheader.AcceptLanguage())
|
|
|
|
h.Set("User-Agent", httpheader.UserAgent())
|
|
|
|
return h
|
|
|
|
}
|
|
|
|
|
|
|
|
// NewHTTPRequestWithContext is a convenience factory for creating
|
|
|
|
// a new HTTP request with the typical headers we use when performing
|
|
|
|
// measurements already set inside of req.Header.
|
|
|
|
func NewHTTPRequestWithContext(ctx context.Context,
|
|
|
|
method, URL string, body io.Reader) (*http.Request, error) {
|
|
|
|
req, err := http.NewRequestWithContext(ctx, method, URL, body)
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
req.Header = NewHTTPRequestHeaderForMeasuring()
|
|
|
|
return req, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// NewHTTPGetRequest is a convenience factory for creating a new
|
|
|
|
// http.Request using the GET method and the given URL.
|
|
|
|
func NewHTTPGetRequest(ctx context.Context, URL string) (*http.Request, error) {
|
|
|
|
return NewHTTPRequestWithContext(ctx, "GET", URL, nil)
|
|
|
|
}
|