Description
When using asynchronous instruments (Float64ObservableCounter, Int64ObservableCounter, Float64ObservableGauge, etc.) with WithFloat64Callback / WithInt64Callback, and one callback returns an error, all collected metric data for that collection cycle is silently dropped — including data successfully observed by other callbacks.
Environment
- OS: Linux
- Architecture: x86_64
- Go Version: 1.26.1
- opentelemetry-go version: v1.42.0
Steps To Reproduce
package main
import (
"context"
"errors"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/stdout/stdoutmetric"
"go.opentelemetry.io/otel/sdk/metric"
)
func main() {
ctx := context.Background()
exporter, _ := stdoutmetric.New()
mp := metric.NewMeterProvider(
metric.WithReader(metric.NewPeriodicReader(exporter)),
)
defer mp.Shutdown(ctx)
otel.SetMeterProvider(mp)
meter := otel.Meter("example")
meter.Float64ObservableCounter(
"metric.one",
metric.WithFloat64Callback(func(_ context.Context, o metric.Float64Observer) error {
return errors.New("something went wrong") // this callback fails
}),
)
meter.Float64ObservableCounter(
"metric.two",
metric.WithFloat64Callback(func(_ context.Context, o metric.Float64Observer) error {
o.Observe(42.0) // this callback succeeds
return nil
}),
)
}
- Run the above program.
- Observe that
metric.two is not exported, even though its callback succeeded.
Expected behavior
When one callback returns an error, the error should be reported (e.g. logged via the global error handler), but metric data collected by other callbacks that succeeded should still be exported.
Additional information
The root cause is a mismatch in how errors are handled across the call chain collectAndExport → collect → produce.
pipeline.produce is intentionally designed to run all callbacks even if some return errors, accumulating errors with errors.Join, and writes collected data into rm unconditionally:
// sdk/metric/pipeline.go
var err error
for _, c := range p.callbacks {
if e := c(ctx); e != nil {
err = errors.Join(err, e) // loop continues
}
}
// ... aggregated data is written into rm regardless of err ...
return err
However, collect propagates this error immediately without giving callers a chance to use the already-populated rm:
// sdk/metric/periodic_reader.go
err = ph.produce(ctx, rm)
if err != nil {
return err // rm contains valid data, but caller won't see it
}
And collectAndExport skips export entirely when Collect returns a non-nil error:
// sdk/metric/periodic_reader.go
err := r.Collect(ctx, rm)
if err == nil { // export is skipped when any callback failed
err = r.export(ctx, rm)
}
As a result, produce's best-effort design — collecting as much data as possible — is made pointless by the layers above it.
A possible fix would be to always call export regardless of the error from Collect:
err := r.Collect(ctx, rm)
exportErr := r.export(ctx, rm)
return errors.Join(err, exportErr)
Tip: React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.
Description
When using asynchronous instruments (
Float64ObservableCounter,Int64ObservableCounter,Float64ObservableGauge, etc.) withWithFloat64Callback/WithInt64Callback, and one callback returns an error, all collected metric data for that collection cycle is silently dropped — including data successfully observed by other callbacks.Environment
Steps To Reproduce
metric.twois not exported, even though its callback succeeded.Expected behavior
When one callback returns an error, the error should be reported (e.g. logged via the global error handler), but metric data collected by other callbacks that succeeded should still be exported.
Additional information
The root cause is a mismatch in how errors are handled across the call chain
collectAndExport→collect→produce.pipeline.produceis intentionally designed to run all callbacks even if some return errors, accumulating errors witherrors.Join, and writes collected data intormunconditionally:However,
collectpropagates this error immediately without giving callers a chance to use the already-populatedrm:And
collectAndExportskipsexportentirely whenCollectreturns a non-nil error:As a result,
produce's best-effort design — collecting as much data as possible — is made pointless by the layers above it.A possible fix would be to always call
exportregardless of the error fromCollect:Tip: React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding
+1orme too, to help us triage it. Learn more here.