-
Notifications
You must be signed in to change notification settings - Fork 130
Instrument http add on operator #1328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
metadata: | ||
name: operator | ||
spec: | ||
replicas: 1 | ||
template: | ||
spec: | ||
containers: | ||
- name: operator | ||
env: | ||
- name: OTEL_PROM_EXPORTER_ENABLED | ||
value: "true" | ||
- name: OTEL_PROM_EXPORTER_PORT | ||
value: "2223" | ||
- name: OTEL_EXPORTER_OTLP_METRICS_ENABLED | ||
value: "true" | ||
- name: OTEL_EXPORTER_OTLP_ENDPOINT | ||
value: "http://opentelemetry-collector.open-telemetry-system:4318" | ||
- name: OTEL_METRIC_EXPORT_INTERVAL | ||
value: "1" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
apiVersion: kustomize.config.k8s.io/v1beta1 | ||
kind: Kustomization | ||
resources: | ||
- deployment.yaml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: operator-metrics | ||
spec: | ||
type: ClusterIP | ||
ports: | ||
- name: metrics | ||
protocol: TCP | ||
port: 2223 | ||
targetPort: metrics |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -60,3 +60,22 @@ Optional variables | |||||
`OTEL_EXPORTER_OTLP_TRACES_TIMEOUT` - The batcher timeout in seconds to send batch of data points (`5` by default) | ||||||
|
||||||
### Configuring Service Failover | ||||||
|
||||||
# Configuring metrics for the KEDA HTTP Add-on Operator | ||||||
|
||||||
### Exportable metrics: | ||||||
* **operator_http_scaled_object_count** - the number of http_scaled_object | ||||||
|
||||||
There are currently 2 supported methods for exposing metrics from the operator - via a Prometheus compatible metrics endpoint or by pushing metrics to a OTEL HTTP collector. | ||||||
|
||||||
### Configuring the Prometheus compatible metrics endpoint | ||||||
When configured, the operator can expose metrics on a Prometheus compatible endpoint. | ||||||
|
||||||
This endpoint can be enabled by setting the `OTEL_PROM_EXPORTER_ENABLED` environment variable to `true` on the operator deployment (`true` by default) and by setting `OTEL_PROM_EXPORTER_PORT` to an unused port for the endpoint to be made avaialble on (`2223` by default). | ||||||
|
||||||
### Configuring the OTEL HTTP exporter | ||||||
When configured, the ioperator can export metrics to a OTEL HTTP collector. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
The OTEL exporter can be enabled by setting the `OTEL_EXPORTER_OTLP_METRICS_ENABLED` environment variable to `true` on the operator deployment (`false` by default). When enabled the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable must also be configured so the exporter knows what collector to send the metrics to (e.g. http://opentelemetry-collector.open-telemetry-system:4318). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
If you need to provide any headers such as authentication details in order to utilise your OTEL collector you can add them into the `OTEL_EXPORTER_OTLP_HEADERS` environment variable. The frequency at which the metrics are exported can be configured by setting `OTEL_METRIC_EXPORT_INTERVAL` to the number of seconds you require between each export interval (`30` by default). |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -34,6 +34,7 @@ | |
httpv1alpha1 "github.com/kedacore/http-add-on/operator/apis/http/v1alpha1" | ||
"github.com/kedacore/http-add-on/operator/controllers/http/config" | ||
"github.com/kedacore/http-add-on/operator/controllers/util" | ||
"github.com/kedacore/http-add-on/operator/metrics" | ||
) | ||
|
||
// HTTPScaledObjectReconciler reconciles a HTTPScaledObject object | ||
|
@@ -98,6 +99,7 @@ | |
"DeploymentName", | ||
httpso.Name, | ||
) | ||
r.updatePromMetrics(ctx, httpso, req.Namespace) | ||
|
||
// check if ScalingMetric is correct | ||
if httpso.Spec.ScalingMetric != nil && | ||
|
@@ -162,3 +164,15 @@ | |
))). | ||
Complete(r) | ||
} | ||
|
||
func (r *HTTPScaledObjectReconciler) updatePromMetrics(ctx context.Context, scaledObject *httpv1alpha1.HTTPScaledObject, namespacedName string) { | ||
logger := log.FromContext(ctx, "updatePromMetrics", namespacedName) | ||
logger.Info("updatePromMetrics") | ||
metrics.RecordHTTPScaledObjectCount(namespacedName) | ||
} | ||
|
||
func (r *HTTPScaledObjectReconciler) updatePromMetricsOnDelete(ctx context.Context, scaledObject *httpv1alpha1.HTTPScaledObject, namespacedName string) { | ||
logger := log.FromContext(ctx, "updatePromMetricsOnDelete", namespacedName) | ||
logger.Info("updatePromMetricsOnDelete") | ||
metrics.RecordDeleteHTTPScaledObjectCount(namespacedName) | ||
} | ||
Comment on lines
+174
to
+178
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. looks like this is never called, wouldn't this result in always growing metric? |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,10 +17,14 @@ limitations under the License. | |
package main | ||
|
||
import ( | ||
"context" | ||
"flag" | ||
"fmt" | ||
"os" | ||
|
||
"github.com/go-logr/logr" | ||
kedav1alpha1 "github.com/kedacore/keda/v2/apis/keda/v1alpha1" | ||
"github.com/prometheus/client_golang/prometheus/promhttp" | ||
"k8s.io/apimachinery/pkg/runtime" | ||
utilruntime "k8s.io/apimachinery/pkg/util/runtime" | ||
clientgoscheme "k8s.io/client-go/kubernetes/scheme" | ||
|
@@ -34,6 +38,10 @@ import ( | |
httpv1alpha1 "github.com/kedacore/http-add-on/operator/apis/http/v1alpha1" | ||
httpcontrollers "github.com/kedacore/http-add-on/operator/controllers/http" | ||
"github.com/kedacore/http-add-on/operator/controllers/http/config" | ||
"github.com/kedacore/http-add-on/operator/metrics" | ||
kedahttp "github.com/kedacore/http-add-on/pkg/http" | ||
"github.com/kedacore/http-add-on/pkg/util" | ||
"golang.org/x/sync/errgroup" | ||
// +kubebuilder:scaffold:imports | ||
) | ||
|
||
|
@@ -69,7 +77,9 @@ func main() { | |
} | ||
opts.BindFlags(flag.CommandLine) | ||
flag.Parse() | ||
|
||
metricsCfg := config.MustParseMetrics() | ||
// setup the configured metrics collectors | ||
metrics.NewMetricsCollectors(metricsCfg) | ||
ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts))) | ||
|
||
externalScalerCfg, err := config.NewExternalScalerFromEnv() | ||
|
@@ -112,6 +122,22 @@ func main() { | |
os.Exit(1) | ||
} | ||
|
||
ctx := ctrl.SetupSignalHandler() | ||
ctx = util.ContextWithLogger(ctx, ctrl.Log) | ||
eg, ctx := errgroup.WithContext(ctx) | ||
|
||
if metricsCfg.OtelPrometheusExporterEnabled { | ||
// start the prometheus compatible metrics server | ||
// serves a prometheus compatible metrics endpoint on the configured port | ||
eg.Go(func() error { | ||
if err := runMetricsServer(ctx, ctrl.Log, metricsCfg); !util.IsIgnoredErr(err) { | ||
setupLog.Error(err, "could not start the Prometheus metrics server") | ||
return err | ||
} | ||
|
||
return nil | ||
}) | ||
} | ||
Comment on lines
+129
to
+140
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The operator is already exposing a metrics server in prom format. Let's use it instead of starting another one, you can register more metrics on that server, for example, that's what we do for KEDA -> https://github.com/kedacore/keda/blob/main/pkg/metricscollector/prommetrics.go#L161 |
||
if err = (&httpcontrollers.HTTPScaledObjectReconciler{ | ||
Client: mgr.GetClient(), | ||
Scheme: mgr.GetScheme(), | ||
|
@@ -134,8 +160,18 @@ func main() { | |
} | ||
|
||
setupLog.Info("starting manager") | ||
if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil { | ||
if err := mgr.Start(ctx); err != nil { | ||
setupLog.Error(err, "problem running manager") | ||
os.Exit(1) | ||
} | ||
} | ||
|
||
func runMetricsServer( | ||
ctx context.Context, | ||
lggr logr.Logger, | ||
metricsCfg *config.Metrics, | ||
) error { | ||
lggr.Info("starting the prometheus metrics server", "port", metricsCfg.OtelPrometheusExporterPort, "path", "/metrics") | ||
addr := fmt.Sprintf("0.0.0.0:%d", metricsCfg.OtelPrometheusExporterPort) | ||
return kedahttp.ServeContext(ctx, addr, promhttp.Handler(), nil) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
package metrics | ||
|
||
import ( | ||
"github.com/kedacore/http-add-on/operator/controllers/http/config" | ||
) | ||
|
||
var ( | ||
collectors []Collector | ||
) | ||
|
||
const meterName = "keda-http-add-on-operator" | ||
|
||
type Collector interface { | ||
RecordHTTPScaledObjectCount(namespace string) | ||
RecordDeleteHTTPScaledObjectCount(namespace string) | ||
} | ||
|
||
func NewMetricsCollectors(metricsConfig *config.Metrics) { | ||
if metricsConfig.OtelPrometheusExporterEnabled { | ||
promometrics := NewPrometheusMetrics() | ||
collectors = append(collectors, promometrics) | ||
} | ||
|
||
if metricsConfig.OtelHTTPExporterEnabled { | ||
otelhttpmetrics := NewOtelMetrics() | ||
collectors = append(collectors, otelhttpmetrics) | ||
} | ||
} | ||
|
||
func RecordHTTPScaledObjectCount(namespace string) { | ||
for _, collector := range collectors { | ||
collector.RecordHTTPScaledObjectCount(namespace) | ||
} | ||
} | ||
|
||
func RecordDeleteHTTPScaledObjectCount(namespace string) { | ||
for _, collector := range collectors { | ||
collector.RecordDeleteHTTPScaledObjectCount(namespace) | ||
} | ||
} |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,75 @@ | ||||||
package metrics | ||||||
|
||||||
import ( | ||||||
"context" | ||||||
"log" | ||||||
|
||||||
"go.opentelemetry.io/otel/attribute" | ||||||
"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp" | ||||||
api "go.opentelemetry.io/otel/metric" | ||||||
"go.opentelemetry.io/otel/sdk/metric" | ||||||
"go.opentelemetry.io/otel/sdk/resource" | ||||||
semconv "go.opentelemetry.io/otel/semconv/v1.4.0" | ||||||
|
||||||
"github.com/kedacore/http-add-on/pkg/build" | ||||||
) | ||||||
|
||||||
type OtelMetrics struct { | ||||||
meter api.Meter | ||||||
httpScaledObjectCounter api.Int64UpDownCounter | ||||||
} | ||||||
|
||||||
func NewOtelMetrics(options ...metric.Option) *OtelMetrics { | ||||||
ctx := context.Background() | ||||||
|
||||||
exporter, err := otlpmetrichttp.New(ctx) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Opentelemetry supports HTTP and gRCP protocols, I prefer if we support both tbh. This is how we do it in KEDA, you can use it as example https://github.com/kedacore/keda/blob/main/pkg/metricscollector/opentelemetry.go#L67-L86 |
||||||
if err != nil { | ||||||
log.Fatalf("could not create otelmetrichttp exporter: %v", err) | ||||||
} | ||||||
|
||||||
if options == nil { | ||||||
res := resource.NewWithAttributes( | ||||||
semconv.SchemaURL, | ||||||
semconv.ServiceNameKey.String("http-add-on-operator"), | ||||||
semconv.ServiceVersionKey.String(build.Version()), | ||||||
) | ||||||
|
||||||
options = []metric.Option{ | ||||||
metric.WithReader(metric.NewPeriodicReader(exporter)), | ||||||
metric.WithResource(res), | ||||||
} | ||||||
} | ||||||
|
||||||
provider := metric.NewMeterProvider(options...) | ||||||
meter := provider.Meter(meterName) | ||||||
|
||||||
httpScaledObjectCounter, err := meter.Int64UpDownCounter("operator_http_scaled_object_count", api.WithDescription("a counter of http_scaled_objects processed by the operator")) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in opentelemetry context, dots are used instead of underscores
Suggested change
|
||||||
if err != nil { | ||||||
log.Fatalf("could not create new otelhttpmetric request counter: %v", err) | ||||||
} | ||||||
|
||||||
return &OtelMetrics{ | ||||||
meter: meter, | ||||||
httpScaledObjectCounter: httpScaledObjectCounter, | ||||||
} | ||||||
} | ||||||
|
||||||
func (om *OtelMetrics) RecordHTTPScaledObjectCount(namespace string) { | ||||||
ctx := context.Background() | ||||||
opt := api.WithAttributeSet( | ||||||
attribute.NewSet( | ||||||
attribute.Key("namespace").String(namespace), | ||||||
), | ||||||
) | ||||||
om.httpScaledObjectCounter.Add(ctx, 1, opt) | ||||||
} | ||||||
|
||||||
func (om *OtelMetrics) RecordDeleteHTTPScaledObjectCount(namespace string) { | ||||||
ctx := context.Background() | ||||||
opt := api.WithAttributeSet( | ||||||
attribute.NewSet( | ||||||
attribute.Key("namespace").String(namespace), | ||||||
), | ||||||
) | ||||||
om.httpScaledObjectCounter.Add(ctx, -1, opt) | ||||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
package metrics | ||
|
||
import ( | ||
"context" | ||
"testing" | ||
|
||
"github.com/stretchr/testify/assert" | ||
"go.opentelemetry.io/otel/sdk/metric" | ||
"go.opentelemetry.io/otel/sdk/metric/metricdata" | ||
) | ||
|
||
var ( | ||
testOtel *OtelMetrics | ||
testReader metric.Reader | ||
) | ||
|
||
func init() { | ||
testReader = metric.NewManualReader() | ||
options := metric.WithReader(testReader) | ||
testOtel = NewOtelMetrics(options) | ||
} | ||
|
||
func TestHTTPScaledObjectCount(t *testing.T) { | ||
testOtel.RecordHTTPScaledObjectCount("test-namespace") | ||
got := metricdata.ResourceMetrics{} | ||
err := testReader.Collect(context.Background(), &got) | ||
|
||
assert.Nil(t, err) | ||
scopeMetrics := got.ScopeMetrics[0] | ||
assert.NotEqual(t, len(scopeMetrics.Metrics), 0) | ||
|
||
metricInfo := retrieveMetric(scopeMetrics.Metrics, "operator_http_scaled_object_count") | ||
data := metricInfo.Data.(metricdata.Sum[int64]).DataPoints[0] | ||
assert.Equal(t, data.Value, int64(1)) | ||
} | ||
|
||
func retrieveMetric(metrics []metricdata.Metrics, metricname string) *metricdata.Metrics { | ||
for _, m := range metrics { | ||
if m.Name == metricname { | ||
return &m | ||
} | ||
} | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this? Personally, I'd prefer to not change an already existing port