Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ This changelog keeps track of work items that have been completed and are ready
### Improvements

- **Interceptor**: Support HTTPScaledObject scoped timeout ([#813](https://github.com/kedacore/http-add-on/issues/813))
- **General**: Add prometehus and otel instrumentation for the operator ([#965](https://github.com/kedacore/http-add-on/issues/965))

### Fixes

Expand Down
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,9 @@ deploy: manifests kustomize ## Deploy to the K8s cluster specified in ~/.kube/co
cd config/operator && \
$(KUSTOMIZE) edit set image ghcr.io/kedacore/http-add-on-operator=${IMAGE_OPERATOR_VERSIONED_TAG}

cd config/operator && \
$(KUSTOMIZE) edit add patch --path e2e-test/otel/deployment.yaml --group apps --kind Deployment --name operator --version v1

$(KUSTOMIZE) build config/default | kubectl apply -f -

undeploy:
Expand Down
2 changes: 1 addition & 1 deletion config/operator/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ spec:
value: ""
ports:
- name: metrics
containerPort: 8080
containerPort: 2223
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this? Personally, I'd prefer to not change an already existing port

- name: probes
containerPort: 8081
livenessProbe:
Expand Down
21 changes: 21 additions & 0 deletions config/operator/e2e-test/otel/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: operator
spec:
replicas: 1
template:
spec:
containers:
- name: operator
env:
- name: OTEL_PROM_EXPORTER_ENABLED
value: "true"
- name: OTEL_PROM_EXPORTER_PORT
value: "2223"
- name: OTEL_EXPORTER_OTLP_METRICS_ENABLED
value: "true"
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://opentelemetry-collector.open-telemetry-system:4318"
- name: OTEL_METRIC_EXPORT_INTERVAL
value: "1"
4 changes: 4 additions & 0 deletions config/operator/e2e-test/otel/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
1 change: 1 addition & 0 deletions config/operator/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ resources:
- role.yaml
- role_binding.yaml
- service_account.yaml
- metrics.service.yaml
labels:
- includeSelectors: true
includeTemplates: true
Expand Down
11 changes: 11 additions & 0 deletions config/operator/metrics.service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: v1
kind: Service
metadata:
name: operator-metrics
spec:
type: ClusterIP
ports:
- name: metrics
protocol: TCP
port: 2223
targetPort: metrics
19 changes: 19 additions & 0 deletions docs/operate.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,22 @@ Optional variables
`OTEL_EXPORTER_OTLP_TRACES_TIMEOUT` - The batcher timeout in seconds to send batch of data points (`5` by default)

### Configuring Service Failover

# Configuring metrics for the KEDA HTTP Add-on Operator

### Exportable metrics:
* **operator_http_scaled_object_count** - the number of http_scaled_object

There are currently 2 supported methods for exposing metrics from the operator - via a Prometheus compatible metrics endpoint or by pushing metrics to a OTEL HTTP collector.

### Configuring the Prometheus compatible metrics endpoint
When configured, the operator can expose metrics on a Prometheus compatible endpoint.

This endpoint can be enabled by setting the `OTEL_PROM_EXPORTER_ENABLED` environment variable to `true` on the operator deployment (`true` by default) and by setting `OTEL_PROM_EXPORTER_PORT` to an unused port for the endpoint to be made avaialble on (`2223` by default).

### Configuring the OTEL HTTP exporter
When configured, the ioperator can export metrics to a OTEL HTTP collector.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When configured, the ioperator can export metrics to a OTEL HTTP collector.
When configured, the operator can export metrics to a OTEL HTTP collector.


The OTEL exporter can be enabled by setting the `OTEL_EXPORTER_OTLP_METRICS_ENABLED` environment variable to `true` on the operator deployment (`false` by default). When enabled the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable must also be configured so the exporter knows what collector to send the metrics to (e.g. http://opentelemetry-collector.open-telemetry-system:4318).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The OTEL exporter can be enabled by setting the `OTEL_EXPORTER_OTLP_METRICS_ENABLED` environment variable to `true` on the operator deployment (`false` by default). When enabled the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable must also be configured so the exporter knows what collector to send the metrics to (e.g. http://opentelemetry-collector.open-telemetry-system:4318).
The OTEL exporter can be enabled by setting the `OTEL_EXPORTER_OTLP_METRICS_ENABLED` environment variable to `true` on the operator deployment (`false` by default). When enabled, the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable must also be configured so the exporter knows what collector to send the metrics to (e.g. http://opentelemetry-collector.open-telemetry-system:4318).


If you need to provide any headers such as authentication details in order to utilise your OTEL collector you can add them into the `OTEL_EXPORTER_OTLP_HEADERS` environment variable. The frequency at which the metrics are exported can be configured by setting `OTEL_METRIC_EXPORT_INTERVAL` to the number of seconds you require between each export interval (`30` by default).
18 changes: 18 additions & 0 deletions operator/controllers/http/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,21 @@ func NewExternalScalerFromEnv() (*ExternalScaler, error) {
Port: port,
}, nil
}

// Metrics is the configuration for configuring metrics in the interceptor.
type Metrics struct {
// Sets whether or not to enable the Prometheus metrics exporter
OtelPrometheusExporterEnabled bool `envconfig:"OTEL_PROM_EXPORTER_ENABLED" default:"true"`
// Sets the port which the Prometheus compatible metrics endpoint should be served on
OtelPrometheusExporterPort int `envconfig:"OTEL_PROM_EXPORTER_PORT" default:"2223"`
// Sets whether or not to enable the OTEL metrics exporter
OtelHTTPExporterEnabled bool `envconfig:"OTEL_EXPORTER_OTLP_METRICS_ENABLED" default:"false"`
}

// Parse parses standard configs using envconfig and returns a pointer to the
// newly created config. Returns nil and a non-nil error if parsing failed
func MustParseMetrics() *Metrics {
ret := new(Metrics)
envconfig.MustProcess("", ret)
return ret
}
14 changes: 14 additions & 0 deletions operator/controllers/http/httpscaledobject_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
httpv1alpha1 "github.com/kedacore/http-add-on/operator/apis/http/v1alpha1"
"github.com/kedacore/http-add-on/operator/controllers/http/config"
"github.com/kedacore/http-add-on/operator/controllers/util"
"github.com/kedacore/http-add-on/operator/metrics"
)

// HTTPScaledObjectReconciler reconciles a HTTPScaledObject object
Expand Down Expand Up @@ -98,6 +99,7 @@
"DeploymentName",
httpso.Name,
)
r.updatePromMetrics(ctx, httpso, req.Namespace)

// check if ScalingMetric is correct
if httpso.Spec.ScalingMetric != nil &&
Expand Down Expand Up @@ -162,3 +164,15 @@
))).
Complete(r)
}

func (r *HTTPScaledObjectReconciler) updatePromMetrics(ctx context.Context, scaledObject *httpv1alpha1.HTTPScaledObject, namespacedName string) {

Check failure on line 168 in operator/controllers/http/httpscaledobject_controller.go

View workflow job for this annotation

GitHub Actions / Static Checks

unused-parameter: parameter 'scaledObject' seems to be unused, consider removing or renaming it as _ (revive)
logger := log.FromContext(ctx, "updatePromMetrics", namespacedName)
logger.Info("updatePromMetrics")
metrics.RecordHTTPScaledObjectCount(namespacedName)
}

func (r *HTTPScaledObjectReconciler) updatePromMetricsOnDelete(ctx context.Context, scaledObject *httpv1alpha1.HTTPScaledObject, namespacedName string) {

Check failure on line 174 in operator/controllers/http/httpscaledobject_controller.go

View workflow job for this annotation

GitHub Actions / Static Checks

unused-parameter: parameter 'scaledObject' seems to be unused, consider removing or renaming it as _ (revive)
logger := log.FromContext(ctx, "updatePromMetricsOnDelete", namespacedName)
logger.Info("updatePromMetricsOnDelete")
metrics.RecordDeleteHTTPScaledObjectCount(namespacedName)
}
Comment on lines +174 to +178
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this is never called, wouldn't this result in always growing metric?

40 changes: 38 additions & 2 deletions operator/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,14 @@ limitations under the License.
package main

import (
"context"
"flag"
"fmt"
"os"

"github.com/go-logr/logr"
kedav1alpha1 "github.com/kedacore/keda/v2/apis/keda/v1alpha1"
"github.com/prometheus/client_golang/prometheus/promhttp"
"k8s.io/apimachinery/pkg/runtime"
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
Expand All @@ -34,6 +38,10 @@ import (
httpv1alpha1 "github.com/kedacore/http-add-on/operator/apis/http/v1alpha1"
httpcontrollers "github.com/kedacore/http-add-on/operator/controllers/http"
"github.com/kedacore/http-add-on/operator/controllers/http/config"
"github.com/kedacore/http-add-on/operator/metrics"
kedahttp "github.com/kedacore/http-add-on/pkg/http"
"github.com/kedacore/http-add-on/pkg/util"
"golang.org/x/sync/errgroup"
// +kubebuilder:scaffold:imports
)

Expand Down Expand Up @@ -69,7 +77,9 @@ func main() {
}
opts.BindFlags(flag.CommandLine)
flag.Parse()

metricsCfg := config.MustParseMetrics()
// setup the configured metrics collectors
metrics.NewMetricsCollectors(metricsCfg)
ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts)))

externalScalerCfg, err := config.NewExternalScalerFromEnv()
Expand Down Expand Up @@ -112,6 +122,22 @@ func main() {
os.Exit(1)
}

ctx := ctrl.SetupSignalHandler()
ctx = util.ContextWithLogger(ctx, ctrl.Log)
eg, ctx := errgroup.WithContext(ctx)

if metricsCfg.OtelPrometheusExporterEnabled {
// start the prometheus compatible metrics server
// serves a prometheus compatible metrics endpoint on the configured port
eg.Go(func() error {
if err := runMetricsServer(ctx, ctrl.Log, metricsCfg); !util.IsIgnoredErr(err) {
setupLog.Error(err, "could not start the Prometheus metrics server")
return err
}

return nil
})
}
Comment on lines +129 to +140
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator is already exposing a metrics server in prom format. Let's use it instead of starting another one, you can register more metrics on that server, for example, that's what we do for KEDA -> https://github.com/kedacore/keda/blob/main/pkg/metricscollector/prommetrics.go#L161

if err = (&httpcontrollers.HTTPScaledObjectReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
Expand All @@ -134,8 +160,18 @@ func main() {
}

setupLog.Info("starting manager")
if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
if err := mgr.Start(ctx); err != nil {
setupLog.Error(err, "problem running manager")
os.Exit(1)
}
}

func runMetricsServer(
ctx context.Context,
lggr logr.Logger,
metricsCfg *config.Metrics,
) error {
lggr.Info("starting the prometheus metrics server", "port", metricsCfg.OtelPrometheusExporterPort, "path", "/metrics")
addr := fmt.Sprintf("0.0.0.0:%d", metricsCfg.OtelPrometheusExporterPort)
return kedahttp.ServeContext(ctx, addr, promhttp.Handler(), nil)
}
40 changes: 40 additions & 0 deletions operator/metrics/metricscollector.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
package metrics

import (
"github.com/kedacore/http-add-on/operator/controllers/http/config"
)

var (
collectors []Collector
)

const meterName = "keda-http-add-on-operator"

type Collector interface {
RecordHTTPScaledObjectCount(namespace string)
RecordDeleteHTTPScaledObjectCount(namespace string)
}

func NewMetricsCollectors(metricsConfig *config.Metrics) {
if metricsConfig.OtelPrometheusExporterEnabled {
promometrics := NewPrometheusMetrics()
collectors = append(collectors, promometrics)
}

if metricsConfig.OtelHTTPExporterEnabled {
otelhttpmetrics := NewOtelMetrics()
collectors = append(collectors, otelhttpmetrics)
}
}

func RecordHTTPScaledObjectCount(namespace string) {
for _, collector := range collectors {
collector.RecordHTTPScaledObjectCount(namespace)
}
}

func RecordDeleteHTTPScaledObjectCount(namespace string) {
for _, collector := range collectors {
collector.RecordDeleteHTTPScaledObjectCount(namespace)
}
}
75 changes: 75 additions & 0 deletions operator/metrics/otelmetrics.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
package metrics

import (
"context"
"log"

"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp"
api "go.opentelemetry.io/otel/metric"
"go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/sdk/resource"
semconv "go.opentelemetry.io/otel/semconv/v1.4.0"

"github.com/kedacore/http-add-on/pkg/build"
)

type OtelMetrics struct {
meter api.Meter
httpScaledObjectCounter api.Int64UpDownCounter
}

func NewOtelMetrics(options ...metric.Option) *OtelMetrics {
ctx := context.Background()

exporter, err := otlpmetrichttp.New(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opentelemetry supports HTTP and gRCP protocols, I prefer if we support both tbh. This is how we do it in KEDA, you can use it as example https://github.com/kedacore/keda/blob/main/pkg/metricscollector/opentelemetry.go#L67-L86

if err != nil {
log.Fatalf("could not create otelmetrichttp exporter: %v", err)
}

if options == nil {
res := resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceNameKey.String("http-add-on-operator"),
semconv.ServiceVersionKey.String(build.Version()),
)

options = []metric.Option{
metric.WithReader(metric.NewPeriodicReader(exporter)),
metric.WithResource(res),
}
}

provider := metric.NewMeterProvider(options...)
meter := provider.Meter(meterName)

httpScaledObjectCounter, err := meter.Int64UpDownCounter("operator_http_scaled_object_count", api.WithDescription("a counter of http_scaled_objects processed by the operator"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in opentelemetry context, dots are used instead of underscores

Suggested change
httpScaledObjectCounter, err := meter.Int64UpDownCounter("operator_http_scaled_object_count", api.WithDescription("a counter of http_scaled_objects processed by the operator"))
httpScaledObjectCounter, err := meter.Int64UpDownCounter("keda.http.scaled.object.count", api.WithDescription("a counter of HttpScaledObjects processed by the operator"))

if err != nil {
log.Fatalf("could not create new otelhttpmetric request counter: %v", err)
}

return &OtelMetrics{
meter: meter,
httpScaledObjectCounter: httpScaledObjectCounter,
}
}

func (om *OtelMetrics) RecordHTTPScaledObjectCount(namespace string) {
ctx := context.Background()
opt := api.WithAttributeSet(
attribute.NewSet(
attribute.Key("namespace").String(namespace),
),
)
om.httpScaledObjectCounter.Add(ctx, 1, opt)
}

func (om *OtelMetrics) RecordDeleteHTTPScaledObjectCount(namespace string) {
ctx := context.Background()
opt := api.WithAttributeSet(
attribute.NewSet(
attribute.Key("namespace").String(namespace),
),
)
om.httpScaledObjectCounter.Add(ctx, -1, opt)
}
44 changes: 44 additions & 0 deletions operator/metrics/otelmetrics_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
package metrics

import (
"context"
"testing"

"github.com/stretchr/testify/assert"
"go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/sdk/metric/metricdata"
)

var (
testOtel *OtelMetrics
testReader metric.Reader
)

func init() {
testReader = metric.NewManualReader()
options := metric.WithReader(testReader)
testOtel = NewOtelMetrics(options)
}

func TestHTTPScaledObjectCount(t *testing.T) {
testOtel.RecordHTTPScaledObjectCount("test-namespace")
got := metricdata.ResourceMetrics{}
err := testReader.Collect(context.Background(), &got)

assert.Nil(t, err)
scopeMetrics := got.ScopeMetrics[0]
assert.NotEqual(t, len(scopeMetrics.Metrics), 0)

metricInfo := retrieveMetric(scopeMetrics.Metrics, "operator_http_scaled_object_count")
data := metricInfo.Data.(metricdata.Sum[int64]).DataPoints[0]
assert.Equal(t, data.Value, int64(1))
}

func retrieveMetric(metrics []metricdata.Metrics, metricname string) *metricdata.Metrics {
for _, m := range metrics {
if m.Name == metricname {
return &m
}
}
return nil
}
Loading
Loading