Analysis

How to implement Keptn Analyses

The Keptn Metrics Operator Analysis feature allows you to validate a deployment or release using data from the observability data provider(s) that are configured for Keptn Metrics. You define the quality criteria for the analysis with SLIs and SLOs:

  • A Service Level Input (SLI) identifies the data to be analysed as a query to a data provider
  • A Service Level Objective (SLO) defines the quality criteria you define for each SLI.

You can specify multiple Service Level Objectives (SLOs) that are evaluated in your Analysis and you can weight the different analyses appropriately. At the end of the analysis, the status returns whether your objective failed, passed, or passed with a warning. This is similar to the functionality provided by the Keptn v1 Quality Gates feature.

Converters are provided to to migrate most Keptn v1 SLIs and SLOs to Keptn Analysis SLIs and SLOs. For more information,see:

The Analysis result is exposed as an OpenTelemetry metric and can be displayed on dashboard tools, such as Grafana.

Note A preliminary release of the Keptn Analysis feature is included in Keptn v0.8.3 and v0.9.0 but is hidden behind a feature flag. See the Analysis reference page for instructions to activate the preview of this feature.

Keptn Analysis basics

A Keptn Analysis is implemented with three resources:

  • AnalysisValueTemplate defines the SLI with the KeptnMetricsProvider (data source) and the query to perform for each SLI

    Each AnalysisValueTemplate resource identifies the data source and the query for the analysis of the SLI. One Analysis can use data from multiple instances of multiple types of data provider; you must define a KeptnMetricsProvider resource for each instance of each data provider you are using. The template refers to that provider and queries it.

  • AnalysisDefinition define the list of SLOs for an Analysis

    An AnalysisDefinition resource contains a list of objectives to satisfy. Each of these objectives must specify:

    • The AnalysisValueTemplate resource that contains the SLIs, defining the data provider from which to gather the data and how to compute the Analysis
    • Failure or warning target criteria
    • Whether the objective is a key objective meaning that its failure fails the Analysis
    • Weight of the objective on the overall Analysis
  • Analysis define the specific configurations and the Analysis to report.

    An Analysis resource customizes the templates defined inside an AnalysisDefinition resource by adding configuration information such as:

    • Timeframe that specifies the range to use for the corresponding query in the AnalysisValueTemplate
    • Map of key/value pairs that can be used to substitute placeholders in the AnalysisValueTemplate

Example Analysis

Consider the following Analysis resource:

apiVersion: metrics.keptn.sh/v1beta1
kind: Analysis
metadata:
  labels:
    app.kubernetes.io/name: analysis
    app.kubernetes.io/instance: analysis-sample
    app.kubernetes.io/part-of: metrics-operator
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: metrics-operator
  name: analysis-sample
spec:
  timeframe:
    recent: 5m
  args:
    project: my-project
    stage: dev
    service: svc1
    nodename: test # can be any key/value pair; NOT only project/stage/service
  analysisDefinition:
    name: ad-my-proj-dev-svc1
    namespace: keptn-lifecycle-toolkit-system

This Analysis resource:

  • Defines the timeframe for which the analysis is done as between 5 am and 10 am on the 5th of May 2023
  • Adds a few specific key-value pairs that will be substituted in the query. For instance, the query could contain the {{.nodename}} variable. The value of the args.nodename field (test) will be substituted for this string.

The AnalysisDefinition resource references this Analysis resource by its name and namespace and can be seen here:

apiVersion: metrics.keptn.sh/v1beta1
kind: AnalysisDefinition
metadata:
  name: ad-my-proj-dev-svc1
  namespace: keptn-lifecycle-toolkit-system
spec:
  objectives:
    - analysisValueTemplateRef:
        name: response-time-p95
        namespace: keptn-lifecycle-toolkit-system
      target:
        failure:
          lessThan:
            fixedValue: 600
        warning:
          inRange:
            lowBound: 300
            highBound: 500
      weight: 1
      keyObjective: false
  totalScore:
    passPercentage: 90
    warningPercentage: 75

This simple definition contains a single objective, response-time-p95. For this objective, both failure and warning criteria are defined:

  • The objective fails if the percentile 95 is less than 600
  • A warning is issued when the value is between 300 and 500

Use a Kubernetes quantity value for the value fields rather than a float. For example, use the 3m quantity rather than the equivalent 0.003 float; the float value causes Invalid value errors.

The total score shows that this Analysis should have an overall score of 90% to pass or 75% to get a warning. Since only one objective is defined, this means that the analysis either passes with 100% (response time is less than 600) or fails with 0% (slower response time).

The objective points to the corresponding AnalysisValueTemplate resource:

apiVersion: metrics.keptn.sh/v1beta1
kind: AnalysisValueTemplate
metadata:
  labels:
    app.kubernetes.io/name: analysisvaluetemplate
    app.kubernetes.io/instance: analysisvaluetemplate-sample
    app.kubernetes.io/part-of: metrics-operator
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: metrics-operator
  name: response-time-p95
  namespace: keptn-lifecycle-toolkit-system
spec:
  provider:
    name: prometheus
  query: "sum(kube_pod_container_resource_limits{node='{{.nodename}}'}) - sum(kube_node_status_capacity{node='{{.nodename}}'})"

This template defines a query to a provider called prometheus:

 sum(kube_pod_container_resource_limits{node='{{.nodename}}'}) - sum(kube_node_status_capacity{node='{{.nodename}}'})

At runtime, the metrics operator tries to substitute everything in{{.variableName}} format with a key-value pair specified in the Analysis resource, so, in this case, the query becomes:

 sum(kube_pod_container_resource_limits{node='test'}) - sum(kube_node_status_capacity{node='test'})

The other key-value pairs such as ‘project’ and ‘stage’ are just examples of how one could pass to the provider information similar to Keptn v1 objectives. For a working example you can check here.

Accessing Analysis

Retrieve KeptnMetric values with kubectl

Use the kubectl get command to retrieve all the Analysis resources in your cluster:

kubectl get analyses.metrics.keptn.sh -A

This returns something like

NAMESPACE   NAME              ANALYSISDEFINITION    STATE   WARNING   PASS
default     analysis-sample   ed-my-proj-dev-svc1

You can then describe the Analysis with:

kubectl describe analyses.metrics.keptn.sh analysis-sample -n=default