Analysis
The Keptn Metrics Operator Analysis feature allows you to validate a deployment or release using data from the observability data provider(s) that are configured for Keptn Metrics. You define the quality criteria for the analysis with SLIs and SLOs:
- A Service Level Input (SLI) identifies the data to be analysed as a query to a data provider
- A Service Level Objective (SLO) defines the quality criteria you define for each SLI.
You can specify multiple Service Level Objectives (SLOs) that are evaluated in your Analysis and you can weight the different analyses appropriately. At the end of the analysis, the status returns whether your objective failed, passed, or passed with a warning. This is similar to the functionality provided by the Keptn v1 Quality Gates feature.
Converters are provided to to migrate most Keptn v1 SLIs and SLOs to Keptn Analysis SLIs and SLOs. For more information,see:
The Analysis result is exposed as an OpenTelemetry metric and can be displayed on dashboard tools, such as Grafana.
Note A preliminary release of the Keptn Analysis feature is included in Keptn v0.8.3 and v0.9.0 but is hidden behind a feature flag. See the Analysis reference page for instructions to activate the preview of this feature.
Keptn Analysis basics
A Keptn Analysis is implemented with three resources:
-
AnalysisValueTemplate defines the SLI with the
KeptnMetricsProvider
(data source) and the query to perform for each SLIEach
AnalysisValueTemplate
resource identifies the data source and the query for the analysis of the SLI. OneAnalysis
can use data from multiple instances of multiple types of data provider; you must define a KeptnMetricsProvider resource for each instance of each data provider you are using. The template refers to that provider and queries it. -
AnalysisDefinition define the list of SLOs for an
Analysis
An
AnalysisDefinition
resource contains a list of objectives to satisfy. Each of these objectives must specify:- The
AnalysisValueTemplate
resource that contains the SLIs, defining the data provider from which to gather the data and how to compute the Analysis - Failure or warning target criteria
- Whether the objective is a key objective meaning that its failure fails the Analysis
- Weight of the objective on the overall Analysis
- The
-
Analysis define the specific configurations and the Analysis to report.
An
Analysis
resource customizes the templates defined inside anAnalysisDefinition
resource by adding configuration information such as:- Timeframe that specifies the range to use
for the corresponding query in the
AnalysisValueTemplate
- Map of key/value pairs that can be used
to substitute placeholders in the
AnalysisValueTemplate
- Timeframe that specifies the range to use
for the corresponding query in the
Example Analysis
Consider the following Analysis
resource:
apiVersion: metrics.keptn.sh/v1beta1
kind: Analysis
metadata:
labels:
app.kubernetes.io/name: analysis
app.kubernetes.io/instance: analysis-sample
app.kubernetes.io/part-of: metrics-operator
app.kubernetes.io/managed-by: kustomize
app.kubernetes.io/created-by: metrics-operator
name: analysis-sample
spec:
timeframe:
recent: 5m
args:
project: my-project
stage: dev
service: svc1
nodename: test # can be any key/value pair; NOT only project/stage/service
analysisDefinition:
name: ad-my-proj-dev-svc1
namespace: keptn-lifecycle-toolkit-system
This Analysis
resource:
- Defines the
timeframe
for which the analysis is done as between 5 am and 10 am on the 5th of May 2023 - Adds a few specific key-value pairs that will be substituted in the query.
For instance, the query could contain the
{{.nodename}}
variable. The value of theargs.nodename
field (test
) will be substituted for this string.
The AnalysisDefinition
resource references this Analysis
resource
by its name
and namespace
and can be seen here:
apiVersion: metrics.keptn.sh/v1beta1
kind: AnalysisDefinition
metadata:
name: ad-my-proj-dev-svc1
namespace: keptn-lifecycle-toolkit-system
spec:
objectives:
- analysisValueTemplateRef:
name: response-time-p95
namespace: keptn-lifecycle-toolkit-system
target:
failure:
lessThan:
fixedValue: 600
warning:
inRange:
lowBound: 300
highBound: 500
weight: 1
keyObjective: false
totalScore:
passPercentage: 90
warningPercentage: 75
This simple definition contains a single objective, response-time-p95
.
For this objective, both failure and warning criteria are defined:
- The objective fails if the percentile 95 is less than 600
- A warning is issued when the value is between 300 and 500
Use a Kubernetes
quantity
value for the value fields rather than a float
.
For example, use the 3m
quantity
rather than the equivalent 0.003
float;
the float
value causes Invalid value
errors.
The total score shows that this Analysis
should have an overall score of 90% to pass or 75% to get a warning.
Since only one objective is defined,
this means that the analysis either passes with 100%
(response time is less than 600)
or fails with 0% (slower response time).
The objective points to the corresponding AnalysisValueTemplate
resource:
apiVersion: metrics.keptn.sh/v1beta1
kind: AnalysisValueTemplate
metadata:
labels:
app.kubernetes.io/name: analysisvaluetemplate
app.kubernetes.io/instance: analysisvaluetemplate-sample
app.kubernetes.io/part-of: metrics-operator
app.kubernetes.io/managed-by: kustomize
app.kubernetes.io/created-by: metrics-operator
name: response-time-p95
namespace: keptn-lifecycle-toolkit-system
spec:
provider:
name: prometheus
query: "sum(kube_pod_container_resource_limits{node='{{.nodename}}'}) - sum(kube_node_status_capacity{node='{{.nodename}}'})"
This template defines a query to a provider called prometheus
:
sum(kube_pod_container_resource_limits{node='{{.nodename}}'}) - sum(kube_node_status_capacity{node='{{.nodename}}'})
At runtime, the metrics operator tries to substitute
everything in{{.variableName}}
format
with a key-value pair specified in the Analysis
resource,
so, in this case, the query becomes:
sum(kube_pod_container_resource_limits{node='test'}) - sum(kube_node_status_capacity{node='test'})
The other key-value pairs such as ‘project’ and ‘stage’ are just examples of how one could pass to the provider information similar to Keptn v1 objectives. For a working example you can check here.
Accessing Analysis
Retrieve KeptnMetric values with kubectl
Use the kubectl get
command to retrieve all the Analysis
resources
in your cluster:
kubectl get analyses.metrics.keptn.sh -A
This returns something like
NAMESPACE NAME ANALYSISDEFINITION STATE WARNING PASS
default analysis-sample ed-my-proj-dev-svc1
You can then describe the Analysis
with:
kubectl describe analyses.metrics.keptn.sh analysis-sample -n=default