Syntax
Basic Syntax
At Keep, we view alerts as workflows, which consist of a series of steps executed in sequence, each with its own specific input and output. To keep our approach simple, Keepβs syntax is designed to closely resemble the syntax used in GitHub Actions. We believe that GitHub Actions has a well-established syntax, and there is no need to reinvent the wheel.
Full Example
title=examples/raw_sql_query_datetime.yml
# Notify if a result queried from the DB is above a certain thershold.
workflow:
id: raw-sql-query
description: Monitor that time difference is no more than 1 hour
steps:
- name: get-max-datetime
provider:
type: mysql
config: "{{ providers.mysql-prod }}"
with:
# Get max(datetime) from the random table
query: "SELECT MAX(datetime) FROM demo_table LIMIT 1"
actions:
- name: trigger-slack
condition:
- name: threshold-condition
type: threshold
# datetime_compare(t1, t2) compares t1-t2 and returns the diff in hours
# utcnow() returns the local machine datetime in UTC
# to_utc() converts a datetime to UTC
value: keep.datetime_compare(keep.utcnow(), keep.to_utc("{{ steps.this.results[0][0] }}"))
compare_to: 1 # hours
compare_type: gt # greater than
provider:
type: slack
config: " {{ providers.slack-demo }} "
with:
message: "DB datetime value ({{ actions.trigger-slack.conditions.threshold.0.compare_value }}) is greater than 1! π¨"
Breakdown π¨
Workflow
workflow:
id: raw-sql-query
description: Monitor that time difference is no more than 1 hour
steps:
-
actions:
-
Workflow
is built of:
- Metadata (id, description. owners and tags will be added soon)
steps
- list of stepsactions
- list of actionson-failure
- a conditionless action used in case of an alert failure
Provider
provider:
type: mysql
config: "{{ providers.mysql-prod }}"
with:
query: "SELECT MAX(datetime) FROM demo_table LIMIT 1"
on-failure:
retry:
count: 4
interval: 10
Provider
is built of:
type
- the type of the provider (see supported providers)config
- the provider configuration. you can either supply it explicitly or using"{{ providers.mysql-prod }}"
with
- all type-specific provider configuration. for example, formysql
we will provide the SQL query.on-failure
- handling the error when provider execution fails, it is built of:retry
- specifies the retry parameters which include:count
: maximum number of retries.interval
: duration in seconds between each retry.
Condition
- name: threshold-condition
type: threshold
value: keep.datetime_compare(keep.utcnow(), keep.to_utc("{{ steps.this.results[0][0] }}"))
compare_to: 1
compare_type: gt
Condition
is built of:
name
- a unique identifier to the conditiontype
- the type of the conditionvalue
- the value that will be supplied to the condition during the alert executioncompare_to
- whatsvalue
will be compared againstcompare_type
- all type-specific condition configuration
Steps/Actions
steps/actions:
- name: trigger-slack
condition:
- name: threshold-condition
type: threshold
value: keep.datetime_compare(keep.utcnow(), keep.to_utc("{{ steps.this.results[0][0] }}"))
compare_to: 1
compare_type: gt
provider:
type: slack
config: " {{ providers.slack-demo }} "
with:
message: "DB datetime value ({{ actions.trigger-slack.conditions.threshold.0.compare_value }}) is greater than 1! π¨"
Step/Action
is built of:
name
- the name of the action.condition
- a list of conditions thatprovider
- the provider that will trigger the action.throttle
- you can throttle the action.if
- action can be limited to when certain conditions are met.foreach
- whenforeach
block supplied, Keep will evaluate it as a list, and evaluates theaction
for every item in the list.
The provider
configuration is already covered in Providers
On-failure
on-failure:
# Just need a provider we can use to send the failure reason
provider:
type: slack
config: " {{ providers.slack-demo }} "
On-failure is actually a condtionless Action
used to notify in case the alert failed with an exception.
The provider is passed a message
(string) to itβs notify
function.