mirror of
https://github.com/ccfos/nightingale.git
synced 2026-03-03 14:38:55 +00:00
Compare commits
28 Commits
v5.14.5-fi
...
v5
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
11152b447d | ||
|
|
9ffaefead2 | ||
|
|
307306ca8a | ||
|
|
5b5ef8346b | ||
|
|
b7cc7d6065 | ||
|
|
c77c9de5a7 | ||
|
|
52e0ba2eae | ||
|
|
9957c4e42a | ||
|
|
68ea466f59 | ||
|
|
41adc6f586 | ||
|
|
aa6daffe7b | ||
|
|
f0c28bb271 | ||
|
|
707a35bc06 | ||
|
|
88f1645d3a | ||
|
|
d8255d0cd3 | ||
|
|
fcc26f6410 | ||
|
|
6ff112f5da | ||
|
|
87899cbedb | ||
|
|
d4257d11f2 | ||
|
|
00b9c31f29 | ||
|
|
1c8c6b92a9 | ||
|
|
9c9fe800e4 | ||
|
|
9aeeaa191e | ||
|
|
e69112958b | ||
|
|
6d8317927e | ||
|
|
072f1bd51f | ||
|
|
25dbc62ff4 | ||
|
|
b233067789 |
52
README.md
52
README.md
@@ -1,7 +1,6 @@
|
||||
<p align="center">
|
||||
<a href="https://github.com/ccfos/nightingale">
|
||||
<img src="doc/img/ccf-n9e.png" alt="nightingale - cloud native monitoring" width="240" /></a>
|
||||
<p align="center">夜莺是一款开源的云原生监控系统,采用 all-in-one 的设计,提供企业级的功能特性,开箱即用的产品体验。推荐升级您的 Prometheus + AlertManager + Grafana 组合方案到夜莺</p>
|
||||
<img src="doc/img/nightingale_logo_h.png" alt="nightingale - cloud native monitoring" width="240" /></a>
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
@@ -11,14 +10,23 @@
|
||||
<a href="https://hub.docker.com/u/flashcatcloud">
|
||||
<img alt="Docker pulls" src="https://img.shields.io/docker/pulls/flashcatcloud/nightingale"/></a>
|
||||
<img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/ccfos/nightingale">
|
||||
<img alt="GitHub Repo issues" src="https://img.shields.io/github/issues/ccfos/nightingale">
|
||||
<img alt="GitHub Repo issues closed" src="https://img.shields.io/github/issues-closed/ccfos/nightingale">
|
||||
<img alt="GitHub forks" src="https://img.shields.io/github/forks/ccfos/nightingale">
|
||||
<a href="https://github.com/ccfos/nightingale/graphs/contributors">
|
||||
<img alt="GitHub contributors" src="https://img.shields.io/github/contributors-anon/ccfos/nightingale"/></a>
|
||||
<img alt="License" src="https://img.shields.io/badge/license-Apache--2.0-blue"/>
|
||||
</p>
|
||||
<p align="center">
|
||||
<b>All-in-one</b> 的开源云原生监控系统 <br/>
|
||||
<b>开箱即用</b>,集数据采集、可视化、监控告警于一体 <br/>
|
||||
推荐升级您的 <b>Prometheus + AlertManager + Grafana</b> 组合方案到夜莺!
|
||||
</p>
|
||||
|
||||
[English](./README_EN.md) | [中文](./README.md)
|
||||
|
||||
|
||||
|
||||
## Highlighted Features
|
||||
|
||||
- **开箱即用**
|
||||
@@ -26,59 +34,59 @@
|
||||
- **专业告警**
|
||||
- 可视化的告警配置和管理,支持丰富的告警规则,提供屏蔽规则、订阅规则的配置能力,支持告警多种送达渠道,支持告警自愈、告警事件管理等;
|
||||
- **云原生**
|
||||
- 以交钥匙的方式快速构建企业级的云原生监控体系,支持 [**Categraf**](https://github.com/flashcatcloud/categraf)、Telegraf、Grafana-agent 等多种采集器,支持 Prometheus、VictoriaMetrics、M3DB、ElasticSearch 等多种数据库,兼容支持导入 Grafana 仪表盘,**与云原生生态无缝集成**;
|
||||
- **高性能,高可用**
|
||||
- 以交钥匙的方式快速构建企业级的云原生监控体系,支持 [Categraf](https://github.com/flashcatcloud/categraf)、Telegraf、Grafana-agent 等多种采集器,支持 Prometheus、VictoriaMetrics、M3DB、ElasticSearch 等多种数据库,兼容支持导入 Grafana 仪表盘,**与云原生生态无缝集成**;
|
||||
- **高性能 高可用**
|
||||
- 得益于夜莺的多数据源管理引擎,和夜莺引擎侧优秀的架构设计,借助于高性能时序库,可以满足数亿时间线的采集、存储、告警分析场景,节省大量成本;
|
||||
- 夜莺监控组件均可水平扩展,无单点,已在上千家企业部署落地,经受了严苛的生产实践检验。众多互联网头部公司,夜莺集群机器达百台,处理数亿级时间线,重度使用夜莺监控;
|
||||
- **灵活扩展,中心化管理**
|
||||
- **灵活扩展 中心化管理**
|
||||
- 夜莺监控,可部署在 1 核 1G 的云主机,可在上百台机器集群化部署,可运行在 K8s 中;也可将时序库、告警引擎等组件下沉到各机房、各 Region,兼顾边缘部署和中心化统一管理,**解决数据割裂,缺乏统一视图的难题**;
|
||||
- **开放社区**
|
||||
- 托管于[中国计算机学会开源发展委员会](https://www.ccf.org.cn/kyfzwyh/),有[**快猫星云**](https://flashcat.cloud)和众多公司的持续投入,和数千名社区用户的积极参与,以及夜莺监控项目清晰明确的定位,都保证了夜莺开源社区健康、长久的发展。活跃、专业的社区用户也在持续迭代和沉淀更多的最佳实践于产品中;
|
||||
- 托管于[中国计算机学会开源发展委员会](https://www.ccf.org.cn/kyfzwyh/),有[快猫星云](https://flashcat.cloud)和众多公司的持续投入,和数千名社区用户的积极参与,以及夜莺监控项目清晰明确的定位,都保证了夜莺开源社区健康、长久的发展。活跃、专业的社区用户也在持续迭代和沉淀更多的最佳实践于产品中;
|
||||
|
||||
> 如果您在使用 Prometheus 过程中,有以下的一个或者多个需求场景,推荐您无缝升级到夜莺:
|
||||
**如果您在使用 Prometheus 过程中,有以下的一个或者多个需求场景,推荐您无缝升级到夜莺**:
|
||||
|
||||
- Prometheus、Alertmanager、Grafana 等多个系统较为割裂,缺乏统一视图,无法开箱即用;
|
||||
- 通过修改配置文件来管理 Prometheus、Alertmanager 的方式,学习曲线大,协同有难度;
|
||||
- 数据量过大而无法扩展您的 Prometheus 集群;
|
||||
- 生产环境运行多套 Prometheus 集群,面临管理和使用成本高的问题;
|
||||
|
||||
> 如果您在使用 Zabbix,有以下的场景,推荐您升级到夜莺:
|
||||
**如果您在使用 Zabbix,有以下的场景,推荐您升级到夜莺**:
|
||||
|
||||
- 监控的数据量太大,希望有更好的扩展解决方案;
|
||||
- 学习曲线高,多人多团队模式下,希望有更好的协同使用效率;
|
||||
- 微服务和云原生架构下,监控数据的生命周期多变、监控数据维度基数高,Zabbix 数据模型不易适配;
|
||||
|
||||
> 如果您在使用 [Open-Falcon](https://github.com/open-falcon/falcon-plus),我们更推荐您升级到夜莺:
|
||||
> 了解更多Zabbix和夜莺监控的对比,推荐您进一步阅读[《Zabbix 和夜莺监控选型对比》](https://flashcat.cloud/blog/zabbx-vs-nightingale/)
|
||||
|
||||
- 关于 Open-Falcon 和夜莺的详细介绍,请参考阅读:[云原生监控的十个特点和趋势](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
|
||||
**如果您在使用 [Open-Falcon](https://github.com/open-falcon/falcon-plus),我们推荐您升级到夜莺:**
|
||||
|
||||
> 我们推荐您使用 [Categraf](https://github.com/flashcatcloud/categraf) 作为首选的监控数据采集器:
|
||||
- 关于 Open-Falcon 和夜莺的详细介绍,请参考阅读:[《云原生监控的十个特点和趋势》](http://flashcat.cloud/blog/10-trends-of-cloudnative-monitoring/)
|
||||
|
||||
- [Categraf](https://github.com/flashcatcloud/categraf) 是夜莺监控的默认采集器,采用开放插件机制和 all-in-one 的设计,同时支持 metric、log、trace、event 的采集。Categraf 不仅可以采集 CPU、内存、网络等系统层面的指标,也集成了众多开源组件的采集能力,支持K8s生态。Categraf 内置了对应的仪表盘和告警规则,开箱即用。
|
||||
**我们推荐您使用 [Categraf](https://github.com/flashcatcloud/categraf) 作为首选的监控数据采集器**:
|
||||
|
||||
- [Categraf](https://github.com/flashcatcloud/categraf) 是夜莺监控的默认采集器,采用开放插件机制和 All-in-one 的设计理念,同时支持 metric、log、trace、event 的采集。Categraf 不仅可以采集 CPU、内存、网络等系统层面的指标,也集成了众多开源组件的采集能力,支持K8s生态。Categraf 内置了对应的仪表盘和告警规则,开箱即用。
|
||||
|
||||
|
||||
## Getting Started
|
||||
|
||||
- [国外文档](https://n9e.github.io/)
|
||||
- [国内文档](http://n9e.flashcat.cloud/)
|
||||
[国外文档](https://n9e.github.io/) | [国内文档](http://n9e.flashcat.cloud/)
|
||||
|
||||
## Screenshots
|
||||
|
||||
<img src="doc/img/intro.gif" width="480">
|
||||
|
||||
https://user-images.githubusercontent.com/792850/216888712-2565fcea-9df5-47bd-a49e-d60af9bd76e8.mp4
|
||||
|
||||
## Architecture
|
||||
|
||||
<img src="doc/img/arch-product.png" width="480">
|
||||
<img src="doc/img/arch-product.png" width="600">
|
||||
|
||||
夜莺监控可以接收各种采集器上报的监控数据(比如 [Categraf](https://github.com/flashcatcloud/categraf)、telegraf、grafana-agent、Prometheus),并写入多种流行的时序数据库中(可以支持Prometheus、M3DB、VictoriaMetrics、Thanos、TDEngine等),提供告警规则、屏蔽规则、订阅规则的配置能力,提供监控数据的查看能力,提供告警自愈机制(告警触发之后自动回调某个webhook地址或者执行某个脚本),提供历史告警事件的存储管理、分组查看的能力。
|
||||
|
||||
<img src="doc/img/arch-system.png" width="480">
|
||||
<img src="doc/img/arch-system.png" width="600">
|
||||
|
||||
夜莺 v5 版本的设计非常简单,核心是 server 和 webapi 两个模块,webapi 无状态,放到中心端,承接前端请求,将用户配置写入数据库;server 是告警引擎和数据转发模块,一般随着时序库走,一个时序库就对应一套 server,每套 server 可以只用一个实例,也可以多个实例组成集群,server 可以接收 Categraf、Telegraf、Grafana-Agent、Datadog-Agent、Falcon-Plugins 上报的数据,写入后端时序库,周期性从数据库同步告警规则,然后查询时序库做告警判断。每套 server 依赖一个 redis。
|
||||
|
||||
|
||||
<img src="doc/img/install-vm.png" width="480">
|
||||
<img src="doc/img/install-vm.png" width="600">
|
||||
|
||||
如果单机版本的时序数据库(比如 Prometheus) 性能有瓶颈或容灾较差,我们推荐使用 [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics),VictoriaMetrics 架构较为简单,性能优异,易于部署和运维,架构图如上。VictoriaMetrics 更详尽的文档,还请参考其[官网](https://victoriametrics.com/)。
|
||||
|
||||
@@ -96,14 +104,14 @@
|
||||
**尊重、认可和记录每一位贡献者的工作**是夜莺开源社区的第一指导原则,我们提倡**高效的提问**,这既是对开发者时间的尊重,也是对整个社区知识沉淀的贡献:
|
||||
- 提问之前请先查阅 [FAQ](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq)
|
||||
- 我们使用[GitHub Discussions](https://github.com/ccfos/nightingale/discussions)作为交流论坛,有问题可以到这里搜索、提问
|
||||
- 我们也推荐你加入微信群,和其他夜莺用户交流经验 (请先加好友:[UlricGO](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg) 备注:夜莺加群+姓名+公司)
|
||||
- 我们也推荐你加入微信群,和其他夜莺用户交流经验 (请先加好友:[picobyte](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg) 备注:夜莺加群+姓名+公司)
|
||||
|
||||
|
||||
## Who is using
|
||||
## Who is using Nightingale
|
||||
|
||||
您可以通过在 **[Who is Using Nightingale](https://github.com/ccfos/nightingale/issues/897)** 登记您的使用情况,分享您的使用经验。
|
||||
|
||||
## Stargazers
|
||||
## Stargazers over time
|
||||
[](https://starchart.cc/ccfos/nightingale)
|
||||
|
||||
## Contributors
|
||||
|
||||
BIN
doc/img/ccf-logo.png
Normal file
BIN
doc/img/ccf-logo.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 118 KiB |
BIN
doc/img/nightingale_logo_h.png
Normal file
BIN
doc/img/nightingale_logo_h.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 131 KiB |
BIN
doc/img/nightingale_logo_v.png
Normal file
BIN
doc/img/nightingale_logo_v.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 146 KiB |
@@ -36,7 +36,7 @@ services:
|
||||
- nightingale
|
||||
|
||||
prometheus:
|
||||
image: prom/prometheus
|
||||
image: prom/prometheus:v2.37.5
|
||||
container_name: prometheus
|
||||
hostname: prometheus
|
||||
restart: always
|
||||
@@ -53,7 +53,7 @@ services:
|
||||
- "--storage.tsdb.path=/prometheus"
|
||||
- "--web.console.libraries=/usr/share/prometheus/console_libraries"
|
||||
- "--web.console.templates=/usr/share/prometheus/consoles"
|
||||
- "--enable-feature=remote-write-receiver"
|
||||
- "--web.enable-remote-write-receiver"
|
||||
- "--query.lookback-delta=2m"
|
||||
|
||||
ibex:
|
||||
|
||||
@@ -246,9 +246,9 @@ CREATE TABLE `alert_rule` (
|
||||
`prom_for_duration` int not null comment 'prometheus for, unit:s',
|
||||
`prom_ql` text not null comment 'promql',
|
||||
`prom_eval_interval` int not null comment 'evaluate interval',
|
||||
`enable_stime` char(5) not null default '00:00',
|
||||
`enable_etime` char(5) not null default '23:59',
|
||||
`enable_days_of_week` varchar(32) not null default '' comment 'split by space: 0 1 2 3 4 5 6',
|
||||
`enable_stime` char(255) not null default '00:00',
|
||||
`enable_etime` char(255) not null default '23:59',
|
||||
`enable_days_of_week` varchar(255) not null default '' comment 'eg: "0 1 2 3 4 5 6 ; 0 1 2"',
|
||||
`enable_in_bg` tinyint(1) not null default 0 comment '1: only this bg 0: global',
|
||||
`notify_recovered` tinyint(1) not null comment 'whether notify when recovery',
|
||||
`notify_channels` varchar(255) not null default '' comment 'split by space: sms voice email dingtalk wecom',
|
||||
|
||||
@@ -78,7 +78,7 @@ Timeout=30000
|
||||
TemplatesDir = "./etc/template"
|
||||
NotifyConcurrency = 10
|
||||
# use builtin go code notify
|
||||
NotifyBuiltinChannels = ["email", "dingtalk", "wecom", "feishu", "mm", "telegram"]
|
||||
NotifyBuiltinChannels = ["email", "dingtalk", "wecom", "feishu", "feishucard","mm", "telegram"]
|
||||
|
||||
[Alerting.CallScript]
|
||||
# built in sending capability in go code
|
||||
|
||||
15
docker/n9eetc/template/feishucard.tpl
Normal file
15
docker/n9eetc/template/feishucard.tpl
Normal file
@@ -0,0 +1,15 @@
|
||||
{{ if .IsRecovered }}
|
||||
**告警集群:** {{.Cluster}}
|
||||
**级别状态:** S{{.Severity}} Recovered
|
||||
**告警名称:** {{.RuleName}}
|
||||
**恢复时间:** {{timeformat .LastEvalTime}}
|
||||
**告警描述:** **服务已恢复**
|
||||
{{- else }}
|
||||
**告警集群:** {{.Cluster}}
|
||||
**级别状态:** S{{.Severity}} Triggered
|
||||
**告警名称:** {{.RuleName}}
|
||||
**触发时间:** {{timeformat .TriggerTime}}
|
||||
**发送时间:** {{timestamp}}
|
||||
**触发时值:** {{.TriggerValue}}
|
||||
{{if .RuleNote }}**告警描述:** **{{.RuleNote}}**{{end}}
|
||||
{{- end -}}
|
||||
@@ -39,6 +39,11 @@ Label = "飞书机器人"
|
||||
# do not change Key
|
||||
Key = "feishu"
|
||||
|
||||
[[NotifyChannels]]
|
||||
Label = "飞书机器人消息卡片"
|
||||
# do not change Key
|
||||
Key = "feishucard"
|
||||
|
||||
[[NotifyChannels]]
|
||||
Label = "mm bot"
|
||||
# do not change Key
|
||||
|
||||
@@ -1,4 +1,6 @@
|
||||
zh:
|
||||
ip_conntrack_count: 连接跟踪表条目总数(单位:int, count)
|
||||
ip_conntrack_max: 连接跟踪表最大容量(单位:int, size)
|
||||
cpu_usage_idle: CPU空闲率(单位:%)
|
||||
cpu_usage_active: CPU使用率(单位:%)
|
||||
cpu_usage_system: CPU内核态时间占比(单位:%)
|
||||
@@ -250,6 +252,8 @@ zh:
|
||||
cloudwatch_aws_rds_write_throughput_sum: rds 写入吞吐量总和
|
||||
|
||||
en:
|
||||
ip_conntrack_count: the number of entries in the conntrack table(unit:int, count)
|
||||
ip_conntrack_max: the max capacity of the conntrack table(unit:int, size)
|
||||
cpu_usage_idle: "CPU idle rate(unit:%)"
|
||||
cpu_usage_active: "CPU usage rate(unit:%)"
|
||||
cpu_usage_system: "CPU kernel state time proportion(unit:%)"
|
||||
|
||||
92
etc/script/notify_feishu.py
Executable file
92
etc/script/notify_feishu.py
Executable file
@@ -0,0 +1,92 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: UTF-8 -*-
|
||||
import sys
|
||||
import json
|
||||
import requests
|
||||
|
||||
class Sender(object):
|
||||
@classmethod
|
||||
def send_email(cls, payload):
|
||||
# already done in go code
|
||||
pass
|
||||
|
||||
@classmethod
|
||||
def send_wecom(cls, payload):
|
||||
# already done in go code
|
||||
pass
|
||||
|
||||
@classmethod
|
||||
def send_dingtalk(cls, payload):
|
||||
# already done in go code
|
||||
pass
|
||||
|
||||
@classmethod
|
||||
def send_ifeishu(cls, payload):
|
||||
users = payload.get('event').get("notify_users_obj")
|
||||
tokens = {}
|
||||
phones = {}
|
||||
|
||||
for u in users:
|
||||
if u.get("phone"):
|
||||
phones[u.get("phone")] = 1
|
||||
|
||||
contacts = u.get("contacts")
|
||||
if contacts.get("feishu_robot_token", ""):
|
||||
tokens[contacts.get("feishu_robot_token", "")] = 1
|
||||
|
||||
headers = {
|
||||
"Content-Type": "application/json;charset=utf-8",
|
||||
"Host": "open.feishu.cn"
|
||||
}
|
||||
|
||||
for t in tokens:
|
||||
url = "https://open.feishu.cn/open-apis/bot/v2/hook/{}".format(t)
|
||||
body = {
|
||||
"msg_type": "text",
|
||||
"content": {
|
||||
"text": payload.get('tpls').get("feishu.tpl", "feishu.tpl not found")
|
||||
},
|
||||
"at": {
|
||||
"atMobiles": phones.keys(),
|
||||
"isAtAll": False
|
||||
}
|
||||
}
|
||||
|
||||
response = requests.post(url, headers=headers, data=json.dumps(body))
|
||||
print(f"notify_feishu: token={t} status_code={response.status_code} response_text={response.text}")
|
||||
|
||||
@classmethod
|
||||
def send_mm(cls, payload):
|
||||
# already done in go code
|
||||
pass
|
||||
|
||||
@classmethod
|
||||
def send_sms(cls, payload):
|
||||
pass
|
||||
|
||||
@classmethod
|
||||
def send_voice(cls, payload):
|
||||
pass
|
||||
|
||||
def main():
|
||||
payload = json.load(sys.stdin)
|
||||
with open(".payload", 'w') as f:
|
||||
f.write(json.dumps(payload, indent=4))
|
||||
for ch in payload.get('event').get('notify_channels'):
|
||||
send_func_name = "send_{}".format(ch.strip())
|
||||
if not hasattr(Sender, send_func_name):
|
||||
print("function: {} not found", send_func_name)
|
||||
continue
|
||||
send_func = getattr(Sender, send_func_name)
|
||||
send_func(payload)
|
||||
|
||||
def hello():
|
||||
print("hello nightingale")
|
||||
|
||||
if __name__ == "__main__":
|
||||
if len(sys.argv) == 1:
|
||||
main()
|
||||
elif sys.argv[1] == "hello":
|
||||
hello()
|
||||
else:
|
||||
print("I am confused")
|
||||
@@ -81,7 +81,7 @@ Timeout=30000
|
||||
TemplatesDir = "./etc/template"
|
||||
NotifyConcurrency = 10
|
||||
# use builtin go code notify
|
||||
NotifyBuiltinChannels = ["email", "dingtalk", "wecom", "feishu", "mm", "telegram"]
|
||||
NotifyBuiltinChannels = ["email", "dingtalk", "wecom", "feishu", "feishucard","mm", "telegram"]
|
||||
|
||||
[Alerting.CallScript]
|
||||
# built in sending capability in go code
|
||||
|
||||
15
etc/template/feishucard.tpl
Normal file
15
etc/template/feishucard.tpl
Normal file
@@ -0,0 +1,15 @@
|
||||
{{ if .IsRecovered }}
|
||||
**告警集群:** {{.Cluster}}
|
||||
**级别状态:** S{{.Severity}} Recovered
|
||||
**告警名称:** {{.RuleName}}
|
||||
**恢复时间:** {{timeformat .LastEvalTime}}
|
||||
**告警描述:** **服务已恢复**
|
||||
{{- else }}
|
||||
**告警集群:** {{.Cluster}}
|
||||
**级别状态:** S{{.Severity}} Triggered
|
||||
**告警名称:** {{.RuleName}}
|
||||
**触发时间:** {{timeformat .TriggerTime}}
|
||||
**发送时间:** {{timestamp}}
|
||||
**触发时值:** {{.TriggerValue}}
|
||||
{{if .RuleNote }}**告警描述:** **{{.RuleNote}}**{{end}}
|
||||
{{- end -}}
|
||||
@@ -39,6 +39,11 @@ Label = "飞书机器人"
|
||||
# do not change Key
|
||||
Key = "feishu"
|
||||
|
||||
[[NotifyChannels]]
|
||||
Label = "飞书机器人消息卡片"
|
||||
# do not change Key
|
||||
Key = "feishucard"
|
||||
|
||||
[[NotifyChannels]]
|
||||
Label = "mm bot"
|
||||
# do not change Key
|
||||
|
||||
12
go.mod
12
go.mod
@@ -6,7 +6,7 @@ require (
|
||||
github.com/coreos/go-oidc v2.2.1+incompatible
|
||||
github.com/dgrijalva/jwt-go v3.2.0+incompatible
|
||||
github.com/gin-contrib/pprof v1.3.0
|
||||
github.com/gin-gonic/gin v1.7.4
|
||||
github.com/gin-gonic/gin v1.7.7
|
||||
github.com/go-ldap/ldap/v3 v3.4.1
|
||||
github.com/go-redis/redis/v9 v9.0.0-rc.1
|
||||
github.com/gogo/protobuf v1.3.2
|
||||
@@ -24,7 +24,7 @@ require (
|
||||
github.com/prometheus/common v0.32.1
|
||||
github.com/prometheus/prometheus v2.5.0+incompatible
|
||||
github.com/tidwall/gjson v1.14.0
|
||||
github.com/toolkits/pkg v1.3.1-0.20220824084030-9f9f830a05d5
|
||||
github.com/toolkits/pkg v1.3.2
|
||||
github.com/urfave/cli/v2 v2.3.0
|
||||
golang.org/x/oauth2 v0.0.0-20210514164344-f6687ab2804c
|
||||
gopkg.in/gomail.v2 v2.0.0-20160411212932-81ebce5c23df
|
||||
@@ -74,10 +74,10 @@ require (
|
||||
github.com/tidwall/pretty v1.2.0 // indirect
|
||||
github.com/ugorji/go/codec v1.1.7 // indirect
|
||||
go.uber.org/automaxprocs v1.4.0 // indirect
|
||||
golang.org/x/crypto v0.0.0-20210817164053-32db794688a5 // indirect
|
||||
golang.org/x/net v0.0.0-20220722155237-a158d28d115b // indirect
|
||||
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f // indirect
|
||||
golang.org/x/text v0.3.7 // indirect
|
||||
golang.org/x/crypto v0.1.0 // indirect
|
||||
golang.org/x/net v0.7.0 // indirect
|
||||
golang.org/x/sys v0.5.0 // indirect
|
||||
golang.org/x/text v0.7.0 // indirect
|
||||
google.golang.org/appengine v1.6.6 // indirect
|
||||
google.golang.org/genproto v0.0.0-20211007155348-82e027067bd4 // indirect
|
||||
google.golang.org/grpc v1.41.0 // indirect
|
||||
|
||||
21
go.sum
21
go.sum
@@ -97,8 +97,9 @@ github.com/gin-contrib/pprof v1.3.0/go.mod h1:waMjT1H9b179t3CxuG1cV3DHpga6ybizwf
|
||||
github.com/gin-contrib/sse v0.1.0 h1:Y/yl/+YNO8GZSjAhjMsSuLt29uWRFHdHYUb5lYOV9qE=
|
||||
github.com/gin-contrib/sse v0.1.0/go.mod h1:RHrZQHXnP2xjPF+u1gW/2HnVO7nvIa9PG3Gm+fLHvGI=
|
||||
github.com/gin-gonic/gin v1.6.2/go.mod h1:75u5sXoLsGZoRN5Sgbi1eraJ4GU3++wFwWzhwvtwp4M=
|
||||
github.com/gin-gonic/gin v1.7.4 h1:QmUZXrvJ9qZ3GfWvQ+2wnW/1ePrTEJqPKMYEU3lD/DM=
|
||||
github.com/gin-gonic/gin v1.7.4/go.mod h1:jD2toBW3GZUr5UMcdrwQA10I7RuaFOl/SGeDjXkfUtY=
|
||||
github.com/gin-gonic/gin v1.7.7 h1:3DoBmSbJbZAWqXJC3SLjAPfutPJJRN1U5pALB7EeTTs=
|
||||
github.com/gin-gonic/gin v1.7.7/go.mod h1:axIBovoeJpVj8S3BwE0uPMTeReE4+AfFtqpqaZ1qq1U=
|
||||
github.com/go-asn1-ber/asn1-ber v1.5.1 h1:pDbRAunXzIUXfx4CB2QJFv5IuPiuoW+sWvr/Us009o8=
|
||||
github.com/go-asn1-ber/asn1-ber v1.5.1/go.mod h1:hEBeB/ic+5LoWskz+yKT7vGhhPYkProFKoKdwZRWMe0=
|
||||
github.com/go-gl/glfw v0.0.0-20190409004039-e6da0acd62b1/go.mod h1:vR7hzQXu2zJy9AVAgeJqvqgH9Q5CA+iKCZ2gyEVpxRU=
|
||||
@@ -372,8 +373,8 @@ github.com/tidwall/match v1.1.1 h1:+Ho715JplO36QYgwN9PGYNhgZvoUSc9X2c80KVTi+GA=
|
||||
github.com/tidwall/match v1.1.1/go.mod h1:eRSPERbgtNPcGhD8UCthc6PmLEQXEWd3PRB5JTxsfmM=
|
||||
github.com/tidwall/pretty v1.2.0 h1:RWIZEg2iJ8/g6fDDYzMpobmaoGh5OLl4AXtGUGPcqCs=
|
||||
github.com/tidwall/pretty v1.2.0/go.mod h1:ITEVvHYasfjBbM0u2Pg8T2nJnzm8xPwvNhhsoaGGjNU=
|
||||
github.com/toolkits/pkg v1.3.1-0.20220824084030-9f9f830a05d5 h1:kMCwr2gNHjHEVgw+uNVdiPbGadj4TekbIfrTXElZeI0=
|
||||
github.com/toolkits/pkg v1.3.1-0.20220824084030-9f9f830a05d5/go.mod h1:PvTBg/UxazPgBz6VaCM7FM7kJldjfVrsuN6k4HT/VuY=
|
||||
github.com/toolkits/pkg v1.3.2 h1:elEW//SWOO956RQymAwcxBHGBKhrvCUAfXDo8wAkmJs=
|
||||
github.com/toolkits/pkg v1.3.2/go.mod h1:PvTBg/UxazPgBz6VaCM7FM7kJldjfVrsuN6k4HT/VuY=
|
||||
github.com/ugorji/go v1.1.7/go.mod h1:kZn38zHttfInRq0xu/PH0az30d+z6vm202qpg1oXVMw=
|
||||
github.com/ugorji/go/codec v1.1.7 h1:2SvQaVZ1ouYrrKKwoSk2pzd4A9evlKJb9oTL+OaLUSs=
|
||||
github.com/ugorji/go/codec v1.1.7/go.mod h1:Ax+UKWsSmolVDwsd+7N3ZtXu+yMGCf907BLYF3GoBXY=
|
||||
@@ -415,8 +416,9 @@ golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPh
|
||||
golang.org/x/crypto v0.0.0-20201203163018-be400aefbc4c/go.mod h1:jdWPYTVW3xRLrWPugEBEK3UY2ZEsg3UU495nc5E+M+I=
|
||||
golang.org/x/crypto v0.0.0-20210616213533-5ff15b29337e/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
|
||||
golang.org/x/crypto v0.0.0-20210711020723-a769d52b0f97/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
|
||||
golang.org/x/crypto v0.0.0-20210817164053-32db794688a5 h1:HWj/xjIHfjYU5nVXpTM0s39J9CbLn7Cc5a7IC5rwsMQ=
|
||||
golang.org/x/crypto v0.0.0-20210817164053-32db794688a5/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
|
||||
golang.org/x/crypto v0.1.0 h1:MDRAIl0xIo9Io2xV565hzXHw3zVseKrJKodhohM5CjU=
|
||||
golang.org/x/crypto v0.1.0/go.mod h1:RecgLatLF4+eUMCP1PoPZQb+cVrJcOPbHkTkbkB9sbw=
|
||||
golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA=
|
||||
golang.org/x/exp v0.0.0-20190306152737-a1d7652674e8/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA=
|
||||
golang.org/x/exp v0.0.0-20190510132918-efd6b22b2522/go.mod h1:ZjyILWgesfNpC6sMxTJOJm9Kp84zZh5NQWvqDGG3Qr8=
|
||||
@@ -480,8 +482,8 @@ golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwY
|
||||
golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
|
||||
golang.org/x/net v0.0.0-20210405180319-a5a99cb37ef4/go.mod h1:p54w0d4576C0XHj96bSt6lcn1PtDYWL6XObtHCRCNQM=
|
||||
golang.org/x/net v0.0.0-20210525063256-abc453219eb5/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y=
|
||||
golang.org/x/net v0.0.0-20220722155237-a158d28d115b h1:PxfKdU9lEEDYjdIzOtC4qFWgkU2rGHdKlKowJSMN9h0=
|
||||
golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c=
|
||||
golang.org/x/net v0.7.0 h1:rJrUqqhjsgNp7KqAIc25s9pZnjU7TUcSY7HcVZjdn1g=
|
||||
golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
|
||||
golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U=
|
||||
golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
|
||||
golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
|
||||
@@ -544,8 +546,8 @@ golang.org/x/sys v0.0.0-20210510120138-977fb7262007/go.mod h1:oPkhp1MJrh7nUepCBc
|
||||
golang.org/x/sys v0.0.0-20210603081109-ebe580a85c40/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.0.0-20220114195835-da31bd327af9/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f h1:v4INt8xihDGvnrfjMDVXGxw9wrfxYyCjk0KbXjhR55s=
|
||||
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.5.0 h1:MUK/U/4lj1t1oPg0HfuXDN/Z1wv31ZJ/YcPiGccS4DU=
|
||||
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/term v0.0.0-20201117132131-f5c789dd3221/go.mod h1:Nr5EML6q2oocZ2LXRh80K7BxOlk5/8JxuGnuhpl+muw=
|
||||
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
|
||||
golang.org/x/text v0.0.0-20170915032832-14c0d48ead0c/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
|
||||
@@ -556,8 +558,9 @@ golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
|
||||
golang.org/x/text v0.3.4/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
|
||||
golang.org/x/text v0.3.5/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
|
||||
golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
|
||||
golang.org/x/text v0.3.7 h1:olpwvP2KacW1ZWvsR7uQhoyTYvKAupfQrRGBFM352Gk=
|
||||
golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
|
||||
golang.org/x/text v0.7.0 h1:4BRB4x83lYWy72KwLD/qYDuTu7q9PjSagHvijDw7cLo=
|
||||
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
|
||||
golang.org/x/time v0.0.0-20181108054448-85acf8d2951c/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
|
||||
golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
|
||||
golang.org/x/time v0.0.0-20191024005414-555d28b269f0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
|
||||
|
||||
@@ -14,45 +14,50 @@ import (
|
||||
)
|
||||
|
||||
type AlertRule struct {
|
||||
Id int64 `json:"id" gorm:"primaryKey"`
|
||||
GroupId int64 `json:"group_id"` // busi group id
|
||||
Cate string `json:"cate"` // alert rule cate (prometheus|elasticsearch)
|
||||
Cluster string `json:"cluster"` // take effect by clusters, seperated by space
|
||||
Name string `json:"name"` // rule name
|
||||
Note string `json:"note"` // will sent in notify
|
||||
Prod string `json:"prod"` // product empty means n9e
|
||||
Algorithm string `json:"algorithm"` // algorithm (''|holtwinters), empty means threshold
|
||||
AlgoParams string `json:"-" gorm:"algo_params"` // params algorithm need
|
||||
AlgoParamsJson interface{} `json:"algo_params" gorm:"-"` //
|
||||
Delay int `json:"delay"` // Time (in seconds) to delay evaluation
|
||||
Severity int `json:"severity"` // 1: Emergency 2: Warning 3: Notice
|
||||
Disabled int `json:"disabled"` // 0: enabled, 1: disabled
|
||||
PromForDuration int `json:"prom_for_duration"` // prometheus for, unit:s
|
||||
PromQl string `json:"prom_ql"` // just one ql
|
||||
PromEvalInterval int `json:"prom_eval_interval"` // unit:s
|
||||
EnableStime string `json:"enable_stime"` // e.g. 00:00
|
||||
EnableEtime string `json:"enable_etime"` // e.g. 23:59
|
||||
EnableDaysOfWeek string `json:"-"` // split by space: 0 1 2 3 4 5 6
|
||||
EnableDaysOfWeekJSON []string `json:"enable_days_of_week" gorm:"-"` // for fe
|
||||
EnableInBG int `json:"enable_in_bg"` // 0: global 1: enable one busi-group
|
||||
NotifyRecovered int `json:"notify_recovered"` // whether notify when recovery
|
||||
NotifyChannels string `json:"-"` // split by space: sms voice email dingtalk wecom
|
||||
NotifyChannelsJSON []string `json:"notify_channels" gorm:"-"` // for fe
|
||||
NotifyGroups string `json:"-"` // split by space: 233 43
|
||||
NotifyGroupsObj []UserGroup `json:"notify_groups_obj" gorm:"-"` // for fe
|
||||
NotifyGroupsJSON []string `json:"notify_groups" gorm:"-"` // for fe
|
||||
NotifyRepeatStep int `json:"notify_repeat_step"` // notify repeat interval, unit: min
|
||||
NotifyMaxNumber int `json:"notify_max_number"` // notify: max number
|
||||
RecoverDuration int64 `json:"recover_duration"` // unit: s
|
||||
Callbacks string `json:"-"` // split by space: http://a.com/api/x http://a.com/api/y'
|
||||
CallbacksJSON []string `json:"callbacks" gorm:"-"` // for fe
|
||||
RunbookUrl string `json:"runbook_url"` // sop url
|
||||
AppendTags string `json:"-"` // split by space: service=n9e mod=api
|
||||
AppendTagsJSON []string `json:"append_tags" gorm:"-"` // for fe
|
||||
CreateAt int64 `json:"create_at"`
|
||||
CreateBy string `json:"create_by"`
|
||||
UpdateAt int64 `json:"update_at"`
|
||||
UpdateBy string `json:"update_by"`
|
||||
Id int64 `json:"id" gorm:"primaryKey"`
|
||||
GroupId int64 `json:"group_id"` // busi group id
|
||||
Cate string `json:"cate"` // alert rule cate (prometheus|elasticsearch)
|
||||
Cluster string `json:"cluster"` // take effect by clusters, seperated by space
|
||||
Name string `json:"name"` // rule name
|
||||
Note string `json:"note"` // will sent in notify
|
||||
Prod string `json:"prod"` // product empty means n9e
|
||||
Algorithm string `json:"algorithm"` // algorithm (''|holtwinters), empty means threshold
|
||||
AlgoParams string `json:"-" gorm:"algo_params"` // params algorithm need
|
||||
AlgoParamsJson interface{} `json:"algo_params" gorm:"-"` //
|
||||
Delay int `json:"delay"` // Time (in seconds) to delay evaluation
|
||||
Severity int `json:"severity"` // 1: Emergency 2: Warning 3: Notice
|
||||
Disabled int `json:"disabled"` // 0: enabled, 1: disabled
|
||||
PromForDuration int `json:"prom_for_duration"` // prometheus for, unit:s
|
||||
PromQl string `json:"prom_ql"` // just one ql
|
||||
PromEvalInterval int `json:"prom_eval_interval"` // unit:s
|
||||
EnableStime string `json:"-"` // split by space: "00:00 10:00 12:00"
|
||||
EnableStimeJSON string `json:"enable_stime" gorm:"-"` // for fe
|
||||
EnableStimesJSON []string `json:"enable_stimes" gorm:"-"` // for fe
|
||||
EnableEtime string `json:"-"` // split by space: "00:00 10:00 12:00"
|
||||
EnableEtimeJSON string `json:"enable_etime" gorm:"-"` // for fe
|
||||
EnableEtimesJSON []string `json:"enable_etimes" gorm:"-"` // for fe
|
||||
EnableDaysOfWeek string `json:"-"` // eg: "0 1 2 3 4 5 6 ; 0 1 2"
|
||||
EnableDaysOfWeekJSON []string `json:"enable_days_of_week" gorm:"-"` // for fe
|
||||
EnableDaysOfWeeksJSON [][]string `json:"enable_days_of_weeks" gorm:"-"` // for fe
|
||||
EnableInBG int `json:"enable_in_bg"` // 0: global 1: enable one busi-group
|
||||
NotifyRecovered int `json:"notify_recovered"` // whether notify when recovery
|
||||
NotifyChannels string `json:"-"` // split by space: sms voice email dingtalk wecom
|
||||
NotifyChannelsJSON []string `json:"notify_channels" gorm:"-"` // for fe
|
||||
NotifyGroups string `json:"-"` // split by space: 233 43
|
||||
NotifyGroupsObj []UserGroup `json:"notify_groups_obj" gorm:"-"` // for fe
|
||||
NotifyGroupsJSON []string `json:"notify_groups" gorm:"-"` // for fe
|
||||
NotifyRepeatStep int `json:"notify_repeat_step"` // notify repeat interval, unit: min
|
||||
NotifyMaxNumber int `json:"notify_max_number"` // notify: max number
|
||||
RecoverDuration int64 `json:"recover_duration"` // unit: s
|
||||
Callbacks string `json:"-"` // split by space: http://a.com/api/x http://a.com/api/y'
|
||||
CallbacksJSON []string `json:"callbacks" gorm:"-"` // for fe
|
||||
RunbookUrl string `json:"runbook_url"` // sop url
|
||||
AppendTags string `json:"-"` // split by space: service=n9e mod=api
|
||||
AppendTagsJSON []string `json:"append_tags" gorm:"-"` // for fe
|
||||
CreateAt int64 `json:"create_at"`
|
||||
CreateBy string `json:"create_by"`
|
||||
UpdateAt int64 `json:"update_at"`
|
||||
UpdateBy string `json:"update_by"`
|
||||
}
|
||||
|
||||
func (ar *AlertRule) TableName() string {
|
||||
@@ -224,7 +229,30 @@ func (ar *AlertRule) FillNotifyGroups(cache map[int64]*UserGroup) error {
|
||||
}
|
||||
|
||||
func (ar *AlertRule) FE2DB() error {
|
||||
ar.EnableDaysOfWeek = strings.Join(ar.EnableDaysOfWeekJSON, " ")
|
||||
if len(ar.EnableStimesJSON) > 0 {
|
||||
ar.EnableStime = strings.Join(ar.EnableStimesJSON, " ")
|
||||
ar.EnableEtime = strings.Join(ar.EnableEtimesJSON, " ")
|
||||
} else {
|
||||
ar.EnableStime = ar.EnableStimeJSON
|
||||
ar.EnableEtime = ar.EnableEtimeJSON
|
||||
}
|
||||
|
||||
if len(ar.EnableDaysOfWeeksJSON) > 0 {
|
||||
for i := 0; i < len(ar.EnableDaysOfWeeksJSON); i++ {
|
||||
if len(ar.EnableDaysOfWeeksJSON) == 1 {
|
||||
ar.EnableDaysOfWeek = strings.Join(ar.EnableDaysOfWeeksJSON[i], " ")
|
||||
} else {
|
||||
if i == len(ar.EnableDaysOfWeeksJSON)-1 {
|
||||
ar.EnableDaysOfWeek += strings.Join(ar.EnableDaysOfWeeksJSON[i], " ")
|
||||
} else {
|
||||
ar.EnableDaysOfWeek += strings.Join(ar.EnableDaysOfWeeksJSON[i], " ") + ";"
|
||||
}
|
||||
}
|
||||
}
|
||||
} else {
|
||||
ar.EnableDaysOfWeek = strings.Join(ar.EnableDaysOfWeekJSON, " ")
|
||||
}
|
||||
|
||||
ar.NotifyChannels = strings.Join(ar.NotifyChannelsJSON, " ")
|
||||
ar.NotifyGroups = strings.Join(ar.NotifyGroupsJSON, " ")
|
||||
ar.Callbacks = strings.Join(ar.CallbacksJSON, " ")
|
||||
@@ -239,7 +267,21 @@ func (ar *AlertRule) FE2DB() error {
|
||||
}
|
||||
|
||||
func (ar *AlertRule) DB2FE() {
|
||||
ar.EnableDaysOfWeekJSON = strings.Fields(ar.EnableDaysOfWeek)
|
||||
ar.EnableStimesJSON = strings.Fields(ar.EnableStime)
|
||||
ar.EnableEtimesJSON = strings.Fields(ar.EnableEtime)
|
||||
if len(ar.EnableEtimesJSON) > 0 {
|
||||
ar.EnableStimeJSON = ar.EnableStimesJSON[0]
|
||||
ar.EnableEtimeJSON = ar.EnableEtimesJSON[0]
|
||||
}
|
||||
|
||||
cache := strings.Split(ar.EnableDaysOfWeek, ";")
|
||||
for i := 0; i < len(cache); i++ {
|
||||
ar.EnableDaysOfWeeksJSON = append(ar.EnableDaysOfWeeksJSON, strings.Fields(cache[i]))
|
||||
}
|
||||
if len(ar.EnableDaysOfWeeksJSON) > 0 {
|
||||
ar.EnableDaysOfWeekJSON = ar.EnableDaysOfWeeksJSON[0]
|
||||
}
|
||||
|
||||
ar.NotifyChannelsJSON = strings.Fields(ar.NotifyChannels)
|
||||
ar.NotifyGroupsJSON = strings.Fields(ar.NotifyGroups)
|
||||
ar.CallbacksJSON = strings.Fields(ar.Callbacks)
|
||||
|
||||
@@ -250,3 +250,30 @@ func AlertSubscribeGetsByCluster(cluster string) ([]*AlertSubscribe, error) {
|
||||
}
|
||||
return slst, err
|
||||
}
|
||||
|
||||
func (s *AlertSubscribe) MatchCluster(cluster string) bool {
|
||||
if s.Cluster == ClusterAll {
|
||||
return true
|
||||
}
|
||||
clusters := strings.Fields(s.Cluster)
|
||||
for _, c := range clusters {
|
||||
if c == cluster {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func (s *AlertSubscribe) ModifyEvent(event *AlertCurEvent) {
|
||||
if s.RedefineSeverity == 1 {
|
||||
event.Severity = s.NewSeverity
|
||||
}
|
||||
|
||||
if s.RedefineChannels == 1 {
|
||||
event.NotifyChannels = s.NewChannels
|
||||
event.NotifyChannelsJSON = strings.Fields(s.NewChannels)
|
||||
}
|
||||
|
||||
event.NotifyGroups = s.UserGroupIds
|
||||
event.NotifyGroupsJSON = strings.Fields(s.UserGroupIds)
|
||||
}
|
||||
|
||||
@@ -7,6 +7,8 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/pkg/errors"
|
||||
"github.com/tidwall/gjson"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
"github.com/toolkits/pkg/slice"
|
||||
"github.com/toolkits/pkg/str"
|
||||
"gorm.io/gorm"
|
||||
@@ -16,6 +18,22 @@ import (
|
||||
"github.com/didi/nightingale/v5/src/webapi/config"
|
||||
)
|
||||
|
||||
const (
|
||||
Dingtalk = "dingtalk"
|
||||
Wecom = "wecom"
|
||||
Feishu = "feishu"
|
||||
FeishuCard = "feishucard"
|
||||
Mm = "mm"
|
||||
Telegram = "telegram"
|
||||
Email = "email"
|
||||
|
||||
DingtalkKey = "dingtalk_robot_token"
|
||||
WecomKey = "wecom_robot_token"
|
||||
FeishuKey = "feishu_robot_token"
|
||||
MmKey = "mm_webhook_url"
|
||||
TelegramKey = "telegram_robot_token"
|
||||
)
|
||||
|
||||
type User struct {
|
||||
Id int64 `json:"id" gorm:"primaryKey"`
|
||||
Username string `json:"username"`
|
||||
@@ -546,3 +564,57 @@ func (u *User) UserGroups(limit int, query string) ([]UserGroup, error) {
|
||||
err = session.Where("name like ?", "%"+query+"%").Find(&lst).Error
|
||||
return lst, err
|
||||
}
|
||||
|
||||
func (u *User) ExtractToken(key string) (string, bool) {
|
||||
bs, err := u.Contacts.MarshalJSON()
|
||||
if err != nil {
|
||||
logger.Errorf("handle_notice: failed to marshal contacts: %v", err)
|
||||
return "", false
|
||||
}
|
||||
|
||||
switch key {
|
||||
case Dingtalk:
|
||||
ret := gjson.GetBytes(bs, DingtalkKey)
|
||||
return ret.String(), ret.Exists()
|
||||
case Wecom:
|
||||
ret := gjson.GetBytes(bs, WecomKey)
|
||||
return ret.String(), ret.Exists()
|
||||
case Feishu:
|
||||
ret := gjson.GetBytes(bs, FeishuKey)
|
||||
return ret.String(), ret.Exists()
|
||||
case FeishuCard:
|
||||
ret := gjson.GetBytes(bs, FeishuKey)
|
||||
return ret.String(), ret.Exists()
|
||||
case Mm:
|
||||
ret := gjson.GetBytes(bs, MmKey)
|
||||
return ret.String(), ret.Exists()
|
||||
case Telegram:
|
||||
ret := gjson.GetBytes(bs, TelegramKey)
|
||||
return ret.String(), ret.Exists()
|
||||
case Email:
|
||||
return u.Email, u.Email != ""
|
||||
default:
|
||||
return "", false
|
||||
}
|
||||
}
|
||||
|
||||
func (u *User) ExtractAllToken() map[string]string {
|
||||
ret := make(map[string]string)
|
||||
if u.Email != "" {
|
||||
ret[Email] = u.Email
|
||||
}
|
||||
|
||||
bs, err := u.Contacts.MarshalJSON()
|
||||
if err != nil {
|
||||
logger.Errorf("handle_notice: failed to marshal contacts: %v", err)
|
||||
return ret
|
||||
}
|
||||
|
||||
ret[Dingtalk] = gjson.GetBytes(bs, DingtalkKey).String()
|
||||
ret[Wecom] = gjson.GetBytes(bs, WecomKey).String()
|
||||
ret[Feishu] = gjson.GetBytes(bs, FeishuKey).String()
|
||||
ret[FeishuCard] = gjson.GetBytes(bs, FeishuKey).String()
|
||||
ret[Mm] = gjson.GetBytes(bs, MmKey).String()
|
||||
ret[Telegram] = gjson.GetBytes(bs, TelegramKey).String()
|
||||
return ret
|
||||
}
|
||||
|
||||
@@ -16,6 +16,7 @@ import (
|
||||
type Config struct {
|
||||
Enable bool
|
||||
SsoAddr string
|
||||
LoginPath string
|
||||
RedirectURL string
|
||||
DisplayName string
|
||||
CoverAttributes bool
|
||||
@@ -47,6 +48,7 @@ func Init(cf Config) {
|
||||
if !cf.Enable {
|
||||
return
|
||||
}
|
||||
|
||||
cli = ssoClient{}
|
||||
cli.config = cf
|
||||
cli.ssoAddr = cf.SsoAddr
|
||||
@@ -86,7 +88,24 @@ func wrapStateKey(key string) string {
|
||||
|
||||
func (cli *ssoClient) genRedirectURL(state string) string {
|
||||
var buf bytes.Buffer
|
||||
buf.WriteString(cli.ssoAddr + "login")
|
||||
|
||||
ssoAddr, err := url.Parse(cli.config.SsoAddr)
|
||||
if cli.config.LoginPath == "" {
|
||||
if strings.Contains(cli.config.SsoAddr, "p3") {
|
||||
ssoAddr.Path = "login"
|
||||
} else {
|
||||
ssoAddr.Path = "cas/login"
|
||||
}
|
||||
} else {
|
||||
ssoAddr.Path = cli.config.LoginPath
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
logger.Error(err)
|
||||
return buf.String()
|
||||
}
|
||||
|
||||
buf.WriteString(ssoAddr.String())
|
||||
v := url.Values{
|
||||
"service": {cli.callbackAddr},
|
||||
}
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
package engine
|
||||
package sender
|
||||
|
||||
import (
|
||||
"strconv"
|
||||
@@ -14,8 +14,7 @@ import (
|
||||
"github.com/didi/nightingale/v5/src/server/memsto"
|
||||
)
|
||||
|
||||
func callback(event *models.AlertCurEvent) {
|
||||
urls := strings.Fields(event.Callbacks)
|
||||
func SendCallbacks(urls []string, event *models.AlertCurEvent) {
|
||||
for _, url := range urls {
|
||||
if url == "" {
|
||||
continue
|
||||
@@ -1,20 +1,15 @@
|
||||
package sender
|
||||
|
||||
import (
|
||||
"net/url"
|
||||
"html/template"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
)
|
||||
|
||||
type DingtalkMessage struct {
|
||||
Title string
|
||||
Text string
|
||||
AtMobiles []string
|
||||
Tokens []string
|
||||
}
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
)
|
||||
|
||||
type dingtalkMarkdown struct {
|
||||
Title string `json:"title"`
|
||||
@@ -32,48 +27,91 @@ type dingtalk struct {
|
||||
At dingtalkAt `json:"at"`
|
||||
}
|
||||
|
||||
func SendDingtalk(message DingtalkMessage) {
|
||||
ats := make([]string, len(message.AtMobiles))
|
||||
for i := 0; i < len(message.AtMobiles); i++ {
|
||||
ats[i] = "@" + message.AtMobiles[i]
|
||||
type DingtalkSender struct {
|
||||
tpl *template.Template
|
||||
}
|
||||
|
||||
func (ds *DingtalkSender) Send(ctx MessageContext) {
|
||||
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
|
||||
return
|
||||
}
|
||||
|
||||
for i := 0; i < len(message.Tokens); i++ {
|
||||
u, err := url.Parse(message.Tokens[i])
|
||||
if err != nil {
|
||||
logger.Errorf("dingtalk_sender: failed to parse error=%v", err)
|
||||
}
|
||||
urls, ats := ds.extract(ctx.Users)
|
||||
if len(urls) == 0 {
|
||||
return
|
||||
}
|
||||
message := BuildTplMessage(ds.tpl, ctx.Event)
|
||||
|
||||
v, err := url.ParseQuery(u.RawQuery)
|
||||
if err != nil {
|
||||
logger.Errorf("dingtalk_sender: failed to parse query error=%v", err)
|
||||
}
|
||||
|
||||
ur := "https://oapi.dingtalk.com/robot/send?access_token=" + u.Path
|
||||
if strings.HasPrefix(message.Tokens[i], "https://") {
|
||||
ur = message.Tokens[i]
|
||||
}
|
||||
body := dingtalk{
|
||||
Msgtype: "markdown",
|
||||
Markdown: dingtalkMarkdown{
|
||||
Title: message.Title,
|
||||
Text: message.Text,
|
||||
},
|
||||
}
|
||||
|
||||
if v.Get("noat") != "1" {
|
||||
body.Markdown.Text = message.Text + " " + strings.Join(ats, " ")
|
||||
body.At = dingtalkAt{
|
||||
AtMobiles: message.AtMobiles,
|
||||
IsAtAll: false,
|
||||
for _, url := range urls {
|
||||
var body dingtalk
|
||||
// NoAt in url
|
||||
if strings.Contains(url, "noat=1") {
|
||||
body = dingtalk{
|
||||
Msgtype: "markdown",
|
||||
Markdown: dingtalkMarkdown{
|
||||
Title: ctx.Rule.Name,
|
||||
Text: message,
|
||||
},
|
||||
}
|
||||
} else {
|
||||
body = dingtalk{
|
||||
Msgtype: "markdown",
|
||||
Markdown: dingtalkMarkdown{
|
||||
Title: ctx.Rule.Name,
|
||||
Text: message + " " + strings.Join(ats, " "),
|
||||
},
|
||||
At: dingtalkAt{
|
||||
AtMobiles: ats,
|
||||
IsAtAll: false,
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
res, code, err := poster.PostJSON(ur, time.Second*5, body, 3)
|
||||
if err != nil {
|
||||
logger.Errorf("dingtalk_sender: result=fail url=%s code=%d error=%v response=%s", ur, code, err, string(res))
|
||||
} else {
|
||||
logger.Infof("dingtalk_sender: result=succ url=%s code=%d response=%s", ur, code, string(res))
|
||||
}
|
||||
ds.doSend(url, body)
|
||||
}
|
||||
}
|
||||
|
||||
func (ds *DingtalkSender) SendRaw(users []*models.User, title, message string) {
|
||||
if len(users) == 0 {
|
||||
return
|
||||
}
|
||||
urls, _ := ds.extract(users)
|
||||
body := dingtalk{
|
||||
Msgtype: "markdown",
|
||||
Markdown: dingtalkMarkdown{
|
||||
Title: title,
|
||||
Text: message,
|
||||
},
|
||||
}
|
||||
for _, url := range urls {
|
||||
ds.doSend(url, body)
|
||||
}
|
||||
}
|
||||
|
||||
// extract urls and ats from Users
|
||||
func (ds *DingtalkSender) extract(users []*models.User) ([]string, []string) {
|
||||
urls := make([]string, 0, len(users))
|
||||
ats := make([]string, 0, len(users))
|
||||
|
||||
for _, user := range users {
|
||||
if user.Phone != "" {
|
||||
ats = append(ats, "@"+user.Phone)
|
||||
}
|
||||
if token, has := user.ExtractToken(models.Dingtalk); has {
|
||||
url := token
|
||||
if !strings.HasPrefix(token, "https://") {
|
||||
url = "https://oapi.dingtalk.com/robot/send?access_token=" + token
|
||||
}
|
||||
urls = append(urls, url)
|
||||
}
|
||||
}
|
||||
return urls, ats
|
||||
}
|
||||
|
||||
func (ds *DingtalkSender) doSend(url string, body dingtalk) {
|
||||
res, code, err := poster.PostJSON(url, time.Second*5, body, 3)
|
||||
if err != nil {
|
||||
logger.Errorf("dingtalk_sender: result=fail url=%s code=%d error=%v response=%s", url, code, err, string(res))
|
||||
} else {
|
||||
logger.Infof("dingtalk_sender: result=succ url=%s code=%d response=%s", url, code, string(res))
|
||||
}
|
||||
}
|
||||
|
||||
@@ -2,15 +2,54 @@ package sender
|
||||
|
||||
import (
|
||||
"crypto/tls"
|
||||
"html/template"
|
||||
"time"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
"gopkg.in/gomail.v2"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
)
|
||||
|
||||
var mailch chan *gomail.Message
|
||||
|
||||
type EmailSender struct {
|
||||
subjectTpl *template.Template
|
||||
contentTpl *template.Template
|
||||
}
|
||||
|
||||
func (es *EmailSender) Send(ctx MessageContext) {
|
||||
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
|
||||
return
|
||||
}
|
||||
tos := es.extract(ctx.Users)
|
||||
var subject string
|
||||
|
||||
if es.subjectTpl != nil {
|
||||
subject = BuildTplMessage(es.subjectTpl, ctx.Event)
|
||||
} else {
|
||||
subject = ctx.Rule.Name
|
||||
}
|
||||
content := BuildTplMessage(es.contentTpl, ctx.Event)
|
||||
WriteEmail(subject, content, tos)
|
||||
}
|
||||
|
||||
func (es *EmailSender) SendRaw(users []*models.User, title, message string) {
|
||||
tos := es.extract(users)
|
||||
WriteEmail(title, message, tos)
|
||||
}
|
||||
|
||||
func (es *EmailSender) extract(users []*models.User) []string {
|
||||
tos := make([]string, 0, len(users))
|
||||
for _, u := range users {
|
||||
if u.Email != "" {
|
||||
tos = append(tos, u.Email)
|
||||
}
|
||||
}
|
||||
return tos
|
||||
}
|
||||
|
||||
func SendEmail(subject, content string, tos []string) {
|
||||
conf := config.C.SMTP
|
||||
|
||||
|
||||
@@ -1,18 +1,15 @@
|
||||
package sender
|
||||
|
||||
import (
|
||||
"html/template"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
)
|
||||
|
||||
type FeishuMessage struct {
|
||||
Text string
|
||||
AtMobiles []string
|
||||
Tokens []string
|
||||
}
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
)
|
||||
|
||||
type feishuContent struct {
|
||||
Text string `json:"text"`
|
||||
@@ -29,28 +26,73 @@ type feishu struct {
|
||||
At feishuAt `json:"at"`
|
||||
}
|
||||
|
||||
func SendFeishu(message FeishuMessage) {
|
||||
for i := 0; i < len(message.Tokens); i++ {
|
||||
url := "https://open.feishu.cn/open-apis/bot/v2/hook/" + message.Tokens[i]
|
||||
if strings.HasPrefix(message.Tokens[i], "https://") {
|
||||
url = message.Tokens[i]
|
||||
}
|
||||
type FeishuSender struct {
|
||||
tpl *template.Template
|
||||
}
|
||||
|
||||
func (fs *FeishuSender) Send(ctx MessageContext) {
|
||||
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
|
||||
return
|
||||
}
|
||||
urls, ats := fs.extract(ctx.Users)
|
||||
message := BuildTplMessage(fs.tpl, ctx.Event)
|
||||
for _, url := range urls {
|
||||
body := feishu{
|
||||
Msgtype: "text",
|
||||
Content: feishuContent{
|
||||
Text: message.Text,
|
||||
Text: message,
|
||||
},
|
||||
At: feishuAt{
|
||||
AtMobiles: message.AtMobiles,
|
||||
}
|
||||
if !strings.Contains(url, "noat=1") {
|
||||
body.At = feishuAt{
|
||||
AtMobiles: ats,
|
||||
IsAtAll: false,
|
||||
},
|
||||
}
|
||||
|
||||
res, code, err := poster.PostJSON(url, time.Second*5, body, 3)
|
||||
if err != nil {
|
||||
logger.Errorf("feishu_sender: result=fail url=%s code=%d error=%v response=%s", url, code, err, string(res))
|
||||
} else {
|
||||
logger.Infof("feishu_sender: result=succ url=%s code=%d response=%s", url, code, string(res))
|
||||
}
|
||||
}
|
||||
fs.doSend(url, body)
|
||||
}
|
||||
}
|
||||
|
||||
func (fs *FeishuSender) SendRaw(users []*models.User, title, message string) {
|
||||
if len(users) == 0 {
|
||||
return
|
||||
}
|
||||
urls, _ := fs.extract(users)
|
||||
body := feishu{
|
||||
Msgtype: "text",
|
||||
Content: feishuContent{
|
||||
Text: message,
|
||||
},
|
||||
}
|
||||
for _, url := range urls {
|
||||
fs.doSend(url, body)
|
||||
}
|
||||
}
|
||||
|
||||
func (fs *FeishuSender) extract(users []*models.User) ([]string, []string) {
|
||||
urls := make([]string, 0, len(users))
|
||||
ats := make([]string, 0, len(users))
|
||||
|
||||
for _, user := range users {
|
||||
if user.Phone != "" {
|
||||
ats = append(ats, user.Phone)
|
||||
}
|
||||
if token, has := user.ExtractToken(models.Feishu); has {
|
||||
url := token
|
||||
if !strings.HasPrefix(token, "https://") {
|
||||
url = "https://open.feishu.cn/open-apis/bot/v2/hook/" + token
|
||||
}
|
||||
urls = append(urls, url)
|
||||
}
|
||||
}
|
||||
return urls, ats
|
||||
}
|
||||
|
||||
func (fs *FeishuSender) doSend(url string, body feishu) {
|
||||
res, code, err := poster.PostJSON(url, time.Second*5, body, 3)
|
||||
if err != nil {
|
||||
logger.Errorf("feishu_sender: result=fail url=%s code=%d error=%v response=%s", url, code, err, string(res))
|
||||
} else {
|
||||
logger.Infof("feishu_sender: result=succ url=%s code=%d response=%s", url, code, string(res))
|
||||
}
|
||||
}
|
||||
|
||||
161
src/server/common/sender/feishu_card.go
Normal file
161
src/server/common/sender/feishu_card.go
Normal file
@@ -0,0 +1,161 @@
|
||||
package sender
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"html/template"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
)
|
||||
|
||||
type feishuCardContent struct {
|
||||
Text string `json:"text"`
|
||||
}
|
||||
|
||||
type Conf struct {
|
||||
WideScreenMode bool `json:"wide_screen_mode"`
|
||||
EnableForward bool `json:"enable_forward"`
|
||||
}
|
||||
|
||||
type Te struct {
|
||||
Content string `json:"content"`
|
||||
Tag string `json:"tag"`
|
||||
}
|
||||
|
||||
type Element struct {
|
||||
Tag string `json:"tag"`
|
||||
Text Te `json:"text"`
|
||||
Content string `json:"content"`
|
||||
Elements []Element `json:"elements"`
|
||||
}
|
||||
|
||||
type Titles struct {
|
||||
Content string `json:"content"`
|
||||
Tag string `json:"tag"`
|
||||
}
|
||||
|
||||
type Headers struct {
|
||||
Title Titles `json:"title"`
|
||||
Template string `json:"template"`
|
||||
}
|
||||
|
||||
type Cards struct {
|
||||
Config Conf `json:"config"`
|
||||
Elements []Element `json:"elements"`
|
||||
Header Headers `json:"header"`
|
||||
}
|
||||
|
||||
type feishuCard struct {
|
||||
Msgtype string `json:"msg_type"`
|
||||
Content feishuCardContent `json:"content"`
|
||||
Card Cards `json:"card"`
|
||||
}
|
||||
|
||||
type FeishuCardSender struct {
|
||||
tpl *template.Template
|
||||
}
|
||||
|
||||
func (fs *FeishuCardSender) Send(ctx MessageContext) {
|
||||
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
|
||||
return
|
||||
}
|
||||
urls, _ := fs.extract(ctx.Users)
|
||||
message := BuildTplMessage(fs.tpl, ctx.Event)
|
||||
for _, url := range urls {
|
||||
var color string
|
||||
if strings.Count(message, "Recovered") > 0 && strings.Count(message, "Triggered") > 0 {
|
||||
color = "orange"
|
||||
} else if strings.Count(message, "Recovered") > 0 {
|
||||
color = "green"
|
||||
} else {
|
||||
color = "red"
|
||||
}
|
||||
SendTitle := fmt.Sprintf("🔔 [告警提醒] - %s", ctx.Rule.Name)
|
||||
body := feishuCard{
|
||||
Msgtype: "interactive",
|
||||
Card: Cards{
|
||||
Config: Conf{
|
||||
WideScreenMode: true,
|
||||
EnableForward: true,
|
||||
},
|
||||
Header: Headers{
|
||||
Title: Titles{
|
||||
Content: SendTitle,
|
||||
Tag: "plain_text",
|
||||
},
|
||||
Template: color,
|
||||
},
|
||||
Elements: []Element{
|
||||
Element{
|
||||
Tag: "div",
|
||||
Text: Te{
|
||||
Content: message,
|
||||
Tag: "lark_md",
|
||||
},
|
||||
},
|
||||
{
|
||||
Tag: "hr",
|
||||
},
|
||||
{
|
||||
Tag: "note",
|
||||
Elements: []Element{
|
||||
{
|
||||
Content: SendTitle,
|
||||
Tag: "lark_md",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
fs.doSend(url, body)
|
||||
}
|
||||
}
|
||||
|
||||
func (fs *FeishuCardSender) SendRaw(users []*models.User, title, message string) {
|
||||
if len(users) == 0 {
|
||||
return
|
||||
}
|
||||
urls, _ := fs.extract(users)
|
||||
body := feishuCard{
|
||||
Msgtype: "text",
|
||||
Content: feishuCardContent{
|
||||
Text: message,
|
||||
},
|
||||
}
|
||||
for _, url := range urls {
|
||||
fs.doSend(url, body)
|
||||
}
|
||||
}
|
||||
|
||||
func (fs *FeishuCardSender) extract(users []*models.User) ([]string, []string) {
|
||||
urls := make([]string, 0, len(users))
|
||||
ats := make([]string, 0, len(users))
|
||||
|
||||
for _, user := range users {
|
||||
if user.Phone != "" {
|
||||
ats = append(ats, user.Phone)
|
||||
}
|
||||
if token, has := user.ExtractToken(models.FeishuCard); has {
|
||||
url := token
|
||||
if !strings.HasPrefix(token, "https://") {
|
||||
url = "https://open.feishu.cn/open-apis/bot/v2/hook/" + token
|
||||
}
|
||||
urls = append(urls, url)
|
||||
}
|
||||
}
|
||||
return urls, ats
|
||||
}
|
||||
|
||||
func (fs *FeishuCardSender) doSend(url string, body feishuCard) {
|
||||
res, code, err := poster.PostJSON(url, time.Second*5, body, 3)
|
||||
if err != nil {
|
||||
logger.Errorf("feishu_sender: result=fail url=%s code=%d error=%v response=%s", url, code, err, string(res))
|
||||
} else {
|
||||
logger.Infof("feishu_sender: result=succ url=%s code=%d response=%s", url, code, string(res))
|
||||
}
|
||||
}
|
||||
@@ -1,12 +1,15 @@
|
||||
package sender
|
||||
|
||||
import (
|
||||
"html/template"
|
||||
"net/url"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
)
|
||||
|
||||
type MatterMostMessage struct {
|
||||
@@ -20,16 +23,49 @@ type mm struct {
|
||||
Text string `json:"text"`
|
||||
}
|
||||
|
||||
func MapStrToStr(arr []string, fn func(s string) string) []string {
|
||||
var newArray = []string{}
|
||||
for _, it := range arr {
|
||||
newArray = append(newArray, fn(it))
|
||||
type MmSender struct {
|
||||
tpl *template.Template
|
||||
}
|
||||
|
||||
func (ms *MmSender) Send(ctx MessageContext) {
|
||||
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
|
||||
return
|
||||
}
|
||||
return newArray
|
||||
|
||||
urls := ms.extract(ctx.Users)
|
||||
if len(urls) == 0 {
|
||||
return
|
||||
}
|
||||
message := BuildTplMessage(ms.tpl, ctx.Event)
|
||||
|
||||
SendMM(MatterMostMessage{
|
||||
Text: message,
|
||||
Tokens: urls,
|
||||
})
|
||||
}
|
||||
|
||||
func (ms *MmSender) SendRaw(users []*models.User, title, message string) {
|
||||
urls := ms.extract(users)
|
||||
if len(urls) == 0 {
|
||||
return
|
||||
}
|
||||
SendMM(MatterMostMessage{
|
||||
Text: message,
|
||||
Tokens: urls,
|
||||
})
|
||||
}
|
||||
|
||||
func (ms *MmSender) extract(users []*models.User) []string {
|
||||
tokens := make([]string, 0, len(users))
|
||||
for _, user := range users {
|
||||
if token, has := user.ExtractToken(models.Mm); has {
|
||||
tokens = append(tokens, token)
|
||||
}
|
||||
}
|
||||
return tokens
|
||||
}
|
||||
|
||||
func SendMM(message MatterMostMessage) {
|
||||
|
||||
for i := 0; i < len(message.Tokens); i++ {
|
||||
u, err := url.Parse(message.Tokens[i])
|
||||
if err != nil {
|
||||
@@ -71,3 +107,11 @@ func SendMM(message MatterMostMessage) {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func MapStrToStr(arr []string, fn func(s string) string) []string {
|
||||
var newArray = []string{}
|
||||
for _, it := range arr {
|
||||
newArray = append(newArray, fn(it))
|
||||
}
|
||||
return newArray
|
||||
}
|
||||
|
||||
77
src/server/common/sender/plugin.go
Normal file
77
src/server/common/sender/plugin.go
Normal file
@@ -0,0 +1,77 @@
|
||||
package sender
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"os/exec"
|
||||
"time"
|
||||
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/notifier"
|
||||
"github.com/didi/nightingale/v5/src/pkg/sys"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
)
|
||||
|
||||
func MayPluginNotify(noticeBytes []byte) {
|
||||
if len(noticeBytes) == 0 {
|
||||
return
|
||||
}
|
||||
alertingCallPlugin(noticeBytes)
|
||||
alertingCallScript(noticeBytes)
|
||||
}
|
||||
|
||||
func alertingCallScript(stdinBytes []byte) {
|
||||
// not enable or no notify.py? do nothing
|
||||
if !config.C.Alerting.CallScript.Enable || config.C.Alerting.CallScript.ScriptPath == "" {
|
||||
return
|
||||
}
|
||||
|
||||
fpath := config.C.Alerting.CallScript.ScriptPath
|
||||
cmd := exec.Command(fpath)
|
||||
cmd.Stdin = bytes.NewReader(stdinBytes)
|
||||
|
||||
// combine stdout and stderr
|
||||
var buf bytes.Buffer
|
||||
cmd.Stdout = &buf
|
||||
cmd.Stderr = &buf
|
||||
|
||||
err := startCmd(cmd)
|
||||
if err != nil {
|
||||
logger.Errorf("event_notify: run cmd err: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
err, isTimeout := sys.WrapTimeout(cmd, time.Duration(config.C.Alerting.Timeout)*time.Millisecond)
|
||||
|
||||
if isTimeout {
|
||||
if err == nil {
|
||||
logger.Errorf("event_notify: timeout and killed process %s", fpath)
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
logger.Errorf("event_notify: kill process %s occur error %v", fpath, err)
|
||||
}
|
||||
|
||||
return
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
logger.Errorf("event_notify: exec script %s occur error: %v, output: %s", fpath, err, buf.String())
|
||||
return
|
||||
}
|
||||
|
||||
logger.Infof("event_notify: exec %s output: %s", fpath, buf.String())
|
||||
}
|
||||
|
||||
// call notify.so via golang plugin build
|
||||
// ig. etc/script/notify/notify.so
|
||||
func alertingCallPlugin(stdinBytes []byte) {
|
||||
if !config.C.Alerting.CallPlugin.Enable {
|
||||
return
|
||||
}
|
||||
|
||||
logger.Debugf("alertingCallPlugin begin")
|
||||
logger.Debugf("payload:", string(stdinBytes))
|
||||
notifier.Instance.Notify(stdinBytes)
|
||||
logger.Debugf("alertingCallPlugin done")
|
||||
}
|
||||
@@ -1,7 +1,7 @@
|
||||
//go:build !windows
|
||||
// +build !windows
|
||||
|
||||
package engine
|
||||
package sender
|
||||
|
||||
import (
|
||||
"os/exec"
|
||||
@@ -1,4 +1,4 @@
|
||||
package engine
|
||||
package sender
|
||||
|
||||
import "os/exec"
|
||||
|
||||
26
src/server/common/sender/redis_pub.go
Normal file
26
src/server/common/sender/redis_pub.go
Normal file
@@ -0,0 +1,26 @@
|
||||
package sender
|
||||
|
||||
import (
|
||||
"context"
|
||||
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
"github.com/didi/nightingale/v5/src/storage"
|
||||
)
|
||||
|
||||
func PublishToRedis(clusterName string, bs []byte) {
|
||||
if len(bs) == 0 {
|
||||
return
|
||||
}
|
||||
if !config.C.Alerting.RedisPub.Enable {
|
||||
return
|
||||
}
|
||||
|
||||
// pub all alerts to redis
|
||||
channelKey := config.C.Alerting.RedisPub.ChannelPrefix + clusterName
|
||||
err := storage.Redis.Publish(context.Background(), channelKey, bs).Err()
|
||||
if err != nil {
|
||||
logger.Errorf("event_notify: redis publish %s err: %v", channelKey, err)
|
||||
}
|
||||
}
|
||||
73
src/server/common/sender/sender.go
Normal file
73
src/server/common/sender/sender.go
Normal file
@@ -0,0 +1,73 @@
|
||||
package sender
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"html/template"
|
||||
|
||||
"github.com/toolkits/pkg/slice"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
"github.com/didi/nightingale/v5/src/server/memsto"
|
||||
)
|
||||
|
||||
type (
|
||||
// Sender 发送消息通知的接口
|
||||
Sender interface {
|
||||
Send(ctx MessageContext)
|
||||
|
||||
// SendRaw 发送原始消息,目前在notifyMaintainer时使用
|
||||
SendRaw(users []*models.User, title, message string)
|
||||
}
|
||||
|
||||
// MessageContext 一个event所生成的告警通知的上下文
|
||||
MessageContext struct {
|
||||
Users []*models.User
|
||||
Rule *models.AlertRule
|
||||
Event *models.AlertCurEvent
|
||||
}
|
||||
)
|
||||
|
||||
func NewSender(key string, tpls map[string]*template.Template) Sender {
|
||||
if !slice.ContainsString(config.C.Alerting.NotifyBuiltinChannels, key) {
|
||||
return nil
|
||||
}
|
||||
|
||||
switch key {
|
||||
case models.Dingtalk:
|
||||
return &DingtalkSender{tpl: tpls["dingtalk.tpl"]}
|
||||
case models.Wecom:
|
||||
return &WecomSender{tpl: tpls["wecom.tpl"]}
|
||||
case models.Feishu:
|
||||
return &FeishuSender{tpl: tpls["feishu.tpl"]}
|
||||
case models.FeishuCard:
|
||||
return &FeishuCardSender{tpl: tpls["feishucard.tpl"]}
|
||||
case models.Email:
|
||||
return &EmailSender{subjectTpl: tpls["subject.tpl"], contentTpl: tpls["mailbody.tpl"]}
|
||||
case models.Mm:
|
||||
return &MmSender{tpl: tpls["mm.tpl"]}
|
||||
case models.Telegram:
|
||||
return &TelegramSender{tpl: tpls["telegram.tpl"]}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func BuildMessageContext(rule *models.AlertRule, event *models.AlertCurEvent, uids []int64) MessageContext {
|
||||
users := memsto.UserCache.GetByUserIds(uids)
|
||||
return MessageContext{
|
||||
Rule: rule,
|
||||
Event: event,
|
||||
Users: users,
|
||||
}
|
||||
}
|
||||
|
||||
func BuildTplMessage(tpl *template.Template, event *models.AlertCurEvent) string {
|
||||
if tpl == nil {
|
||||
return "tpl for current sender not found, please check configuration"
|
||||
}
|
||||
var body bytes.Buffer
|
||||
if err := tpl.Execute(&body, event); err != nil {
|
||||
return err.Error()
|
||||
}
|
||||
return body.String()
|
||||
}
|
||||
@@ -1,11 +1,14 @@
|
||||
package sender
|
||||
|
||||
import (
|
||||
"html/template"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
)
|
||||
|
||||
type TelegramMessage struct {
|
||||
@@ -18,6 +21,41 @@ type telegram struct {
|
||||
Text string `json:"text"`
|
||||
}
|
||||
|
||||
type TelegramSender struct {
|
||||
tpl *template.Template
|
||||
}
|
||||
|
||||
func (ts *TelegramSender) Send(ctx MessageContext) {
|
||||
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
|
||||
return
|
||||
}
|
||||
tokens := ts.extract(ctx.Users)
|
||||
message := BuildTplMessage(ts.tpl, ctx.Event)
|
||||
|
||||
SendTelegram(TelegramMessage{
|
||||
Text: message,
|
||||
Tokens: tokens,
|
||||
})
|
||||
}
|
||||
|
||||
func (ts *TelegramSender) SendRaw(users []*models.User, title, message string) {
|
||||
tokens := ts.extract(users)
|
||||
SendTelegram(TelegramMessage{
|
||||
Text: message,
|
||||
Tokens: tokens,
|
||||
})
|
||||
}
|
||||
|
||||
func (ts *TelegramSender) extract(users []*models.User) []string {
|
||||
tokens := make([]string, 0, len(users))
|
||||
for _, user := range users {
|
||||
if token, has := user.ExtractToken(models.Telegram); has {
|
||||
tokens = append(tokens, token)
|
||||
}
|
||||
}
|
||||
return tokens
|
||||
}
|
||||
|
||||
func SendTelegram(message TelegramMessage) {
|
||||
for i := 0; i < len(message.Tokens); i++ {
|
||||
if !strings.Contains(message.Tokens[i], "/") && !strings.HasPrefix(message.Tokens[i], "https://") {
|
||||
|
||||
66
src/server/common/sender/webhook.go
Normal file
66
src/server/common/sender/webhook.go
Normal file
@@ -0,0 +1,66 @@
|
||||
package sender
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/json"
|
||||
"io/ioutil"
|
||||
"net/http"
|
||||
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
)
|
||||
|
||||
func SendWebhooks(webhooks []config.Webhook, event *models.AlertCurEvent) {
|
||||
for _, conf := range webhooks {
|
||||
if conf.Url == "" || !conf.Enable {
|
||||
continue
|
||||
}
|
||||
bs, err := json.Marshal(event)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
|
||||
bf := bytes.NewBuffer(bs)
|
||||
|
||||
req, err := http.NewRequest("POST", conf.Url, bf)
|
||||
if err != nil {
|
||||
logger.Warning("alertingWebhook failed to new request", err)
|
||||
continue
|
||||
}
|
||||
|
||||
if conf.BasicAuthUser != "" && conf.BasicAuthPass != "" {
|
||||
req.SetBasicAuth(conf.BasicAuthUser, conf.BasicAuthPass)
|
||||
}
|
||||
|
||||
if len(conf.Headers) > 0 && len(conf.Headers)%2 == 0 {
|
||||
for i := 0; i < len(conf.Headers); i += 2 {
|
||||
if conf.Headers[i] == "host" {
|
||||
req.Host = conf.Headers[i+1]
|
||||
continue
|
||||
}
|
||||
req.Header.Set(conf.Headers[i], conf.Headers[i+1])
|
||||
}
|
||||
}
|
||||
|
||||
client := http.Client{
|
||||
Timeout: conf.TimeoutDuration,
|
||||
}
|
||||
|
||||
var resp *http.Response
|
||||
resp, err = client.Do(req)
|
||||
if err != nil {
|
||||
logger.Warningf("WebhookCallError, ruleId: [%d], eventId: [%d], url: [%s], error: [%s]", event.RuleId, event.Id, conf.Url, err)
|
||||
continue
|
||||
}
|
||||
|
||||
var body []byte
|
||||
if resp.Body != nil {
|
||||
defer resp.Body.Close()
|
||||
body, _ = ioutil.ReadAll(resp.Body)
|
||||
}
|
||||
|
||||
logger.Debugf("alertingWebhook done, url: %s, response code: %d, body: %s", conf.Url, resp.StatusCode, string(body))
|
||||
}
|
||||
}
|
||||
@@ -1,17 +1,15 @@
|
||||
package sender
|
||||
|
||||
import (
|
||||
"html/template"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
)
|
||||
|
||||
type WecomMessage struct {
|
||||
Text string
|
||||
Tokens []string
|
||||
}
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/pkg/poster"
|
||||
)
|
||||
|
||||
type wecomMarkdown struct {
|
||||
Content string `json:"content"`
|
||||
@@ -22,24 +20,59 @@ type wecom struct {
|
||||
Markdown wecomMarkdown `json:"markdown"`
|
||||
}
|
||||
|
||||
func SendWecom(message WecomMessage) {
|
||||
for i := 0; i < len(message.Tokens); i++ {
|
||||
url := "https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=" + message.Tokens[i]
|
||||
if strings.HasPrefix(message.Tokens[i], "https://") {
|
||||
url = message.Tokens[i]
|
||||
}
|
||||
type WecomSender struct {
|
||||
tpl *template.Template
|
||||
}
|
||||
|
||||
func (ws *WecomSender) Send(ctx MessageContext) {
|
||||
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
|
||||
return
|
||||
}
|
||||
urls := ws.extract(ctx.Users)
|
||||
message := BuildTplMessage(ws.tpl, ctx.Event)
|
||||
for _, url := range urls {
|
||||
body := wecom{
|
||||
Msgtype: "markdown",
|
||||
Markdown: wecomMarkdown{
|
||||
Content: message.Text,
|
||||
Content: message,
|
||||
},
|
||||
}
|
||||
|
||||
res, code, err := poster.PostJSON(url, time.Second*5, body, 3)
|
||||
if err != nil {
|
||||
logger.Errorf("wecom_sender: result=fail url=%s code=%d error=%v response=%s", url, code, err, string(res))
|
||||
} else {
|
||||
logger.Infof("wecom_sender: result=succ url=%s code=%d response=%s", url, code, string(res))
|
||||
}
|
||||
ws.doSend(url, body)
|
||||
}
|
||||
}
|
||||
|
||||
func (ws *WecomSender) SendRaw(users []*models.User, title, message string) {
|
||||
urls := ws.extract(users)
|
||||
for _, url := range urls {
|
||||
body := wecom{
|
||||
Msgtype: "markdown",
|
||||
Markdown: wecomMarkdown{
|
||||
Content: message,
|
||||
},
|
||||
}
|
||||
ws.doSend(url, body)
|
||||
}
|
||||
}
|
||||
|
||||
func (ws *WecomSender) extract(users []*models.User) []string {
|
||||
urls := make([]string, 0, len(users))
|
||||
for _, user := range users {
|
||||
if token, has := user.ExtractToken(models.Wecom); has {
|
||||
url := token
|
||||
if !strings.HasPrefix(token, "https://") {
|
||||
url = "https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=" + token
|
||||
}
|
||||
urls = append(urls, url)
|
||||
}
|
||||
}
|
||||
return urls
|
||||
}
|
||||
|
||||
func (ws *WecomSender) doSend(url string, body wecom) {
|
||||
res, code, err := poster.PostJSON(url, time.Second*5, body, 3)
|
||||
if err != nil {
|
||||
logger.Errorf("wecom_sender: result=fail url=%s code=%d error=%v response=%s", url, code, err, string(res))
|
||||
} else {
|
||||
logger.Infof("wecom_sender: result=succ url=%s code=%d response=%s", url, code, string(res))
|
||||
}
|
||||
}
|
||||
|
||||
@@ -2,9 +2,11 @@ package config
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"html/template"
|
||||
"log"
|
||||
"net"
|
||||
"os"
|
||||
"path"
|
||||
"plugin"
|
||||
"runtime"
|
||||
"strings"
|
||||
@@ -13,6 +15,9 @@ import (
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/koding/multiconfig"
|
||||
"github.com/pkg/errors"
|
||||
"github.com/toolkits/pkg/file"
|
||||
"github.com/toolkits/pkg/runner"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/notifier"
|
||||
@@ -20,6 +25,7 @@ import (
|
||||
"github.com/didi/nightingale/v5/src/pkg/logx"
|
||||
"github.com/didi/nightingale/v5/src/pkg/ormx"
|
||||
"github.com/didi/nightingale/v5/src/pkg/secu"
|
||||
"github.com/didi/nightingale/v5/src/pkg/tplx"
|
||||
"github.com/didi/nightingale/v5/src/storage"
|
||||
)
|
||||
|
||||
@@ -168,45 +174,7 @@ func MustLoad(key string, fpaths ...string) {
|
||||
|
||||
C.Heartbeat.Endpoint = fmt.Sprintf("%s:%d", C.Heartbeat.IP, C.HTTP.Port)
|
||||
|
||||
if C.Alerting.Webhook.Enable {
|
||||
if C.Alerting.Webhook.Timeout == "" {
|
||||
C.Alerting.Webhook.TimeoutDuration = time.Second * 5
|
||||
} else {
|
||||
dur, err := time.ParseDuration(C.Alerting.Webhook.Timeout)
|
||||
if err != nil {
|
||||
fmt.Println("failed to parse Alerting.Webhook.Timeout")
|
||||
os.Exit(1)
|
||||
}
|
||||
C.Alerting.Webhook.TimeoutDuration = dur
|
||||
}
|
||||
}
|
||||
|
||||
if C.Alerting.CallPlugin.Enable {
|
||||
if runtime.GOOS == "windows" {
|
||||
fmt.Println("notify plugin on unsupported os:", runtime.GOOS)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
p, err := plugin.Open(C.Alerting.CallPlugin.PluginPath)
|
||||
if err != nil {
|
||||
fmt.Println("failed to load plugin:", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
caller, err := p.Lookup(C.Alerting.CallPlugin.Caller)
|
||||
if err != nil {
|
||||
fmt.Println("failed to lookup plugin Caller:", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
ins, ok := caller.(notifier.Notifier)
|
||||
if !ok {
|
||||
log.Println("notifier interface not implemented")
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
notifier.Instance = ins
|
||||
}
|
||||
C.Alerting.check()
|
||||
|
||||
if C.WriterOpt.QueueMaxSize <= 0 {
|
||||
C.WriterOpt.QueueMaxSize = 10000000
|
||||
@@ -267,7 +235,7 @@ type Config struct {
|
||||
EngineDelay int64
|
||||
DisableUsageReport bool
|
||||
ReaderFrom string
|
||||
LabelRewrite bool
|
||||
LabelRewrite bool
|
||||
ForceUseServerTS bool
|
||||
Log logx.Config
|
||||
HTTP httpx.Config
|
||||
@@ -341,6 +309,89 @@ type Alerting struct {
|
||||
Webhook Webhook
|
||||
}
|
||||
|
||||
func (a *Alerting) check() {
|
||||
if a.Webhook.Enable {
|
||||
if a.Webhook.Timeout == "" {
|
||||
a.Webhook.TimeoutDuration = time.Second * 5
|
||||
} else {
|
||||
dur, err := time.ParseDuration(C.Alerting.Webhook.Timeout)
|
||||
if err != nil {
|
||||
fmt.Println("failed to parse Alerting.Webhook.Timeout")
|
||||
os.Exit(1)
|
||||
}
|
||||
a.Webhook.TimeoutDuration = dur
|
||||
}
|
||||
}
|
||||
|
||||
if a.CallPlugin.Enable {
|
||||
if runtime.GOOS == "windows" {
|
||||
fmt.Println("notify plugin on unsupported os:", runtime.GOOS)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
p, err := plugin.Open(a.CallPlugin.PluginPath)
|
||||
if err != nil {
|
||||
fmt.Println("failed to load plugin:", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
caller, err := p.Lookup(a.CallPlugin.Caller)
|
||||
if err != nil {
|
||||
fmt.Println("failed to lookup plugin Caller:", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
ins, ok := caller.(notifier.Notifier)
|
||||
if !ok {
|
||||
log.Println("notifier interface not implemented")
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
notifier.Instance = ins
|
||||
}
|
||||
|
||||
if a.TemplatesDir == "" {
|
||||
a.TemplatesDir = path.Join(runner.Cwd, "etc", "template")
|
||||
}
|
||||
|
||||
if a.Timeout == 0 {
|
||||
a.Timeout = 30000
|
||||
}
|
||||
}
|
||||
|
||||
func (a *Alerting) ListTpls() (map[string]*template.Template, error) {
|
||||
filenames, err := file.FilesUnder(a.TemplatesDir)
|
||||
if err != nil {
|
||||
return nil, errors.WithMessage(err, "failed to exec FilesUnder")
|
||||
}
|
||||
|
||||
if len(filenames) == 0 {
|
||||
return nil, errors.New("no tpl files under " + a.TemplatesDir)
|
||||
}
|
||||
|
||||
tplFiles := make([]string, 0, len(filenames))
|
||||
for i := 0; i < len(filenames); i++ {
|
||||
if strings.HasSuffix(filenames[i], ".tpl") {
|
||||
tplFiles = append(tplFiles, filenames[i])
|
||||
}
|
||||
}
|
||||
|
||||
if len(tplFiles) == 0 {
|
||||
return nil, errors.New("no tpl files under " + a.TemplatesDir)
|
||||
}
|
||||
|
||||
tpls := make(map[string]*template.Template)
|
||||
for _, tplFile := range tplFiles {
|
||||
tplpath := path.Join(a.TemplatesDir, tplFile)
|
||||
tpl, err := template.New(tplFile).Funcs(tplx.TemplateFuncMap).ParseFiles(tplpath)
|
||||
if err != nil {
|
||||
return nil, errors.WithMessage(err, "failed to parse tpl: "+tplpath)
|
||||
}
|
||||
tpls[tplFile] = tpl
|
||||
}
|
||||
return tpls, nil
|
||||
}
|
||||
|
||||
type CallScript struct {
|
||||
Enable bool
|
||||
ScriptPath string
|
||||
|
||||
@@ -1,6 +1,10 @@
|
||||
package config
|
||||
|
||||
import "sync"
|
||||
import (
|
||||
"sync"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/pkg/tls"
|
||||
)
|
||||
|
||||
type PromOption struct {
|
||||
ClusterName string
|
||||
@@ -11,6 +15,9 @@ type PromOption struct {
|
||||
Timeout int64
|
||||
DialTimeout int64
|
||||
|
||||
UseTLS bool
|
||||
tls.ClientConfig
|
||||
|
||||
MaxIdleConnsPerHost int
|
||||
|
||||
Headers []string
|
||||
|
||||
@@ -119,17 +119,27 @@ func loadFromDatabase() {
|
||||
}
|
||||
|
||||
func newClientFromPromOption(po PromOption) (api.Client, error) {
|
||||
transport := &http.Transport{
|
||||
Proxy: http.ProxyFromEnvironment,
|
||||
DialContext: (&net.Dialer{
|
||||
Timeout: time.Duration(po.DialTimeout) * time.Millisecond,
|
||||
}).DialContext,
|
||||
ResponseHeaderTimeout: time.Duration(po.Timeout) * time.Millisecond,
|
||||
MaxIdleConnsPerHost: po.MaxIdleConnsPerHost,
|
||||
}
|
||||
|
||||
if po.UseTLS {
|
||||
tlsConfig, err := po.TLSConfig()
|
||||
if err != nil {
|
||||
logger.Errorf("new cluster %s fail: %v", po.Url, err)
|
||||
return nil, err
|
||||
}
|
||||
transport.TLSClientConfig = tlsConfig
|
||||
}
|
||||
|
||||
return api.NewClient(api.Config{
|
||||
Address: po.Url,
|
||||
RoundTripper: &http.Transport{
|
||||
// TLSClientConfig: tlsConfig,
|
||||
Proxy: http.ProxyFromEnvironment,
|
||||
DialContext: (&net.Dialer{
|
||||
Timeout: time.Duration(po.DialTimeout) * time.Millisecond,
|
||||
}).DialContext,
|
||||
ResponseHeaderTimeout: time.Duration(po.Timeout) * time.Millisecond,
|
||||
MaxIdleConnsPerHost: po.MaxIdleConnsPerHost,
|
||||
},
|
||||
Address: po.Url,
|
||||
RoundTripper: transport,
|
||||
})
|
||||
}
|
||||
|
||||
@@ -142,6 +152,11 @@ func setClientFromPromOption(clusterName string, po PromOption) error {
|
||||
return fmt.Errorf("prometheus url is blank")
|
||||
}
|
||||
|
||||
if strings.HasPrefix(po.Url, "https") {
|
||||
po.UseTLS = true
|
||||
po.InsecureSkipVerify = true
|
||||
}
|
||||
|
||||
cli, err := newClientFromPromOption(po)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to newClientFromPromOption: %v", err)
|
||||
|
||||
@@ -59,9 +59,7 @@ func consumeOne(event *models.AlertCurEvent) {
|
||||
return
|
||||
}
|
||||
|
||||
fillUsers(event)
|
||||
callback(event)
|
||||
notify(event)
|
||||
HandleEventNotify(event, false)
|
||||
}
|
||||
|
||||
func persist(event *models.AlertCurEvent) {
|
||||
@@ -149,7 +147,6 @@ func fillUsers(e *models.AlertCurEvent) {
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
|
||||
gids = append(gids, gid)
|
||||
}
|
||||
|
||||
@@ -173,11 +170,3 @@ func mapKeys(m map[int64]struct{}) []int64 {
|
||||
}
|
||||
return lst
|
||||
}
|
||||
|
||||
func StringSetKeys(m map[string]struct{}) []string {
|
||||
lst := make([]string, 0, len(m))
|
||||
for k := range m {
|
||||
lst = append(lst, k)
|
||||
}
|
||||
return lst
|
||||
}
|
||||
|
||||
@@ -23,9 +23,6 @@ func Start(ctx context.Context) error {
|
||||
// start loop consumer
|
||||
go loopConsume(ctx)
|
||||
|
||||
// filter my rules and start worker
|
||||
//go loopFilterRules(ctx)
|
||||
|
||||
go ruleHolder.LoopSyncRules(ctx)
|
||||
|
||||
go reportQueueSize()
|
||||
|
||||
@@ -1,8 +1,9 @@
|
||||
package engine
|
||||
|
||||
import (
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
)
|
||||
|
||||
func LogEvent(event *models.AlertCurEvent, location string, err ...error) {
|
||||
|
||||
@@ -11,33 +11,26 @@ import (
|
||||
"github.com/didi/nightingale/v5/src/server/memsto"
|
||||
)
|
||||
|
||||
var AlertMuteStrategies = AlertMuteStrategiesType{
|
||||
&TimeNonEffectiveMuteStrategy{},
|
||||
&IdentNotExistsMuteStrategy{},
|
||||
&BgNotMatchMuteStrategy{},
|
||||
&EventMuteStrategy{},
|
||||
type MuteStrategyFunc func(rule *models.AlertRule, event *models.AlertCurEvent) bool
|
||||
|
||||
var AlertMuteStrategies = []MuteStrategyFunc{
|
||||
TimeNonEffectiveMuteStrategy,
|
||||
IdentNotExistsMuteStrategy,
|
||||
BgNotMatchMuteStrategy,
|
||||
EventMuteStrategy,
|
||||
}
|
||||
|
||||
type AlertMuteStrategiesType []AlertMuteStrategy
|
||||
|
||||
func (ss AlertMuteStrategiesType) IsMuted(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
for _, s := range ss {
|
||||
if s.IsMuted(rule, event) {
|
||||
func IsMuted(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
for _, strategyFunc := range AlertMuteStrategies {
|
||||
if strategyFunc(rule, event) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// AlertMuteStrategy 是过滤event的抽象,当返回true时,表示该告警时间由于某些原因不需要告警
|
||||
type AlertMuteStrategy interface {
|
||||
IsMuted(rule *models.AlertRule, event *models.AlertCurEvent) bool
|
||||
}
|
||||
|
||||
// TimeNonEffectiveMuteStrategy 根据规则配置的告警时间过滤,如果产生的告警不在规则配置的告警时间内,则不告警
|
||||
type TimeNonEffectiveMuteStrategy struct{}
|
||||
|
||||
func (s *TimeNonEffectiveMuteStrategy) IsMuted(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
func TimeNonEffectiveMuteStrategy(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
if rule.Disabled == 1 {
|
||||
return true
|
||||
}
|
||||
@@ -46,24 +39,33 @@ func (s *TimeNonEffectiveMuteStrategy) IsMuted(rule *models.AlertRule, event *mo
|
||||
triggerTime := tm.Format("15:04")
|
||||
triggerWeek := strconv.Itoa(int(tm.Weekday()))
|
||||
|
||||
if rule.EnableStime <= rule.EnableEtime {
|
||||
if triggerTime < rule.EnableStime || triggerTime > rule.EnableEtime {
|
||||
return true
|
||||
enableStime := strings.Fields(rule.EnableStime)
|
||||
enableEtime := strings.Fields(rule.EnableEtime)
|
||||
enableDaysOfWeek := strings.Split(rule.EnableDaysOfWeek, ";")
|
||||
length := len(enableDaysOfWeek)
|
||||
// enableStime,enableEtime,enableDaysOfWeek三者长度肯定相同,这里循环一个即可
|
||||
for i := 0; i < length; i++ {
|
||||
enableDaysOfWeek[i] = strings.Replace(enableDaysOfWeek[i], "7", "0", 1)
|
||||
if !strings.Contains(enableDaysOfWeek[i], triggerWeek) {
|
||||
continue
|
||||
}
|
||||
} else {
|
||||
if triggerTime < rule.EnableStime && triggerTime > rule.EnableEtime {
|
||||
return true
|
||||
if enableStime[i] <= enableEtime[i] {
|
||||
if triggerTime < enableStime[i] || triggerTime > enableEtime[i] {
|
||||
continue
|
||||
}
|
||||
} else {
|
||||
if triggerTime < enableStime[i] && triggerTime > enableEtime[i] {
|
||||
continue
|
||||
}
|
||||
}
|
||||
// 到这里说明当前时刻在告警规则的某组生效时间范围内,直接返回 false
|
||||
return false
|
||||
}
|
||||
|
||||
rule.EnableDaysOfWeek = strings.Replace(rule.EnableDaysOfWeek, "7", "0", 1)
|
||||
return !strings.Contains(rule.EnableDaysOfWeek, triggerWeek)
|
||||
return true
|
||||
}
|
||||
|
||||
// IdentNotExistsMuteStrategy 根据ident是否存在过滤,如果ident不存在,则target_up的告警直接过滤掉
|
||||
type IdentNotExistsMuteStrategy struct{}
|
||||
|
||||
func (s *IdentNotExistsMuteStrategy) IsMuted(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
func IdentNotExistsMuteStrategy(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
ident, has := event.TagsMap["ident"]
|
||||
if !has {
|
||||
return false
|
||||
@@ -72,16 +74,14 @@ func (s *IdentNotExistsMuteStrategy) IsMuted(rule *models.AlertRule, event *mode
|
||||
// 如果是target_up的告警,且ident已经不存在了,直接过滤掉
|
||||
// 这里的判断有点太粗暴了,但是目前没有更好的办法
|
||||
if !exists && strings.Contains(rule.PromQl, "target_up") {
|
||||
logger.Debugf("[%T] mute: rule_eval:%d cluster:%s ident:%s", s, rule.Id, event.Cluster, ident)
|
||||
logger.Debugf("[%s] mute: rule_eval:%d cluster:%s ident:%s", "IdentNotExistsMuteStrategy", rule.Id, event.Cluster, ident)
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// BgNotMatchMuteStrategy 当规则开启只在bg内部告警时,对于非bg内部的机器过滤
|
||||
type BgNotMatchMuteStrategy struct{}
|
||||
|
||||
func (s *BgNotMatchMuteStrategy) IsMuted(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
func BgNotMatchMuteStrategy(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
// 没有开启BG内部告警,直接不过滤
|
||||
if rule.EnableInBG == 0 {
|
||||
return false
|
||||
@@ -96,17 +96,13 @@ func (s *BgNotMatchMuteStrategy) IsMuted(rule *models.AlertRule, event *models.A
|
||||
// 对于包含ident的告警事件,check一下ident所属bg和rule所属bg是否相同
|
||||
// 如果告警规则选择了只在本BG生效,那其他BG的机器就不能因此规则产生告警
|
||||
if exists && target.GroupId != rule.GroupId {
|
||||
logger.Debugf("[%T] mute: rule_eval:%d cluster:%s", s, rule.Id, event.Cluster)
|
||||
logger.Debugf("[%s] mute: rule_eval:%d cluster:%s", "BgNotMatchMuteStrategy", rule.Id, event.Cluster)
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
type EventMuteStrategy struct{}
|
||||
|
||||
var EventMuteStra = new(EventMuteStrategy)
|
||||
|
||||
func (s *EventMuteStrategy) IsMuted(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
func EventMuteStrategy(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
mutes, has := memsto.AlertMuteCache.Gets(event.GroupId)
|
||||
if !has || len(mutes) == 0 {
|
||||
return false
|
||||
|
||||
@@ -2,88 +2,164 @@ package engine
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"html/template"
|
||||
"io/ioutil"
|
||||
"net/http"
|
||||
"os/exec"
|
||||
"path"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/pkg/errors"
|
||||
"github.com/tidwall/gjson"
|
||||
"github.com/toolkits/pkg/file"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
"github.com/toolkits/pkg/runner"
|
||||
"github.com/toolkits/pkg/slice"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/notifier"
|
||||
"github.com/didi/nightingale/v5/src/pkg/sys"
|
||||
"github.com/didi/nightingale/v5/src/pkg/tplx"
|
||||
"github.com/didi/nightingale/v5/src/server/common/sender"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
"github.com/didi/nightingale/v5/src/server/memsto"
|
||||
"github.com/didi/nightingale/v5/src/storage"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
)
|
||||
|
||||
var (
|
||||
tpls map[string]*template.Template
|
||||
rwLock sync.RWMutex
|
||||
rwLock sync.RWMutex
|
||||
tpls map[string]*template.Template
|
||||
Senders map[string]sender.Sender
|
||||
|
||||
// 处理事件到subscription关系,处理的subscription用OrMerge进行合并
|
||||
routers = []Router{GroupRouter, GlobalWebhookRouter, EventCallbacksRouter}
|
||||
// 额外去掉一些订阅,处理的subscription用AndMerge进行合并, 如设置 channel=false,合并后不通过这个channel发送
|
||||
// 如果实现了相关Router,可以添加到interceptors中
|
||||
interceptors []Router
|
||||
|
||||
// 额外的订阅event逻辑处理
|
||||
subscribeRouters = []Router{GroupRouter}
|
||||
subscribeInterceptors []Router
|
||||
)
|
||||
|
||||
func reloadTpls() error {
|
||||
if config.C.Alerting.TemplatesDir == "" {
|
||||
config.C.Alerting.TemplatesDir = path.Join(runner.Cwd, "etc", "template")
|
||||
}
|
||||
|
||||
filenames, err := file.FilesUnder(config.C.Alerting.TemplatesDir)
|
||||
tmpTpls, err := config.C.Alerting.ListTpls()
|
||||
if err != nil {
|
||||
return errors.WithMessage(err, "failed to exec FilesUnder")
|
||||
return err
|
||||
}
|
||||
|
||||
if len(filenames) == 0 {
|
||||
return errors.New("no tpl files under " + config.C.Alerting.TemplatesDir)
|
||||
}
|
||||
|
||||
tplFiles := make([]string, 0, len(filenames))
|
||||
for i := 0; i < len(filenames); i++ {
|
||||
if strings.HasSuffix(filenames[i], ".tpl") {
|
||||
tplFiles = append(tplFiles, filenames[i])
|
||||
}
|
||||
}
|
||||
|
||||
if len(tplFiles) == 0 {
|
||||
return errors.New("no tpl files under " + config.C.Alerting.TemplatesDir)
|
||||
}
|
||||
|
||||
tmpTpls := make(map[string]*template.Template)
|
||||
for i := 0; i < len(tplFiles); i++ {
|
||||
tplpath := path.Join(config.C.Alerting.TemplatesDir, tplFiles[i])
|
||||
|
||||
tpl, err := template.New(tplFiles[i]).Funcs(tplx.TemplateFuncMap).ParseFiles(tplpath)
|
||||
if err != nil {
|
||||
return errors.WithMessage(err, "failed to parse tpl: "+tplpath)
|
||||
}
|
||||
|
||||
tmpTpls[tplFiles[i]] = tpl
|
||||
senders := map[string]sender.Sender{
|
||||
models.Email: sender.NewSender(models.Email, tmpTpls),
|
||||
models.Dingtalk: sender.NewSender(models.Dingtalk, tmpTpls),
|
||||
models.Wecom: sender.NewSender(models.Wecom, tmpTpls),
|
||||
models.Feishu: sender.NewSender(models.Feishu, tmpTpls),
|
||||
models.FeishuCard: sender.NewSender(models.FeishuCard, tmpTpls),
|
||||
models.Mm: sender.NewSender(models.Mm, tmpTpls),
|
||||
models.Telegram: sender.NewSender(models.Telegram, tmpTpls),
|
||||
}
|
||||
|
||||
rwLock.Lock()
|
||||
tpls = tmpTpls
|
||||
Senders = senders
|
||||
rwLock.Unlock()
|
||||
return nil
|
||||
}
|
||||
|
||||
// HandleEventNotify 处理event事件的主逻辑
|
||||
// event: 告警/恢复事件
|
||||
// isSubscribe: 告警事件是否由subscribe的配置产生
|
||||
func HandleEventNotify(event *models.AlertCurEvent, isSubscribe bool) {
|
||||
rule := memsto.AlertRuleCache.Get(event.RuleId)
|
||||
if rule == nil {
|
||||
return
|
||||
}
|
||||
fillUsers(event)
|
||||
|
||||
var (
|
||||
handlers []Router
|
||||
interceptorHandlers []Router
|
||||
)
|
||||
if isSubscribe {
|
||||
handlers = subscribeRouters
|
||||
interceptorHandlers = subscribeInterceptors
|
||||
} else {
|
||||
handlers = routers
|
||||
interceptorHandlers = interceptors
|
||||
}
|
||||
|
||||
subscription := NewSubscription()
|
||||
// 处理订阅关系使用OrMerge
|
||||
for _, handler := range handlers {
|
||||
subscription.OrMerge(handler(rule, event, subscription))
|
||||
}
|
||||
|
||||
// 处理移除订阅关系的逻辑,比如员工离职,临时静默某个通道的策略等
|
||||
for _, handler := range interceptorHandlers {
|
||||
subscription.AndMerge(handler(rule, event, subscription))
|
||||
}
|
||||
|
||||
// 处理事件发送,这里用一个goroutine处理一个event的所有发送事件
|
||||
go Send(rule, event, subscription, isSubscribe)
|
||||
|
||||
// 如果是不是订阅规则出现的event,则需要处理订阅规则的event
|
||||
if !isSubscribe {
|
||||
handleSubs(event)
|
||||
}
|
||||
}
|
||||
|
||||
func handleSubs(event *models.AlertCurEvent) {
|
||||
// handle alert subscribes
|
||||
subscribes := make([]*models.AlertSubscribe, 0)
|
||||
// rule specific subscribes
|
||||
if subs, has := memsto.AlertSubscribeCache.Get(event.RuleId); has {
|
||||
subscribes = append(subscribes, subs...)
|
||||
}
|
||||
// global subscribes
|
||||
if subs, has := memsto.AlertSubscribeCache.Get(0); has {
|
||||
subscribes = append(subscribes, subs...)
|
||||
}
|
||||
for _, sub := range subscribes {
|
||||
handleSub(sub, *event)
|
||||
}
|
||||
}
|
||||
|
||||
// handleSub 处理订阅规则的event,注意这里event要使用值传递,因为后面会修改event的状态
|
||||
func handleSub(sub *models.AlertSubscribe, event models.AlertCurEvent) {
|
||||
if sub.IsDisabled() || !sub.MatchCluster(event.Cluster) {
|
||||
return
|
||||
}
|
||||
if !matchTags(event.TagsMap, sub.ITags) {
|
||||
return
|
||||
}
|
||||
|
||||
sub.ModifyEvent(&event)
|
||||
LogEvent(&event, "subscribe")
|
||||
HandleEventNotify(&event, true)
|
||||
}
|
||||
|
||||
func Send(rule *models.AlertRule, event *models.AlertCurEvent, subscription *Subscription, isSubscribe bool) {
|
||||
for channel, uids := range subscription.ToChannelUserMap() {
|
||||
ctx := sender.BuildMessageContext(rule, event, uids)
|
||||
rwLock.RLock()
|
||||
s := Senders[channel]
|
||||
rwLock.RUnlock()
|
||||
if s == nil {
|
||||
logger.Warningf("no sender for channel: %s", channel)
|
||||
continue
|
||||
}
|
||||
s.Send(ctx)
|
||||
}
|
||||
|
||||
// handle event callbacks
|
||||
sender.SendCallbacks(subscription.ToCallbackList(), event)
|
||||
|
||||
// handle global webhooks
|
||||
sender.SendWebhooks(subscription.ToWebhookList(), event)
|
||||
|
||||
noticeBytes := genNoticeBytes(event)
|
||||
|
||||
// handle plugin call
|
||||
go sender.MayPluginNotify(noticeBytes)
|
||||
|
||||
if !isSubscribe {
|
||||
// handle redis pub
|
||||
sender.PublishToRedis(event.Cluster, noticeBytes)
|
||||
}
|
||||
}
|
||||
|
||||
type Notice struct {
|
||||
Event *models.AlertCurEvent `json:"event"`
|
||||
Tpls map[string]string `json:"tpls"`
|
||||
}
|
||||
|
||||
func genNotice(event *models.AlertCurEvent) Notice {
|
||||
func genNoticeBytes(event *models.AlertCurEvent) []byte {
|
||||
// build notice body with templates
|
||||
ntpls := make(map[string]string)
|
||||
|
||||
@@ -98,393 +174,12 @@ func genNotice(event *models.AlertCurEvent) Notice {
|
||||
}
|
||||
}
|
||||
|
||||
return Notice{Event: event, Tpls: ntpls}
|
||||
}
|
||||
|
||||
func alertingRedisPub(clusterName string, bs []byte) {
|
||||
channelKey := config.C.Alerting.RedisPub.ChannelPrefix + clusterName
|
||||
// pub all alerts to redis
|
||||
if config.C.Alerting.RedisPub.Enable {
|
||||
err := storage.Redis.Publish(context.Background(), channelKey, bs).Err()
|
||||
if err != nil {
|
||||
logger.Errorf("event_notify: redis publish %s err: %v", channelKey, err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func handleNotice(notice Notice, bs []byte) {
|
||||
alertingCallScript(bs)
|
||||
alertingCallPlugin(bs)
|
||||
|
||||
if len(config.C.Alerting.NotifyBuiltinChannels) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
emailset := make(map[string]struct{})
|
||||
phoneset := make(map[string]struct{})
|
||||
wecomset := make(map[string]struct{})
|
||||
dingtalkset := make(map[string]struct{})
|
||||
feishuset := make(map[string]struct{})
|
||||
mmset := make(map[string]struct{})
|
||||
telegramset := make(map[string]struct{})
|
||||
|
||||
for _, user := range notice.Event.NotifyUsersObj {
|
||||
if user.Email != "" {
|
||||
emailset[user.Email] = struct{}{}
|
||||
}
|
||||
|
||||
if user.Phone != "" {
|
||||
phoneset[user.Phone] = struct{}{}
|
||||
}
|
||||
|
||||
bs, err := user.Contacts.MarshalJSON()
|
||||
if err != nil {
|
||||
logger.Errorf("handle_notice: failed to marshal contacts: %v", err)
|
||||
continue
|
||||
}
|
||||
|
||||
ret := gjson.GetBytes(bs, "dingtalk_robot_token")
|
||||
if ret.Exists() {
|
||||
dingtalkset[ret.String()] = struct{}{}
|
||||
}
|
||||
|
||||
ret = gjson.GetBytes(bs, "wecom_robot_token")
|
||||
if ret.Exists() {
|
||||
wecomset[ret.String()] = struct{}{}
|
||||
}
|
||||
|
||||
ret = gjson.GetBytes(bs, "feishu_robot_token")
|
||||
if ret.Exists() {
|
||||
feishuset[ret.String()] = struct{}{}
|
||||
}
|
||||
|
||||
ret = gjson.GetBytes(bs, "mm_webhook_url")
|
||||
if ret.Exists() {
|
||||
mmset[ret.String()] = struct{}{}
|
||||
}
|
||||
|
||||
ret = gjson.GetBytes(bs, "telegram_robot_token")
|
||||
if ret.Exists() {
|
||||
telegramset[ret.String()] = struct{}{}
|
||||
}
|
||||
}
|
||||
|
||||
phones := StringSetKeys(phoneset)
|
||||
|
||||
for _, ch := range notice.Event.NotifyChannelsJSON {
|
||||
switch ch {
|
||||
case "email":
|
||||
if len(emailset) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
if !slice.ContainsString(config.C.Alerting.NotifyBuiltinChannels, "email") {
|
||||
continue
|
||||
}
|
||||
|
||||
subject, has := notice.Tpls["subject.tpl"]
|
||||
if !has {
|
||||
subject = "subject.tpl not found"
|
||||
}
|
||||
|
||||
content, has := notice.Tpls["mailbody.tpl"]
|
||||
if !has {
|
||||
content = "mailbody.tpl not found"
|
||||
}
|
||||
|
||||
sender.WriteEmail(subject, content, StringSetKeys(emailset))
|
||||
case "dingtalk":
|
||||
if len(dingtalkset) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
if !slice.ContainsString(config.C.Alerting.NotifyBuiltinChannels, "dingtalk") {
|
||||
continue
|
||||
}
|
||||
|
||||
content, has := notice.Tpls["dingtalk.tpl"]
|
||||
if !has {
|
||||
content = "dingtalk.tpl not found"
|
||||
}
|
||||
|
||||
sender.SendDingtalk(sender.DingtalkMessage{
|
||||
Title: notice.Event.RuleName,
|
||||
Text: content,
|
||||
AtMobiles: phones,
|
||||
Tokens: StringSetKeys(dingtalkset),
|
||||
})
|
||||
case "wecom":
|
||||
if len(wecomset) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
if !slice.ContainsString(config.C.Alerting.NotifyBuiltinChannels, "wecom") {
|
||||
continue
|
||||
}
|
||||
|
||||
content, has := notice.Tpls["wecom.tpl"]
|
||||
if !has {
|
||||
content = "wecom.tpl not found"
|
||||
}
|
||||
sender.SendWecom(sender.WecomMessage{
|
||||
Text: content,
|
||||
Tokens: StringSetKeys(wecomset),
|
||||
})
|
||||
case "feishu":
|
||||
if len(feishuset) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
if !slice.ContainsString(config.C.Alerting.NotifyBuiltinChannels, "feishu") {
|
||||
continue
|
||||
}
|
||||
|
||||
content, has := notice.Tpls["feishu.tpl"]
|
||||
if !has {
|
||||
content = "feishu.tpl not found"
|
||||
}
|
||||
sender.SendFeishu(sender.FeishuMessage{
|
||||
Text: content,
|
||||
AtMobiles: phones,
|
||||
Tokens: StringSetKeys(feishuset),
|
||||
})
|
||||
case "mm":
|
||||
if len(mmset) == 0 {
|
||||
continue
|
||||
}
|
||||
if !slice.ContainsString(config.C.Alerting.NotifyBuiltinChannels, "mm") {
|
||||
continue
|
||||
}
|
||||
|
||||
content, has := notice.Tpls["mm.tpl"]
|
||||
if !has {
|
||||
content = "mm.tpl not found"
|
||||
}
|
||||
|
||||
sender.SendMM(sender.MatterMostMessage{
|
||||
Text: content,
|
||||
Tokens: StringSetKeys(mmset),
|
||||
})
|
||||
case "telegram":
|
||||
if len(telegramset) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
if !slice.ContainsString(config.C.Alerting.NotifyBuiltinChannels, "telegram") {
|
||||
continue
|
||||
}
|
||||
|
||||
content, has := notice.Tpls["telegram.tpl"]
|
||||
if !has {
|
||||
content = "telegram.tpl not found"
|
||||
}
|
||||
sender.SendTelegram(sender.TelegramMessage{
|
||||
Text: content,
|
||||
Tokens: StringSetKeys(telegramset),
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func notify(event *models.AlertCurEvent) {
|
||||
LogEvent(event, "notify")
|
||||
|
||||
notice := genNotice(event)
|
||||
notice := Notice{Event: event, Tpls: ntpls}
|
||||
stdinBytes, err := json.Marshal(notice)
|
||||
if err != nil {
|
||||
logger.Errorf("event_notify: failed to marshal notice: %v", err)
|
||||
return
|
||||
return nil
|
||||
}
|
||||
|
||||
alertingRedisPub(event.Cluster, stdinBytes)
|
||||
alertingWebhook(event)
|
||||
|
||||
handleNotice(notice, stdinBytes)
|
||||
|
||||
// handle alert subscribes
|
||||
subs, has := memsto.AlertSubscribeCache.Get(event.RuleId)
|
||||
if has {
|
||||
handleSubscribes(*event, subs)
|
||||
}
|
||||
|
||||
subs, has = memsto.AlertSubscribeCache.Get(0)
|
||||
if has {
|
||||
handleSubscribes(*event, subs)
|
||||
}
|
||||
}
|
||||
|
||||
func alertingWebhook(event *models.AlertCurEvent) {
|
||||
conf := config.C.Alerting.Webhook
|
||||
|
||||
if !conf.Enable {
|
||||
return
|
||||
}
|
||||
|
||||
if conf.Url == "" {
|
||||
return
|
||||
}
|
||||
|
||||
bs, err := json.Marshal(event)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
|
||||
bf := bytes.NewBuffer(bs)
|
||||
|
||||
req, err := http.NewRequest("POST", conf.Url, bf)
|
||||
if err != nil {
|
||||
logger.Warning("alertingWebhook failed to new request", err)
|
||||
return
|
||||
}
|
||||
|
||||
if conf.BasicAuthUser != "" && conf.BasicAuthPass != "" {
|
||||
req.SetBasicAuth(conf.BasicAuthUser, conf.BasicAuthPass)
|
||||
}
|
||||
|
||||
if len(conf.Headers) > 0 && len(conf.Headers)%2 == 0 {
|
||||
for i := 0; i < len(conf.Headers); i += 2 {
|
||||
req.Header.Set(conf.Headers[i], conf.Headers[i+1])
|
||||
}
|
||||
}
|
||||
|
||||
client := http.Client{
|
||||
Timeout: conf.TimeoutDuration,
|
||||
}
|
||||
|
||||
var resp *http.Response
|
||||
resp, err = client.Do(req)
|
||||
if err != nil {
|
||||
logger.Warning("alertingWebhook failed to call url, error: ", err)
|
||||
return
|
||||
}
|
||||
|
||||
var body []byte
|
||||
if resp.Body != nil {
|
||||
defer resp.Body.Close()
|
||||
body, _ = ioutil.ReadAll(resp.Body)
|
||||
}
|
||||
|
||||
logger.Debugf("alertingWebhook done, url: %s, response code: %d, body: %s", conf.Url, resp.StatusCode, string(body))
|
||||
}
|
||||
|
||||
func handleSubscribes(event models.AlertCurEvent, subs []*models.AlertSubscribe) {
|
||||
for i := 0; i < len(subs); i++ {
|
||||
handleSubscribe(event, subs[i])
|
||||
}
|
||||
}
|
||||
|
||||
func handleSubscribe(event models.AlertCurEvent, sub *models.AlertSubscribe) {
|
||||
if sub.IsDisabled() {
|
||||
return
|
||||
}
|
||||
|
||||
// 如果不是全局的,判断 cluster
|
||||
if sub.Cluster != models.ClusterAll {
|
||||
// sub.Cluster 是一个字符串,可能是多个cluster的组合,比如"cluster1 cluster2"
|
||||
clusters := strings.Fields(sub.Cluster)
|
||||
cm := make(map[string]struct{}, len(clusters))
|
||||
for i := 0; i < len(clusters); i++ {
|
||||
cm[clusters[i]] = struct{}{}
|
||||
}
|
||||
|
||||
if _, has := cm[event.Cluster]; !has {
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
if !matchTags(event.TagsMap, sub.ITags) {
|
||||
return
|
||||
}
|
||||
|
||||
if sub.RedefineSeverity == 1 {
|
||||
event.Severity = sub.NewSeverity
|
||||
}
|
||||
|
||||
if sub.RedefineChannels == 1 {
|
||||
event.NotifyChannels = sub.NewChannels
|
||||
event.NotifyChannelsJSON = strings.Fields(sub.NewChannels)
|
||||
}
|
||||
|
||||
event.NotifyGroups = sub.UserGroupIds
|
||||
event.NotifyGroupsJSON = strings.Fields(sub.UserGroupIds)
|
||||
if len(event.NotifyGroupsJSON) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
LogEvent(&event, "subscribe")
|
||||
|
||||
fillUsers(&event)
|
||||
|
||||
notice := genNotice(&event)
|
||||
stdinBytes, err := json.Marshal(notice)
|
||||
if err != nil {
|
||||
logger.Errorf("event_notify: failed to marshal notice: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
handleNotice(notice, stdinBytes)
|
||||
}
|
||||
|
||||
func alertingCallScript(stdinBytes []byte) {
|
||||
if !config.C.Alerting.CallScript.Enable {
|
||||
return
|
||||
}
|
||||
|
||||
// no notify.py? do nothing
|
||||
if config.C.Alerting.CallScript.ScriptPath == "" {
|
||||
return
|
||||
}
|
||||
|
||||
if config.C.Alerting.Timeout == 0 {
|
||||
config.C.Alerting.Timeout = 30000
|
||||
}
|
||||
|
||||
fpath := config.C.Alerting.CallScript.ScriptPath
|
||||
cmd := exec.Command(fpath)
|
||||
cmd.Stdin = bytes.NewReader(stdinBytes)
|
||||
|
||||
// combine stdout and stderr
|
||||
var buf bytes.Buffer
|
||||
cmd.Stdout = &buf
|
||||
cmd.Stderr = &buf
|
||||
|
||||
err := startCmd(cmd)
|
||||
if err != nil {
|
||||
logger.Errorf("event_notify: run cmd err: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
err, isTimeout := sys.WrapTimeout(cmd, time.Duration(config.C.Alerting.Timeout)*time.Millisecond)
|
||||
|
||||
if isTimeout {
|
||||
if err == nil {
|
||||
logger.Errorf("event_notify: timeout and killed process %s", fpath)
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
logger.Errorf("event_notify: kill process %s occur error %v", fpath, err)
|
||||
}
|
||||
|
||||
return
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
logger.Errorf("event_notify: exec script %s occur error: %v, output: %s", fpath, err, buf.String())
|
||||
return
|
||||
}
|
||||
|
||||
logger.Infof("event_notify: exec %s output: %s", fpath, buf.String())
|
||||
}
|
||||
|
||||
// call notify.so via golang plugin build
|
||||
// ig. etc/script/notify/notify.so
|
||||
func alertingCallPlugin(stdinBytes []byte) {
|
||||
if !config.C.Alerting.CallPlugin.Enable {
|
||||
return
|
||||
}
|
||||
|
||||
logger.Debugf("alertingCallPlugin begin")
|
||||
logger.Debugf("payload:", string(stdinBytes))
|
||||
notifier.Instance.Notify(stdinBytes)
|
||||
logger.Debugf("alertingCallPlugin done")
|
||||
return stdinBytes
|
||||
}
|
||||
|
||||
@@ -4,13 +4,12 @@ import (
|
||||
"encoding/json"
|
||||
"time"
|
||||
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/notifier"
|
||||
"github.com/didi/nightingale/v5/src/server/common/sender"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
"github.com/didi/nightingale/v5/src/server/memsto"
|
||||
"github.com/tidwall/gjson"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
)
|
||||
|
||||
type MaintainMessage struct {
|
||||
@@ -29,12 +28,13 @@ func notifyToMaintainer(title, msg string) {
|
||||
}
|
||||
|
||||
triggerTime := time.Now().Format("2006/01/02 - 15:04:05")
|
||||
msg = "Title: " + title + "\nContent: " + msg + "\nTime: " + triggerTime
|
||||
|
||||
notifyMaintainerWithPlugin(title, msg, triggerTime, users)
|
||||
notifyMaintainerWithBuiltin(title, msg, triggerTime, users)
|
||||
notifyMaintainerWithPlugin(title, msg, users)
|
||||
notifyMaintainerWithBuiltin(title, msg, users)
|
||||
}
|
||||
|
||||
func notifyMaintainerWithPlugin(title, msg, triggerTime string, users []*models.User) {
|
||||
func notifyMaintainerWithPlugin(title, msg string, users []*models.User) {
|
||||
if !config.C.Alerting.CallPlugin.Enable {
|
||||
return
|
||||
}
|
||||
@@ -42,7 +42,7 @@ func notifyMaintainerWithPlugin(title, msg, triggerTime string, users []*models.
|
||||
stdinBytes, err := json.Marshal(MaintainMessage{
|
||||
Tos: users,
|
||||
Title: title,
|
||||
Content: "Title: " + title + "\nContent: " + msg + "\nTime: " + triggerTime,
|
||||
Content: msg,
|
||||
})
|
||||
|
||||
if err != nil {
|
||||
@@ -54,119 +54,17 @@ func notifyMaintainerWithPlugin(title, msg, triggerTime string, users []*models.
|
||||
logger.Debugf("notify maintainer with plugin done")
|
||||
}
|
||||
|
||||
func notifyMaintainerWithBuiltin(title, msg, triggerTime string, users []*models.User) {
|
||||
if len(config.C.Alerting.NotifyBuiltinChannels) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
emailset := make(map[string]struct{})
|
||||
phoneset := make(map[string]struct{})
|
||||
wecomset := make(map[string]struct{})
|
||||
dingtalkset := make(map[string]struct{})
|
||||
feishuset := make(map[string]struct{})
|
||||
mmset := make(map[string]struct{})
|
||||
telegramset := make(map[string]struct{})
|
||||
|
||||
for _, user := range users {
|
||||
if user.Email != "" {
|
||||
emailset[user.Email] = struct{}{}
|
||||
}
|
||||
|
||||
if user.Phone != "" {
|
||||
phoneset[user.Phone] = struct{}{}
|
||||
}
|
||||
|
||||
bs, err := user.Contacts.MarshalJSON()
|
||||
if err != nil {
|
||||
logger.Errorf("handle_notice: failed to marshal contacts: %v", err)
|
||||
func notifyMaintainerWithBuiltin(title, msg string, users []*models.User) {
|
||||
subscription := NewSubscriptionFromUsers(users)
|
||||
for channel, uids := range subscription.ToChannelUserMap() {
|
||||
currentUsers := memsto.UserCache.GetByUserIds(uids)
|
||||
rwLock.RLock()
|
||||
s := Senders[channel]
|
||||
rwLock.RUnlock()
|
||||
if s == nil {
|
||||
logger.Warningf("no sender for channel: %s", channel)
|
||||
continue
|
||||
}
|
||||
|
||||
ret := gjson.GetBytes(bs, "dingtalk_robot_token")
|
||||
if ret.Exists() {
|
||||
dingtalkset[ret.String()] = struct{}{}
|
||||
}
|
||||
|
||||
ret = gjson.GetBytes(bs, "wecom_robot_token")
|
||||
if ret.Exists() {
|
||||
wecomset[ret.String()] = struct{}{}
|
||||
}
|
||||
|
||||
ret = gjson.GetBytes(bs, "feishu_robot_token")
|
||||
if ret.Exists() {
|
||||
feishuset[ret.String()] = struct{}{}
|
||||
}
|
||||
|
||||
ret = gjson.GetBytes(bs, "mm_webhook_url")
|
||||
if ret.Exists() {
|
||||
mmset[ret.String()] = struct{}{}
|
||||
}
|
||||
|
||||
ret = gjson.GetBytes(bs, "telegram_robot_token")
|
||||
if ret.Exists() {
|
||||
telegramset[ret.String()] = struct{}{}
|
||||
}
|
||||
}
|
||||
|
||||
phones := StringSetKeys(phoneset)
|
||||
|
||||
for _, ch := range config.C.Alerting.NotifyBuiltinChannels {
|
||||
switch ch {
|
||||
case "email":
|
||||
if len(emailset) == 0 {
|
||||
continue
|
||||
}
|
||||
content := "Title: " + title + "\nContent: " + msg + "\nTime: " + triggerTime
|
||||
sender.WriteEmail(title, content, StringSetKeys(emailset))
|
||||
case "dingtalk":
|
||||
if len(dingtalkset) == 0 {
|
||||
continue
|
||||
}
|
||||
content := "**Title: **" + title + "\n**Content: **" + msg + "\n**Time: **" + triggerTime
|
||||
sender.SendDingtalk(sender.DingtalkMessage{
|
||||
Title: title,
|
||||
Text: content,
|
||||
AtMobiles: phones,
|
||||
Tokens: StringSetKeys(dingtalkset),
|
||||
})
|
||||
case "wecom":
|
||||
if len(wecomset) == 0 {
|
||||
continue
|
||||
}
|
||||
content := "**Title: **" + title + "\n**Content: **" + msg + "\n**Time: **" + triggerTime
|
||||
sender.SendWecom(sender.WecomMessage{
|
||||
Text: content,
|
||||
Tokens: StringSetKeys(wecomset),
|
||||
})
|
||||
case "feishu":
|
||||
if len(feishuset) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
content := "Title: " + title + "\nContent: " + msg + "\nTime: " + triggerTime
|
||||
sender.SendFeishu(sender.FeishuMessage{
|
||||
Text: content,
|
||||
AtMobiles: phones,
|
||||
Tokens: StringSetKeys(feishuset),
|
||||
})
|
||||
case "mm":
|
||||
if len(mmset) == 0 {
|
||||
continue
|
||||
}
|
||||
content := "**Title: **" + title + "\n**Content: **" + msg + "\n**Time: **" + triggerTime
|
||||
sender.SendMM(sender.MatterMostMessage{
|
||||
Text: content,
|
||||
Tokens: StringSetKeys(mmset),
|
||||
})
|
||||
case "telegram":
|
||||
if len(telegramset) == 0 {
|
||||
continue
|
||||
}
|
||||
content := "**Title: **" + title + "\n**Content: **" + msg + "\n**Time: **" + triggerTime
|
||||
sender.SendTelegram(sender.TelegramMessage{
|
||||
Text: content,
|
||||
Tokens: StringSetKeys(telegramset),
|
||||
})
|
||||
}
|
||||
go s.SendRaw(currentUsers, title, msg)
|
||||
}
|
||||
}
|
||||
|
||||
55
src/server/engine/route_strategy.go
Normal file
55
src/server/engine/route_strategy.go
Normal file
@@ -0,0 +1,55 @@
|
||||
package engine
|
||||
|
||||
import (
|
||||
"strconv"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
"github.com/didi/nightingale/v5/src/server/memsto"
|
||||
)
|
||||
|
||||
// Router 抽象由告警事件到订阅者的路由策略
|
||||
// rule: 告警规则
|
||||
// event: 告警事件
|
||||
// prev: 前一次路由结果, Router的实现可以直接修改prev, 也可以返回一个新的Subscription用于AndMerge/OrMerge
|
||||
type Router func(rule *models.AlertRule, event *models.AlertCurEvent, prev *Subscription) *Subscription
|
||||
|
||||
// GroupRouter 处理告警规则的组订阅关系
|
||||
func GroupRouter(rule *models.AlertRule, event *models.AlertCurEvent, prev *Subscription) *Subscription {
|
||||
groupIds := make([]int64, 0, len(event.NotifyGroupsJSON))
|
||||
for _, groupId := range event.NotifyGroupsJSON {
|
||||
gid, err := strconv.ParseInt(groupId, 10, 64)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
groupIds = append(groupIds, gid)
|
||||
}
|
||||
groups := memsto.UserGroupCache.GetByUserGroupIds(groupIds)
|
||||
subscription := NewSubscription()
|
||||
for _, group := range groups {
|
||||
for _, userId := range group.UserIds {
|
||||
subscription.userMap[userId] = NewNotifyChannels(event.NotifyChannelsJSON)
|
||||
}
|
||||
}
|
||||
return subscription
|
||||
}
|
||||
|
||||
func GlobalWebhookRouter(rule *models.AlertRule, event *models.AlertCurEvent, prev *Subscription) *Subscription {
|
||||
conf := config.C.Alerting.Webhook
|
||||
if !conf.Enable {
|
||||
return nil
|
||||
}
|
||||
subscription := NewSubscription()
|
||||
subscription.webhooks[conf.Url] = conf
|
||||
return subscription
|
||||
}
|
||||
|
||||
func EventCallbacksRouter(rule *models.AlertRule, event *models.AlertCurEvent, prev *Subscription) *Subscription {
|
||||
for _, c := range event.CallbacksJSON {
|
||||
if c == "" {
|
||||
continue
|
||||
}
|
||||
prev.callbacks[c] = struct{}{}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
@@ -12,6 +12,7 @@ import (
|
||||
"github.com/didi/nightingale/v5/src/server/naming"
|
||||
)
|
||||
|
||||
// RuleContext is the interface for alert rule and record rule
|
||||
type RuleContext interface {
|
||||
Key() string
|
||||
Hash() string
|
||||
@@ -27,15 +28,14 @@ var ruleHolder = &RuleHolder{
|
||||
externalAlertRules: make(map[string]*AlertRuleContext),
|
||||
}
|
||||
|
||||
// RuleHolder is the global rule holder
|
||||
type RuleHolder struct {
|
||||
externalLock sync.RWMutex
|
||||
|
||||
// key: hash
|
||||
alertRules map[string]RuleContext
|
||||
// key: hash
|
||||
alertRules map[string]RuleContext
|
||||
recordRules map[string]RuleContext
|
||||
|
||||
// key: key
|
||||
// key: key of rule
|
||||
externalLock sync.RWMutex
|
||||
externalAlertRules map[string]*AlertRuleContext
|
||||
}
|
||||
|
||||
@@ -63,30 +63,27 @@ func (rh *RuleHolder) SyncAlertRules() {
|
||||
continue
|
||||
}
|
||||
|
||||
// 如果 rule 不是通过 prometheus engine 来告警的,则创建为 externalRule
|
||||
ruleClusters := config.ReaderClients.Hit(rule.Cluster)
|
||||
if !rule.IsPrometheusRule() {
|
||||
ruleClusters := strings.Fields(rule.Cluster)
|
||||
for _, cluster := range ruleClusters {
|
||||
// hash ring not hit
|
||||
if !naming.ClusterHashRing.IsHit(cluster, fmt.Sprintf("%d", rule.Id), config.C.Heartbeat.Endpoint) {
|
||||
continue
|
||||
}
|
||||
|
||||
externalRule := NewAlertRuleContext(rule, cluster)
|
||||
externalAllRules[externalRule.Key()] = externalRule
|
||||
}
|
||||
continue
|
||||
// 非 Prometheus 的规则, 不支持 $all, 直接从 rule.Cluster 解析
|
||||
ruleClusters = strings.Fields(rule.Cluster)
|
||||
}
|
||||
|
||||
ruleClusters := config.ReaderClients.Hit(rule.Cluster)
|
||||
for _, cluster := range ruleClusters {
|
||||
// hash ring not hit
|
||||
if !naming.ClusterHashRing.IsHit(cluster, fmt.Sprintf("%d", rule.Id), config.C.Heartbeat.Endpoint) {
|
||||
continue
|
||||
}
|
||||
|
||||
alertRule := NewAlertRuleContext(rule, cluster)
|
||||
alertRules[alertRule.Hash()] = alertRule
|
||||
if rule.IsPrometheusRule() {
|
||||
// 正常的告警规则
|
||||
alertRule := NewAlertRuleContext(rule, cluster)
|
||||
alertRules[alertRule.Hash()] = alertRule
|
||||
} else {
|
||||
// 如果 rule 不是通过 prometheus engine 来告警的,则创建为 externalRule
|
||||
externalRule := NewAlertRuleContext(rule, cluster)
|
||||
externalAllRules[externalRule.Key()] = externalRule
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -105,19 +102,22 @@ func (rh *RuleHolder) SyncAlertRules() {
|
||||
}
|
||||
}
|
||||
|
||||
for hash, rule := range externalAllRules {
|
||||
rh.externalLock.Lock()
|
||||
if _, has := rh.externalAlertRules[hash]; !has {
|
||||
rule.Prepare()
|
||||
rh.externalAlertRules[hash] = rule
|
||||
rh.externalLock.Lock()
|
||||
for key, rule := range externalAllRules {
|
||||
if curRule, has := rh.externalAlertRules[key]; has {
|
||||
// rule存在,且hash一致,认为没有变更,这里可以根据需求单独实现一个关联数据更多的hash函数
|
||||
if rule.Hash() == curRule.Hash() {
|
||||
continue
|
||||
}
|
||||
}
|
||||
rh.externalLock.Unlock()
|
||||
// 现有规则中没有rule以及有rule但hash不一致的场景,需要触发rule的update
|
||||
rule.Prepare()
|
||||
rh.externalAlertRules[key] = rule
|
||||
}
|
||||
|
||||
rh.externalLock.Lock()
|
||||
for hash := range rh.externalAlertRules {
|
||||
if _, has := externalAllRules[hash]; !has {
|
||||
delete(rh.externalAlertRules, hash)
|
||||
for key := range rh.externalAlertRules {
|
||||
if _, has := externalAllRules[key]; !has {
|
||||
delete(rh.externalAlertRules, key)
|
||||
}
|
||||
}
|
||||
rh.externalLock.Unlock()
|
||||
@@ -158,9 +158,12 @@ func (rh *RuleHolder) SyncRecordRules() {
|
||||
}
|
||||
|
||||
func GetExternalAlertRule(cluster string, id int64) (*AlertRuleContext, bool) {
|
||||
key := fmt.Sprintf("alert-%s-%d", cluster, id)
|
||||
ruleHolder.externalLock.RLock()
|
||||
defer ruleHolder.externalLock.RUnlock()
|
||||
rule, has := ruleHolder.externalAlertRules[key]
|
||||
rule, has := ruleHolder.externalAlertRules[ruleKey(cluster, id)]
|
||||
return rule, has
|
||||
}
|
||||
|
||||
func ruleKey(cluster string, id int64) string {
|
||||
return fmt.Sprintf("alert-%s-%d", cluster, id)
|
||||
}
|
||||
|
||||
@@ -40,7 +40,7 @@ func (arc *AlertRuleContext) RuleFromCache() *models.AlertRule {
|
||||
}
|
||||
|
||||
func (arc *AlertRuleContext) Key() string {
|
||||
return fmt.Sprintf("alert-%s-%d", arc.cluster, arc.rule.Id)
|
||||
return ruleKey(arc.cluster, arc.rule.Id)
|
||||
}
|
||||
|
||||
func (arc *AlertRuleContext) Hash() string {
|
||||
@@ -138,7 +138,7 @@ func (arc *AlertRuleContext) HandleVectors(vectors []conv.Vector, from string) {
|
||||
event := alertVector.BuildEvent(now)
|
||||
// 如果event被mute了,本质也是fire的状态,这里无论如何都添加到alertingKeys中,防止fire的事件自动恢复了
|
||||
alertingKeys[alertVector.Hash()] = struct{}{}
|
||||
if AlertMuteStrategies.IsMuted(cachedRule, event) {
|
||||
if IsMuted(cachedRule, event) {
|
||||
continue
|
||||
}
|
||||
arc.handleEvent(event)
|
||||
|
||||
@@ -78,12 +78,12 @@ func (rrc *RecordRuleContext) Eval() {
|
||||
|
||||
value, warnings, err := config.ReaderClients.GetCli(rrc.cluster).Query(context.Background(), promql, time.Now())
|
||||
if err != nil {
|
||||
logger.Errorf("eval:%d promql:%s, error:%v", rrc.Key(), promql, err)
|
||||
logger.Errorf("eval:%s promql:%s, error:%v", rrc.Key(), promql, err)
|
||||
return
|
||||
}
|
||||
|
||||
if len(warnings) > 0 {
|
||||
logger.Errorf("eval:%d promql:%s, warnings:%v", rrc.Key(), promql, warnings)
|
||||
logger.Errorf("eval:%s promql:%s, warnings:%v", rrc.Key(), promql, warnings)
|
||||
return
|
||||
}
|
||||
ts := conv.ConvertToTimeSeries(value, rrc.rule)
|
||||
|
||||
139
src/server/engine/subscription.go
Normal file
139
src/server/engine/subscription.go
Normal file
@@ -0,0 +1,139 @@
|
||||
package engine
|
||||
|
||||
import (
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
)
|
||||
|
||||
// NotifyChannels channelKey -> bool
|
||||
type NotifyChannels map[string]bool
|
||||
|
||||
func NewNotifyChannels(channels []string) NotifyChannels {
|
||||
nc := make(NotifyChannels)
|
||||
for _, ch := range channels {
|
||||
nc[ch] = true
|
||||
}
|
||||
return nc
|
||||
}
|
||||
|
||||
func (nc NotifyChannels) OrMerge(other NotifyChannels) {
|
||||
nc.merge(other, func(a, b bool) bool { return a || b })
|
||||
}
|
||||
|
||||
func (nc NotifyChannels) AndMerge(other NotifyChannels) {
|
||||
nc.merge(other, func(a, b bool) bool { return a && b })
|
||||
}
|
||||
|
||||
func (nc NotifyChannels) merge(other NotifyChannels, f func(bool, bool) bool) {
|
||||
if other == nil {
|
||||
return
|
||||
}
|
||||
for k, v := range other {
|
||||
if curV, has := nc[k]; has {
|
||||
nc[k] = f(curV, v)
|
||||
} else {
|
||||
nc[k] = v
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Subscription 维护所有需要发送的用户-通道/回调/钩子信息,用map维护的数据结构具有去重功能
|
||||
type Subscription struct {
|
||||
userMap map[int64]NotifyChannels
|
||||
webhooks map[string]config.Webhook
|
||||
callbacks map[string]struct{}
|
||||
}
|
||||
|
||||
func NewSubscription() *Subscription {
|
||||
return &Subscription{
|
||||
userMap: make(map[int64]NotifyChannels),
|
||||
webhooks: make(map[string]config.Webhook),
|
||||
callbacks: make(map[string]struct{}),
|
||||
}
|
||||
}
|
||||
|
||||
// NewSubscriptionFromUsers 根据用户的token配置,生成订阅信息,用于notifyMaintainer
|
||||
func NewSubscriptionFromUsers(users []*models.User) *Subscription {
|
||||
s := NewSubscription()
|
||||
for _, u := range users {
|
||||
if u == nil {
|
||||
continue
|
||||
}
|
||||
for channel, token := range u.ExtractAllToken() {
|
||||
if token == "" {
|
||||
continue
|
||||
}
|
||||
if channelMap, has := s.userMap[u.Id]; has {
|
||||
channelMap[channel] = true
|
||||
} else {
|
||||
s.userMap[u.Id] = map[string]bool{
|
||||
channel: true,
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return s
|
||||
}
|
||||
|
||||
// OrMerge 将channelMap按照or的方式合并,方便实现多种组合的策略,比如根据某个tag进行路由等
|
||||
func (s *Subscription) OrMerge(other *Subscription) {
|
||||
s.merge(other, NotifyChannels.OrMerge)
|
||||
}
|
||||
|
||||
// AndMerge 将channelMap中的bool值按照and的逻辑进行合并,可以单独将人/通道维度的通知移除
|
||||
// 常用的场景有:
|
||||
// 1. 人员离职了不需要发送告警了
|
||||
// 2. 某个告警通道进行维护,暂时不需要发送告警了
|
||||
// 3. 业务值班的重定向逻辑,将高等级的告警额外发送给应急人员等
|
||||
// 可以结合业务需求自己实现router
|
||||
func (s *Subscription) AndMerge(other *Subscription) {
|
||||
s.merge(other, NotifyChannels.AndMerge)
|
||||
}
|
||||
|
||||
func (s *Subscription) merge(other *Subscription, f func(NotifyChannels, NotifyChannels)) {
|
||||
if other == nil {
|
||||
return
|
||||
}
|
||||
for k, v := range other.userMap {
|
||||
if curV, has := s.userMap[k]; has {
|
||||
f(curV, v)
|
||||
} else {
|
||||
s.userMap[k] = v
|
||||
}
|
||||
}
|
||||
for k, v := range other.webhooks {
|
||||
s.webhooks[k] = v
|
||||
}
|
||||
for k, v := range other.callbacks {
|
||||
s.callbacks[k] = v
|
||||
}
|
||||
}
|
||||
|
||||
// ToChannelUserMap userMap(map[uid][channel]bool) 转换为 map[channel][]uid 的结构
|
||||
func (s *Subscription) ToChannelUserMap() map[string][]int64 {
|
||||
m := make(map[string][]int64)
|
||||
for uid, nc := range s.userMap {
|
||||
for ch, send := range nc {
|
||||
if send {
|
||||
m[ch] = append(m[ch], uid)
|
||||
}
|
||||
}
|
||||
}
|
||||
return m
|
||||
}
|
||||
|
||||
func (s *Subscription) ToCallbackList() []string {
|
||||
callbacks := make([]string, 0, len(s.callbacks))
|
||||
for cb := range s.callbacks {
|
||||
callbacks = append(callbacks, cb)
|
||||
}
|
||||
return callbacks
|
||||
}
|
||||
|
||||
func (s *Subscription) ToWebhookList() []config.Webhook {
|
||||
webhooks := make([]config.Webhook, 0, len(s.webhooks))
|
||||
for _, wh := range s.webhooks {
|
||||
webhooks = append(webhooks, wh)
|
||||
}
|
||||
return webhooks
|
||||
}
|
||||
@@ -9,7 +9,6 @@ import (
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
promstat "github.com/didi/nightingale/v5/src/server/stat"
|
||||
)
|
||||
|
||||
@@ -99,13 +98,6 @@ func loopSyncAlertMutes() {
|
||||
func syncAlertMutes() error {
|
||||
start := time.Now()
|
||||
|
||||
clusterNames := config.ReaderClients.GetClusterNames()
|
||||
if len(clusterNames) == 0 {
|
||||
AlertRuleCache.Reset()
|
||||
logger.Warning("cluster is blank")
|
||||
return nil
|
||||
}
|
||||
|
||||
stat, err := models.AlertMuteStatistics("")
|
||||
if err != nil {
|
||||
return errors.WithMessage(err, "failed to exec AlertMuteStatistics")
|
||||
|
||||
@@ -9,7 +9,6 @@ import (
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
promstat "github.com/didi/nightingale/v5/src/server/stat"
|
||||
)
|
||||
|
||||
@@ -95,14 +94,6 @@ func loopSyncAlertRules() {
|
||||
|
||||
func syncAlertRules() error {
|
||||
start := time.Now()
|
||||
|
||||
clusterNames := config.ReaderClients.GetClusterNames()
|
||||
if len(clusterNames) == 0 {
|
||||
AlertRuleCache.Reset()
|
||||
logger.Warning("cluster is blank")
|
||||
return nil
|
||||
}
|
||||
|
||||
stat, err := models.AlertRuleStatistics("")
|
||||
if err != nil {
|
||||
return errors.WithMessage(err, "failed to exec AlertRuleStatistics")
|
||||
|
||||
@@ -9,7 +9,6 @@ import (
|
||||
"github.com/toolkits/pkg/logger"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
promstat "github.com/didi/nightingale/v5/src/server/stat"
|
||||
)
|
||||
|
||||
@@ -101,14 +100,6 @@ func loopSyncAlertSubscribes() {
|
||||
|
||||
func syncAlertSubscribes() error {
|
||||
start := time.Now()
|
||||
|
||||
clusterNames := config.ReaderClients.GetClusterNames()
|
||||
if len(clusterNames) == 0 {
|
||||
AlertSubscribeCache.Reset()
|
||||
logger.Warning("cluster is blank")
|
||||
return nil
|
||||
}
|
||||
|
||||
stat, err := models.AlertSubscribeStatistics("")
|
||||
if err != nil {
|
||||
return errors.WithMessage(err, "failed to exec AlertSubscribeStatistics")
|
||||
|
||||
@@ -6,7 +6,6 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/didi/nightingale/v5/src/models"
|
||||
"github.com/didi/nightingale/v5/src/server/config"
|
||||
promstat "github.com/didi/nightingale/v5/src/server/stat"
|
||||
"github.com/pkg/errors"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
@@ -95,20 +94,7 @@ func loopSyncRecordingRules() {
|
||||
func syncRecordingRules() error {
|
||||
start := time.Now()
|
||||
|
||||
clusterNames := config.ReaderClients.GetClusterNames()
|
||||
if len(clusterNames) == 0 {
|
||||
RecordingRuleCache.Reset()
|
||||
logger.Warning("cluster is blank")
|
||||
return nil
|
||||
}
|
||||
|
||||
var clusterName string
|
||||
// 只有一个集群,使用单集群模式,如果大于1个集群,则获取全部的规则
|
||||
if len(clusterNames) == 1 {
|
||||
clusterName = clusterNames[0]
|
||||
}
|
||||
|
||||
stat, err := models.RecordingRuleStatistics(clusterName)
|
||||
stat, err := models.RecordingRuleStatistics("")
|
||||
if err != nil {
|
||||
return errors.WithMessage(err, "failed to exec RecordingRuleStatistics")
|
||||
}
|
||||
@@ -120,7 +106,7 @@ func syncRecordingRules() error {
|
||||
return nil
|
||||
}
|
||||
|
||||
lst, err := models.RecordingRuleGetsByCluster(clusterName)
|
||||
lst, err := models.RecordingRuleGetsByCluster("")
|
||||
if err != nil {
|
||||
return errors.WithMessage(err, "failed to exec RecordingRuleGetsByCluster")
|
||||
}
|
||||
|
||||
@@ -57,14 +57,12 @@ func heartbeat() error {
|
||||
return err
|
||||
}
|
||||
if len(clusters) == 0 {
|
||||
// 实例刚刚部署,还没有在页面配置 cluster 的情况,先使用配置文件中的 cluster 上报心跳
|
||||
for i := 0; i < len(config.C.Readers); i++ {
|
||||
err := models.AlertingEngineHeartbeatWithCluster(config.C.Heartbeat.Endpoint, config.C.Readers[i].ClusterName)
|
||||
if err != nil {
|
||||
logger.Warningf("heartbeat with cluster %s err:%v", config.C.Readers[i].ClusterName, err)
|
||||
continue
|
||||
}
|
||||
// 告警引擎页面还没有配置集群,先上报一个空的集群,让 n9e-server 实例在页面上显示出来
|
||||
err := models.AlertingEngineHeartbeatWithCluster(config.C.Heartbeat.Endpoint, "")
|
||||
if err != nil {
|
||||
logger.Warningf("heartbeat with cluster %s err:%v", "", err)
|
||||
}
|
||||
logger.Warningf("heartbeat %s no cluster", config.C.Heartbeat.Endpoint)
|
||||
}
|
||||
|
||||
err := models.AlertingEngineHeartbeat(config.C.Heartbeat.Endpoint)
|
||||
|
||||
@@ -40,7 +40,7 @@ func pushEventToQueue(c *gin.Context) {
|
||||
event.TagsMap[arr[0]] = arr[1]
|
||||
}
|
||||
|
||||
if engine.EventMuteStra.IsMuted(nil, event) {
|
||||
if engine.EventMuteStrategy(nil, event) {
|
||||
logger.Infof("event_muted: rule_id=%d %s", event.RuleId, event.Hash)
|
||||
ginx.NewRender(c).Message(nil)
|
||||
return
|
||||
|
||||
@@ -21,14 +21,6 @@ import (
|
||||
"github.com/didi/nightingale/v5/src/server/writer"
|
||||
)
|
||||
|
||||
var promMetricFilter map[string]bool = map[string]bool{
|
||||
"up": true,
|
||||
"scrape_series_added": true,
|
||||
"scrape_samples_post_metric_relabeling": true,
|
||||
"scrape_samples_scraped": true,
|
||||
"scrape_duration_seconds": true,
|
||||
}
|
||||
|
||||
type promqlForm struct {
|
||||
PromQL string `json:"promql"`
|
||||
}
|
||||
@@ -74,6 +66,32 @@ func duplicateLabelKey(series *prompb.TimeSeries) bool {
|
||||
return false
|
||||
}
|
||||
|
||||
func extractIdentFromTimeSeries(s *prompb.TimeSeries) string {
|
||||
for i := 0; i < len(s.Labels); i++ {
|
||||
if s.Labels[i].Name == "ident" {
|
||||
return s.Labels[i].Value
|
||||
}
|
||||
}
|
||||
|
||||
// agent_hostname for grafana-agent and categraf
|
||||
for i := 0; i < len(s.Labels); i++ {
|
||||
if s.Labels[i].Name == "agent_hostname" {
|
||||
s.Labels[i].Name = "ident"
|
||||
return s.Labels[i].Value
|
||||
}
|
||||
}
|
||||
|
||||
// telegraf, output plugin: http, format: prometheusremotewrite
|
||||
for i := 0; i < len(s.Labels); i++ {
|
||||
if s.Labels[i].Name == "host" {
|
||||
s.Labels[i].Name = "ident"
|
||||
return s.Labels[i].Value
|
||||
}
|
||||
}
|
||||
|
||||
return ""
|
||||
}
|
||||
|
||||
func remoteWrite(c *gin.Context) {
|
||||
req, err := DecodeWriteRequest(c.Request.Body)
|
||||
if err != nil {
|
||||
@@ -100,39 +118,16 @@ func remoteWrite(c *gin.Context) {
|
||||
continue
|
||||
}
|
||||
|
||||
ident = ""
|
||||
ident = extractIdentFromTimeSeries(req.Timeseries[i])
|
||||
|
||||
// find ident label
|
||||
for j := 0; j < len(req.Timeseries[i].Labels); j++ {
|
||||
if req.Timeseries[i].Labels[j].Name == "host" {
|
||||
req.Timeseries[i].Labels[j].Name = "ident"
|
||||
}
|
||||
|
||||
if req.Timeseries[i].Labels[j].Name == "ident" {
|
||||
ident = req.Timeseries[i].Labels[j].Value
|
||||
}
|
||||
|
||||
if req.Timeseries[i].Labels[j].Name == "__name__" {
|
||||
metric = req.Timeseries[i].Labels[j].Value
|
||||
}
|
||||
}
|
||||
|
||||
if ident == "" {
|
||||
// not found, try agent_hostname
|
||||
for j := 0; j < len(req.Timeseries[i].Labels); j++ {
|
||||
// agent_hostname for grafana-agent
|
||||
if req.Timeseries[i].Labels[j].Name == "agent_hostname" {
|
||||
req.Timeseries[i].Labels[j].Name = "ident"
|
||||
ident = req.Timeseries[i].Labels[j].Value
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// 当数据是通过prometheus抓取(也许直接remote write到夜莺)的时候,prometheus会自动产生部分系统指标
|
||||
// 例如最典型的有up指标,是prometheus为exporter生成的指标,即使exporter挂掉的时候也会送up=0的指标
|
||||
// 此类指标当剔除,否则会导致redis数据中时间戳被意外更新,导致由此类指标中携带的ident的相关target_up指标无法变为实际的0值
|
||||
// 更多详细信息:https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series
|
||||
if _, has := promMetricFilter[metric]; has {
|
||||
// telegraf 上报数据的场景,只有在 metric 为 system_load1 时,说明指标来自机器,将 host 改为 ident,其他情况都忽略
|
||||
if metric != "system_load1" {
|
||||
ident = ""
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user