mirror of
https://github.com/ccfos/nightingale.git
synced 2026-03-18 13:50:52 +00:00
Compare commits
128 Commits
release-14
...
optimize-h
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
f8ddce8149 | ||
|
|
45685947dd | ||
|
|
cddf5e7d37 | ||
|
|
f07baa276e | ||
|
|
2c2d5004f4 | ||
|
|
9982666e44 | ||
|
|
2b448f738c | ||
|
|
e4c258de8e | ||
|
|
4f128a9b44 | ||
|
|
deb85b9c68 | ||
|
|
1b84324147 | ||
|
|
c73b66848e | ||
|
|
cd74442819 | ||
|
|
252a8284f9 | ||
|
|
7d2e998078 | ||
|
|
69582bacdf | ||
|
|
1bede4eeb8 | ||
|
|
16ed81020a | ||
|
|
7b020ae238 | ||
|
|
05eabcf00d | ||
|
|
e316842022 | ||
|
|
8b3c4749aa | ||
|
|
16be04c3e9 | ||
|
|
ccbadba9ff | ||
|
|
ce5bf2e473 | ||
|
|
80cdf9d0bb | ||
|
|
7514086ae6 | ||
|
|
116f8b1590 | ||
|
|
0fb4e4b723 | ||
|
|
07fb427eea | ||
|
|
d8f8fed95f | ||
|
|
f2e0ec10f7 | ||
|
|
db467a8811 | ||
|
|
b839bd3e16 | ||
|
|
8033ca590b | ||
|
|
0974f33d16 | ||
|
|
d52a19b1f7 | ||
|
|
f11c4dc87d | ||
|
|
d7f3bc8841 | ||
|
|
2ae8c35a50 | ||
|
|
da0697c5ce | ||
|
|
2eff1159e5 | ||
|
|
6c19c0adf4 | ||
|
|
5e5525ef57 | ||
|
|
58c2a3cc71 | ||
|
|
cef6d5fe49 | ||
|
|
49cda8b58a | ||
|
|
d6a585ccbd | ||
|
|
764c254833 | ||
|
|
c427abdfa3 | ||
|
|
3749f62adc | ||
|
|
f932f93a94 | ||
|
|
5bbc432db0 | ||
|
|
0712baa6e1 | ||
|
|
b4d595d5f5 | ||
|
|
95090055e0 | ||
|
|
880b92bf36 | ||
|
|
744eb44f19 | ||
|
|
6ddc78ea11 | ||
|
|
823568081b | ||
|
|
2f8e63f821 | ||
|
|
bdc9fa4638 | ||
|
|
9e1d69c8b0 | ||
|
|
85d8607be8 | ||
|
|
ec6a4f134a | ||
|
|
798f9e5536 | ||
|
|
92095ea89c | ||
|
|
eb85c9c78b | ||
|
|
bd8bf1cf9e | ||
|
|
b27ddf45cf | ||
|
|
c8e004ba51 | ||
|
|
eb330f00b2 | ||
|
|
49d61bbd5d | ||
|
|
407a1b61a5 | ||
|
|
bc8a6f61be | ||
|
|
94cd9796bf | ||
|
|
c3ee0143b2 | ||
|
|
10d4faae4e | ||
|
|
ffac81a2ef | ||
|
|
d8d1a454b3 | ||
|
|
94f9818fd2 | ||
|
|
a5d820ddb3 | ||
|
|
da0224d010 | ||
|
|
4a399a23c0 | ||
|
|
95ecc61834 | ||
|
|
f72e29677f | ||
|
|
f876eb02e2 | ||
|
|
cdcadefb03 | ||
|
|
582a3981fb | ||
|
|
8081c48450 | ||
|
|
5e7541215a | ||
|
|
e95b5428b2 | ||
|
|
8a47088d97 | ||
|
|
05ba5caf8a | ||
|
|
dc7752c2af | ||
|
|
a828603406 | ||
|
|
c5c4e00ab8 | ||
|
|
770e15db39 | ||
|
|
5096117b45 | ||
|
|
dd3b68e4ab | ||
|
|
85947c08a8 | ||
|
|
3f3c815171 | ||
|
|
08f82e899a | ||
|
|
043628d4eb | ||
|
|
ba33512d22 | ||
|
|
a7cf658c1d | ||
|
|
b62e6fda04 | ||
|
|
6243f9a05c | ||
|
|
e8962b5646 | ||
|
|
97a4ee2764 | ||
|
|
2fdb80f314 | ||
|
|
c0ab672cf7 | ||
|
|
7664c15121 | ||
|
|
4059a2022c | ||
|
|
e7263680a8 | ||
|
|
4a67f7a108 | ||
|
|
04ca6c5fd5 | ||
|
|
747211c78f | ||
|
|
bf54fac1e8 | ||
|
|
76117ae440 | ||
|
|
9ad02075c6 | ||
|
|
6d27ff673f | ||
|
|
ee4e2b3f7d | ||
|
|
e6de301c65 | ||
|
|
d4f5871fba | ||
|
|
c2e61f3741 | ||
|
|
d26df3b331 | ||
|
|
391c674d21 |
121
README.md
121
README.md
@@ -3,7 +3,7 @@
|
||||
<img src="doc/img/Nightingale_L_V.png" alt="nightingale - cloud native monitoring" width="100" /></a>
|
||||
</p>
|
||||
<p align="center">
|
||||
<b>开源告警管理专家 一体化的可观测平台</b>
|
||||
<b>开源告警管理专家</b>
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
@@ -27,77 +27,86 @@
|
||||
|
||||
[English](./README_en.md) | [中文](./README.md)
|
||||
|
||||
## 夜莺 Nightingale 是什么
|
||||
## 夜莺是什么
|
||||
|
||||
> 夜莺 Nightingale 是什么,解决什么问题?以大家都很熟悉的 Grafana 做个类比,Grafana 擅长对接各种各样的数据源,然后提供灵活、强大、好看的可视化面板。夜莺则擅长对接各种多样的数据源,提供灵活、强大、高效的监控告警管理能力。从发展路径和定位来说,夜莺和 Grafana 很像,可以总结为一句话:可视化就用 Grafana,监控告警就找夜莺。
|
||||
>
|
||||
> 在可视化领域,Grafana 是毫无争议的领导者,Grafana 在影响力、装机量、用户群、开发者数量等各个维度的数字上,相比夜莺都是追赶的榜样。巨无霸往往都是从一个切入点打开局面的,Grafana Labs 有了在可视化领域 Grafana 这个王牌,逐步扩展到整个可观测性方向,比如 Logging 维度有 Loki,Tracing 维度有 Tempo,Profiling 维度有收购来的 Pyroscope,On-call 维度有同样是收购来的 Grafana-OnCall 项目,还有时序数据库 Mimir、eBPF 采集器 Beyla、OpenTelemetry 采集器 Alloy、前端监控 SDK Faro,最终构成了一个完整的可观测性工具矩阵,但整个飞轮都是从 Grafana 项目开始转动起来的。
|
||||
>
|
||||
>夜莺,则是从监控告警这个切入点打开局面,也逐步横向做了相应扩展,比如夜莺也自研了可视化面板,如果你想有一个 all-in-one 的监控告警+可视化的工具,那么用夜莺也是正确的选择;比如 OnCall 方向,夜莺可以和 [Flashduty SaaS](https://flashcat.cloud/product/flashcat-duty/) 服务无缝的集成;在采集器方向,夜莺有配套的 [Categraf](https://flashcat.cloud/product/categraf),可以一个采集器中管理所有的 exporter,并同时支持指标和日志的采集,极大减轻工程师维护的采集器数量和工作量(这个点太痛了,你可能也遇到过业务团队吐槽采集器数量比业务应用进程数量还多的窘况吧)。
|
||||
夜莺监控(Nightingale)是一款侧重告警的监控类开源项目。类似 Grafana 的数据源集成方式,夜莺也是对接多种既有的数据源,不过 Grafana 侧重在可视化,夜莺是侧重在告警引擎、告警事件的处理和分发。
|
||||
|
||||
夜莺 Nightingale 作为一款开源云原生监控工具,最初由滴滴开发和开源,并于 2022 年 5 月 11 日,捐赠予中国计算机学会开源发展委员会(CCF ODC),为 CCF ODC 成立后接受捐赠的第一个开源项目。在 GitHub 上有超过 10000 颗星,是广受关注和使用的开源监控工具。夜莺的核心研发团队,也是 Open-Falcon 项目原核心研发人员,从 2014 年(Open-Falcon 是 2014 年开源)算起来,也有 10 年了,只为把监控做到极致。
|
||||
夜莺监控项目,最初由滴滴开发和开源,并于 2022 年 5 月 11 日,捐赠予中国计算机学会开源发展委员会(CCF ODC),为 CCF ODC 成立后接受捐赠的第一个开源项目。
|
||||
|
||||
## 夜莺的工作逻辑
|
||||
|
||||
## 快速开始
|
||||
- 👉 [文档中心](https://flashcat.cloud/docs/) | [下载中心](https://flashcat.cloud/download/nightingale/)
|
||||
- ❤️ [报告 Bug](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=&projects=&template=question.yml)
|
||||
- ℹ️ 为了提供更快速的访问体验,上述文档和下载站点托管于 [FlashcatCloud](https://flashcat.cloud)
|
||||
- 💡 前后端代码分离,前端代码仓库:[https://github.com/n9e/fe](https://github.com/n9e/fe)
|
||||
很多用户已经自行采集了指标、日志数据,此时就把存储库(VictoriaMetrics、ElasticSearch等)作为数据源接入夜莺,即可在夜莺里配置告警规则、通知规则,完成告警事件的生成和派发。
|
||||
|
||||
## 功能特点
|
||||

|
||||
|
||||
- 对接多种时序库,实现统一监控告警管理:支持对接的时序库包括 Prometheus、VictoriaMetrics、Thanos、Mimir、M3DB、TDengine 等。
|
||||
- 对接日志库,实现针对日志的监控告警:支持对接的日志库包括 ElasticSearch、Loki 等。
|
||||
- 专业告警能力:内置支持多种告警规则,可以扩展支持常见通知媒介,支持告警屏蔽/抑制/订阅/自愈、告警事件管理。
|
||||
- 高性能可视化引擎:支持多种图表样式,内置众多 Dashboard 模版,也可导入 Grafana 模版,开箱即用,开源协议商业友好。
|
||||
- 支持常见采集器:支持 [Categraf](https://flashcat.cloud/product/categraf)、Telegraf、Grafana-agent、Datadog-agent、各种 Exporter 作为采集器,没有什么数据是不能监控的。
|
||||
- 👀无缝搭配 [Flashduty](https://flashcat.cloud/product/flashcat-duty/):实现告警聚合收敛、认领、升级、排班、IM集成,确保告警处理不遗漏,减少打扰,高效协同。
|
||||
夜莺项目本身不提供监控数据采集能力。推荐您使用 [Categraf](https://github.com/flashcatcloud/categraf) 作为采集器,可以和夜莺丝滑对接。
|
||||
|
||||
[Categraf](https://github.com/flashcatcloud/categraf) 可以采集操作系统、网络设备、各类中间件、数据库的监控数据,通过 Remote Write 协议推送给夜莺,夜莺把监控数据转存到时序库(如 Prometheus、VictoriaMetrics 等),并提供告警和可视化能力。
|
||||
|
||||
## 截图演示
|
||||
|
||||
|
||||
你可以在页面的右上角,切换语言和主题,目前我们支持英语、简体中文、繁体中文。
|
||||
|
||||

|
||||
|
||||
即时查询,类似 Prometheus 内置的查询分析页面,做 ad-hoc 查询,夜莺做了一些 UI 优化,同时提供了一些内置 promql 指标,让不太了解 promql 的用户也可以快速查询。
|
||||
|
||||

|
||||
|
||||
当然,也可以直接通过指标视图查看,有了指标视图,即时查询基本可以不用了,或者只有高端玩家使用即时查询,普通用户直接通过指标视图查询即可。
|
||||
|
||||

|
||||
|
||||
夜莺内置了常用仪表盘,可以直接导入使用。也可以导入 Grafana 仪表盘,不过只能兼容 Grafana 基本图表,如果已经习惯了 Grafana 建议继续使用 Grafana 看图,把夜莺作为一个告警引擎使用。
|
||||
|
||||

|
||||
|
||||
除了内置的仪表盘,也内置了很多告警规则,开箱即用。
|
||||
|
||||

|
||||
|
||||
|
||||
|
||||
## 产品架构
|
||||
|
||||
社区使用夜莺最多的场景就是使用夜莺做告警引擎,对接多套时序库,统一告警规则管理。绘图仍然使用 Grafana 居多。作为一个告警引擎,夜莺的产品架构如下:
|
||||
|
||||

|
||||
|
||||
对于个别边缘机房,如果和中心夜莺服务端网络链路不好,希望提升告警可用性,我们也提供边缘机房告警引擎下沉部署模式,这个模式下,即便网络割裂,告警功能也不受影响。
|
||||
对于个别边缘机房,如果和中心夜莺服务端网络链路不好,希望提升告警可用性,夜莺也提供边缘机房告警引擎下沉部署模式,这个模式下,即便边缘和中心端网络割裂,告警功能也不受影响。
|
||||
|
||||

|
||||
|
||||
> 上图中,机房A和中心机房的网络链路很好,所以直接由中心端的夜莺进程做告警引擎,机房B和中心机房的网络链路不好,所以在机房B部署了 `n9e-edge` 做告警引擎,对机房B的数据源做告警判定。
|
||||
|
||||
## 交流渠道
|
||||
- 报告Bug,优先推荐提交[夜莺GitHub Issue](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&projects=&template=bug_report.yml)
|
||||
- 推荐完整浏览[夜莺文档站点](https://flashcat.cloud/docs/content/flashcat-monitor/nightingale-v7/introduction/),了解更多信息
|
||||
- 加我微信:`picobyte`(我已关闭好友验证)拉入微信群,备注:`夜莺互助群`
|
||||
## 告警降噪、升级、协同
|
||||
|
||||
夜莺的侧重点是做告警引擎,即负责产生告警事件,并根据规则做灵活派发,内置支持 20 种通知媒介(电话、短信、邮件、钉钉、飞书、企微、Slack 等)。
|
||||
|
||||
如果您有更高级的需求,比如:
|
||||
|
||||
- 想要把公司的多套监控系统产生的事件聚拢到一个平台,统一做收敛降噪、响应处理、数据分析
|
||||
- 想要支持人员的排班,践行 On-call 文化,想要支持告警认领、升级(避免遗漏)、协同处理
|
||||
|
||||
那夜莺是不合适的,您需要的是 [PagerDuty](https://www.pagerduty.com/) 或 [FlashDuty](https://flashcat.cloud/product/flashcat-duty/) (产品易用,且有免费套餐)这样的 On-call 产品。
|
||||
|
||||
|
||||
## 相关资料 & 交流渠道
|
||||
- 📚 [夜莺介绍PPT](https://mp.weixin.qq.com/s/Mkwx_46xrltSq8NLqAIYow) 对您了解夜莺各项关键特性会有帮助(PPT链接在文末)
|
||||
- 👉 [文档中心](https://flashcat.cloud/docs/) 为了更快的访问速度,站点托管在 [FlashcatCloud](https://flashcat.cloud)
|
||||
- ❤️ [报告 Bug](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=&projects=&template=question.yml) 写清楚问题描述、复现步骤、截图等信息,更容易得到答案
|
||||
- 💡 前后端代码分离,前端代码仓库:[https://github.com/n9e/fe](https://github.com/n9e/fe)
|
||||
- 🎯 关注[这个公众号](https://gitlink.org.cn/UlricQin)了解更多夜莺动态和知识
|
||||
- 🌟 加我微信:`picobyte`(我已关闭好友验证)拉入微信群,备注:`夜莺互助群`,如果已经把夜莺上到生产环境,可联系我拉入资深监控用户群
|
||||
|
||||
|
||||
## 关键特性简介
|
||||
|
||||

|
||||
|
||||
- 夜莺支持告警规则、屏蔽规则、订阅规则、通知规则,内置支持 20 种通知媒介,支持消息模板自定义
|
||||
- 支持事件管道,对告警事件做 Pipeline 处理,方便和自有系统做自动化整合,比如给告警事件附加一些元信息,对事件做 relabel
|
||||
- 支持业务组概念,引入权限体系,分门别类管理各类规则
|
||||
- 很多数据库、中间件内置了告警规则,可以直接导入使用,也可以直接导入 Prometheus 的告警规则
|
||||
- 支持告警自愈,即告警之后自动触发一个脚本执行一些预定义的逻辑,比如清理一下磁盘、抓一下现场等
|
||||
|
||||

|
||||
|
||||
- 夜莺存档了历史告警事件,支持多维度的查询和统计
|
||||
- 支持灵活的聚合分组,一目了然看到公司的告警事件分布情况
|
||||
|
||||

|
||||
|
||||
- 夜莺内置常用操作系统、中间件、数据库的的指标说明、仪表盘、告警规则,不过都是社区贡献的,整体也是参差不齐
|
||||
- 夜莺直接接收 Remote Write、OpenTSDB、Datadog、Falcon 等多种协议的数据,故而可以和各类 Agent 对接
|
||||
- 夜莺支持 Prometheus、ElasticSearch、Loki、TDEngine 等多种数据源,可以对其中的数据做告警
|
||||
- 夜莺可以很方便内嵌企业内部系统,比如 Grafana、CMDB 等,甚至可以配置这些内嵌系统的菜单可见性
|
||||
|
||||
|
||||

|
||||
|
||||
- 夜莺支持仪表盘功能,支持常见的图表类型,也内置了一些仪表盘,上图是其中一个仪表盘的截图。
|
||||
- 如果你已经习惯了 Grafana,建议仍然使用 Grafana 看图。Grafana 在看图方面道行更深。
|
||||
- 机器相关的监控数据,如果是 Categraf 采集的,建议使用夜莺自带的仪表盘查看,因为 Categraf 的指标命名 Follow 的是 Telegraf 的命名方式,和 Node Exporter 不同
|
||||
- 因为夜莺有个业务组的概念,机器可以归属不同的业务组,有时在仪表盘里只想查看当前所属业务组的机器,所以夜莺的仪表盘可以和业务组联动
|
||||
|
||||
## 广受关注
|
||||
[](https://star-history.com/#ccfos/nightingale&Date)
|
||||
|
||||
## 感谢众多企业的信赖
|
||||
|
||||

|
||||
|
||||
## 社区共建
|
||||
- ❇️ 请阅读浏览[夜莺开源项目和社区治理架构草案](./doc/community-governance.md),真诚欢迎每一位用户、开发者、公司以及组织,使用夜莺监控、积极反馈 Bug、提交功能需求、分享最佳实践,共建专业、活跃的夜莺开源社区。
|
||||
- ❤️ 夜莺贡献者
|
||||
|
||||
@@ -115,7 +115,9 @@ func Start(alertc aconf.Alert, pushgwc pconf.Pushgw, syncStats *memsto.Stats, al
|
||||
eval.NewScheduler(alertc, externalProcessors, alertRuleCache, targetCache, targetsOfAlertRulesCache,
|
||||
busiGroupCache, alertMuteCache, datasourceCache, promClients, naming, ctx, alertStats)
|
||||
|
||||
dp := dispatch.NewDispatch(alertRuleCache, userCache, userGroupCache, alertSubscribeCache, targetCache, notifyConfigCache, taskTplsCache, notifyRuleCache, notifyChannelCache, messageTemplateCache, alertc.Alerting, ctx, alertStats)
|
||||
eventProcessorCache := memsto.NewEventProcessorCache(ctx, syncStats)
|
||||
|
||||
dp := dispatch.NewDispatch(alertRuleCache, userCache, userGroupCache, alertSubscribeCache, targetCache, notifyConfigCache, taskTplsCache, notifyRuleCache, notifyChannelCache, messageTemplateCache, eventProcessorCache, alertc.Alerting, ctx, alertStats)
|
||||
consumer := dispatch.NewConsumer(alertc.Alerting, ctx, dp, promClients)
|
||||
|
||||
notifyRecordComsumer := sender.NewNotifyRecordConsumer(ctx)
|
||||
|
||||
@@ -17,6 +17,7 @@ type Stats struct {
|
||||
CounterRuleEval *prometheus.CounterVec
|
||||
CounterQueryDataErrorTotal *prometheus.CounterVec
|
||||
CounterQueryDataTotal *prometheus.CounterVec
|
||||
CounterVarFillingQuery *prometheus.CounterVec
|
||||
CounterRecordEval *prometheus.CounterVec
|
||||
CounterRecordEvalErrorTotal *prometheus.CounterVec
|
||||
CounterMuteTotal *prometheus.CounterVec
|
||||
@@ -24,6 +25,7 @@ type Stats struct {
|
||||
CounterHeartbeatErrorTotal *prometheus.CounterVec
|
||||
CounterSubEventTotal *prometheus.CounterVec
|
||||
GaugeQuerySeriesCount *prometheus.GaugeVec
|
||||
GaugeRuleEvalDuration *prometheus.GaugeVec
|
||||
GaugeNotifyRecordQueueSize prometheus.Gauge
|
||||
}
|
||||
|
||||
@@ -54,7 +56,7 @@ func NewSyncStats() *Stats {
|
||||
Subsystem: subsystem,
|
||||
Name: "query_data_total",
|
||||
Help: "Number of rule eval query data.",
|
||||
}, []string{"datasource"})
|
||||
}, []string{"datasource", "rule_id"})
|
||||
|
||||
CounterRecordEval := prometheus.NewCounterVec(prometheus.CounterOpts{
|
||||
Namespace: namespace,
|
||||
@@ -135,6 +137,20 @@ func NewSyncStats() *Stats {
|
||||
Help: "The size of notify record queue.",
|
||||
})
|
||||
|
||||
GaugeRuleEvalDuration := prometheus.NewGaugeVec(prometheus.GaugeOpts{
|
||||
Namespace: namespace,
|
||||
Subsystem: subsystem,
|
||||
Name: "rule_eval_duration_ms",
|
||||
Help: "Duration of rule eval in milliseconds.",
|
||||
}, []string{"rule_id", "datasource_id"})
|
||||
|
||||
CounterVarFillingQuery := prometheus.NewCounterVec(prometheus.CounterOpts{
|
||||
Namespace: namespace,
|
||||
Subsystem: subsystem,
|
||||
Name: "var_filling_query_total",
|
||||
Help: "Number of var filling query.",
|
||||
}, []string{"rule_id", "datasource_id", "ref", "typ"})
|
||||
|
||||
prometheus.MustRegister(
|
||||
CounterAlertsTotal,
|
||||
GaugeAlertQueueSize,
|
||||
@@ -150,7 +166,9 @@ func NewSyncStats() *Stats {
|
||||
CounterHeartbeatErrorTotal,
|
||||
CounterSubEventTotal,
|
||||
GaugeQuerySeriesCount,
|
||||
GaugeRuleEvalDuration,
|
||||
GaugeNotifyRecordQueueSize,
|
||||
CounterVarFillingQuery,
|
||||
)
|
||||
|
||||
return &Stats{
|
||||
@@ -168,6 +186,8 @@ func NewSyncStats() *Stats {
|
||||
CounterHeartbeatErrorTotal: CounterHeartbeatErrorTotal,
|
||||
CounterSubEventTotal: CounterSubEventTotal,
|
||||
GaugeQuerySeriesCount: GaugeQuerySeriesCount,
|
||||
GaugeRuleEvalDuration: GaugeRuleEvalDuration,
|
||||
GaugeNotifyRecordQueueSize: GaugeNotifyRecordQueueSize,
|
||||
CounterVarFillingQuery: CounterVarFillingQuery,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -15,6 +15,7 @@ import (
|
||||
"github.com/ccfos/nightingale/v6/alert/aconf"
|
||||
"github.com/ccfos/nightingale/v6/alert/astats"
|
||||
"github.com/ccfos/nightingale/v6/alert/common"
|
||||
"github.com/ccfos/nightingale/v6/alert/pipeline"
|
||||
"github.com/ccfos/nightingale/v6/alert/sender"
|
||||
"github.com/ccfos/nightingale/v6/memsto"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
@@ -35,6 +36,7 @@ type Dispatch struct {
|
||||
notifyRuleCache *memsto.NotifyRuleCacheType
|
||||
notifyChannelCache *memsto.NotifyChannelCacheType
|
||||
messageTemplateCache *memsto.MessageTemplateCacheType
|
||||
eventProcessorCache *memsto.EventProcessorCacheType
|
||||
|
||||
alerting aconf.Alerting
|
||||
|
||||
@@ -54,7 +56,7 @@ type Dispatch struct {
|
||||
func NewDispatch(alertRuleCache *memsto.AlertRuleCacheType, userCache *memsto.UserCacheType, userGroupCache *memsto.UserGroupCacheType,
|
||||
alertSubscribeCache *memsto.AlertSubscribeCacheType, targetCache *memsto.TargetCacheType, notifyConfigCache *memsto.NotifyConfigCacheType,
|
||||
taskTplsCache *memsto.TaskTplCache, notifyRuleCache *memsto.NotifyRuleCacheType, notifyChannelCache *memsto.NotifyChannelCacheType,
|
||||
messageTemplateCache *memsto.MessageTemplateCacheType, alerting aconf.Alerting, ctx *ctx.Context, astats *astats.Stats) *Dispatch {
|
||||
messageTemplateCache *memsto.MessageTemplateCacheType, eventProcessorCache *memsto.EventProcessorCacheType, alerting aconf.Alerting, ctx *ctx.Context, astats *astats.Stats) *Dispatch {
|
||||
notify := &Dispatch{
|
||||
alertRuleCache: alertRuleCache,
|
||||
userCache: userCache,
|
||||
@@ -66,6 +68,7 @@ func NewDispatch(alertRuleCache *memsto.AlertRuleCacheType, userCache *memsto.Us
|
||||
notifyRuleCache: notifyRuleCache,
|
||||
notifyChannelCache: notifyChannelCache,
|
||||
messageTemplateCache: messageTemplateCache,
|
||||
eventProcessorCache: eventProcessorCache,
|
||||
|
||||
alerting: alerting,
|
||||
|
||||
@@ -77,6 +80,12 @@ func NewDispatch(alertRuleCache *memsto.AlertRuleCacheType, userCache *memsto.Us
|
||||
ctx: ctx,
|
||||
Astats: astats,
|
||||
}
|
||||
|
||||
pipeline.Init()
|
||||
|
||||
// 设置通知记录回调函数
|
||||
notifyChannelCache.SetNotifyRecordFunc(sender.NotifyRecord)
|
||||
|
||||
return notify
|
||||
}
|
||||
|
||||
@@ -141,11 +150,14 @@ func (e *Dispatch) reloadTpls() error {
|
||||
return nil
|
||||
}
|
||||
|
||||
func (e *Dispatch) HandleEventWithNotifyRule(event *models.AlertCurEvent, isSubscribe bool) {
|
||||
func (e *Dispatch) HandleEventWithNotifyRule(eventOrigin *models.AlertCurEvent) {
|
||||
|
||||
if len(event.NotifyRuleIDs) > 0 {
|
||||
for _, notifyRuleId := range event.NotifyRuleIDs {
|
||||
logger.Infof("notify rule ids: %v, event: %+v", notifyRuleId, event)
|
||||
if len(eventOrigin.NotifyRuleIds) > 0 {
|
||||
for _, notifyRuleId := range eventOrigin.NotifyRuleIds {
|
||||
// 深拷贝新的 event,避免并发修改 event 冲突
|
||||
eventCopy := eventOrigin.DeepCopy()
|
||||
|
||||
logger.Infof("notify rule ids: %v, event: %+v", notifyRuleId, eventCopy)
|
||||
notifyRule := e.notifyRuleCache.Get(notifyRuleId)
|
||||
if notifyRule == nil {
|
||||
continue
|
||||
@@ -155,33 +167,108 @@ func (e *Dispatch) HandleEventWithNotifyRule(event *models.AlertCurEvent, isSubs
|
||||
continue
|
||||
}
|
||||
|
||||
var processors []models.Processor
|
||||
for _, pipelineConfig := range notifyRule.PipelineConfigs {
|
||||
if !pipelineConfig.Enable {
|
||||
continue
|
||||
}
|
||||
|
||||
eventPipeline := e.eventProcessorCache.Get(pipelineConfig.PipelineId)
|
||||
if eventPipeline == nil {
|
||||
logger.Warningf("notify_id: %d, event:%+v, processor not found", notifyRuleId, eventCopy)
|
||||
continue
|
||||
}
|
||||
|
||||
if !pipelineApplicable(eventPipeline, eventCopy) {
|
||||
logger.Debugf("notify_id: %d, event:%+v, pipeline_id: %d, not applicable", notifyRuleId, eventCopy, pipelineConfig.PipelineId)
|
||||
continue
|
||||
}
|
||||
|
||||
processors = append(processors, e.eventProcessorCache.GetProcessorsById(pipelineConfig.PipelineId)...)
|
||||
}
|
||||
|
||||
for _, processor := range processors {
|
||||
logger.Infof("before processor notify_id: %d, event:%+v, processor:%+v", notifyRuleId, eventCopy, processor)
|
||||
eventCopy = processor.Process(e.ctx, eventCopy)
|
||||
logger.Infof("after processor notify_id: %d, event:%+v, processor:%+v", notifyRuleId, eventCopy, processor)
|
||||
if eventCopy == nil {
|
||||
logger.Warningf("notify_id: %d, event:%+v, processor:%+v, event is nil", notifyRuleId, eventCopy, processor)
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
if eventCopy == nil {
|
||||
// 如果 eventCopy 为 nil,说明 eventCopy 被 processor drop 掉了, 不再发送通知
|
||||
continue
|
||||
}
|
||||
|
||||
// notify
|
||||
for i := range notifyRule.NotifyConfigs {
|
||||
if !NotifyRuleApplicable(¬ifyRule.NotifyConfigs[i], event) {
|
||||
if !NotifyRuleApplicable(¬ifyRule.NotifyConfigs[i], eventCopy) {
|
||||
continue
|
||||
}
|
||||
notifyChannel := e.notifyChannelCache.Get(notifyRule.NotifyConfigs[i].ChannelID)
|
||||
messageTemplate := e.messageTemplateCache.Get(notifyRule.NotifyConfigs[i].TemplateID)
|
||||
if notifyChannel == nil {
|
||||
sender.NotifyRecord(e.ctx, []*models.AlertCurEvent{event}, notifyRuleId, fmt.Sprintf("notify_channel_id:%d", notifyRule.NotifyConfigs[i].ChannelID), "", "", errors.New("notify_channel not found"))
|
||||
logger.Warningf("notify_id: %d, event:%+v, channel_id:%d, template_id: %d, notify_channel not found", notifyRuleId, event, notifyRule.NotifyConfigs[i].ChannelID, notifyRule.NotifyConfigs[i].TemplateID)
|
||||
sender.NotifyRecord(e.ctx, []*models.AlertCurEvent{eventCopy}, notifyRuleId, fmt.Sprintf("notify_channel_id:%d", notifyRule.NotifyConfigs[i].ChannelID), "", "", errors.New("notify_channel not found"))
|
||||
logger.Warningf("notify_id: %d, event:%+v, channel_id:%d, template_id: %d, notify_channel not found", notifyRuleId, eventCopy, notifyRule.NotifyConfigs[i].ChannelID, notifyRule.NotifyConfigs[i].TemplateID)
|
||||
continue
|
||||
}
|
||||
|
||||
if notifyChannel.RequestType != "flashduty" && messageTemplate == nil {
|
||||
logger.Warningf("notify_id: %d, channel_name: %v, event:%+v, template_id: %d, message_template not found", notifyRuleId, notifyChannel.Ident, event, notifyRule.NotifyConfigs[i].TemplateID)
|
||||
sender.NotifyRecord(e.ctx, []*models.AlertCurEvent{event}, notifyRuleId, notifyChannel.Name, "", "", errors.New("message_template not found"))
|
||||
logger.Warningf("notify_id: %d, channel_name: %v, event:%+v, template_id: %d, message_template not found", notifyRuleId, notifyChannel.Ident, eventCopy, notifyRule.NotifyConfigs[i].TemplateID)
|
||||
sender.NotifyRecord(e.ctx, []*models.AlertCurEvent{eventCopy}, notifyRuleId, notifyChannel.Name, "", "", errors.New("message_template not found"))
|
||||
|
||||
continue
|
||||
}
|
||||
|
||||
// todo go send
|
||||
// todo 聚合 event
|
||||
go e.sendV2([]*models.AlertCurEvent{event}, notifyRuleId, ¬ifyRule.NotifyConfigs[i], notifyChannel, messageTemplate)
|
||||
go e.sendV2([]*models.AlertCurEvent{eventCopy}, notifyRuleId, ¬ifyRule.NotifyConfigs[i], notifyChannel, messageTemplate)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func pipelineApplicable(pipeline *models.EventPipeline, event *models.AlertCurEvent) bool {
|
||||
if pipeline == nil {
|
||||
return true
|
||||
}
|
||||
|
||||
if !pipeline.FilterEnable {
|
||||
return true
|
||||
}
|
||||
|
||||
tagMatch := true
|
||||
if len(pipeline.LabelFilters) > 0 {
|
||||
for i := range pipeline.LabelFilters {
|
||||
if pipeline.LabelFilters[i].Func == "" {
|
||||
pipeline.LabelFilters[i].Func = pipeline.LabelFilters[i].Op
|
||||
}
|
||||
}
|
||||
|
||||
tagFilters, err := models.ParseTagFilter(pipeline.LabelFilters)
|
||||
if err != nil {
|
||||
logger.Errorf("pipeline applicable failed to parse tag filter: %v event:%+v pipeline:%+v", err, event, pipeline)
|
||||
return false
|
||||
}
|
||||
tagMatch = common.MatchTags(event.TagsMap, tagFilters)
|
||||
}
|
||||
|
||||
attributesMatch := true
|
||||
if len(pipeline.AttrFilters) > 0 {
|
||||
tagFilters, err := models.ParseTagFilter(pipeline.AttrFilters)
|
||||
if err != nil {
|
||||
logger.Errorf("pipeline applicable failed to parse tag filter: %v event:%+v pipeline:%+v err:%v", tagFilters, event, pipeline, err)
|
||||
return false
|
||||
}
|
||||
|
||||
attributesMatch = common.MatchTags(event.JsonTagsAndValue(), tagFilters)
|
||||
}
|
||||
|
||||
return tagMatch && attributesMatch
|
||||
}
|
||||
|
||||
func NotifyRuleApplicable(notifyConfig *models.NotifyConfig, event *models.AlertCurEvent) bool {
|
||||
tm := time.Unix(event.TriggerTime, 0)
|
||||
triggerTime := tm.Format("15:04")
|
||||
@@ -359,35 +446,34 @@ func (e *Dispatch) sendV2(events []*models.AlertCurEvent, notifyRuleId int64, no
|
||||
|
||||
switch notifyChannel.RequestType {
|
||||
case "flashduty":
|
||||
if len(flashDutyChannelIDs) == 0 {
|
||||
flashDutyChannelIDs = []int64{0} // 如果 flashduty 通道没有配置,则使用 0, 给 SendFlashDuty 判断使用, 不给 flashduty 传 channel_id 参数
|
||||
}
|
||||
|
||||
for i := range flashDutyChannelIDs {
|
||||
respBody, err := notifyChannel.SendFlashDuty(events, flashDutyChannelIDs[i], e.notifyChannelCache.GetHttpClient(notifyChannel.ID))
|
||||
logger.Infof("notify_id: %d, channel_name: %v, event:%+v, IntegrationUrl: %v dutychannel_id: %v, respBody: %v, err: %v", notifyRuleId, notifyChannel.Name, events[0], notifyChannel.RequestConfig.FlashDutyRequestConfig.IntegrationUrl, flashDutyChannelIDs[i], respBody, err)
|
||||
sender.NotifyRecord(e.ctx, events, notifyRuleId, notifyChannel.Name, strconv.FormatInt(flashDutyChannelIDs[i], 10), respBody, err)
|
||||
}
|
||||
return
|
||||
|
||||
case "http":
|
||||
if e.notifyChannelCache.HttpConcurrencyAdd(notifyChannel.ID) {
|
||||
defer e.notifyChannelCache.HttpConcurrencyDone(notifyChannel.ID)
|
||||
}
|
||||
if notifyChannel.RequestConfig == nil {
|
||||
logger.Warningf("notify_id: %d, channel_name: %v, event:%+v, request config not found", notifyRuleId, notifyChannel.Name, events[0])
|
||||
// 使用队列模式处理 http 通知
|
||||
// 创建通知任务
|
||||
task := &memsto.NotifyTask{
|
||||
Events: events,
|
||||
NotifyRuleId: notifyRuleId,
|
||||
NotifyChannel: notifyChannel,
|
||||
TplContent: tplContent,
|
||||
CustomParams: customParams,
|
||||
Sendtos: sendtos,
|
||||
}
|
||||
|
||||
if notifyChannel.RequestConfig.HTTPRequestConfig == nil {
|
||||
logger.Warningf("notify_id: %d, channel_name: %v, event:%+v, http request config not found", notifyRuleId, notifyChannel.Name, events[0])
|
||||
}
|
||||
|
||||
if NeedBatchContacts(notifyChannel.RequestConfig.HTTPRequestConfig) || len(sendtos) == 0 {
|
||||
resp, err := notifyChannel.SendHTTP(events, tplContent, customParams, sendtos, e.notifyChannelCache.GetHttpClient(notifyChannel.ID))
|
||||
logger.Infof("notify_id: %d, channel_name: %v, event:%+v, tplContent:%s, customParams:%v, userInfo:%+v, respBody: %v, err: %v", notifyRuleId, notifyChannel.Name, events[0], tplContent, customParams, sendtos, resp, err)
|
||||
|
||||
sender.NotifyRecord(e.ctx, events, notifyRuleId, notifyChannel.Name, getSendTarget(customParams, sendtos), resp, err)
|
||||
} else {
|
||||
for i := range sendtos {
|
||||
resp, err := notifyChannel.SendHTTP(events, tplContent, customParams, []string{sendtos[i]}, e.notifyChannelCache.GetHttpClient(notifyChannel.ID))
|
||||
logger.Infof("notify_id: %d, channel_name: %v, event:%+v, tplContent:%s, customParams:%v, userInfo:%+v, respBody: %v, err: %v", notifyRuleId, notifyChannel.Name, events[0], tplContent, customParams, sendtos[i], resp, err)
|
||||
sender.NotifyRecord(e.ctx, events, notifyRuleId, notifyChannel.Name, getSendTarget(customParams, []string{sendtos[i]}), resp, err)
|
||||
}
|
||||
// 将任务加入队列
|
||||
success := e.notifyChannelCache.EnqueueNotifyTask(task)
|
||||
if !success {
|
||||
logger.Errorf("failed to enqueue notify task for channel %d, notify_id: %d", notifyChannel.ID, notifyRuleId)
|
||||
// 如果入队失败,记录错误通知
|
||||
sender.NotifyRecord(e.ctx, events, notifyRuleId, notifyChannel.Name, getSendTarget(customParams, sendtos), "", errors.New("failed to enqueue notify task, queue is full"))
|
||||
}
|
||||
|
||||
case "smtp":
|
||||
@@ -416,11 +502,6 @@ func (e *Dispatch) HandleEventNotify(event *models.AlertCurEvent, isSubscribe bo
|
||||
return
|
||||
}
|
||||
|
||||
if e.blockEventNotify(rule, event) {
|
||||
logger.Infof("block event notify: rule_id:%d event:%+v", rule.Id, event)
|
||||
return
|
||||
}
|
||||
|
||||
fillUsers(event, e.userCache, e.userGroupCache)
|
||||
|
||||
var (
|
||||
@@ -448,8 +529,7 @@ func (e *Dispatch) HandleEventNotify(event *models.AlertCurEvent, isSubscribe bo
|
||||
notifyTarget.AndMerge(handler(rule, event, notifyTarget, e))
|
||||
}
|
||||
|
||||
// 处理事件发送,这里用一个goroutine处理一个event的所有发送事件
|
||||
go e.HandleEventWithNotifyRule(event, isSubscribe)
|
||||
go e.HandleEventWithNotifyRule(event)
|
||||
go e.Send(rule, event, notifyTarget, isSubscribe)
|
||||
|
||||
// 如果是不是订阅规则出现的event, 则需要处理订阅规则的event
|
||||
@@ -458,25 +538,6 @@ func (e *Dispatch) HandleEventNotify(event *models.AlertCurEvent, isSubscribe bo
|
||||
}
|
||||
}
|
||||
|
||||
func (e *Dispatch) blockEventNotify(rule *models.AlertRule, event *models.AlertCurEvent) bool {
|
||||
ruleType := rule.GetRuleType()
|
||||
|
||||
// 若为机器则先看机器是否删除
|
||||
if ruleType == models.HOST {
|
||||
host, ok := e.targetCache.Get(event.TagsMap["ident"])
|
||||
if !ok || host == nil {
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
// 恢复通知,检测规则配置是否改变
|
||||
// if event.IsRecovered && event.RuleHash != rule.Hash() {
|
||||
// return true
|
||||
// }
|
||||
|
||||
return false
|
||||
}
|
||||
|
||||
func (e *Dispatch) handleSubs(event *models.AlertCurEvent) {
|
||||
// handle alert subscribes
|
||||
subscribes := make([]*models.AlertSubscribe, 0)
|
||||
@@ -646,6 +707,11 @@ func (e *Dispatch) HandleIbex(rule *models.AlertRule, event *models.AlertCurEven
|
||||
}
|
||||
json.Unmarshal([]byte(rule.RuleConfig), &ruleConfig)
|
||||
|
||||
if event.IsRecovered {
|
||||
// 恢复事件不需要走故障自愈的逻辑
|
||||
return
|
||||
}
|
||||
|
||||
for _, t := range ruleConfig.TaskTpls {
|
||||
if t.TplId == 0 {
|
||||
continue
|
||||
@@ -732,10 +798,10 @@ func getSendTarget(customParams map[string]string, sendtos []string) string {
|
||||
|
||||
values := make([]string, 0)
|
||||
for _, value := range customParams {
|
||||
if len(value) <= 4 {
|
||||
runes := []rune(value)
|
||||
if len(runes) <= 4 {
|
||||
values = append(values, value)
|
||||
} else {
|
||||
runes := []rune(value)
|
||||
maskedValue := string(runes[:len(runes)-4]) + "****"
|
||||
values = append(values, maskedValue)
|
||||
}
|
||||
|
||||
@@ -24,7 +24,7 @@ func LogEvent(event *models.AlertCurEvent, location string, err ...error) {
|
||||
location,
|
||||
event.RuleId,
|
||||
event.SubRuleId,
|
||||
event.NotifyRuleIDs,
|
||||
event.NotifyRuleIds,
|
||||
event.Cluster,
|
||||
event.TagsJSON,
|
||||
event.TriggerValue,
|
||||
|
||||
@@ -13,6 +13,7 @@ import (
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/alert/astats"
|
||||
"github.com/ccfos/nightingale/v6/alert/common"
|
||||
"github.com/ccfos/nightingale/v6/alert/process"
|
||||
"github.com/ccfos/nightingale/v6/dscache"
|
||||
@@ -36,7 +37,6 @@ type AlertRuleWorker struct {
|
||||
DatasourceId int64
|
||||
Quit chan struct{}
|
||||
Inhibit bool
|
||||
Severity int
|
||||
|
||||
Rule *models.AlertRule
|
||||
|
||||
@@ -99,7 +99,7 @@ func NewAlertRuleWorker(rule *models.AlertRule, datasourceId int64, Processor *p
|
||||
rule.CronPattern = fmt.Sprintf("@every %ds", interval)
|
||||
}
|
||||
|
||||
arw.Scheduler = cron.New(cron.WithSeconds())
|
||||
arw.Scheduler = cron.New(cron.WithSeconds(), cron.WithChain(cron.SkipIfStillRunning(cron.DefaultLogger)))
|
||||
|
||||
entryID, err := arw.Scheduler.AddFunc(rule.CronPattern, func() {
|
||||
arw.Eval()
|
||||
@@ -172,7 +172,7 @@ func (arw *AlertRuleWorker) Eval() {
|
||||
case models.LOKI:
|
||||
anomalyPoints, err = arw.GetPromAnomalyPoint(cachedRule.RuleConfig)
|
||||
default:
|
||||
anomalyPoints, recoverPoints = arw.GetAnomalyPoint(cachedRule, arw.Processor.DatasourceId())
|
||||
anomalyPoints, recoverPoints, err = arw.GetAnomalyPoint(cachedRule, arw.Processor.DatasourceId())
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
@@ -232,7 +232,10 @@ func (arw *AlertRuleWorker) Stop() {
|
||||
|
||||
func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) ([]models.AnomalyPoint, error) {
|
||||
var lst []models.AnomalyPoint
|
||||
var severity int
|
||||
start := time.Now()
|
||||
defer func() {
|
||||
arw.Processor.Stats.GaugeRuleEvalDuration.WithLabelValues(fmt.Sprintf("%v", arw.Rule.Id), fmt.Sprintf("%v", arw.Processor.DatasourceId())).Set(float64(time.Since(start).Milliseconds()))
|
||||
}()
|
||||
|
||||
var rule *models.PromRuleConfig
|
||||
if err := json.Unmarshal([]byte(ruleConfig), &rule); err != nil {
|
||||
@@ -259,10 +262,6 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) ([]models.Ano
|
||||
|
||||
arw.Inhibit = rule.Inhibit
|
||||
for i, query := range rule.Queries {
|
||||
if query.Severity < severity {
|
||||
arw.Severity = query.Severity
|
||||
}
|
||||
|
||||
readerClient := arw.PromClients.GetCli(arw.DatasourceId)
|
||||
|
||||
if readerClient == nil {
|
||||
@@ -281,9 +280,21 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) ([]models.Ano
|
||||
if hasLabelLossAggregator(query) || notExactMatch(query) {
|
||||
// 若有聚合函数或非精确匹配则需要先填充变量然后查询,这个方式效率较低
|
||||
anomalyPoints = arw.VarFillingBeforeQuery(query, readerClient)
|
||||
arw.Processor.Stats.CounterVarFillingQuery.WithLabelValues(
|
||||
fmt.Sprintf("%v", arw.Rule.Id),
|
||||
fmt.Sprintf("%v", arw.Processor.DatasourceId()),
|
||||
fmt.Sprintf("%v", i),
|
||||
"BeforeQuery",
|
||||
).Inc()
|
||||
} else {
|
||||
// 先查询再过滤变量,效率较高,但无法处理有聚合函数的情况
|
||||
anomalyPoints = arw.VarFillingAfterQuery(query, readerClient)
|
||||
arw.Processor.Stats.CounterVarFillingQuery.WithLabelValues(
|
||||
fmt.Sprintf("%v", arw.Rule.Id),
|
||||
fmt.Sprintf("%v", arw.Processor.DatasourceId()),
|
||||
fmt.Sprintf("%v", i),
|
||||
"AfterQuery",
|
||||
).Inc()
|
||||
}
|
||||
lst = append(lst, anomalyPoints...)
|
||||
} else {
|
||||
@@ -302,7 +313,7 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) ([]models.Ano
|
||||
}
|
||||
|
||||
var warnings promsdk.Warnings
|
||||
arw.Processor.Stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId)).Inc()
|
||||
arw.Processor.Stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId), fmt.Sprintf("%d", arw.Rule.Id)).Inc()
|
||||
value, warnings, err := readerClient.Query(context.Background(), promql, time.Now())
|
||||
if err != nil {
|
||||
logger.Errorf("rule_eval:%s promql:%s, error:%v", arw.Key(), promql, err)
|
||||
@@ -413,6 +424,7 @@ func (arw *AlertRuleWorker) VarFillingAfterQuery(query models.PromQuery, readerC
|
||||
realQuery = strings.Replace(realQuery, fmt.Sprintf("$%s", key), val, -1)
|
||||
}
|
||||
// 得到满足值变量的所有结果
|
||||
arw.Processor.Stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId), fmt.Sprintf("%d", arw.Rule.Id)).Inc()
|
||||
value, _, err := readerClient.Query(context.Background(), curQuery, time.Now())
|
||||
if err != nil {
|
||||
logger.Errorf("rule_eval:%s, promql:%s, error:%v", arw.Key(), curQuery, err)
|
||||
@@ -574,7 +586,7 @@ func (arw *AlertRuleWorker) getParamPermutation(paramVal map[string]models.Param
|
||||
logger.Errorf("query:%s fail to unmarshalling into string slice, error:%v", paramQuery.Query, err)
|
||||
}
|
||||
if len(query) == 0 {
|
||||
paramsKeyAllLabel, err := getParamKeyAllLabel(varToLabel[paramKey], originPromql, readerClient)
|
||||
paramsKeyAllLabel, err := getParamKeyAllLabel(varToLabel[paramKey], originPromql, readerClient, arw.DatasourceId, arw.Rule.Id, arw.Processor.Stats)
|
||||
if err != nil {
|
||||
logger.Errorf("rule_eval:%s, fail to getParamKeyAllLabel, error:%v query:%s", arw.Key(), err, paramQuery.Query)
|
||||
}
|
||||
@@ -605,7 +617,7 @@ func (arw *AlertRuleWorker) getParamPermutation(paramVal map[string]models.Param
|
||||
return res, nil
|
||||
}
|
||||
|
||||
func getParamKeyAllLabel(paramKey string, promql string, client promsdk.API) ([]string, error) {
|
||||
func getParamKeyAllLabel(paramKey string, promql string, client promsdk.API, dsId int64, rid int64, stats *astats.Stats) ([]string, error) {
|
||||
labels, metricName, err := promql2.GetLabelsAndMetricNameWithReplace(promql, "$")
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("promql:%s, get labels error:%v", promql, err)
|
||||
@@ -619,6 +631,7 @@ func getParamKeyAllLabel(paramKey string, promql string, client promsdk.API) ([]
|
||||
}
|
||||
pr := metricName + "{" + strings.Join(labelstrs, ",") + "}"
|
||||
|
||||
stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", dsId), fmt.Sprintf("%d", rid)).Inc()
|
||||
value, _, err := client.Query(context.Background(), pr, time.Now())
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("promql: %s query error: %v", pr, err)
|
||||
@@ -733,7 +746,10 @@ func combine(paramKeys []string, paraMap map[string][]string, index int, current
|
||||
|
||||
func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.AnomalyPoint, error) {
|
||||
var lst []models.AnomalyPoint
|
||||
var severity int
|
||||
start := time.Now()
|
||||
defer func() {
|
||||
arw.Processor.Stats.GaugeRuleEvalDuration.WithLabelValues(fmt.Sprintf("%v", arw.Rule.Id), fmt.Sprintf("%v", arw.Processor.DatasourceId())).Set(float64(time.Since(start).Milliseconds()))
|
||||
}()
|
||||
|
||||
var rule *models.HostRuleConfig
|
||||
if err := json.Unmarshal([]byte(ruleConfig), &rule); err != nil {
|
||||
@@ -761,10 +777,6 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.Ano
|
||||
arw.Inhibit = rule.Inhibit
|
||||
now := time.Now().Unix()
|
||||
for _, trigger := range rule.Triggers {
|
||||
if trigger.Severity < severity {
|
||||
arw.Severity = trigger.Severity
|
||||
}
|
||||
|
||||
switch trigger.Type {
|
||||
case "target_miss":
|
||||
t := now - int64(trigger.Duration)
|
||||
@@ -1276,6 +1288,7 @@ func (arw *AlertRuleWorker) VarFillingBeforeQuery(query models.PromQuery, reader
|
||||
<-semaphore
|
||||
wg.Done()
|
||||
}()
|
||||
arw.Processor.Stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId), fmt.Sprintf("%d", arw.Rule.Id)).Inc()
|
||||
value, _, err := readerClient.Query(context.Background(), promql, time.Now())
|
||||
if err != nil {
|
||||
logger.Errorf("rule_eval:%s, promql:%s, error:%v", arw.Key(), promql, err)
|
||||
@@ -1409,13 +1422,18 @@ func fillVar(curRealQuery string, paramKey string, val string) string {
|
||||
return curRealQuery
|
||||
}
|
||||
|
||||
func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64) ([]models.AnomalyPoint, []models.AnomalyPoint) {
|
||||
func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64) ([]models.AnomalyPoint, []models.AnomalyPoint, error) {
|
||||
// 获取查询和规则判断条件
|
||||
start := time.Now()
|
||||
defer func() {
|
||||
arw.Processor.Stats.GaugeRuleEvalDuration.WithLabelValues(fmt.Sprintf("%v", arw.Rule.Id), fmt.Sprintf("%v", arw.Processor.DatasourceId())).Set(float64(time.Since(start).Milliseconds()))
|
||||
}()
|
||||
|
||||
points := []models.AnomalyPoint{}
|
||||
recoverPoints := []models.AnomalyPoint{}
|
||||
ruleConfig := strings.TrimSpace(rule.RuleConfig)
|
||||
if ruleConfig == "" {
|
||||
logger.Warningf("rule_eval:%d promql is blank", rule.Id)
|
||||
logger.Warningf("rule_eval:%d ruleConfig is blank", rule.Id)
|
||||
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_RULE_CONFIG, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
|
||||
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
|
||||
fmt.Sprintf("%v", arw.Rule.Id),
|
||||
@@ -1423,7 +1441,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
|
||||
"",
|
||||
).Set(0)
|
||||
|
||||
return points, recoverPoints
|
||||
return points, recoverPoints, fmt.Errorf("rule_eval:%d ruleConfig is blank", rule.Id)
|
||||
}
|
||||
|
||||
var ruleQuery models.RuleQuery
|
||||
@@ -1431,7 +1449,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
|
||||
if err != nil {
|
||||
logger.Warningf("rule_eval:%d promql parse error:%s", rule.Id, err.Error())
|
||||
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_RULE_CONFIG, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
|
||||
return points, recoverPoints
|
||||
return points, recoverPoints, fmt.Errorf("rule_eval:%d promql parse error:%s", rule.Id, err.Error())
|
||||
}
|
||||
|
||||
arw.Inhibit = ruleQuery.Inhibit
|
||||
@@ -1451,12 +1469,13 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
|
||||
fmt.Sprintf("%v", arw.Processor.DatasourceId()),
|
||||
fmt.Sprintf("%v", i),
|
||||
).Set(-2)
|
||||
continue
|
||||
|
||||
return points, recoverPoints, fmt.Errorf("rule_eval:%d datasource:%d not exists", rule.Id, dsId)
|
||||
}
|
||||
|
||||
ctx := context.WithValue(context.Background(), "delay", int64(rule.Delay))
|
||||
series, err := plug.QueryData(ctx, query)
|
||||
arw.Processor.Stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId)).Inc()
|
||||
arw.Processor.Stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId), fmt.Sprintf("%d", rule.Id)).Inc()
|
||||
if err != nil {
|
||||
logger.Warningf("rule_eval rid:%d query data error: %v", rule.Id, err)
|
||||
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_CLIENT, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
|
||||
@@ -1466,7 +1485,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
|
||||
fmt.Sprintf("%v", i),
|
||||
).Set(-1)
|
||||
|
||||
continue
|
||||
return points, recoverPoints, fmt.Errorf("rule_eval:%d query data error: %v", rule.Id, err)
|
||||
}
|
||||
|
||||
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
|
||||
@@ -1500,6 +1519,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
|
||||
for _, query := range ruleQuery.Queries {
|
||||
ref, unit, err := GetQueryRefAndUnit(query)
|
||||
if err != nil {
|
||||
logger.Warningf("rule_eval rid:%d query:%+v get ref and unit error:%s", rule.Id, query, err.Error())
|
||||
continue
|
||||
}
|
||||
unitMap[ref] = unit
|
||||
@@ -1579,6 +1599,11 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
|
||||
}
|
||||
}
|
||||
|
||||
queries := ruleQuery.Queries
|
||||
if sample.Query != "" {
|
||||
queries = []interface{}{sample.Query}
|
||||
}
|
||||
|
||||
point := models.AnomalyPoint{
|
||||
Key: sample.MetricName(),
|
||||
Labels: sample.Metric,
|
||||
@@ -1587,7 +1612,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
|
||||
Values: values,
|
||||
Severity: trigger.Severity,
|
||||
Triggered: isTriggered,
|
||||
Query: fmt.Sprintf("query:%+v trigger:%+v", ruleQuery.Queries, trigger),
|
||||
Query: fmt.Sprintf("query:%+v trigger:%+v", queries, trigger),
|
||||
RecoverConfig: trigger.RecoverConfig,
|
||||
ValuesUnit: valuesUnitMap,
|
||||
}
|
||||
@@ -1661,5 +1686,5 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
|
||||
}
|
||||
}
|
||||
|
||||
return points, recoverPoints
|
||||
return points, recoverPoints, nil
|
||||
}
|
||||
|
||||
@@ -44,6 +44,12 @@ func TimeSpanMuteStrategy(rule *models.AlertRule, event *models.AlertCurEvent) b
|
||||
triggerTime := tm.Format("15:04")
|
||||
triggerWeek := strconv.Itoa(int(tm.Weekday()))
|
||||
|
||||
if rule.EnableDaysOfWeek == "" {
|
||||
// 如果规则没有配置生效时间,则默认全天生效
|
||||
|
||||
return false
|
||||
}
|
||||
|
||||
enableStime := strings.Fields(rule.EnableStime)
|
||||
enableEtime := strings.Fields(rule.EnableEtime)
|
||||
enableDaysOfWeek := strings.Split(rule.EnableDaysOfWeek, ";")
|
||||
@@ -129,7 +135,7 @@ func EventMuteStrategy(event *models.AlertCurEvent, alertMuteCache *memsto.Alert
|
||||
}
|
||||
|
||||
for i := 0; i < len(mutes); i++ {
|
||||
if matchMute(event, mutes[i]) {
|
||||
if MatchMute(event, mutes[i]) {
|
||||
return true, mutes[i].Id
|
||||
}
|
||||
}
|
||||
@@ -137,15 +143,11 @@ func EventMuteStrategy(event *models.AlertCurEvent, alertMuteCache *memsto.Alert
|
||||
return false, 0
|
||||
}
|
||||
|
||||
// matchMute 如果传入了clock这个可选参数,就表示使用这个clock表示的时间,否则就从event的字段中取TriggerTime
|
||||
func matchMute(event *models.AlertCurEvent, mute *models.AlertMute, clock ...int64) bool {
|
||||
// MatchMute 如果传入了clock这个可选参数,就表示使用这个clock表示的时间,否则就从event的字段中取TriggerTime
|
||||
func MatchMute(event *models.AlertCurEvent, mute *models.AlertMute, clock ...int64) bool {
|
||||
if mute.Disabled == 1 {
|
||||
return false
|
||||
}
|
||||
ts := event.TriggerTime
|
||||
if len(clock) > 0 {
|
||||
ts = clock[0]
|
||||
}
|
||||
|
||||
// 如果不是全局的,判断 匹配的 datasource id
|
||||
if len(mute.DatasourceIdsJson) != 0 && mute.DatasourceIdsJson[0] != 0 && event.DatasourceId != 0 {
|
||||
@@ -160,37 +162,21 @@ func matchMute(event *models.AlertCurEvent, mute *models.AlertMute, clock ...int
|
||||
}
|
||||
}
|
||||
|
||||
var matchTime bool
|
||||
if mute.MuteTimeType == models.TimeRange {
|
||||
if ts < mute.Btime || ts > mute.Etime {
|
||||
if !mute.IsWithinTimeRange(event.TriggerTime) {
|
||||
return false
|
||||
}
|
||||
matchTime = true
|
||||
} else if mute.MuteTimeType == models.Periodic {
|
||||
tm := time.Unix(event.TriggerTime, 0)
|
||||
triggerTime := tm.Format("15:04")
|
||||
triggerWeek := strconv.Itoa(int(tm.Weekday()))
|
||||
|
||||
for i := 0; i < len(mute.PeriodicMutesJson); i++ {
|
||||
if strings.Contains(mute.PeriodicMutesJson[i].EnableDaysOfWeek, triggerWeek) {
|
||||
if mute.PeriodicMutesJson[i].EnableStime == mute.PeriodicMutesJson[i].EnableEtime || (mute.PeriodicMutesJson[i].EnableStime == "00:00" && mute.PeriodicMutesJson[i].EnableEtime == "23:59") {
|
||||
matchTime = true
|
||||
break
|
||||
} else if mute.PeriodicMutesJson[i].EnableStime < mute.PeriodicMutesJson[i].EnableEtime {
|
||||
if triggerTime >= mute.PeriodicMutesJson[i].EnableStime && triggerTime < mute.PeriodicMutesJson[i].EnableEtime {
|
||||
matchTime = true
|
||||
break
|
||||
}
|
||||
} else {
|
||||
if triggerTime >= mute.PeriodicMutesJson[i].EnableStime || triggerTime < mute.PeriodicMutesJson[i].EnableEtime {
|
||||
matchTime = true
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
ts := event.TriggerTime
|
||||
if len(clock) > 0 {
|
||||
ts = clock[0]
|
||||
}
|
||||
}
|
||||
if !matchTime {
|
||||
|
||||
if !mute.IsWithinPeriodicMute(ts) {
|
||||
return false
|
||||
}
|
||||
} else {
|
||||
logger.Warningf("mute time type invalid, %d", mute.MuteTimeType)
|
||||
return false
|
||||
}
|
||||
|
||||
|
||||
11
alert/pipeline/pipeline.go
Normal file
11
alert/pipeline/pipeline.go
Normal file
@@ -0,0 +1,11 @@
|
||||
package pipeline
|
||||
|
||||
import (
|
||||
_ "github.com/ccfos/nightingale/v6/alert/pipeline/processor/callback"
|
||||
_ "github.com/ccfos/nightingale/v6/alert/pipeline/processor/eventdrop"
|
||||
_ "github.com/ccfos/nightingale/v6/alert/pipeline/processor/eventupdate"
|
||||
_ "github.com/ccfos/nightingale/v6/alert/pipeline/processor/relabel"
|
||||
)
|
||||
|
||||
func Init() {
|
||||
}
|
||||
106
alert/pipeline/processor/callback/callback.go
Normal file
106
alert/pipeline/processor/callback/callback.go
Normal file
@@ -0,0 +1,106 @@
|
||||
package callback
|
||||
|
||||
import (
|
||||
"crypto/tls"
|
||||
"encoding/json"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/alert/pipeline/processor/common"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/ctx"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
)
|
||||
|
||||
type HTTPConfig struct {
|
||||
URL string `json:"url"`
|
||||
Method string `json:"method,omitempty"`
|
||||
Body string `json:"body,omitempty"`
|
||||
Headers map[string]string `json:"headers"`
|
||||
AuthUsername string `json:"auth_username"`
|
||||
AuthPassword string `json:"auth_password"`
|
||||
Timeout int `json:"timeout"` // 单位:ms
|
||||
SkipSSLVerify bool `json:"skip_ssl_verify"`
|
||||
Proxy string `json:"proxy"`
|
||||
Client *http.Client `json:"-"`
|
||||
}
|
||||
|
||||
// RelabelConfig
|
||||
type CallbackConfig struct {
|
||||
HTTPConfig
|
||||
}
|
||||
|
||||
func init() {
|
||||
models.RegisterProcessor("callback", &CallbackConfig{})
|
||||
}
|
||||
|
||||
func (c *CallbackConfig) Init(settings interface{}) (models.Processor, error) {
|
||||
result, err := common.InitProcessor[*CallbackConfig](settings)
|
||||
return result, err
|
||||
}
|
||||
|
||||
func (c *CallbackConfig) Process(ctx *ctx.Context, event *models.AlertCurEvent) *models.AlertCurEvent {
|
||||
if c.Client == nil {
|
||||
transport := &http.Transport{
|
||||
TLSClientConfig: &tls.Config{InsecureSkipVerify: c.SkipSSLVerify},
|
||||
}
|
||||
|
||||
if c.Proxy != "" {
|
||||
proxyURL, err := url.Parse(c.Proxy)
|
||||
if err != nil {
|
||||
logger.Errorf("failed to parse proxy url: %v", err)
|
||||
} else {
|
||||
transport.Proxy = http.ProxyURL(proxyURL)
|
||||
}
|
||||
}
|
||||
|
||||
c.Client = &http.Client{
|
||||
Timeout: time.Duration(c.Timeout) * time.Millisecond,
|
||||
Transport: transport,
|
||||
}
|
||||
}
|
||||
|
||||
headers := make(map[string]string)
|
||||
headers["Content-Type"] = "application/json"
|
||||
for k, v := range c.Headers {
|
||||
headers[k] = v
|
||||
}
|
||||
|
||||
body, err := json.Marshal(event)
|
||||
if err != nil {
|
||||
logger.Errorf("failed to marshal event: %v", err)
|
||||
return event
|
||||
}
|
||||
|
||||
req, err := http.NewRequest("POST", c.URL, strings.NewReader(string(body)))
|
||||
if err != nil {
|
||||
logger.Errorf("failed to create request: %v event: %v", err, event)
|
||||
return event
|
||||
}
|
||||
|
||||
for k, v := range headers {
|
||||
req.Header.Set(k, v)
|
||||
}
|
||||
|
||||
if c.AuthUsername != "" && c.AuthPassword != "" {
|
||||
req.SetBasicAuth(c.AuthUsername, c.AuthPassword)
|
||||
}
|
||||
|
||||
resp, err := c.Client.Do(req)
|
||||
if err != nil {
|
||||
logger.Errorf("failed to send request: %v event: %v", err, event)
|
||||
return event
|
||||
}
|
||||
|
||||
b, err := io.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
logger.Errorf("failed to read response body: %v event: %v", err, event)
|
||||
return event
|
||||
}
|
||||
|
||||
logger.Infof("response body: %s", string(b))
|
||||
return event
|
||||
}
|
||||
24
alert/pipeline/processor/common/common.go
Normal file
24
alert/pipeline/processor/common/common.go
Normal file
@@ -0,0 +1,24 @@
|
||||
package common
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
)
|
||||
|
||||
// InitProcessor 是一个通用的初始化处理器的方法
|
||||
// 使用泛型简化处理器初始化逻辑
|
||||
// T 必须是 models.Processor 接口的实现
|
||||
func InitProcessor[T any](settings interface{}) (T, error) {
|
||||
var zero T
|
||||
b, err := json.Marshal(settings)
|
||||
if err != nil {
|
||||
return zero, err
|
||||
}
|
||||
|
||||
var result T
|
||||
err = json.Unmarshal(b, &result)
|
||||
if err != nil {
|
||||
return zero, err
|
||||
}
|
||||
|
||||
return result, nil
|
||||
}
|
||||
61
alert/pipeline/processor/eventdrop/event_drop.go
Normal file
61
alert/pipeline/processor/eventdrop/event_drop.go
Normal file
@@ -0,0 +1,61 @@
|
||||
package eventdrop
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"strings"
|
||||
texttemplate "text/template"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/alert/pipeline/processor/common"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/ctx"
|
||||
"github.com/ccfos/nightingale/v6/pkg/tplx"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
)
|
||||
|
||||
type EventDropConfig struct {
|
||||
Content string `json:"content"`
|
||||
}
|
||||
|
||||
func init() {
|
||||
models.RegisterProcessor("event_drop", &EventDropConfig{})
|
||||
}
|
||||
|
||||
func (c *EventDropConfig) Init(settings interface{}) (models.Processor, error) {
|
||||
result, err := common.InitProcessor[*EventDropConfig](settings)
|
||||
return result, err
|
||||
}
|
||||
|
||||
func (c *EventDropConfig) Process(ctx *ctx.Context, event *models.AlertCurEvent) *models.AlertCurEvent {
|
||||
// 使用背景是可以根据此处理器,实现对事件进行更加灵活的过滤的逻辑
|
||||
// 在标签过滤和属性过滤都不满足需求时可以使用
|
||||
// 如果模板执行结果为 true,则删除该事件
|
||||
|
||||
var defs = []string{
|
||||
"{{ $event := . }}",
|
||||
"{{ $labels := .TagsMap }}",
|
||||
"{{ $value := .TriggerValue }}",
|
||||
}
|
||||
|
||||
text := strings.Join(append(defs, c.Content), "")
|
||||
|
||||
tpl, err := texttemplate.New("eventdrop").Funcs(tplx.TemplateFuncMap).Parse(text)
|
||||
if err != nil {
|
||||
logger.Errorf("processor failed to parse template: %v event: %v", err, event)
|
||||
return event
|
||||
}
|
||||
|
||||
var body bytes.Buffer
|
||||
if err = tpl.Execute(&body, event); err != nil {
|
||||
logger.Errorf("processor failed to execute template: %v event: %v", err, event)
|
||||
return event
|
||||
}
|
||||
|
||||
result := strings.TrimSpace(body.String())
|
||||
logger.Infof("processor eventdrop result: %v", result)
|
||||
if result == "true" {
|
||||
logger.Infof("processor eventdrop drop event: %v", event)
|
||||
return nil
|
||||
}
|
||||
|
||||
return event
|
||||
}
|
||||
95
alert/pipeline/processor/eventupdate/event_update.go
Normal file
95
alert/pipeline/processor/eventupdate/event_update.go
Normal file
@@ -0,0 +1,95 @@
|
||||
package eventupdate
|
||||
|
||||
import (
|
||||
"crypto/tls"
|
||||
"encoding/json"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/alert/pipeline/processor/callback"
|
||||
"github.com/ccfos/nightingale/v6/alert/pipeline/processor/common"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/ctx"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
)
|
||||
|
||||
// RelabelConfig
|
||||
type EventUpdateConfig struct {
|
||||
callback.HTTPConfig
|
||||
}
|
||||
|
||||
func init() {
|
||||
models.RegisterProcessor("event_update", &EventUpdateConfig{})
|
||||
}
|
||||
|
||||
func (c *EventUpdateConfig) Init(settings interface{}) (models.Processor, error) {
|
||||
result, err := common.InitProcessor[*EventUpdateConfig](settings)
|
||||
return result, err
|
||||
}
|
||||
|
||||
func (c *EventUpdateConfig) Process(ctx *ctx.Context, event *models.AlertCurEvent) *models.AlertCurEvent {
|
||||
if c.Client == nil {
|
||||
transport := &http.Transport{
|
||||
TLSClientConfig: &tls.Config{InsecureSkipVerify: c.SkipSSLVerify},
|
||||
}
|
||||
|
||||
if c.Proxy != "" {
|
||||
proxyURL, err := url.Parse(c.Proxy)
|
||||
if err != nil {
|
||||
logger.Errorf("failed to parse proxy url: %v", err)
|
||||
} else {
|
||||
transport.Proxy = http.ProxyURL(proxyURL)
|
||||
}
|
||||
}
|
||||
|
||||
c.Client = &http.Client{
|
||||
Timeout: time.Duration(c.Timeout) * time.Millisecond,
|
||||
Transport: transport,
|
||||
}
|
||||
}
|
||||
|
||||
headers := make(map[string]string)
|
||||
headers["Content-Type"] = "application/json"
|
||||
for k, v := range c.Headers {
|
||||
headers[k] = v
|
||||
}
|
||||
|
||||
body, err := json.Marshal(event)
|
||||
if err != nil {
|
||||
logger.Errorf("failed to marshal event: %v", err)
|
||||
return event
|
||||
}
|
||||
|
||||
req, err := http.NewRequest("POST", c.URL, strings.NewReader(string(body)))
|
||||
if err != nil {
|
||||
logger.Errorf("failed to create request: %v event: %v", err, event)
|
||||
return event
|
||||
}
|
||||
|
||||
for k, v := range headers {
|
||||
req.Header.Set(k, v)
|
||||
}
|
||||
|
||||
if c.AuthUsername != "" && c.AuthPassword != "" {
|
||||
req.SetBasicAuth(c.AuthUsername, c.AuthPassword)
|
||||
}
|
||||
|
||||
resp, err := c.Client.Do(req)
|
||||
if err != nil {
|
||||
logger.Errorf("failed to send request: %v event: %v", err, event)
|
||||
return event
|
||||
}
|
||||
|
||||
b, err := io.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
logger.Errorf("failed to read response body: %v event: %v", err, event)
|
||||
return event
|
||||
}
|
||||
logger.Infof("response body: %s", string(b))
|
||||
|
||||
json.Unmarshal(b, &event)
|
||||
return event
|
||||
}
|
||||
107
alert/pipeline/processor/relabel/relabel.go
Normal file
107
alert/pipeline/processor/relabel/relabel.go
Normal file
@@ -0,0 +1,107 @@
|
||||
package relabel
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strings"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/alert/pipeline/processor/common"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/ctx"
|
||||
"github.com/ccfos/nightingale/v6/pushgw/pconf"
|
||||
"github.com/ccfos/nightingale/v6/pushgw/writer"
|
||||
|
||||
"github.com/prometheus/common/model"
|
||||
"github.com/prometheus/prometheus/prompb"
|
||||
)
|
||||
|
||||
const (
|
||||
REPLACE_DOT = "___"
|
||||
)
|
||||
|
||||
// RelabelConfig
|
||||
type RelabelConfig struct {
|
||||
SourceLabels []string `json:"source_labels"`
|
||||
Separator string `json:"separator"`
|
||||
Regex string `json:"regex"`
|
||||
RegexCompiled *regexp.Regexp
|
||||
If string `json:"if"`
|
||||
IfRegex *regexp.Regexp
|
||||
Modulus uint64 `json:"modulus"`
|
||||
TargetLabel string `json:"target_label"`
|
||||
Replacement string `json:"replacement"`
|
||||
Action string `json:"action"`
|
||||
}
|
||||
|
||||
func init() {
|
||||
models.RegisterProcessor("relabel", &RelabelConfig{})
|
||||
}
|
||||
|
||||
func (r *RelabelConfig) Init(settings interface{}) (models.Processor, error) {
|
||||
result, err := common.InitProcessor[*RelabelConfig](settings)
|
||||
return result, err
|
||||
}
|
||||
|
||||
func (r *RelabelConfig) Process(ctx *ctx.Context, event *models.AlertCurEvent) *models.AlertCurEvent {
|
||||
sourceLabels := make([]model.LabelName, len(r.SourceLabels))
|
||||
for i := range r.SourceLabels {
|
||||
sourceLabels[i] = model.LabelName(strings.ReplaceAll(r.SourceLabels[i], ".", REPLACE_DOT))
|
||||
}
|
||||
|
||||
relabelConfigs := []*pconf.RelabelConfig{
|
||||
{
|
||||
SourceLabels: sourceLabels,
|
||||
Separator: r.Separator,
|
||||
Regex: r.Regex,
|
||||
RegexCompiled: r.RegexCompiled,
|
||||
If: r.If,
|
||||
IfRegex: r.IfRegex,
|
||||
Modulus: r.Modulus,
|
||||
TargetLabel: r.TargetLabel,
|
||||
Replacement: r.Replacement,
|
||||
Action: r.Action,
|
||||
},
|
||||
}
|
||||
|
||||
EventRelabel(event, relabelConfigs)
|
||||
return event
|
||||
}
|
||||
|
||||
func EventRelabel(event *models.AlertCurEvent, relabelConfigs []*pconf.RelabelConfig) {
|
||||
labels := make([]prompb.Label, len(event.TagsJSON))
|
||||
event.OriginalTagsJSON = make([]string, len(event.TagsJSON))
|
||||
for i, tag := range event.TagsJSON {
|
||||
label := strings.SplitN(tag, "=", 2)
|
||||
if len(label) != 2 {
|
||||
continue
|
||||
}
|
||||
event.OriginalTagsJSON[i] = tag
|
||||
|
||||
label[0] = strings.ReplaceAll(string(label[0]), ".", REPLACE_DOT)
|
||||
labels[i] = prompb.Label{Name: label[0], Value: label[1]}
|
||||
}
|
||||
|
||||
for i := 0; i < len(relabelConfigs); i++ {
|
||||
if relabelConfigs[i].Replacement == "" {
|
||||
relabelConfigs[i].Replacement = "$1"
|
||||
}
|
||||
|
||||
if relabelConfigs[i].Separator == "" {
|
||||
relabelConfigs[i].Separator = ";"
|
||||
}
|
||||
|
||||
if relabelConfigs[i].Regex == "" {
|
||||
relabelConfigs[i].Regex = "(.*)"
|
||||
}
|
||||
}
|
||||
|
||||
gotLabels := writer.Process(labels, relabelConfigs...)
|
||||
event.TagsJSON = make([]string, len(gotLabels))
|
||||
event.TagsMap = make(map[string]string, len(gotLabels))
|
||||
for i, label := range gotLabels {
|
||||
label.Name = strings.ReplaceAll(string(label.Name), REPLACE_DOT, ".")
|
||||
event.TagsJSON[i] = fmt.Sprintf("%s=%s", label.Name, label.Value)
|
||||
event.TagsMap[label.Name] = label.Value
|
||||
}
|
||||
event.Tags = strings.Join(event.TagsJSON, ",,")
|
||||
}
|
||||
@@ -14,14 +14,13 @@ import (
|
||||
"github.com/ccfos/nightingale/v6/alert/common"
|
||||
"github.com/ccfos/nightingale/v6/alert/dispatch"
|
||||
"github.com/ccfos/nightingale/v6/alert/mute"
|
||||
"github.com/ccfos/nightingale/v6/alert/pipeline/processor/relabel"
|
||||
"github.com/ccfos/nightingale/v6/alert/queue"
|
||||
"github.com/ccfos/nightingale/v6/memsto"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/ctx"
|
||||
"github.com/ccfos/nightingale/v6/pkg/tplx"
|
||||
"github.com/ccfos/nightingale/v6/pushgw/writer"
|
||||
|
||||
"github.com/prometheus/prometheus/prompb"
|
||||
"github.com/robfig/cron/v3"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
"github.com/toolkits/pkg/str"
|
||||
@@ -61,11 +60,9 @@ type Processor struct {
|
||||
pendingsUseByRecover *AlertCurEventMap
|
||||
inhibit bool
|
||||
|
||||
tagsMap map[string]string
|
||||
tagsArr []string
|
||||
target string
|
||||
targetNote string
|
||||
groupName string
|
||||
tagsMap map[string]string
|
||||
tagsArr []string
|
||||
groupName string
|
||||
|
||||
alertRuleCache *memsto.AlertRuleCacheType
|
||||
TargetCache *memsto.TargetCacheType
|
||||
@@ -154,7 +151,7 @@ func (p *Processor) Handle(anomalyPoints []models.AnomalyPoint, from string, inh
|
||||
eventsMap := make(map[string][]*models.AlertCurEvent)
|
||||
for _, anomalyPoint := range anomalyPoints {
|
||||
event := p.BuildEvent(anomalyPoint, from, now, ruleHash)
|
||||
event.NotifyRuleIDs = cachedRule.NotifyRuleIds
|
||||
event.NotifyRuleIds = cachedRule.NotifyRuleIds
|
||||
// 如果 event 被 mute 了,本质也是 fire 的状态,这里无论如何都添加到 alertingKeys 中,防止 fire 的事件自动恢复了
|
||||
hash := event.Hash
|
||||
alertingKeys[hash] = struct{}{}
|
||||
@@ -196,7 +193,7 @@ func (p *Processor) Handle(anomalyPoints []models.AnomalyPoint, from string, inh
|
||||
|
||||
func (p *Processor) BuildEvent(anomalyPoint models.AnomalyPoint, from string, now int64, ruleHash string) *models.AlertCurEvent {
|
||||
p.fillTags(anomalyPoint)
|
||||
p.mayHandleIdent()
|
||||
|
||||
hash := Hash(p.rule.Id, p.datasourceId, anomalyPoint)
|
||||
ds := p.datasourceCache.GetById(p.datasourceId)
|
||||
var dsName string
|
||||
@@ -216,8 +213,6 @@ func (p *Processor) BuildEvent(anomalyPoint models.AnomalyPoint, from string, no
|
||||
event.DatasourceId = p.datasourceId
|
||||
event.Cluster = dsName
|
||||
event.Hash = hash
|
||||
event.TargetIdent = p.target
|
||||
event.TargetNote = p.targetNote
|
||||
event.TriggerValue = anomalyPoint.ReadableValue()
|
||||
event.TriggerValues = anomalyPoint.Values
|
||||
event.TriggerValuesJson = models.EventTriggerValues{ValuesWithUnit: anomalyPoint.ValuesUnit}
|
||||
@@ -249,15 +244,6 @@ func (p *Processor) BuildEvent(anomalyPoint models.AnomalyPoint, from string, no
|
||||
logger.Warningf("unmarshal annotations json failed: %v, rule: %d", err, p.rule.Id)
|
||||
}
|
||||
|
||||
if p.target != "" {
|
||||
if pt, exist := p.TargetCache.Get(p.target); exist {
|
||||
pt.GroupNames = p.BusiGroupCache.GetNamesByBusiGroupIds(pt.GroupIds)
|
||||
event.Target = pt
|
||||
} else {
|
||||
logger.Infof("Target[ident: %s] doesn't exist in cache.", p.target)
|
||||
}
|
||||
}
|
||||
|
||||
if event.TriggerValues != "" && strings.Count(event.TriggerValues, "$") > 1 {
|
||||
// TriggerValues 有多个变量,将多个变量都放到 TriggerValue 中
|
||||
event.TriggerValue = event.TriggerValues
|
||||
@@ -271,6 +257,19 @@ func (p *Processor) BuildEvent(anomalyPoint models.AnomalyPoint, from string, no
|
||||
|
||||
// 生成事件之后,立马进程 relabel 处理
|
||||
Relabel(p.rule, event)
|
||||
|
||||
// 放到 Relabel(p.rule, event) 下面,为了处理 relabel 之后,标签里才出现 ident 的情况
|
||||
p.mayHandleIdent(event)
|
||||
|
||||
if event.TargetIdent != "" {
|
||||
if pt, exist := p.TargetCache.Get(event.TargetIdent); exist {
|
||||
pt.GroupNames = p.BusiGroupCache.GetNamesByBusiGroupIds(pt.GroupIds)
|
||||
event.Target = pt
|
||||
} else {
|
||||
logger.Infof("fill event target error, ident: %s doesn't exist in cache.", event.TargetIdent)
|
||||
}
|
||||
}
|
||||
|
||||
return event
|
||||
}
|
||||
|
||||
@@ -279,44 +278,15 @@ func Relabel(rule *models.AlertRule, event *models.AlertCurEvent) {
|
||||
return
|
||||
}
|
||||
|
||||
if len(rule.EventRelabelConfig) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
// need to keep the original label
|
||||
event.OriginalTags = event.Tags
|
||||
event.OriginalTagsJSON = make([]string, len(event.TagsJSON))
|
||||
|
||||
labels := make([]prompb.Label, len(event.TagsJSON))
|
||||
for i, tag := range event.TagsJSON {
|
||||
label := strings.SplitN(tag, "=", 2)
|
||||
event.OriginalTagsJSON[i] = tag
|
||||
labels[i] = prompb.Label{Name: label[0], Value: label[1]}
|
||||
if len(rule.EventRelabelConfig) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
for i := 0; i < len(rule.EventRelabelConfig); i++ {
|
||||
if rule.EventRelabelConfig[i].Replacement == "" {
|
||||
rule.EventRelabelConfig[i].Replacement = "$1"
|
||||
}
|
||||
|
||||
if rule.EventRelabelConfig[i].Separator == "" {
|
||||
rule.EventRelabelConfig[i].Separator = ";"
|
||||
}
|
||||
|
||||
if rule.EventRelabelConfig[i].Regex == "" {
|
||||
rule.EventRelabelConfig[i].Regex = "(.*)"
|
||||
}
|
||||
}
|
||||
|
||||
// relabel process
|
||||
relabels := writer.Process(labels, rule.EventRelabelConfig...)
|
||||
event.TagsJSON = make([]string, len(relabels))
|
||||
event.TagsMap = make(map[string]string, len(relabels))
|
||||
for i, label := range relabels {
|
||||
event.TagsJSON[i] = fmt.Sprintf("%s=%s", label.Name, label.Value)
|
||||
event.TagsMap[label.Name] = label.Value
|
||||
}
|
||||
event.Tags = strings.Join(event.TagsJSON, ",,")
|
||||
relabel.EventRelabel(event, rule.EventRelabelConfig)
|
||||
}
|
||||
|
||||
func (p *Processor) HandleRecover(alertingKeys map[string]struct{}, now int64, inhibit bool) {
|
||||
@@ -436,8 +406,8 @@ func (p *Processor) RecoverSingle(byRecover bool, hash string, now int64, value
|
||||
|
||||
func (p *Processor) handleEvent(events []*models.AlertCurEvent) {
|
||||
var fireEvents []*models.AlertCurEvent
|
||||
// severity 初始为 4, 一定为遇到比自己优先级高的事件
|
||||
severity := 4
|
||||
// severity 初始为最低优先级, 一定为遇到比自己优先级高的事件
|
||||
severity := models.SeverityLowest
|
||||
for _, event := range events {
|
||||
if event == nil {
|
||||
continue
|
||||
@@ -567,7 +537,7 @@ func (p *Processor) RecoverAlertCurEventFromDb() {
|
||||
if alertRule == nil {
|
||||
continue
|
||||
}
|
||||
event.NotifyRuleIDs = alertRule.NotifyRuleIds
|
||||
event.NotifyRuleIds = alertRule.NotifyRuleIds
|
||||
|
||||
if event.Cate == models.HOST {
|
||||
target, exists := p.TargetCache.Get(event.TargetIdent)
|
||||
@@ -641,19 +611,19 @@ func (p *Processor) fillTags(anomalyPoint models.AnomalyPoint) {
|
||||
p.tagsArr = labelMapToArr(tagsMap)
|
||||
}
|
||||
|
||||
func (p *Processor) mayHandleIdent() {
|
||||
func (p *Processor) mayHandleIdent(event *models.AlertCurEvent) {
|
||||
// handle ident
|
||||
if ident, has := p.tagsMap["ident"]; has {
|
||||
if ident, has := event.TagsMap["ident"]; has {
|
||||
if target, exists := p.TargetCache.Get(ident); exists {
|
||||
p.target = target.Ident
|
||||
p.targetNote = target.Note
|
||||
event.TargetIdent = target.Ident
|
||||
event.TargetNote = target.Note
|
||||
} else {
|
||||
p.target = ident
|
||||
p.targetNote = ""
|
||||
event.TargetIdent = ident
|
||||
event.TargetNote = ""
|
||||
}
|
||||
} else {
|
||||
p.target = ""
|
||||
p.targetNote = ""
|
||||
event.TargetIdent = ""
|
||||
event.TargetNote = ""
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -39,7 +39,7 @@ func NewRecordRuleContext(rule *models.RecordingRule, datasourceId int64, promCl
|
||||
rule.CronPattern = fmt.Sprintf("@every %ds", rule.PromEvalInterval)
|
||||
}
|
||||
|
||||
rrc.scheduler = cron.New(cron.WithSeconds())
|
||||
rrc.scheduler = cron.New(cron.WithSeconds(), cron.WithChain(cron.SkipIfStillRunning(cron.DefaultLogger)))
|
||||
_, err := rrc.scheduler.AddFunc(rule.CronPattern, func() {
|
||||
rrc.Eval()
|
||||
})
|
||||
@@ -56,12 +56,13 @@ func (rrc *RecordRuleContext) Key() string {
|
||||
}
|
||||
|
||||
func (rrc *RecordRuleContext) Hash() string {
|
||||
return str.MD5(fmt.Sprintf("%d_%s_%s_%d_%s",
|
||||
return str.MD5(fmt.Sprintf("%d_%s_%s_%d_%s_%s",
|
||||
rrc.rule.Id,
|
||||
rrc.rule.CronPattern,
|
||||
rrc.rule.PromQl,
|
||||
rrc.datasourceId,
|
||||
rrc.rule.AppendTags,
|
||||
rrc.rule.Name,
|
||||
))
|
||||
}
|
||||
|
||||
|
||||
@@ -30,12 +30,14 @@ type IbexCallBacker struct {
|
||||
|
||||
func (c *IbexCallBacker) CallBack(ctx CallBackContext) {
|
||||
if len(ctx.CallBackURL) == 0 || len(ctx.Events) == 0 {
|
||||
logger.Warningf("event_callback_ibex: url or events is empty, url: %s, events: %+v", ctx.CallBackURL, ctx.Events)
|
||||
return
|
||||
}
|
||||
|
||||
event := ctx.Events[0]
|
||||
|
||||
if event.IsRecovered {
|
||||
logger.Infof("event_callback_ibex: event is recovered, event: %+v", event)
|
||||
return
|
||||
}
|
||||
|
||||
@@ -43,8 +45,9 @@ func (c *IbexCallBacker) CallBack(ctx CallBackContext) {
|
||||
}
|
||||
|
||||
func (c *IbexCallBacker) handleIbex(ctx *ctx.Context, url string, event *models.AlertCurEvent) {
|
||||
logger.Infof("event_callback_ibex: url: %s, event: %+v", url, event)
|
||||
if imodels.DB() == nil && ctx.IsCenter {
|
||||
logger.Warning("event_callback_ibex: db is nil")
|
||||
logger.Warningf("event_callback_ibex: db is nil, event: %+v", event)
|
||||
return
|
||||
}
|
||||
|
||||
@@ -63,17 +66,23 @@ func (c *IbexCallBacker) handleIbex(ctx *ctx.Context, url string, event *models.
|
||||
|
||||
id, err := strconv.ParseInt(idstr, 10, 64)
|
||||
if err != nil {
|
||||
logger.Errorf("event_callback_ibex: failed to parse url: %s", url)
|
||||
logger.Errorf("event_callback_ibex: failed to parse url: %s event: %+v", url, event)
|
||||
return
|
||||
}
|
||||
|
||||
if host == "" {
|
||||
// 用户在callback url中没有传入host,就从event中解析
|
||||
host = event.TargetIdent
|
||||
|
||||
if host == "" {
|
||||
if ident, has := event.TagsMap["ident"]; has {
|
||||
host = ident
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if host == "" {
|
||||
logger.Error("event_callback_ibex: failed to get host")
|
||||
logger.Errorf("event_callback_ibex: failed to get host, id: %d, event: %+v", id, event)
|
||||
return
|
||||
}
|
||||
|
||||
@@ -83,21 +92,23 @@ func (c *IbexCallBacker) handleIbex(ctx *ctx.Context, url string, event *models.
|
||||
func CallIbex(ctx *ctx.Context, id int64, host string,
|
||||
taskTplCache *memsto.TaskTplCache, targetCache *memsto.TargetCacheType,
|
||||
userCache *memsto.UserCacheType, event *models.AlertCurEvent) {
|
||||
logger.Infof("event_callback_ibex: id: %d, host: %s, event: %+v", id, host, event)
|
||||
|
||||
tpl := taskTplCache.Get(id)
|
||||
if tpl == nil {
|
||||
logger.Errorf("event_callback_ibex: no such tpl(%d)", id)
|
||||
logger.Errorf("event_callback_ibex: no such tpl(%d), event: %+v", id, event)
|
||||
return
|
||||
}
|
||||
// check perm
|
||||
// tpl.GroupId - host - account 三元组校验权限
|
||||
can, err := canDoIbex(tpl.UpdateBy, tpl, host, targetCache, userCache)
|
||||
if err != nil {
|
||||
logger.Errorf("event_callback_ibex: check perm fail: %v", err)
|
||||
logger.Errorf("event_callback_ibex: check perm fail: %v, event: %+v", err, event)
|
||||
return
|
||||
}
|
||||
|
||||
if !can {
|
||||
logger.Errorf("event_callback_ibex: user(%s) no permission", tpl.UpdateBy)
|
||||
logger.Errorf("event_callback_ibex: user(%s) no permission, event: %+v", tpl.UpdateBy, event)
|
||||
return
|
||||
}
|
||||
|
||||
@@ -122,7 +133,7 @@ func CallIbex(ctx *ctx.Context, id int64, host string,
|
||||
|
||||
tags, err := json.Marshal(tagsMap)
|
||||
if err != nil {
|
||||
logger.Errorf("event_callback_ibex: failed to marshal tags to json: %v", tagsMap)
|
||||
logger.Errorf("event_callback_ibex: failed to marshal tags to json: %v, event: %+v", tagsMap, event)
|
||||
return
|
||||
}
|
||||
|
||||
@@ -145,7 +156,7 @@ func CallIbex(ctx *ctx.Context, id int64, host string,
|
||||
|
||||
id, err = TaskAdd(in, tpl.UpdateBy, ctx.IsCenter)
|
||||
if err != nil {
|
||||
logger.Errorf("event_callback_ibex: call ibex fail: %v", err)
|
||||
logger.Errorf("event_callback_ibex: call ibex fail: %v, event: %+v", err, event)
|
||||
return
|
||||
}
|
||||
|
||||
@@ -167,7 +178,7 @@ func CallIbex(ctx *ctx.Context, id int64, host string,
|
||||
}
|
||||
|
||||
if err = record.Add(ctx); err != nil {
|
||||
logger.Errorf("event_callback_ibex: persist task_record fail: %v", err)
|
||||
logger.Errorf("event_callback_ibex: persist task_record fail: %v, event: %+v", err, event)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -187,7 +198,7 @@ func canDoIbex(username string, tpl *models.TaskTpl, host string, targetCache *m
|
||||
|
||||
func TaskAdd(f models.TaskForm, authUser string, isCenter bool) (int64, error) {
|
||||
if storage.Cache == nil {
|
||||
logger.Warning("event_callback_ibex: redis cache is nil")
|
||||
logger.Warningf("event_callback_ibex: redis cache is nil, task: %+v", f)
|
||||
return 0, fmt.Errorf("redis cache is nil")
|
||||
}
|
||||
|
||||
|
||||
@@ -85,254 +85,221 @@ func MergeOperationConf() error {
|
||||
const (
|
||||
builtInOps = `
|
||||
ops:
|
||||
- name: dashboards
|
||||
cname: Dashboards
|
||||
ops:
|
||||
- name: "/dashboards"
|
||||
cname: View Dashboards
|
||||
- name: "/dashboards/add"
|
||||
cname: Add Dashboard
|
||||
- name: "/dashboards/put"
|
||||
cname: Modify Dashboard
|
||||
- name: "/dashboards/del"
|
||||
cname: Delete Dashboard
|
||||
- name: "/embedded-dashboards/put"
|
||||
cname: Modify Embedded Dashboard
|
||||
- name: "/embedded-dashboards"
|
||||
cname: View Embedded Dashboard
|
||||
- name: "/public-dashboards"
|
||||
cname: View Public Dashboard
|
||||
|
||||
- name: metric
|
||||
cname: Time Series Metrics
|
||||
ops:
|
||||
- name: "/metric/explorer"
|
||||
cname: View Metric Data
|
||||
- name: "/object/explorer"
|
||||
cname: View Object Data
|
||||
|
||||
- name: builtin-metrics
|
||||
cname: Metric Views
|
||||
ops:
|
||||
- name: "/metrics-built-in"
|
||||
cname: View Built-in Metrics
|
||||
- name: "/builtin-metrics/add"
|
||||
cname: Add Built-in Metric
|
||||
- name: "/builtin-metrics/put"
|
||||
cname: Modify Built-in Metric
|
||||
- name: "/builtin-metrics/del"
|
||||
cname: Delete Built-in Metric
|
||||
|
||||
- name: recording-rules
|
||||
cname: Recording Rule Management
|
||||
ops:
|
||||
- name: "/recording-rules"
|
||||
cname: View Recording Rules
|
||||
- name: "/recording-rules/add"
|
||||
cname: Add Recording Rule
|
||||
- name: "/recording-rules/put"
|
||||
cname: Modify Recording Rule
|
||||
- name: "/recording-rules/del"
|
||||
cname: Delete Recording Rule
|
||||
|
||||
- name: log
|
||||
cname: Log Analysis
|
||||
ops:
|
||||
- name: "/log/explorer"
|
||||
cname: View Logs
|
||||
- name: "/log/index-patterns"
|
||||
cname: View Index Patterns
|
||||
|
||||
- name: alert
|
||||
cname: Alert Rules
|
||||
ops:
|
||||
- name: "/alert-rules"
|
||||
cname: View Alert Rules
|
||||
- name: "/alert-rules/add"
|
||||
cname: Add Alert Rule
|
||||
- name: "/alert-rules/put"
|
||||
cname: Modify Alert Rule
|
||||
- name: "/alert-rules/del"
|
||||
cname: Delete Alert Rule
|
||||
|
||||
- name: alert-mutes
|
||||
cname: Alert Silence Management
|
||||
ops:
|
||||
- name: "/alert-mutes"
|
||||
cname: View Alert Silences
|
||||
- name: "/alert-mutes/add"
|
||||
cname: Add Alert Silence
|
||||
- name: "/alert-mutes/put"
|
||||
cname: Modify Alert Silence
|
||||
- name: "/alert-mutes/del"
|
||||
cname: Delete Alert Silence
|
||||
|
||||
- name: alert-subscribes
|
||||
cname: Alert Subscription Management
|
||||
ops:
|
||||
- name: "/alert-subscribes"
|
||||
cname: View Alert Subscriptions
|
||||
- name: "/alert-subscribes/add"
|
||||
cname: Add Alert Subscription
|
||||
- name: "/alert-subscribes/put"
|
||||
cname: Modify Alert Subscription
|
||||
- name: "/alert-subscribes/del"
|
||||
cname: Delete Alert Subscription
|
||||
|
||||
- name: alert-events
|
||||
cname: Alert Event Management
|
||||
ops:
|
||||
- name: "/alert-cur-events"
|
||||
cname: View Current Alerts
|
||||
- name: "/alert-cur-events/del"
|
||||
cname: Delete Current Alert
|
||||
- name: "/alert-his-events"
|
||||
cname: View Historical Alerts
|
||||
|
||||
- name: notification
|
||||
cname: Alert Notification
|
||||
ops:
|
||||
- name: "/help/notification-settings"
|
||||
cname: View Notification Settings
|
||||
- name: "/help/notification-tpls"
|
||||
cname: View Notification Templates
|
||||
|
||||
- name: job
|
||||
cname: Task Management
|
||||
ops:
|
||||
- name: "/job-tpls"
|
||||
cname: View Task Templates
|
||||
- name: "/job-tpls/add"
|
||||
cname: Add Task Template
|
||||
- name: "/job-tpls/put"
|
||||
cname: Modify Task Template
|
||||
- name: "/job-tpls/del"
|
||||
cname: Delete Task Template
|
||||
- name: "/job-tasks"
|
||||
cname: View Task Instances
|
||||
- name: "/job-tasks/add"
|
||||
cname: Add Task Instance
|
||||
- name: "/job-tasks/put"
|
||||
cname: Modify Task Instance
|
||||
|
||||
- name: targets
|
||||
- name: Infrastructure
|
||||
cname: Infrastructure
|
||||
ops:
|
||||
- name: "/targets"
|
||||
cname: View Objects
|
||||
- name: "/targets/add"
|
||||
cname: Add Object
|
||||
- name: "/targets/put"
|
||||
cname: Modify Object
|
||||
- name: "/targets/del"
|
||||
cname: Delete Object
|
||||
- name: "/targets/bind"
|
||||
cname: Bind Object
|
||||
- name: /targets
|
||||
cname: Host - View
|
||||
- name: /targets/put
|
||||
cname: Host - Modify
|
||||
- name: /targets/del
|
||||
cname: Host - Delete
|
||||
- name: /targets/bind
|
||||
cname: Host - Bind Uncategorized
|
||||
|
||||
- name: user
|
||||
cname: User Management
|
||||
- name: Explorer
|
||||
cname: Explorer
|
||||
ops:
|
||||
- name: "/users"
|
||||
cname: View User List
|
||||
- name: "/user-groups"
|
||||
cname: View User Groups
|
||||
- name: "/user-groups/add"
|
||||
cname: Add User Group
|
||||
- name: "/user-groups/put"
|
||||
cname: Modify User Group
|
||||
- name: "/user-groups/del"
|
||||
cname: Delete User Group
|
||||
- name: /metric/explorer
|
||||
cname: Metrics Explorer
|
||||
- name: /object/explorer
|
||||
cname: Quick View
|
||||
- name: /metrics-built-in
|
||||
cname: Built-in Metric - View
|
||||
- name: /builtin-metrics/add
|
||||
cname: Built-in Metric - Add
|
||||
- name: /builtin-metrics/put
|
||||
cname: Built-in Metric - Modify
|
||||
- name: /builtin-metrics/del
|
||||
cname: Built-in Metric - Delete
|
||||
- name: /recording-rules
|
||||
cname: Recording Rule - View
|
||||
- name: /recording-rules/add
|
||||
cname: Recording Rule - Add
|
||||
- name: /recording-rules/put
|
||||
cname: Recording Rule - Modify
|
||||
- name: /recording-rules/del
|
||||
cname: Recording Rule - Delete
|
||||
- name: /log/explorer
|
||||
cname: Logs Explorer
|
||||
- name: /log/index-patterns # 前端有个管理索引模式的页面,所以需要一个权限点来控制,后面应该改成侧拉板
|
||||
cname: Index Pattern - View
|
||||
- name: /log/index-patterns/add
|
||||
cname: Index Pattern - Add
|
||||
- name: /log/index-patterns/put
|
||||
cname: Index Pattern - Modify
|
||||
- name: /log/index-patterns/del
|
||||
cname: Index Pattern - Delete
|
||||
- name: /dashboards
|
||||
cname: Dashboard - View
|
||||
- name: /dashboards/add
|
||||
cname: Dashboard - Add
|
||||
- name: /dashboards/put
|
||||
cname: Dashboard - Modify
|
||||
- name: /dashboards/del
|
||||
cname: Dashboard - Delete
|
||||
- name: /public-dashboards
|
||||
cname: Dashboard - View Public
|
||||
|
||||
- name: busi-groups
|
||||
cname: Business Group Management
|
||||
- name: alerting
|
||||
cname: Alerting
|
||||
ops:
|
||||
- name: "/busi-groups"
|
||||
cname: View Business Groups
|
||||
- name: "/busi-groups/add"
|
||||
cname: Add Business Group
|
||||
- name: "/busi-groups/put"
|
||||
cname: Modify Business Group
|
||||
- name: "/busi-groups/del"
|
||||
cname: Delete Business Group
|
||||
- name: /alert-rules
|
||||
cname: Alerting Rule - View
|
||||
- name: /alert-rules/add
|
||||
cname: Alerting Rule - Add
|
||||
- name: /alert-rules/put
|
||||
cname: Alerting Rule - Modify
|
||||
- name: /alert-rules/del
|
||||
cname: Alerting Rule - Delete
|
||||
- name: /alert-mutes
|
||||
cname: Mutting Rule - View
|
||||
- name: /alert-mutes/add
|
||||
cname: Mutting Rule - Add
|
||||
- name: /alert-mutes/put
|
||||
cname: Mutting Rule - Modify
|
||||
- name: /alert-mutes/del
|
||||
cname: Mutting Rule - Delete
|
||||
- name: /alert-subscribes
|
||||
cname: Subscribing Rule - View
|
||||
- name: /alert-subscribes/add
|
||||
cname: Subscribing Rule - Add
|
||||
- name: /alert-subscribes/put
|
||||
cname: Subscribing Rule - Modify
|
||||
- name: /alert-subscribes/del
|
||||
cname: Subscribing Rule - Delete
|
||||
- name: /job-tpls
|
||||
cname: Self-healing-Script - View
|
||||
- name: /job-tpls/add
|
||||
cname: Self-healing-Script - Add
|
||||
- name: /job-tpls/put
|
||||
cname: Self-healing-Script - Modify
|
||||
- name: /job-tpls/del
|
||||
cname: Self-healing-Script - Delete
|
||||
- name: /job-tasks
|
||||
cname: Self-healing-Job - View
|
||||
- name: /job-tasks/add
|
||||
cname: Self-healing-Job - Add
|
||||
- name: /job-tasks/put
|
||||
cname: Self-healing-Job - Modify
|
||||
- name: /alert-cur-events
|
||||
cname: Active Event - View
|
||||
- name: /alert-cur-events/del
|
||||
cname: Active Event - Delete
|
||||
- name: /alert-his-events
|
||||
cname: Historical Event - View
|
||||
|
||||
- name: permissions
|
||||
cname: Permission Management
|
||||
- name: Notification
|
||||
cname: Notification
|
||||
ops:
|
||||
- name: "/permissions"
|
||||
cname: View Permission Settings
|
||||
|
||||
- name: contacts
|
||||
cname: User Contact Management
|
||||
ops:
|
||||
- name: "/contacts"
|
||||
cname: User Contact Management
|
||||
- name: /notification-rules
|
||||
cname: Notification Rule - View
|
||||
- name: /notification-rules/add
|
||||
cname: Notification Rule - Add
|
||||
- name: /notification-rules/put
|
||||
cname: Notification Rule - Modify
|
||||
- name: /notification-rules/del
|
||||
cname: Notification Rule - Delete
|
||||
- name: /notification-channels
|
||||
cname: Media Type - View
|
||||
- name: /notification-channels/add
|
||||
cname: Media Type - Add
|
||||
- name: /notification-channels/put
|
||||
cname: Media Type - Modify
|
||||
- name: /notification-channels/del
|
||||
cname: Media Type - Delete
|
||||
- name: /notification-templates
|
||||
cname: Message Template - View
|
||||
- name: /notification-templates/add
|
||||
cname: Message Template - Add
|
||||
- name: /notification-templates/put
|
||||
cname: Message Template - Modify
|
||||
- name: /notification-templates/del
|
||||
cname: Message Template - Delete
|
||||
- name: /event-pipelines
|
||||
cname: Event Pipeline - View
|
||||
- name: /event-pipelines/add
|
||||
cname: Event Pipeline - Add
|
||||
- name: /event-pipelines/put
|
||||
cname: Event Pipeline - Modify
|
||||
- name: /event-pipelines/del
|
||||
cname: Event Pipeline - Delete
|
||||
- name: /help/notification-settings # 用于控制老版本的通知设置菜单是否展示
|
||||
cname: Notification Settings - View
|
||||
- name: /help/notification-tpls # 用于控制老版本的通知模板菜单是否展示
|
||||
cname: Notification Templates - View
|
||||
|
||||
- name: built-in-components
|
||||
cname: Template Center
|
||||
- name: Integrations
|
||||
cname: Integrations
|
||||
ops:
|
||||
- name: "/built-in-components"
|
||||
cname: View Built-in Components
|
||||
- name: "/built-in-components/add"
|
||||
cname: Add Built-in Component
|
||||
- name: "/built-in-components/put"
|
||||
cname: Modify Built-in Component
|
||||
- name: "/built-in-components/del"
|
||||
cname: Delete Built-in Component
|
||||
- name: /datasources # 用于控制能否看到数据源列表页面的菜单。只有 Admin 才能修改、删除数据源
|
||||
cname: Data Source - View
|
||||
- name: /components
|
||||
cname: Component - View
|
||||
- name: /components/add
|
||||
cname: Component - Add
|
||||
- name: /components/put
|
||||
cname: Component - Modify
|
||||
- name: /components/del
|
||||
cname: Component - Delete
|
||||
- name: /embedded-products
|
||||
cname: Embedded Product - View
|
||||
- name: /embedded-product/add
|
||||
cname: Embedded Product - Add
|
||||
- name: /embedded-product/put
|
||||
cname: Embedded Product - Modify
|
||||
- name: /embedded-product/delete
|
||||
cname: Embedded Product - Delete
|
||||
|
||||
- name: datasource
|
||||
cname: Data Source Management
|
||||
- name: Organization
|
||||
cname: Organization
|
||||
ops:
|
||||
- name: "/help/source"
|
||||
cname: View Data Source Configuration
|
||||
- name: /users
|
||||
cname: User - View
|
||||
- name: /users/add
|
||||
cname: User - Add
|
||||
- name: /users/put
|
||||
cname: User - Modify
|
||||
- name: /users/del
|
||||
cname: User - Delete
|
||||
- name: /user-groups
|
||||
cname: Team - View
|
||||
- name: /user-groups/add
|
||||
cname: Team - Add
|
||||
- name: /user-groups/put
|
||||
cname: Team - Modify
|
||||
- name: /user-groups/del
|
||||
cname: Team - Delete
|
||||
- name: /busi-groups
|
||||
cname: Business Group - View
|
||||
- name: /busi-groups/add
|
||||
cname: Business Group - Add
|
||||
- name: /busi-groups/put
|
||||
cname: Business Group - Modify
|
||||
- name: /busi-groups/del
|
||||
cname: Business Group - Delete
|
||||
- name: /roles
|
||||
cname: Role - View
|
||||
- name: /roles/add
|
||||
cname: Role - Add
|
||||
- name: /roles/put
|
||||
cname: Role - Modify
|
||||
- name: /roles/del
|
||||
cname: Role - Delete
|
||||
|
||||
- name: system
|
||||
cname: System Information
|
||||
- name: System Settings
|
||||
cname: System Settings
|
||||
ops:
|
||||
- name: "/help/variable-configs"
|
||||
cname: View Variable Configuration
|
||||
- name: "/help/version"
|
||||
cname: View Version Information
|
||||
- name: "/help/servers"
|
||||
cname: View Server Information
|
||||
- name: "/help/sso"
|
||||
cname: View SSO Configuration
|
||||
- name: "/site-settings"
|
||||
- name: /system/site-settings # 仅用于控制能否展示菜单,只有 Admin 才能修改、删除
|
||||
cname: View Site Settings
|
||||
- name: /system/variable-settings
|
||||
cname: View Variable Settings
|
||||
- name: /system/sso-settings
|
||||
cname: View SSO Settings
|
||||
- name: /system/alerting-engines
|
||||
cname: View Alerting Engines
|
||||
- name: /system/version
|
||||
cname: View Product Version
|
||||
|
||||
- name: message-templates
|
||||
cname: Message Templates
|
||||
ops:
|
||||
- name: "/notification-templates"
|
||||
cname: View Message Templates
|
||||
- name: "/notification-templates/add"
|
||||
cname: Add Message Templates
|
||||
- name: "/notification-templates/put"
|
||||
cname: Modify Message Templates
|
||||
- name: "/notification-templates/del"
|
||||
cname: Delete Message Templates
|
||||
|
||||
- name: notify-rules
|
||||
cname: Notify Rules
|
||||
ops:
|
||||
- name: "/notification-rules"
|
||||
cname: View Notify Rules
|
||||
- name: "/notification-rules/add"
|
||||
cname: Add Notify Rules
|
||||
- name: "/notification-rules/put"
|
||||
cname: Modify Notify Rules
|
||||
- name: "/notification-rules/del"
|
||||
cname: Delete Notify Rules
|
||||
|
||||
- name: notify-channels
|
||||
cname: Notify Channels
|
||||
ops:
|
||||
- name: "/notification-channels"
|
||||
cname: View Notify Channels
|
||||
- name: "/notification-channels/add"
|
||||
cname: Add Notify Channels
|
||||
- name: "/notification-channels/put"
|
||||
cname: Modify Notify Channels
|
||||
- name: "/notification-channels/del"
|
||||
cname: Delete Notify Channels
|
||||
`
|
||||
)
|
||||
|
||||
@@ -25,4 +25,10 @@ var Plugins = []Plugin{
|
||||
Type: "tdengine",
|
||||
TypeName: "TDengine",
|
||||
},
|
||||
{
|
||||
Id: 5,
|
||||
Category: "logging",
|
||||
Type: "ck",
|
||||
TypeName: "ClickHouse",
|
||||
},
|
||||
}
|
||||
|
||||
@@ -13,7 +13,6 @@ import (
|
||||
alertrt "github.com/ccfos/nightingale/v6/alert/router"
|
||||
"github.com/ccfos/nightingale/v6/center/cconf"
|
||||
"github.com/ccfos/nightingale/v6/center/cconf/rsa"
|
||||
"github.com/ccfos/nightingale/v6/center/cstats"
|
||||
"github.com/ccfos/nightingale/v6/center/integration"
|
||||
"github.com/ccfos/nightingale/v6/center/metas"
|
||||
centerrt "github.com/ccfos/nightingale/v6/center/router"
|
||||
@@ -60,7 +59,6 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
|
||||
}
|
||||
|
||||
i18nx.Init(configDir)
|
||||
cstats.Init()
|
||||
flashduty.Init(config.Center.FlashDuty)
|
||||
|
||||
db, err := storage.New(config.DB)
|
||||
@@ -86,7 +84,7 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
|
||||
}
|
||||
|
||||
metas := metas.New(redis)
|
||||
idents := idents.New(ctx, redis)
|
||||
idents := idents.New(ctx, redis, config.Pushgw)
|
||||
|
||||
syncStats := memsto.NewSyncStats()
|
||||
alertStats := astats.NewSyncStats()
|
||||
@@ -94,6 +92,9 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
|
||||
if config.Center.MigrateBusiGroupLabel || models.CanMigrateBg(ctx) {
|
||||
models.MigrateBg(ctx, config.Pushgw.BusiGroupLabelKey)
|
||||
}
|
||||
if models.CanMigrateEP(ctx) {
|
||||
models.MigrateEP(ctx)
|
||||
}
|
||||
|
||||
configCache := memsto.NewConfigCache(ctx, syncStats, config.HTTP.RSA.RSAPrivateKey, config.HTTP.RSA.RSAPassWord)
|
||||
busiGroupCache := memsto.NewBusiGroupCache(ctx, syncStats)
|
||||
|
||||
@@ -6,40 +6,49 @@ import (
|
||||
"github.com/prometheus/client_golang/prometheus"
|
||||
)
|
||||
|
||||
const Service = "n9e-center"
|
||||
const (
|
||||
namespace = "n9e"
|
||||
subsystem = "center"
|
||||
)
|
||||
|
||||
var (
|
||||
labels = []string{"service", "code", "path", "method"}
|
||||
|
||||
uptime = prometheus.NewCounterVec(
|
||||
uptime = prometheus.NewCounter(
|
||||
prometheus.CounterOpts{
|
||||
Name: "uptime",
|
||||
Help: "HTTP service uptime.",
|
||||
}, []string{"service"},
|
||||
)
|
||||
|
||||
RequestCounter = prometheus.NewCounterVec(
|
||||
prometheus.CounterOpts{
|
||||
Name: "http_request_count_total",
|
||||
Help: "Total number of HTTP requests made.",
|
||||
}, labels,
|
||||
Namespace: namespace,
|
||||
Subsystem: subsystem,
|
||||
Name: "uptime",
|
||||
Help: "HTTP service uptime.",
|
||||
},
|
||||
)
|
||||
|
||||
RequestDuration = prometheus.NewHistogramVec(
|
||||
prometheus.HistogramOpts{
|
||||
Buckets: []float64{.01, .1, 1, 10},
|
||||
Name: "http_request_duration_seconds",
|
||||
Help: "HTTP request latencies in seconds.",
|
||||
}, labels,
|
||||
Namespace: namespace,
|
||||
Subsystem: subsystem,
|
||||
Buckets: prometheus.DefBuckets,
|
||||
Name: "http_request_duration_seconds",
|
||||
Help: "HTTP request latencies in seconds.",
|
||||
}, []string{"code", "path", "method"},
|
||||
)
|
||||
|
||||
RedisOperationLatency = prometheus.NewHistogramVec(
|
||||
prometheus.HistogramOpts{
|
||||
Namespace: namespace,
|
||||
Subsystem: subsystem,
|
||||
Name: "redis_operation_latency_seconds",
|
||||
Help: "Histogram of latencies for Redis operations",
|
||||
Buckets: []float64{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5},
|
||||
},
|
||||
[]string{"operation", "status"},
|
||||
)
|
||||
)
|
||||
|
||||
func Init() {
|
||||
func init() {
|
||||
// Register the summary and the histogram with Prometheus's default registry.
|
||||
prometheus.MustRegister(
|
||||
uptime,
|
||||
RequestCounter,
|
||||
RequestDuration,
|
||||
RedisOperationLatency,
|
||||
)
|
||||
|
||||
go recordUptime()
|
||||
@@ -48,6 +57,6 @@ func Init() {
|
||||
// recordUptime increases service uptime per second.
|
||||
func recordUptime() {
|
||||
for range time.Tick(time.Second) {
|
||||
uptime.WithLabelValues(Service).Inc()
|
||||
uptime.Inc()
|
||||
}
|
||||
}
|
||||
|
||||
@@ -6,6 +6,7 @@ import (
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/center/cstats"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/storage"
|
||||
|
||||
@@ -115,15 +116,23 @@ func (s *Set) updateTargets(m map[string]models.HostMeta) error {
|
||||
}
|
||||
newMap[models.WrapIdent(ident)] = meta
|
||||
}
|
||||
|
||||
start := time.Now()
|
||||
err := storage.MSet(context.Background(), s.redis, newMap)
|
||||
if err != nil {
|
||||
cstats.RedisOperationLatency.WithLabelValues("mset_target_meta", "fail").Observe(time.Since(start).Seconds())
|
||||
return err
|
||||
} else {
|
||||
cstats.RedisOperationLatency.WithLabelValues("mset_target_meta", "success").Observe(time.Since(start).Seconds())
|
||||
}
|
||||
|
||||
if len(extendMap) > 0 {
|
||||
err = storage.MSet(context.Background(), s.redis, extendMap)
|
||||
if err != nil {
|
||||
cstats.RedisOperationLatency.WithLabelValues("mset_target_extend", "fail").Observe(time.Since(start).Seconds())
|
||||
return err
|
||||
} else {
|
||||
cstats.RedisOperationLatency.WithLabelValues("mset_target_extend", "success").Observe(time.Since(start).Seconds())
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -93,10 +93,9 @@ func stat() gin.HandlerFunc {
|
||||
|
||||
code := fmt.Sprintf("%d", c.Writer.Status())
|
||||
method := c.Request.Method
|
||||
labels := []string{cstats.Service, code, c.FullPath(), method}
|
||||
labels := []string{code, c.FullPath(), method}
|
||||
|
||||
cstats.RequestCounter.WithLabelValues(labels...).Inc()
|
||||
cstats.RequestDuration.WithLabelValues(labels...).Observe(float64(time.Since(start).Seconds()))
|
||||
cstats.RequestDuration.WithLabelValues(labels...).Observe(time.Since(start).Seconds())
|
||||
}
|
||||
}
|
||||
|
||||
@@ -265,11 +264,11 @@ func (rt *Router) Config(r *gin.Engine) {
|
||||
pages.DELETE("/self/token/:id", rt.auth(), rt.user(), rt.deleteToken)
|
||||
|
||||
pages.GET("/users", rt.auth(), rt.user(), rt.perm("/users"), rt.userGets)
|
||||
pages.POST("/users", rt.auth(), rt.admin(), rt.userAddPost)
|
||||
pages.POST("/users", rt.auth(), rt.user(), rt.perm("/users/add"), rt.userAddPost)
|
||||
pages.GET("/user/:id/profile", rt.auth(), rt.userProfileGet)
|
||||
pages.PUT("/user/:id/profile", rt.auth(), rt.admin(), rt.userProfilePut)
|
||||
pages.PUT("/user/:id/password", rt.auth(), rt.admin(), rt.userPasswordPut)
|
||||
pages.DELETE("/user/:id", rt.auth(), rt.admin(), rt.userDel)
|
||||
pages.PUT("/user/:id/profile", rt.auth(), rt.user(), rt.perm("/users/put"), rt.userProfilePut)
|
||||
pages.PUT("/user/:id/password", rt.auth(), rt.user(), rt.perm("/users/put"), rt.userPasswordPut)
|
||||
pages.DELETE("/user/:id", rt.auth(), rt.user(), rt.perm("/users/del"), rt.userDel)
|
||||
|
||||
pages.GET("/metric-views", rt.auth(), rt.metricViewGets)
|
||||
pages.DELETE("/metric-views", rt.auth(), rt.user(), rt.metricViewDel)
|
||||
@@ -390,6 +389,7 @@ func (rt *Router) Config(r *gin.Engine) {
|
||||
pages.PUT("/busi-group/:id/alert-mute/:amid", rt.auth(), rt.user(), rt.perm("/alert-mutes/put"), rt.alertMutePutByFE)
|
||||
pages.GET("/busi-group/:id/alert-mute/:amid", rt.auth(), rt.user(), rt.perm("/alert-mutes"), rt.alertMuteGet)
|
||||
pages.PUT("/busi-group/:id/alert-mutes/fields", rt.auth(), rt.user(), rt.perm("/alert-mutes/put"), rt.bgrw(), rt.alertMutePutFields)
|
||||
pages.POST("/alert-mute-tryrun", rt.auth(), rt.user(), rt.perm("/alert-mutes/add"), rt.alertMuteTryRun)
|
||||
|
||||
pages.GET("/busi-groups/alert-subscribes", rt.auth(), rt.user(), rt.perm("/alert-subscribes"), rt.alertSubscribeGetsByGids)
|
||||
pages.GET("/busi-group/:id/alert-subscribes", rt.auth(), rt.user(), rt.perm("/alert-subscribes"), rt.bgro(), rt.alertSubscribeGets)
|
||||
@@ -444,13 +444,13 @@ func (rt *Router) Config(r *gin.Engine) {
|
||||
pages.POST("/datasource/status/update", rt.auth(), rt.admin(), rt.datasourceUpdataStatus)
|
||||
pages.DELETE("/datasource/", rt.auth(), rt.admin(), rt.datasourceDel)
|
||||
|
||||
pages.GET("/roles", rt.auth(), rt.admin(), rt.roleGets)
|
||||
pages.POST("/roles", rt.auth(), rt.admin(), rt.roleAdd)
|
||||
pages.PUT("/roles", rt.auth(), rt.admin(), rt.rolePut)
|
||||
pages.DELETE("/role/:id", rt.auth(), rt.admin(), rt.roleDel)
|
||||
pages.GET("/roles", rt.auth(), rt.user(), rt.perm("/roles"), rt.roleGets)
|
||||
pages.POST("/roles", rt.auth(), rt.user(), rt.perm("/roles/add"), rt.roleAdd)
|
||||
pages.PUT("/roles", rt.auth(), rt.user(), rt.perm("/roles/put"), rt.rolePut)
|
||||
pages.DELETE("/role/:id", rt.auth(), rt.user(), rt.perm("/roles/del"), rt.roleDel)
|
||||
|
||||
pages.GET("/role/:id/ops", rt.auth(), rt.admin(), rt.operationOfRole)
|
||||
pages.PUT("/role/:id/ops", rt.auth(), rt.admin(), rt.roleBindOperation)
|
||||
pages.GET("/role/:id/ops", rt.auth(), rt.user(), rt.perm("/roles"), rt.operationOfRole)
|
||||
pages.PUT("/role/:id/ops", rt.auth(), rt.user(), rt.perm("/roles/put"), rt.roleBindOperation)
|
||||
pages.GET("/operation", rt.operations)
|
||||
|
||||
pages.GET("/notify-tpls", rt.auth(), rt.user(), rt.notifyTplGets)
|
||||
@@ -472,7 +472,7 @@ func (rt *Router) Config(r *gin.Engine) {
|
||||
pages.GET("/notify-channel", rt.auth(), rt.user(), rt.perm("/help/notification-settings"), rt.notifyChannelGets)
|
||||
pages.PUT("/notify-channel", rt.auth(), rt.admin(), rt.notifyChannelPuts)
|
||||
|
||||
pages.GET("/notify-contact", rt.auth(), rt.user(), rt.perm("/help/notification-settings"), rt.notifyContactGets)
|
||||
pages.GET("/notify-contact", rt.auth(), rt.user(), rt.notifyContactGets)
|
||||
pages.PUT("/notify-contact", rt.auth(), rt.admin(), rt.notifyContactPuts)
|
||||
|
||||
pages.GET("/notify-config", rt.auth(), rt.user(), rt.perm("/help/notification-settings"), rt.notifyConfigGet)
|
||||
@@ -481,13 +481,20 @@ func (rt *Router) Config(r *gin.Engine) {
|
||||
|
||||
pages.GET("/es-index-pattern", rt.auth(), rt.esIndexPatternGet)
|
||||
pages.GET("/es-index-pattern-list", rt.auth(), rt.esIndexPatternGetList)
|
||||
pages.POST("/es-index-pattern", rt.auth(), rt.admin(), rt.esIndexPatternAdd)
|
||||
pages.PUT("/es-index-pattern", rt.auth(), rt.admin(), rt.esIndexPatternPut)
|
||||
pages.DELETE("/es-index-pattern", rt.auth(), rt.admin(), rt.esIndexPatternDel)
|
||||
pages.POST("/es-index-pattern", rt.auth(), rt.user(), rt.perm("/log/index-patterns/add"), rt.esIndexPatternAdd)
|
||||
pages.PUT("/es-index-pattern", rt.auth(), rt.user(), rt.perm("/log/index-patterns/put"), rt.esIndexPatternPut)
|
||||
pages.DELETE("/es-index-pattern", rt.auth(), rt.user(), rt.perm("/log/index-patterns/del"), rt.esIndexPatternDel)
|
||||
|
||||
pages.GET("/embedded-dashboards", rt.auth(), rt.user(), rt.perm("/embedded-dashboards"), rt.embeddedDashboardsGet)
|
||||
pages.PUT("/embedded-dashboards", rt.auth(), rt.user(), rt.perm("/embedded-dashboards/put"), rt.embeddedDashboardsPut)
|
||||
|
||||
// 获取 embedded-product 列表
|
||||
pages.GET("/embedded-product", rt.auth(), rt.user(), rt.embeddedProductGets)
|
||||
pages.GET("/embedded-product/:id", rt.auth(), rt.user(), rt.embeddedProductGet)
|
||||
pages.POST("/embedded-product", rt.auth(), rt.user(), rt.perm("/embedded-product/add"), rt.embeddedProductAdd)
|
||||
pages.PUT("/embedded-product/:id", rt.auth(), rt.user(), rt.perm("/embedded-product/put"), rt.embeddedProductPut)
|
||||
pages.DELETE("/embedded-product/:id", rt.auth(), rt.user(), rt.perm("/embedded-product/delete"), rt.embeddedProductDelete)
|
||||
|
||||
pages.GET("/user-variable-configs", rt.auth(), rt.user(), rt.perm("/help/variable-configs"), rt.userVariableConfigGets)
|
||||
pages.POST("/user-variable-config", rt.auth(), rt.user(), rt.perm("/help/variable-configs"), rt.userVariableConfigAdd)
|
||||
pages.PUT("/user-variable-config/:id", rt.auth(), rt.user(), rt.perm("/help/variable-configs"), rt.userVariableConfigPut)
|
||||
@@ -497,20 +504,23 @@ func (rt *Router) Config(r *gin.Engine) {
|
||||
pages.PUT("/config", rt.auth(), rt.admin(), rt.configPutByKey)
|
||||
pages.GET("/site-info", rt.siteInfo)
|
||||
|
||||
// source token 相关路由
|
||||
pages.POST("/source-token", rt.auth(), rt.user(), rt.sourceTokenAdd)
|
||||
|
||||
// for admin api
|
||||
pages.GET("/user/busi-groups", rt.auth(), rt.admin(), rt.userBusiGroupsGets)
|
||||
|
||||
pages.GET("/builtin-components", rt.auth(), rt.user(), rt.builtinComponentsGets)
|
||||
pages.POST("/builtin-components", rt.auth(), rt.user(), rt.perm("/built-in-components/add"), rt.builtinComponentsAdd)
|
||||
pages.PUT("/builtin-components", rt.auth(), rt.user(), rt.perm("/built-in-components/put"), rt.builtinComponentsPut)
|
||||
pages.DELETE("/builtin-components", rt.auth(), rt.user(), rt.perm("/built-in-components/del"), rt.builtinComponentsDel)
|
||||
pages.POST("/builtin-components", rt.auth(), rt.user(), rt.perm("/components/add"), rt.builtinComponentsAdd)
|
||||
pages.PUT("/builtin-components", rt.auth(), rt.user(), rt.perm("/components/put"), rt.builtinComponentsPut)
|
||||
pages.DELETE("/builtin-components", rt.auth(), rt.user(), rt.perm("/components/del"), rt.builtinComponentsDel)
|
||||
|
||||
pages.GET("/builtin-payloads", rt.auth(), rt.user(), rt.builtinPayloadsGets)
|
||||
pages.GET("/builtin-payloads/cates", rt.auth(), rt.user(), rt.builtinPayloadcatesGet)
|
||||
pages.POST("/builtin-payloads", rt.auth(), rt.user(), rt.perm("/built-in-components/add"), rt.builtinPayloadsAdd)
|
||||
pages.GET("/builtin-payload/:id", rt.auth(), rt.user(), rt.perm("/built-in-components"), rt.builtinPayloadGet)
|
||||
pages.PUT("/builtin-payloads", rt.auth(), rt.user(), rt.perm("/built-in-components/put"), rt.builtinPayloadsPut)
|
||||
pages.DELETE("/builtin-payloads", rt.auth(), rt.user(), rt.perm("/built-in-components/del"), rt.builtinPayloadsDel)
|
||||
pages.POST("/builtin-payloads", rt.auth(), rt.user(), rt.perm("/components/add"), rt.builtinPayloadsAdd)
|
||||
pages.GET("/builtin-payload/:id", rt.auth(), rt.user(), rt.perm("/components"), rt.builtinPayloadGet)
|
||||
pages.PUT("/builtin-payloads", rt.auth(), rt.user(), rt.perm("/components/put"), rt.builtinPayloadsPut)
|
||||
pages.DELETE("/builtin-payloads", rt.auth(), rt.user(), rt.perm("/components/del"), rt.builtinPayloadsDel)
|
||||
pages.GET("/builtin-payload", rt.auth(), rt.user(), rt.builtinPayloadsGetByUUIDOrID)
|
||||
|
||||
pages.POST("/message-templates", rt.auth(), rt.user(), rt.perm("/notification-templates/add"), rt.messageTemplatesAdd)
|
||||
@@ -527,6 +537,16 @@ func (rt *Router) Config(r *gin.Engine) {
|
||||
pages.GET("/notify-rules", rt.auth(), rt.user(), rt.perm("/notification-rules"), rt.notifyRulesGet)
|
||||
pages.POST("/notify-rule/test", rt.auth(), rt.user(), rt.perm("/notification-rules"), rt.notifyTest)
|
||||
pages.GET("/notify-rule/custom-params", rt.auth(), rt.user(), rt.perm("/notification-rules"), rt.notifyRuleCustomParamsGet)
|
||||
pages.POST("/notify-rule/event-pipelines-tryrun", rt.auth(), rt.user(), rt.perm("/notification-rules/add"), rt.tryRunEventProcessorByNotifyRule)
|
||||
|
||||
// 事件Pipeline相关路由
|
||||
pages.GET("/event-pipelines", rt.auth(), rt.user(), rt.perm("/event-pipelines"), rt.eventPipelinesList)
|
||||
pages.POST("/event-pipeline", rt.auth(), rt.user(), rt.perm("/event-pipelines/add"), rt.addEventPipeline)
|
||||
pages.PUT("/event-pipeline", rt.auth(), rt.user(), rt.perm("/event-pipelines/put"), rt.updateEventPipeline)
|
||||
pages.GET("/event-pipeline/:id", rt.auth(), rt.user(), rt.perm("/event-pipelines"), rt.getEventPipeline)
|
||||
pages.DELETE("/event-pipelines", rt.auth(), rt.user(), rt.perm("/event-pipelines/del"), rt.deleteEventPipelines)
|
||||
pages.POST("/event-pipeline-tryrun", rt.auth(), rt.user(), rt.perm("/event-pipelines"), rt.tryRunEventPipeline)
|
||||
pages.POST("/event-processor-tryrun", rt.auth(), rt.user(), rt.perm("/event-pipelines"), rt.tryRunEventProcessor)
|
||||
|
||||
pages.POST("/notify-channel-configs", rt.auth(), rt.user(), rt.perm("/notification-channels/add"), rt.notifyChannelsAdd)
|
||||
pages.DELETE("/notify-channel-configs", rt.auth(), rt.user(), rt.perm("/notification-channels/del"), rt.notifyChannelsDel)
|
||||
@@ -647,6 +667,7 @@ func (rt *Router) Config(r *gin.Engine) {
|
||||
|
||||
service.GET("/message-templates", rt.messageTemplateGets)
|
||||
|
||||
service.GET("/event-pipelines", rt.eventPipelinesListByService)
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -1,50 +1,54 @@
|
||||
package router
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"net/http"
|
||||
"sort"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/ctx"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
)
|
||||
|
||||
func parseAggrRules(c *gin.Context) []*models.AggrRule {
|
||||
aggrRules := strings.Split(ginx.QueryStr(c, "rule", ""), "::") // e.g. field:group_name::field:severity::tagkey:ident
|
||||
|
||||
if len(aggrRules) == 0 {
|
||||
ginx.Bomb(http.StatusBadRequest, "rule empty")
|
||||
func getUserGroupIds(ctx *gin.Context, rt *Router, myGroups bool) ([]int64, error) {
|
||||
if !myGroups {
|
||||
return nil, nil
|
||||
}
|
||||
|
||||
rules := make([]*models.AggrRule, len(aggrRules))
|
||||
for i := 0; i < len(aggrRules); i++ {
|
||||
pair := strings.Split(aggrRules[i], ":")
|
||||
if len(pair) != 2 {
|
||||
ginx.Bomb(http.StatusBadRequest, "rule invalid")
|
||||
}
|
||||
|
||||
if !(pair[0] == "field" || pair[0] == "tagkey") {
|
||||
ginx.Bomb(http.StatusBadRequest, "rule invalid")
|
||||
}
|
||||
|
||||
rules[i] = &models.AggrRule{
|
||||
Type: pair[0],
|
||||
Value: pair[1],
|
||||
}
|
||||
}
|
||||
|
||||
return rules
|
||||
me := ctx.MustGet("user").(*models.User)
|
||||
return models.MyGroupIds(rt.Ctx, me.Id)
|
||||
}
|
||||
|
||||
func (rt *Router) alertCurEventsCard(c *gin.Context) {
|
||||
stime, etime := getTimeRange(c)
|
||||
severity := ginx.QueryInt(c, "severity", -1)
|
||||
severity := strx.IdsInt64ForAPI(ginx.QueryStr(c, "severity", ""), ",")
|
||||
query := ginx.QueryStr(c, "query", "")
|
||||
myGroups := ginx.QueryBool(c, "my_groups", false) // 是否只看自己组,默认false
|
||||
|
||||
var gids []int64
|
||||
var err error
|
||||
if myGroups {
|
||||
gids, err = getUserGroupIds(c, rt, myGroups)
|
||||
ginx.Dangerous(err)
|
||||
if len(gids) == 0 {
|
||||
gids = append(gids, -1)
|
||||
}
|
||||
}
|
||||
|
||||
viewId := ginx.QueryInt64(c, "view_id")
|
||||
|
||||
alertView, err := models.GetAlertAggrViewByViewID(rt.Ctx, viewId)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if alertView == nil {
|
||||
ginx.Bomb(http.StatusNotFound, "alert aggr view not found")
|
||||
}
|
||||
|
||||
dsIds := queryDatasourceIds(c)
|
||||
rules := parseAggrRules(c)
|
||||
|
||||
prod := ginx.QueryStr(c, "prods", "")
|
||||
if prod == "" {
|
||||
@@ -61,17 +65,18 @@ func (rt *Router) alertCurEventsCard(c *gin.Context) {
|
||||
cates = strings.Split(cate, ",")
|
||||
}
|
||||
|
||||
bgids, err := GetBusinessGroupIds(c, rt.Ctx, rt.Center.EventHistoryGroupView)
|
||||
bgids, err := GetBusinessGroupIds(c, rt.Ctx, rt.Center.EventHistoryGroupView, myGroups)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
// 最多获取50000个,获取太多也没啥意义
|
||||
list, err := models.AlertCurEventsGet(rt.Ctx, prods, bgids, stime, etime, severity, dsIds,
|
||||
cates, 0, query, 50000, 0)
|
||||
cates, 0, query, 50000, 0, []int64{})
|
||||
ginx.Dangerous(err)
|
||||
|
||||
cardmap := make(map[string]*AlertCard)
|
||||
for _, event := range list {
|
||||
title := event.GenCardTitle(rules)
|
||||
title, err := event.GenCardTitle(alertView.Rule)
|
||||
ginx.Dangerous(err)
|
||||
if _, has := cardmap[title]; has {
|
||||
cardmap[title].Total++
|
||||
cardmap[title].EventIds = append(cardmap[title].EventIds, event.Id)
|
||||
@@ -86,6 +91,10 @@ func (rt *Router) alertCurEventsCard(c *gin.Context) {
|
||||
Severity: event.Severity,
|
||||
}
|
||||
}
|
||||
|
||||
if cardmap[title].Severity < 1 {
|
||||
cardmap[title].Severity = 3
|
||||
}
|
||||
}
|
||||
|
||||
titles := make([]string, 0, len(cardmap))
|
||||
@@ -142,11 +151,15 @@ func (rt *Router) alertCurEventsGetByRid(c *gin.Context) {
|
||||
// 列表方式,拉取活跃告警
|
||||
func (rt *Router) alertCurEventsList(c *gin.Context) {
|
||||
stime, etime := getTimeRange(c)
|
||||
severity := ginx.QueryInt(c, "severity", -1)
|
||||
severity := strx.IdsInt64ForAPI(ginx.QueryStr(c, "severity", ""), ",")
|
||||
query := ginx.QueryStr(c, "query", "")
|
||||
limit := ginx.QueryInt(c, "limit", 20)
|
||||
myGroups := ginx.QueryBool(c, "my_groups", false) // 是否只看自己组,默认false
|
||||
|
||||
dsIds := queryDatasourceIds(c)
|
||||
|
||||
eventIds := strx.IdsInt64ForAPI(ginx.QueryStr(c, "event_ids", ""), ",")
|
||||
|
||||
prod := ginx.QueryStr(c, "prods", "")
|
||||
if prod == "" {
|
||||
prod = ginx.QueryStr(c, "rule_prods", "")
|
||||
@@ -165,18 +178,19 @@ func (rt *Router) alertCurEventsList(c *gin.Context) {
|
||||
|
||||
ruleId := ginx.QueryInt64(c, "rid", 0)
|
||||
|
||||
bgids, err := GetBusinessGroupIds(c, rt.Ctx, rt.Center.EventHistoryGroupView)
|
||||
bgids, err := GetBusinessGroupIds(c, rt.Ctx, rt.Center.EventHistoryGroupView, myGroups)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
total, err := models.AlertCurEventTotal(rt.Ctx, prods, bgids, stime, etime, severity, dsIds,
|
||||
cates, ruleId, query)
|
||||
cates, ruleId, query, eventIds)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
list, err := models.AlertCurEventsGet(rt.Ctx, prods, bgids, stime, etime, severity, dsIds,
|
||||
cates, ruleId, query, limit, ginx.Offset(c, limit))
|
||||
cates, ruleId, query, limit, ginx.Offset(c, limit), eventIds)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
cache := make(map[int64]*models.UserGroup)
|
||||
|
||||
for i := 0; i < len(list); i++ {
|
||||
list[i].FillNotifyGroups(rt.Ctx, cache)
|
||||
}
|
||||
@@ -218,24 +232,60 @@ func (rt *Router) checkCurEventBusiGroupRWPermission(c *gin.Context, ids []int64
|
||||
|
||||
func (rt *Router) alertCurEventGet(c *gin.Context) {
|
||||
eid := ginx.UrlParamInt64(c, "eid")
|
||||
event, err := models.AlertCurEventGetById(rt.Ctx, eid)
|
||||
ginx.Dangerous(err)
|
||||
event, err := GetCurEventDetail(rt.Ctx, eid)
|
||||
ginx.NewRender(c).Data(event, err)
|
||||
}
|
||||
|
||||
func GetCurEventDetail(ctx *ctx.Context, eid int64) (*models.AlertCurEvent, error) {
|
||||
event, err := models.AlertCurEventGetById(ctx, eid)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
if event == nil {
|
||||
ginx.Bomb(404, "No such active event")
|
||||
return nil, fmt.Errorf("no such active event")
|
||||
}
|
||||
|
||||
if !rt.Center.AnonymousAccess.AlertDetail && rt.Center.EventHistoryGroupView {
|
||||
rt.bgroCheck(c, event.GroupId)
|
||||
}
|
||||
|
||||
ruleConfig, needReset := models.FillRuleConfigTplName(rt.Ctx, event.RuleConfig)
|
||||
ruleConfig, needReset := models.FillRuleConfigTplName(ctx, event.RuleConfig)
|
||||
if needReset {
|
||||
event.RuleConfigJson = ruleConfig
|
||||
}
|
||||
|
||||
event.LastEvalTime = event.TriggerTime
|
||||
ginx.NewRender(c).Data(event, nil)
|
||||
event.NotifyVersion, err = GetEventNotifyVersion(ctx, event.RuleId, event.NotifyRuleIds)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
event.NotifyRules, err = GetEventNorifyRuleNames(ctx, event.NotifyRuleIds)
|
||||
return event, err
|
||||
}
|
||||
|
||||
func GetEventNorifyRuleNames(ctx *ctx.Context, notifyRuleIds []int64) ([]*models.EventNotifyRule, error) {
|
||||
notifyRuleNames := make([]*models.EventNotifyRule, 0)
|
||||
notifyRules, err := models.NotifyRulesGet(ctx, "id in ?", notifyRuleIds)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
for _, notifyRule := range notifyRules {
|
||||
notifyRuleNames = append(notifyRuleNames, &models.EventNotifyRule{
|
||||
Id: notifyRule.ID,
|
||||
Name: notifyRule.Name,
|
||||
})
|
||||
}
|
||||
return notifyRuleNames, nil
|
||||
}
|
||||
|
||||
func GetEventNotifyVersion(ctx *ctx.Context, ruleId int64, notifyRuleIds []int64) (int, error) {
|
||||
if len(notifyRuleIds) != 0 {
|
||||
// 如果存在 notify_rule_ids,则认为使用新的告警通知方式
|
||||
return 1, nil
|
||||
}
|
||||
|
||||
rule, err := models.AlertRuleGetById(ctx, ruleId)
|
||||
if err != nil {
|
||||
return 0, err
|
||||
}
|
||||
return rule.NotifyVersion, nil
|
||||
}
|
||||
|
||||
func (rt *Router) alertCurEventsStatistics(c *gin.Context) {
|
||||
|
||||
@@ -56,7 +56,7 @@ func (rt *Router) alertHisEventsList(c *gin.Context) {
|
||||
|
||||
ruleId := ginx.QueryInt64(c, "rid", 0)
|
||||
|
||||
bgids, err := GetBusinessGroupIds(c, rt.Ctx, rt.Center.EventHistoryGroupView)
|
||||
bgids, err := GetBusinessGroupIds(c, rt.Ctx, rt.Center.EventHistoryGroupView, false)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
total, err := models.AlertHisEventTotal(rt.Ctx, prods, bgids, stime, etime, severity,
|
||||
@@ -96,46 +96,54 @@ func (rt *Router) alertHisEventGet(c *gin.Context) {
|
||||
event.RuleConfigJson = ruleConfig
|
||||
}
|
||||
|
||||
event.NotifyVersion, err = GetEventNotifyVersion(rt.Ctx, event.RuleId, event.NotifyRuleIds)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
event.NotifyRules, err = GetEventNorifyRuleNames(rt.Ctx, event.NotifyRuleIds)
|
||||
ginx.NewRender(c).Data(event, err)
|
||||
}
|
||||
|
||||
func GetBusinessGroupIds(c *gin.Context, ctx *ctx.Context, eventHistoryGroupView bool) ([]int64, error) {
|
||||
func GetBusinessGroupIds(c *gin.Context, ctx *ctx.Context, onlySelfGroupView bool, myGroups bool) ([]int64, error) {
|
||||
bgid := ginx.QueryInt64(c, "bgid", 0)
|
||||
var bgids []int64
|
||||
|
||||
if !eventHistoryGroupView || strings.HasPrefix(c.Request.URL.Path, "/v1") {
|
||||
if strings.HasPrefix(c.Request.URL.Path, "/v1") {
|
||||
// 如果请求路径以 /v1 开头,不查询用户信息
|
||||
if bgid > 0 {
|
||||
return []int64{bgid}, nil
|
||||
}
|
||||
|
||||
return bgids, nil
|
||||
}
|
||||
|
||||
user := c.MustGet("user").(*models.User)
|
||||
if user.IsAdmin() {
|
||||
if myGroups || (onlySelfGroupView && !user.IsAdmin()) {
|
||||
// 1. 页面上勾选了我的业务组,需要查询用户所属的业务组
|
||||
// 2. 如果 onlySelfGroupView 为 true,表示只允许查询用户所属的业务组
|
||||
bussGroupIds, err := models.MyBusiGroupIds(ctx, user.Id)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
if len(bussGroupIds) == 0 {
|
||||
// 如果没查到用户属于任何业务组,需要返回一个0,否则会导致查询到全部告警历史
|
||||
return []int64{0}, nil
|
||||
}
|
||||
|
||||
if bgid > 0 {
|
||||
if !slices.Contains(bussGroupIds, bgid) && !user.IsAdmin() {
|
||||
return nil, fmt.Errorf("business group ID not allowed")
|
||||
}
|
||||
|
||||
return []int64{bgid}, nil
|
||||
}
|
||||
return bgids, nil
|
||||
}
|
||||
|
||||
bussGroupIds, err := models.MyBusiGroupIds(ctx, user.Id)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
if len(bussGroupIds) == 0 {
|
||||
// 如果没查到用户属于任何业务组,需要返回一个0,否则会导致查询到全部告警历史
|
||||
return []int64{0}, nil
|
||||
}
|
||||
|
||||
if bgid > 0 && !slices.Contains(bussGroupIds, bgid) {
|
||||
return nil, fmt.Errorf("business group ID not allowed")
|
||||
return bussGroupIds, nil
|
||||
}
|
||||
|
||||
if bgid > 0 {
|
||||
// Pass filter parameters, priority to use
|
||||
return []int64{bgid}, nil
|
||||
}
|
||||
|
||||
return bussGroupIds, nil
|
||||
return bgids, nil
|
||||
}
|
||||
|
||||
@@ -12,6 +12,7 @@ import (
|
||||
"gopkg.in/yaml.v2"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
"github.com/ccfos/nightingale/v6/pushgw/pconf"
|
||||
"github.com/ccfos/nightingale/v6/pushgw/writer"
|
||||
|
||||
@@ -20,7 +21,6 @@ import (
|
||||
"github.com/prometheus/prometheus/prompb"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/i18n"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
type AlertRuleModifyHookFunc func(ar *models.AlertRule)
|
||||
@@ -52,7 +52,7 @@ func getAlertCueEventTimeRange(c *gin.Context) (stime, etime int64) {
|
||||
}
|
||||
|
||||
func (rt *Router) alertRuleGetsByGids(c *gin.Context) {
|
||||
gids := str.IdsInt64(ginx.QueryStr(c, "gids", ""), ",")
|
||||
gids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "gids", ""), ",")
|
||||
if len(gids) > 0 {
|
||||
for _, gid := range gids {
|
||||
rt.bgroCheck(c, gid)
|
||||
|
||||
@@ -5,10 +5,10 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
// Return all, front-end search and paging
|
||||
@@ -31,7 +31,7 @@ func (rt *Router) alertSubscribeGets(c *gin.Context) {
|
||||
}
|
||||
|
||||
func (rt *Router) alertSubscribeGetsByGids(c *gin.Context) {
|
||||
gids := str.IdsInt64(ginx.QueryStr(c, "gids", ""), ",")
|
||||
gids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "gids", ""), ",")
|
||||
if len(gids) > 0 {
|
||||
for _, gid := range gids {
|
||||
rt.bgroCheck(c, gid)
|
||||
|
||||
@@ -6,11 +6,11 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/i18n"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
type boardForm struct {
|
||||
@@ -51,9 +51,14 @@ func (rt *Router) boardAdd(c *gin.Context) {
|
||||
|
||||
func (rt *Router) boardGet(c *gin.Context) {
|
||||
bid := ginx.UrlParamStr(c, "bid")
|
||||
board, err := models.BoardGet(rt.Ctx, "id = ? or ident = ?", bid, bid)
|
||||
board, err := models.BoardGet(rt.Ctx, "ident = ?", bid)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if board == nil {
|
||||
board, err = models.BoardGet(rt.Ctx, "id = ?", bid)
|
||||
ginx.Dangerous(err)
|
||||
}
|
||||
|
||||
if board == nil {
|
||||
ginx.Bomb(http.StatusNotFound, "No such dashboard")
|
||||
}
|
||||
@@ -96,7 +101,7 @@ func (rt *Router) boardGet(c *gin.Context) {
|
||||
|
||||
// 根据 bids 参数,获取多个 board
|
||||
func (rt *Router) boardGetsByBids(c *gin.Context) {
|
||||
bids := str.IdsInt64(ginx.QueryStr(c, "bids", ""), ",")
|
||||
bids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "bids", ""), ",")
|
||||
boards, err := models.BoardGetsByBids(rt.Ctx, bids)
|
||||
ginx.Dangerous(err)
|
||||
ginx.NewRender(c).Data(boards, err)
|
||||
@@ -265,7 +270,7 @@ func (rt *Router) publicBoardGets(c *gin.Context) {
|
||||
}
|
||||
|
||||
func (rt *Router) boardGetsByGids(c *gin.Context) {
|
||||
gids := str.IdsInt64(ginx.QueryStr(c, "gids", ""), ",")
|
||||
gids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "gids", ""), ",")
|
||||
query := ginx.QueryStr(c, "query", "")
|
||||
|
||||
if len(gids) > 0 {
|
||||
|
||||
@@ -57,7 +57,7 @@ func (rt *Router) metricFilterDel(c *gin.Context) {
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if !HasPerm(gids, old.GroupsPerm, true) {
|
||||
ginx.NewRender(c).Message("no permission")
|
||||
ginx.NewRender(c).Message("forbidden")
|
||||
return
|
||||
}
|
||||
}
|
||||
@@ -79,7 +79,7 @@ func (rt *Router) metricFilterPut(c *gin.Context) {
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if !HasPerm(gids, old.GroupsPerm, true) {
|
||||
ginx.NewRender(c).Message("no permission")
|
||||
ginx.NewRender(c).Message("forbidden")
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
@@ -86,15 +86,11 @@ func (rt *Router) builtinMetricsDel(c *gin.Context) {
|
||||
func (rt *Router) builtinMetricsDefaultTypes(c *gin.Context) {
|
||||
lst := []string{
|
||||
"Linux",
|
||||
"Procstat",
|
||||
"cAdvisor",
|
||||
"Ping",
|
||||
"MySQL",
|
||||
"Redis",
|
||||
"Kafka",
|
||||
"Elasticsearch",
|
||||
"PostgreSQL",
|
||||
"MongoDB",
|
||||
"Memcached",
|
||||
"ClickHouse",
|
||||
}
|
||||
ginx.NewRender(c).Data(lst, nil)
|
||||
}
|
||||
@@ -102,29 +98,10 @@ func (rt *Router) builtinMetricsDefaultTypes(c *gin.Context) {
|
||||
func (rt *Router) builtinMetricsTypes(c *gin.Context) {
|
||||
collector := ginx.QueryStr(c, "collector", "")
|
||||
query := ginx.QueryStr(c, "query", "")
|
||||
disabled := ginx.QueryInt(c, "disabled", -1)
|
||||
lang := c.GetHeader("X-Language")
|
||||
|
||||
metricTypeList, err := models.BuiltinMetricTypes(rt.Ctx, lang, collector, query)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
componentList, err := models.BuiltinComponentGets(rt.Ctx, "", disabled)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
// 创建一个 map 来存储 componentList 中的类型
|
||||
componentTypes := make(map[string]struct{})
|
||||
for _, comp := range componentList {
|
||||
componentTypes[comp.Ident] = struct{}{}
|
||||
}
|
||||
|
||||
filteredMetricTypeList := make([]string, 0)
|
||||
for _, metricType := range metricTypeList {
|
||||
if _, exists := componentTypes[metricType]; exists {
|
||||
filteredMetricTypeList = append(filteredMetricTypeList, metricType)
|
||||
}
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(filteredMetricTypeList, nil)
|
||||
ginx.NewRender(c).Data(metricTypeList, err)
|
||||
}
|
||||
|
||||
func (rt *Router) builtinMetricsCollectors(c *gin.Context) {
|
||||
|
||||
@@ -4,11 +4,11 @@ import (
|
||||
"net/http"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
type busiGroupForm struct {
|
||||
@@ -131,7 +131,7 @@ func (rt *Router) busiGroupGetsByService(c *gin.Context) {
|
||||
// 这个接口只有在活跃告警页面才调用,获取各个BG的活跃告警数量
|
||||
func (rt *Router) busiGroupAlertingsGets(c *gin.Context) {
|
||||
ids := ginx.QueryStr(c, "ids", "")
|
||||
ret, err := models.AlertNumbers(rt.Ctx, str.IdsInt64(ids))
|
||||
ret, err := models.AlertNumbers(rt.Ctx, strx.IdsInt64ForAPI(ids))
|
||||
ginx.NewRender(c).Data(ret, err)
|
||||
}
|
||||
|
||||
@@ -142,7 +142,7 @@ func (rt *Router) busiGroupGet(c *gin.Context) {
|
||||
}
|
||||
|
||||
func (rt *Router) busiGroupsGetTags(c *gin.Context) {
|
||||
bgids := str.IdsInt64(ginx.QueryStr(c, "gids", ""), ",")
|
||||
bgids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "gids", ""), ",")
|
||||
targetIdents, err := models.TargetIndentsGetByBgids(rt.Ctx, bgids)
|
||||
ginx.Dangerous(err)
|
||||
tags, err := models.TargetGetTags(rt.Ctx, targetIdents, true, "busigroup")
|
||||
|
||||
@@ -4,15 +4,15 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
func (rt *Router) chartShareGets(c *gin.Context) {
|
||||
ids := ginx.QueryStr(c, "ids", "")
|
||||
lst, err := models.ChartShareGetsByIds(rt.Ctx, str.IdsInt64(ids, ","))
|
||||
lst, err := models.ChartShareGetsByIds(rt.Ctx, strx.IdsInt64ForAPI(ids, ","))
|
||||
ginx.NewRender(c).Data(lst, err)
|
||||
}
|
||||
|
||||
|
||||
@@ -57,15 +57,21 @@ func (rt *Router) datasourceBriefs(c *gin.Context) {
|
||||
|
||||
for _, item := range list {
|
||||
item.AuthJson.BasicAuthPassword = ""
|
||||
if item.PluginType != models.PROMETHEUS {
|
||||
item.SettingsJson = nil
|
||||
} else {
|
||||
if item.PluginType == models.PROMETHEUS {
|
||||
for k, v := range item.SettingsJson {
|
||||
if strings.HasPrefix(k, "prometheus.") {
|
||||
item.SettingsJson[strings.TrimPrefix(k, "prometheus.")] = v
|
||||
delete(item.SettingsJson, k)
|
||||
}
|
||||
}
|
||||
} else if item.PluginType == "cloudwatch" {
|
||||
for k := range item.SettingsJson {
|
||||
if !strings.Contains(k, "region") {
|
||||
delete(item.SettingsJson, k)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
item.SettingsJson = nil
|
||||
}
|
||||
dss = append(dss, item)
|
||||
}
|
||||
@@ -117,7 +123,7 @@ func (rt *Router) datasourceUpsert(c *gin.Context) {
|
||||
}
|
||||
err = req.Add(rt.Ctx)
|
||||
} else {
|
||||
err = req.Update(rt.Ctx, "name", "description", "cluster_name", "settings", "http", "auth", "updated_by", "updated_at", "is_default")
|
||||
err = req.Update(rt.Ctx, "name", "identifier", "description", "cluster_name", "settings", "http", "auth", "updated_by", "updated_at", "is_default")
|
||||
}
|
||||
|
||||
Render(c, nil, err)
|
||||
|
||||
141
center/router/router_embedded.go
Normal file
141
center/router/router_embedded.go
Normal file
@@ -0,0 +1,141 @@
|
||||
package router
|
||||
|
||||
import (
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/ctx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
)
|
||||
|
||||
func (rt *Router) embeddedProductGets(c *gin.Context) {
|
||||
products, err := models.EmbeddedProductGets(rt.Ctx)
|
||||
ginx.Dangerous(err)
|
||||
// 获取当前用户可访问的Group ID 列表
|
||||
me := c.MustGet("user").(*models.User)
|
||||
|
||||
if me.IsAdmin() {
|
||||
ginx.NewRender(c).Data(products, err)
|
||||
return
|
||||
}
|
||||
|
||||
gids, err := models.MyGroupIds(rt.Ctx, me.Id)
|
||||
bgSet := make(map[int64]struct{}, len(gids))
|
||||
for _, id := range gids {
|
||||
bgSet[id] = struct{}{}
|
||||
}
|
||||
|
||||
// 过滤出公开或有权限访问的私有 product link
|
||||
var result []*models.EmbeddedProduct
|
||||
for _, product := range products {
|
||||
if !product.IsPrivate {
|
||||
result = append(result, product)
|
||||
continue
|
||||
}
|
||||
|
||||
for _, tid := range product.TeamIDs {
|
||||
if _, ok := bgSet[tid]; ok {
|
||||
result = append(result, product)
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(result, err)
|
||||
}
|
||||
|
||||
func (rt *Router) embeddedProductGet(c *gin.Context) {
|
||||
id := ginx.UrlParamInt64(c, "id")
|
||||
if id <= 0 {
|
||||
ginx.Bomb(400, "invalid id")
|
||||
}
|
||||
|
||||
data, err := models.GetEmbeddedProductByID(rt.Ctx, id)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
me := c.MustGet("user").(*models.User)
|
||||
hashPermission, err := hasEmbeddedProductAccess(rt.Ctx, me, data)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if !hashPermission {
|
||||
ginx.Bomb(403, "forbidden")
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(data, nil)
|
||||
}
|
||||
|
||||
func (rt *Router) embeddedProductAdd(c *gin.Context) {
|
||||
var eps []models.EmbeddedProduct
|
||||
ginx.BindJSON(c, &eps)
|
||||
|
||||
me := c.MustGet("user").(*models.User)
|
||||
|
||||
for i := range eps {
|
||||
eps[i].CreateBy = me.Nickname
|
||||
eps[i].UpdateBy = me.Nickname
|
||||
}
|
||||
|
||||
err := models.AddEmbeddedProduct(rt.Ctx, eps)
|
||||
ginx.NewRender(c).Message(err)
|
||||
}
|
||||
|
||||
func (rt *Router) embeddedProductPut(c *gin.Context) {
|
||||
var ep models.EmbeddedProduct
|
||||
id := ginx.UrlParamInt64(c, "id")
|
||||
ginx.BindJSON(c, &ep)
|
||||
|
||||
if id <= 0 {
|
||||
ginx.Bomb(400, "invalid id")
|
||||
}
|
||||
|
||||
oldProduct, err := models.GetEmbeddedProductByID(rt.Ctx, id)
|
||||
ginx.Dangerous(err)
|
||||
me := c.MustGet("user").(*models.User)
|
||||
|
||||
now := time.Now().Unix()
|
||||
oldProduct.Name = ep.Name
|
||||
oldProduct.URL = ep.URL
|
||||
oldProduct.IsPrivate = ep.IsPrivate
|
||||
oldProduct.TeamIDs = ep.TeamIDs
|
||||
oldProduct.UpdateBy = me.Username
|
||||
oldProduct.UpdateAt = now
|
||||
|
||||
err = models.UpdateEmbeddedProduct(rt.Ctx, oldProduct)
|
||||
ginx.NewRender(c).Message(err)
|
||||
}
|
||||
|
||||
func (rt *Router) embeddedProductDelete(c *gin.Context) {
|
||||
id := ginx.UrlParamInt64(c, "id")
|
||||
if id <= 0 {
|
||||
ginx.Bomb(400, "invalid id")
|
||||
}
|
||||
|
||||
err := models.DeleteEmbeddedProduct(rt.Ctx, id)
|
||||
ginx.NewRender(c).Message(err)
|
||||
}
|
||||
|
||||
func hasEmbeddedProductAccess(ctx *ctx.Context, user *models.User, ep *models.EmbeddedProduct) (bool, error) {
|
||||
if user.IsAdmin() || !ep.IsPrivate {
|
||||
return true, nil
|
||||
}
|
||||
|
||||
gids, err := models.MyGroupIds(ctx, user.Id)
|
||||
if err != nil {
|
||||
return false, err
|
||||
}
|
||||
|
||||
groupSet := make(map[int64]struct{}, len(gids))
|
||||
for _, gid := range gids {
|
||||
groupSet[gid] = struct{}{}
|
||||
}
|
||||
|
||||
for _, tid := range ep.TeamIDs {
|
||||
if _, ok := groupSet[tid]; ok {
|
||||
return true, nil
|
||||
}
|
||||
}
|
||||
|
||||
return false, nil
|
||||
}
|
||||
228
center/router/router_event_pipeline.go
Normal file
228
center/router/router_event_pipeline.go
Normal file
@@ -0,0 +1,228 @@
|
||||
package router
|
||||
|
||||
import (
|
||||
"net/http"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
)
|
||||
|
||||
// 获取事件Pipeline列表
|
||||
func (rt *Router) eventPipelinesList(c *gin.Context) {
|
||||
me := c.MustGet("user").(*models.User)
|
||||
pipelines, err := models.ListEventPipelines(rt.Ctx)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
allTids := make([]int64, 0)
|
||||
for _, pipeline := range pipelines {
|
||||
allTids = append(allTids, pipeline.TeamIds...)
|
||||
}
|
||||
ugMap, err := models.UserGroupIdAndNameMap(rt.Ctx, allTids)
|
||||
ginx.Dangerous(err)
|
||||
for _, pipeline := range pipelines {
|
||||
for _, tid := range pipeline.TeamIds {
|
||||
pipeline.TeamNames = append(pipeline.TeamNames, ugMap[tid])
|
||||
}
|
||||
}
|
||||
|
||||
gids, err := models.MyGroupIdsMap(rt.Ctx, me.Id)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if me.IsAdmin() {
|
||||
ginx.NewRender(c).Data(pipelines, nil)
|
||||
return
|
||||
}
|
||||
|
||||
res := make([]*models.EventPipeline, 0)
|
||||
for _, pipeline := range pipelines {
|
||||
for _, tid := range pipeline.TeamIds {
|
||||
if _, ok := gids[tid]; ok {
|
||||
res = append(res, pipeline)
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(res, nil)
|
||||
}
|
||||
|
||||
// 获取单个事件Pipeline详情
|
||||
func (rt *Router) getEventPipeline(c *gin.Context) {
|
||||
me := c.MustGet("user").(*models.User)
|
||||
id := ginx.UrlParamInt64(c, "id")
|
||||
pipeline, err := models.GetEventPipeline(rt.Ctx, id)
|
||||
ginx.Dangerous(err)
|
||||
ginx.Dangerous(me.CheckGroupPermission(rt.Ctx, pipeline.TeamIds))
|
||||
|
||||
err = pipeline.FillTeamNames(rt.Ctx)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
ginx.NewRender(c).Data(pipeline, nil)
|
||||
}
|
||||
|
||||
// 创建事件Pipeline
|
||||
func (rt *Router) addEventPipeline(c *gin.Context) {
|
||||
var pipeline models.EventPipeline
|
||||
ginx.BindJSON(c, &pipeline)
|
||||
|
||||
user := c.MustGet("user").(*models.User)
|
||||
now := time.Now().Unix()
|
||||
pipeline.CreateBy = user.Username
|
||||
pipeline.CreateAt = now
|
||||
pipeline.UpdateAt = now
|
||||
pipeline.UpdateBy = user.Username
|
||||
|
||||
err := pipeline.Verify()
|
||||
if err != nil {
|
||||
ginx.Bomb(http.StatusBadRequest, err.Error())
|
||||
}
|
||||
|
||||
ginx.Dangerous(user.CheckGroupPermission(rt.Ctx, pipeline.TeamIds))
|
||||
err = models.CreateEventPipeline(rt.Ctx, &pipeline)
|
||||
ginx.NewRender(c).Message(err)
|
||||
}
|
||||
|
||||
// 更新事件Pipeline
|
||||
func (rt *Router) updateEventPipeline(c *gin.Context) {
|
||||
var f models.EventPipeline
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
me := c.MustGet("user").(*models.User)
|
||||
f.UpdateBy = me.Username
|
||||
f.UpdateAt = time.Now().Unix()
|
||||
|
||||
pipeline, err := models.GetEventPipeline(rt.Ctx, f.ID)
|
||||
if err != nil {
|
||||
ginx.Bomb(http.StatusNotFound, "No such event pipeline")
|
||||
}
|
||||
ginx.Dangerous(me.CheckGroupPermission(rt.Ctx, pipeline.TeamIds))
|
||||
|
||||
ginx.NewRender(c).Message(pipeline.Update(rt.Ctx, &f))
|
||||
}
|
||||
|
||||
// 删除事件Pipeline
|
||||
func (rt *Router) deleteEventPipelines(c *gin.Context) {
|
||||
var f struct {
|
||||
Ids []int64 `json:"ids"`
|
||||
}
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
if len(f.Ids) == 0 {
|
||||
ginx.Bomb(http.StatusBadRequest, "ids required")
|
||||
}
|
||||
|
||||
me := c.MustGet("user").(*models.User)
|
||||
for _, id := range f.Ids {
|
||||
pipeline, err := models.GetEventPipeline(rt.Ctx, id)
|
||||
ginx.Dangerous(err)
|
||||
ginx.Dangerous(me.CheckGroupPermission(rt.Ctx, pipeline.TeamIds))
|
||||
}
|
||||
|
||||
err := models.DeleteEventPipelines(rt.Ctx, f.Ids)
|
||||
ginx.NewRender(c).Message(err)
|
||||
}
|
||||
|
||||
// 测试事件Pipeline
|
||||
func (rt *Router) tryRunEventPipeline(c *gin.Context) {
|
||||
var f struct {
|
||||
EventId int64 `json:"event_id"`
|
||||
PipelineConfig models.EventPipeline `json:"pipeline_config"`
|
||||
}
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
hisEvent, err := models.AlertHisEventGetById(rt.Ctx, f.EventId)
|
||||
if err != nil || hisEvent == nil {
|
||||
ginx.Bomb(http.StatusBadRequest, "event not found")
|
||||
}
|
||||
event := hisEvent.ToCur()
|
||||
|
||||
for _, p := range f.PipelineConfig.ProcessorConfigs {
|
||||
processor, err := models.GetProcessorByType(p.Typ, p.Config)
|
||||
if err != nil {
|
||||
ginx.Bomb(http.StatusBadRequest, "processor %+v type not found", p)
|
||||
}
|
||||
event = processor.Process(rt.Ctx, event)
|
||||
if event == nil {
|
||||
ginx.Bomb(http.StatusBadRequest, "event is nil")
|
||||
}
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(event, nil)
|
||||
}
|
||||
|
||||
// 测试事件处理器
|
||||
func (rt *Router) tryRunEventProcessor(c *gin.Context) {
|
||||
var f struct {
|
||||
EventId int64 `json:"event_id"`
|
||||
ProcessorConfig models.ProcessorConfig `json:"processor_config"`
|
||||
}
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
hisEvent, err := models.AlertHisEventGetById(rt.Ctx, f.EventId)
|
||||
if err != nil || hisEvent == nil {
|
||||
ginx.Bomb(http.StatusBadRequest, "event not found")
|
||||
}
|
||||
event := hisEvent.ToCur()
|
||||
|
||||
processor, err := models.GetProcessorByType(f.ProcessorConfig.Typ, f.ProcessorConfig.Config)
|
||||
if err != nil {
|
||||
ginx.Bomb(http.StatusBadRequest, "processor type not found")
|
||||
}
|
||||
event = processor.Process(rt.Ctx, event)
|
||||
logger.Infof("processor %+v result: %+v", f.ProcessorConfig, event)
|
||||
if event == nil {
|
||||
ginx.Bomb(http.StatusBadRequest, "event is nil")
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(event, nil)
|
||||
}
|
||||
|
||||
func (rt *Router) tryRunEventProcessorByNotifyRule(c *gin.Context) {
|
||||
var f struct {
|
||||
EventId int64 `json:"event_id"`
|
||||
PipelineConfigs []models.PipelineConfig `json:"pipeline_configs"`
|
||||
}
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
hisEvent, err := models.AlertHisEventGetById(rt.Ctx, f.EventId)
|
||||
if err != nil || hisEvent == nil {
|
||||
ginx.Bomb(http.StatusBadRequest, "event not found")
|
||||
}
|
||||
event := hisEvent.ToCur()
|
||||
|
||||
pids := make([]int64, 0)
|
||||
for _, pc := range f.PipelineConfigs {
|
||||
if pc.Enable {
|
||||
pids = append(pids, pc.PipelineId)
|
||||
}
|
||||
}
|
||||
|
||||
pipelines, err := models.GetEventPipelinesByIds(rt.Ctx, pids)
|
||||
if err != nil {
|
||||
ginx.Bomb(http.StatusBadRequest, "processors not found")
|
||||
}
|
||||
|
||||
for _, pl := range pipelines {
|
||||
for _, p := range pl.ProcessorConfigs {
|
||||
processor, err := models.GetProcessorByType(p.Typ, p.Config)
|
||||
if err != nil {
|
||||
ginx.Bomb(http.StatusBadRequest, "processor %+v type not found", p)
|
||||
}
|
||||
event = processor.Process(rt.Ctx, event)
|
||||
if event == nil {
|
||||
ginx.Bomb(http.StatusBadRequest, "event is nil")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(event, nil)
|
||||
}
|
||||
|
||||
func (rt *Router) eventPipelinesListByService(c *gin.Context) {
|
||||
pipelines, err := models.ListEventPipelines(rt.Ctx)
|
||||
ginx.NewRender(c).Data(pipelines, err)
|
||||
}
|
||||
@@ -40,6 +40,10 @@ func (rt *Router) statistic(c *gin.Context) {
|
||||
model = models.NotifyRule{}
|
||||
case "notify_channel":
|
||||
model = models.NotifyChannel{}
|
||||
case "event_pipeline":
|
||||
statistics, err = models.EventPipelineStatistics(rt.Ctx)
|
||||
ginx.NewRender(c).Data(statistics, err)
|
||||
return
|
||||
case "datasource":
|
||||
// datasource update_at is different from others
|
||||
statistics, err = models.DatasourceStatistics(rt.Ctx)
|
||||
|
||||
@@ -152,6 +152,13 @@ func (rt *Router) refreshPost(c *gin.Context) {
|
||||
return
|
||||
}
|
||||
|
||||
// 看这个 token 是否还存在 redis 中
|
||||
val, err := rt.fetchAuth(c.Request.Context(), refreshUuid)
|
||||
if err != nil || val == "" {
|
||||
ginx.NewRender(c, http.StatusUnauthorized).Message("refresh token expired")
|
||||
return
|
||||
}
|
||||
|
||||
userIdentity, ok := claims["user_identity"].(string)
|
||||
if !ok {
|
||||
// Theoretically impossible
|
||||
|
||||
@@ -10,10 +10,10 @@ import (
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/slice"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
"github.com/ccfos/nightingale/v6/pkg/tplx"
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
func (rt *Router) messageTemplatesAdd(c *gin.Context) {
|
||||
@@ -32,7 +32,7 @@ func (rt *Router) messageTemplatesAdd(c *gin.Context) {
|
||||
for _, tpl := range lst {
|
||||
ginx.Dangerous(tpl.Verify())
|
||||
if !isAdmin && !slice.HaveIntersection(gids, tpl.UserGroupIds) {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
idents = append(idents, tpl.Ident)
|
||||
|
||||
@@ -75,8 +75,8 @@ func (rt *Router) messageTemplatesDel(c *gin.Context) {
|
||||
gids, err := models.MyGroupIds(rt.Ctx, me.Id)
|
||||
ginx.Dangerous(err)
|
||||
for _, t := range lst {
|
||||
if !slice.HaveIntersection[int64](gids, t.UserGroupIds) {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
if !slice.HaveIntersection(gids, t.UserGroupIds) {
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -105,8 +105,8 @@ func (rt *Router) messageTemplatePut(c *gin.Context) {
|
||||
if !me.IsAdmin() {
|
||||
gids, err := models.MyGroupIds(rt.Ctx, me.Id)
|
||||
ginx.Dangerous(err)
|
||||
if !slice.HaveIntersection[int64](gids, mt.UserGroupIds) {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
if !slice.HaveIntersection(gids, mt.UserGroupIds) {
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
}
|
||||
|
||||
@@ -125,8 +125,8 @@ func (rt *Router) messageTemplateGet(c *gin.Context) {
|
||||
if mt == nil {
|
||||
ginx.Bomb(http.StatusNotFound, "message template not found")
|
||||
}
|
||||
if mt.Private == 1 && !slice.HaveIntersection[int64](gids, mt.UserGroupIds) {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
if mt.Private == 1 && !slice.HaveIntersection(gids, mt.UserGroupIds) {
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(mt, nil)
|
||||
@@ -137,7 +137,7 @@ func (rt *Router) messageTemplatesGet(c *gin.Context) {
|
||||
if tmp := ginx.QueryStr(c, "notify_channel_idents", ""); tmp != "" {
|
||||
notifyChannelIdents = strings.Split(tmp, ",")
|
||||
}
|
||||
notifyChannelIds := str.IdsInt64(ginx.QueryStr(c, "notify_channel_ids", ""))
|
||||
notifyChannelIds := strx.IdsInt64ForAPI(ginx.QueryStr(c, "notify_channel_ids", ""))
|
||||
if len(notifyChannelIds) > 0 {
|
||||
ginx.Dangerous(models.DB(rt.Ctx).Model(models.NotifyChannelConfig{}).
|
||||
Where("id in (?)", notifyChannelIds).Pluck("ident", ¬ifyChannelIdents).Error)
|
||||
|
||||
@@ -1,16 +1,18 @@
|
||||
package router
|
||||
|
||||
import (
|
||||
"math"
|
||||
"net/http"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/alert/common"
|
||||
"github.com/ccfos/nightingale/v6/alert/mute"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
// Return all, front-end search and paging
|
||||
@@ -22,7 +24,7 @@ func (rt *Router) alertMuteGetsByBG(c *gin.Context) {
|
||||
}
|
||||
|
||||
func (rt *Router) alertMuteGetsByGids(c *gin.Context) {
|
||||
gids := str.IdsInt64(ginx.QueryStr(c, "gids", ""), ",")
|
||||
gids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "gids", ""), ",")
|
||||
if len(gids) > 0 {
|
||||
for _, gid := range gids {
|
||||
rt.bgroCheck(c, gid)
|
||||
@@ -63,10 +65,45 @@ func (rt *Router) alertMuteAdd(c *gin.Context) {
|
||||
|
||||
username := c.MustGet("username").(string)
|
||||
f.CreateBy = username
|
||||
f.UpdateBy = username
|
||||
f.GroupId = ginx.UrlParamInt64(c, "id")
|
||||
ginx.NewRender(c).Message(f.Add(rt.Ctx))
|
||||
}
|
||||
|
||||
type MuteTestForm struct {
|
||||
EventId int64 `json:"event_id" binding:"required"`
|
||||
AlertMute models.AlertMute `json:"mute_config" binding:"required"`
|
||||
}
|
||||
|
||||
func (rt *Router) alertMuteTryRun(c *gin.Context) {
|
||||
|
||||
var f MuteTestForm
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
hisEvent, err := models.AlertHisEventGetById(rt.Ctx, f.EventId)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if hisEvent == nil {
|
||||
ginx.Bomb(http.StatusNotFound, "event not found")
|
||||
}
|
||||
|
||||
curEvent := *hisEvent.ToCur()
|
||||
curEvent.SetTagsMap()
|
||||
|
||||
// 绕过时间范围检查:设置时间范围为全量(0 到 int64 最大值),仅验证其他匹配条件(如标签、策略类型等)
|
||||
f.AlertMute.MuteTimeType = models.TimeRange
|
||||
f.AlertMute.Btime = 0 // 最小可能值(如 Unix 时间戳起点)
|
||||
f.AlertMute.Etime = math.MaxInt64 // 最大可能值(int64 上限)
|
||||
|
||||
if !mute.MatchMute(&curEvent, &f.AlertMute) {
|
||||
ginx.NewRender(c).Data("not match", nil)
|
||||
return
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data("mute test match", nil)
|
||||
|
||||
}
|
||||
|
||||
// Preview events (alert_cur_event) that match the mute strategy based on the following criteria:
|
||||
// business group ID (group_id, group_id), product (prod, rule_prod),
|
||||
// alert event severity (severities, severity), and event tags (tags, tags).
|
||||
|
||||
@@ -9,6 +9,7 @@ import (
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/center/cstats"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
@@ -335,6 +336,12 @@ func (rt *Router) extractTokenMetadata(r *http.Request) (*AccessDetails, error)
|
||||
return nil, errors.New("failed to parse access_uuid from jwt")
|
||||
}
|
||||
|
||||
// accessUuid 在 redis 里存在才放行
|
||||
val, err := rt.fetchAuth(r.Context(), accessUuid)
|
||||
if err != nil || val == "" {
|
||||
return nil, errors.New("unauthorized")
|
||||
}
|
||||
|
||||
return &AccessDetails{
|
||||
AccessUuid: accessUuid,
|
||||
UserIdentity: claims["user_identity"].(string),
|
||||
@@ -355,29 +362,72 @@ func (rt *Router) extractToken(r *http.Request) string {
|
||||
}
|
||||
|
||||
func (rt *Router) createAuth(ctx context.Context, userIdentity string, td *TokenDetails) error {
|
||||
username := strings.Split(userIdentity, "-")[1]
|
||||
|
||||
// 如果只能有一个账号登录,那么就删除之前的 token
|
||||
if rt.HTTP.JWTAuth.SingleLogin {
|
||||
delKeys, err := rt.Redis.SMembers(ctx, rt.wrapJwtKey(username)).Result()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
if len(delKeys) > 0 {
|
||||
errDel := rt.Redis.Del(ctx, delKeys...).Err()
|
||||
if errDel != nil {
|
||||
return errDel
|
||||
}
|
||||
}
|
||||
|
||||
if errDel := rt.Redis.Del(ctx, rt.wrapJwtKey(username)).Err(); errDel != nil {
|
||||
return errDel
|
||||
}
|
||||
}
|
||||
|
||||
at := time.Unix(td.AtExpires, 0)
|
||||
rte := time.Unix(td.RtExpires, 0)
|
||||
now := time.Now()
|
||||
|
||||
errAccess := rt.Redis.Set(ctx, rt.wrapJwtKey(td.AccessUuid), userIdentity, at.Sub(now)).Err()
|
||||
if errAccess != nil {
|
||||
return errAccess
|
||||
if err := rt.Redis.Set(ctx, rt.wrapJwtKey(td.AccessUuid), userIdentity, at.Sub(now)).Err(); err != nil {
|
||||
cstats.RedisOperationLatency.WithLabelValues("set_token", "fail").Observe(time.Since(now).Seconds())
|
||||
return err
|
||||
}
|
||||
|
||||
errRefresh := rt.Redis.Set(ctx, rt.wrapJwtKey(td.RefreshUuid), userIdentity, rte.Sub(now)).Err()
|
||||
if errRefresh != nil {
|
||||
return errRefresh
|
||||
if err := rt.Redis.Set(ctx, rt.wrapJwtKey(td.RefreshUuid), userIdentity, rte.Sub(now)).Err(); err != nil {
|
||||
cstats.RedisOperationLatency.WithLabelValues("set_token", "fail").Observe(time.Since(now).Seconds())
|
||||
return err
|
||||
}
|
||||
|
||||
cstats.RedisOperationLatency.WithLabelValues("set_token", "success").Observe(time.Since(now).Seconds())
|
||||
|
||||
if rt.HTTP.JWTAuth.SingleLogin {
|
||||
if err := rt.Redis.SAdd(ctx, rt.wrapJwtKey(username), rt.wrapJwtKey(td.AccessUuid), rt.wrapJwtKey(td.RefreshUuid)).Err(); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func (rt *Router) fetchAuth(ctx context.Context, givenUuid string) (string, error) {
|
||||
return rt.Redis.Get(ctx, rt.wrapJwtKey(givenUuid)).Result()
|
||||
now := time.Now()
|
||||
ret, err := rt.Redis.Get(ctx, rt.wrapJwtKey(givenUuid)).Result()
|
||||
if err != nil {
|
||||
cstats.RedisOperationLatency.WithLabelValues("get_token", "fail").Observe(time.Since(now).Seconds())
|
||||
} else {
|
||||
cstats.RedisOperationLatency.WithLabelValues("get_token", "success").Observe(time.Since(now).Seconds())
|
||||
}
|
||||
|
||||
return ret, err
|
||||
}
|
||||
|
||||
func (rt *Router) deleteAuth(ctx context.Context, givenUuid string) error {
|
||||
return rt.Redis.Del(ctx, rt.wrapJwtKey(givenUuid)).Err()
|
||||
err := rt.Redis.Del(ctx, rt.wrapJwtKey(givenUuid)).Err()
|
||||
if err != nil {
|
||||
cstats.RedisOperationLatency.WithLabelValues("del_token", "fail").Observe(time.Since(time.Now()).Seconds())
|
||||
} else {
|
||||
cstats.RedisOperationLatency.WithLabelValues("del_token", "success").Observe(time.Since(time.Now()).Seconds())
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
func (rt *Router) deleteTokens(ctx context.Context, authD *AccessDetails) error {
|
||||
|
||||
@@ -17,9 +17,6 @@ import (
|
||||
|
||||
func (rt *Router) notifyChannelsAdd(c *gin.Context) {
|
||||
me := c.MustGet("user").(*models.User)
|
||||
if !me.IsAdmin() {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
}
|
||||
|
||||
var lst []*models.NotifyChannelConfig
|
||||
ginx.BindJSON(c, &lst)
|
||||
@@ -55,11 +52,6 @@ func (rt *Router) notifyChannelsAdd(c *gin.Context) {
|
||||
}
|
||||
|
||||
func (rt *Router) notifyChannelsDel(c *gin.Context) {
|
||||
me := c.MustGet("user").(*models.User)
|
||||
if !me.IsAdmin() {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
}
|
||||
|
||||
var f idsForm
|
||||
ginx.BindJSON(c, &f)
|
||||
f.Verify()
|
||||
@@ -79,9 +71,6 @@ func (rt *Router) notifyChannelsDel(c *gin.Context) {
|
||||
|
||||
func (rt *Router) notifyChannelPut(c *gin.Context) {
|
||||
me := c.MustGet("user").(*models.User)
|
||||
if !me.IsAdmin() {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
}
|
||||
|
||||
var f models.NotifyChannelConfig
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
@@ -31,7 +31,7 @@ func (rt *Router) notifyRulesAdd(c *gin.Context) {
|
||||
for _, nr := range lst {
|
||||
ginx.Dangerous(nr.Verify())
|
||||
if !isAdmin && !slice.HaveIntersection(gids, nr.UserGroupIds) {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
|
||||
nr.CreateBy = me.Username
|
||||
@@ -56,8 +56,8 @@ func (rt *Router) notifyRulesDel(c *gin.Context) {
|
||||
gids, err := models.MyGroupIds(rt.Ctx, me.Id)
|
||||
ginx.Dangerous(err)
|
||||
for _, t := range lst {
|
||||
if !slice.HaveIntersection[int64](gids, t.UserGroupIds) {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
if !slice.HaveIntersection(gids, t.UserGroupIds) {
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -79,8 +79,8 @@ func (rt *Router) notifyRulePut(c *gin.Context) {
|
||||
me := c.MustGet("user").(*models.User)
|
||||
gids, err := models.MyGroupIds(rt.Ctx, me.Id)
|
||||
ginx.Dangerous(err)
|
||||
if !slice.HaveIntersection[int64](gids, nr.UserGroupIds) && !me.IsAdmin() {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
if !slice.HaveIntersection(gids, nr.UserGroupIds) && !me.IsAdmin() {
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
|
||||
f.UpdateBy = me.Username
|
||||
@@ -99,8 +99,8 @@ func (rt *Router) notifyRuleGet(c *gin.Context) {
|
||||
ginx.Bomb(http.StatusNotFound, "notify rule not found")
|
||||
}
|
||||
|
||||
if !slice.HaveIntersection[int64](gids, nr.UserGroupIds) && !me.IsAdmin() {
|
||||
ginx.Bomb(http.StatusForbidden, "no permission")
|
||||
if !slice.HaveIntersection(gids, nr.UserGroupIds) && !me.IsAdmin() {
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(nr, nil)
|
||||
|
||||
@@ -45,7 +45,7 @@ func (rt *Router) notifyTplUpdateContent(c *gin.Context) {
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if notifyTpl.CreateBy != user.Username && !user.IsAdmin() {
|
||||
ginx.Bomb(403, "no permission")
|
||||
ginx.Bomb(403, "forbidden")
|
||||
}
|
||||
|
||||
f.UpdateAt = time.Now().Unix()
|
||||
@@ -64,7 +64,7 @@ func (rt *Router) notifyTplUpdate(c *gin.Context) {
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if notifyTpl.CreateBy != user.Username && !user.IsAdmin() {
|
||||
ginx.Bomb(403, "no permission")
|
||||
ginx.Bomb(403, "forbidden")
|
||||
}
|
||||
|
||||
// get the count of the same channel and name but different id
|
||||
@@ -188,7 +188,7 @@ func (rt *Router) notifyTplDel(c *gin.Context) {
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if notifyTpl.CreateBy != user.Username && !user.IsAdmin() {
|
||||
ginx.Bomb(403, "no permission")
|
||||
ginx.Bomb(403, "forbidden")
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Message(f.NotifyTplDelete(rt.Ctx, id))
|
||||
|
||||
@@ -3,6 +3,7 @@ package router
|
||||
import (
|
||||
"fmt"
|
||||
"sort"
|
||||
"sync"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/dscache"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
@@ -38,71 +39,116 @@ type LogResp struct {
|
||||
List []interface{} `json:"list"`
|
||||
}
|
||||
|
||||
func (rt *Router) QueryLogBatch(c *gin.Context) {
|
||||
var f QueryFrom
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
func QueryLogBatchConcurrently(anonymousAccess bool, ctx *gin.Context, f QueryFrom) (LogResp, error) {
|
||||
var resp LogResp
|
||||
var errMsg string
|
||||
var mu sync.Mutex
|
||||
var wg sync.WaitGroup
|
||||
var errs []error
|
||||
|
||||
for _, q := range f.Queries {
|
||||
if !rt.Center.AnonymousAccess.PromQuerier && !CheckDsPerm(c, q.Did, q.DsCate, q) {
|
||||
ginx.Bomb(200, "no permission")
|
||||
if !anonymousAccess && !CheckDsPerm(ctx, q.Did, q.DsCate, q) {
|
||||
return LogResp{}, fmt.Errorf("forbidden")
|
||||
}
|
||||
|
||||
plug, exists := dscache.DsCache.Get(q.DsCate, q.Did)
|
||||
if !exists {
|
||||
logger.Warningf("cluster:%d not exists query:%+v", q.Did, q)
|
||||
ginx.Bomb(200, "cluster not exists")
|
||||
return LogResp{}, fmt.Errorf("cluster not exists")
|
||||
}
|
||||
|
||||
data, total, err := plug.QueryLog(c.Request.Context(), q.Query)
|
||||
if err != nil {
|
||||
errMsg += fmt.Sprintf("query data error: %v query:%v\n ", err, q)
|
||||
logger.Warningf("query data error: %v query:%v", err, q)
|
||||
continue
|
||||
}
|
||||
wg.Add(1)
|
||||
go func(query Query) {
|
||||
defer wg.Done()
|
||||
|
||||
m := make(map[string]interface{})
|
||||
m["ref"] = q.Ref
|
||||
m["ds_id"] = q.Did
|
||||
m["ds_cate"] = q.DsCate
|
||||
m["data"] = data
|
||||
resp.List = append(resp.List, m)
|
||||
resp.Total += total
|
||||
data, total, err := plug.QueryLog(ctx.Request.Context(), query.Query)
|
||||
mu.Lock()
|
||||
defer mu.Unlock()
|
||||
if err != nil {
|
||||
errMsg := fmt.Sprintf("query data error: %v query:%v\n ", err, query)
|
||||
logger.Warningf(errMsg)
|
||||
errs = append(errs, err)
|
||||
return
|
||||
}
|
||||
|
||||
m := make(map[string]interface{})
|
||||
m["ref"] = query.Ref
|
||||
m["ds_id"] = query.Did
|
||||
m["ds_cate"] = query.DsCate
|
||||
m["data"] = data
|
||||
|
||||
resp.List = append(resp.List, m)
|
||||
resp.Total += total
|
||||
}(q)
|
||||
}
|
||||
|
||||
if errMsg != "" || len(resp.List) == 0 {
|
||||
ginx.Bomb(200, errMsg)
|
||||
wg.Wait()
|
||||
|
||||
if len(errs) > 0 {
|
||||
return LogResp{}, errs[0]
|
||||
}
|
||||
|
||||
if len(resp.List) == 0 {
|
||||
return LogResp{}, fmt.Errorf("no data")
|
||||
}
|
||||
|
||||
return resp, nil
|
||||
}
|
||||
|
||||
func (rt *Router) QueryLogBatch(c *gin.Context) {
|
||||
var f QueryFrom
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
resp, err := QueryLogBatchConcurrently(rt.Center.AnonymousAccess.PromQuerier, c, f)
|
||||
if err != nil {
|
||||
ginx.Bomb(200, "err:%v", err)
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(resp, nil)
|
||||
}
|
||||
|
||||
func (rt *Router) QueryData(c *gin.Context) {
|
||||
var f models.QueryParam
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
func QueryDataConcurrently(anonymousAccess bool, ctx *gin.Context, f models.QueryParam) ([]models.DataResp, error) {
|
||||
var resp []models.DataResp
|
||||
var err error
|
||||
var mu sync.Mutex
|
||||
var wg sync.WaitGroup
|
||||
var errs []error
|
||||
|
||||
for _, q := range f.Querys {
|
||||
if !rt.Center.AnonymousAccess.PromQuerier && !CheckDsPerm(c, f.DatasourceId, f.Cate, q) {
|
||||
ginx.Bomb(403, "no permission")
|
||||
if !anonymousAccess && !CheckDsPerm(ctx, f.DatasourceId, f.Cate, q) {
|
||||
return nil, fmt.Errorf("forbidden")
|
||||
}
|
||||
|
||||
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
|
||||
if !exists {
|
||||
logger.Warningf("cluster:%d not exists", f.DatasourceId)
|
||||
ginx.Bomb(200, "cluster not exists")
|
||||
return nil, fmt.Errorf("cluster not exists")
|
||||
}
|
||||
var datas []models.DataResp
|
||||
datas, err = plug.QueryData(c.Request.Context(), q)
|
||||
if err != nil {
|
||||
logger.Warningf("query data error: req:%+v err:%v", q, err)
|
||||
ginx.Bomb(200, "err:%v", err)
|
||||
}
|
||||
logger.Debugf("query data: req:%+v resp:%+v", q, datas)
|
||||
resp = append(resp, datas...)
|
||||
|
||||
wg.Add(1)
|
||||
go func(query interface{}) {
|
||||
defer wg.Done()
|
||||
|
||||
datas, err := plug.QueryData(ctx.Request.Context(), query)
|
||||
if err != nil {
|
||||
logger.Warningf("query data error: req:%+v err:%v", query, err)
|
||||
mu.Lock()
|
||||
errs = append(errs, err)
|
||||
mu.Unlock()
|
||||
return
|
||||
}
|
||||
|
||||
logger.Debugf("query data: req:%+v resp:%+v", query, datas)
|
||||
mu.Lock()
|
||||
resp = append(resp, datas...)
|
||||
mu.Unlock()
|
||||
}(q)
|
||||
}
|
||||
|
||||
wg.Wait()
|
||||
|
||||
if len(errs) > 0 {
|
||||
return nil, errs[0]
|
||||
}
|
||||
|
||||
// 面向API的统一处理
|
||||
// 按照 .Metric 排序
|
||||
// 确保仪表盘中相同图例的曲线颜色相同
|
||||
@@ -115,41 +161,80 @@ func (rt *Router) QueryData(c *gin.Context) {
|
||||
})
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(resp, err)
|
||||
return resp, nil
|
||||
}
|
||||
|
||||
func (rt *Router) QueryData(c *gin.Context) {
|
||||
var f models.QueryParam
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
resp, err := QueryDataConcurrently(rt.Center.AnonymousAccess.PromQuerier, c, f)
|
||||
if err != nil {
|
||||
ginx.Bomb(200, "err:%v", err)
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(resp, nil)
|
||||
}
|
||||
|
||||
// QueryLogConcurrently 并发查询日志
|
||||
func QueryLogConcurrently(anonymousAccess bool, ctx *gin.Context, f models.QueryParam) (LogResp, error) {
|
||||
var resp LogResp
|
||||
var mu sync.Mutex
|
||||
var wg sync.WaitGroup
|
||||
var errs []error
|
||||
|
||||
for _, q := range f.Querys {
|
||||
if !anonymousAccess && !CheckDsPerm(ctx, f.DatasourceId, f.Cate, q) {
|
||||
return LogResp{}, fmt.Errorf("forbidden")
|
||||
}
|
||||
|
||||
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
|
||||
if !exists {
|
||||
logger.Warningf("cluster:%d not exists query:%+v", f.DatasourceId, f)
|
||||
return LogResp{}, fmt.Errorf("cluster not exists")
|
||||
}
|
||||
|
||||
wg.Add(1)
|
||||
go func(query interface{}) {
|
||||
defer wg.Done()
|
||||
|
||||
data, total, err := plug.QueryLog(ctx.Request.Context(), query)
|
||||
logger.Debugf("query log: req:%+v resp:%+v", query, data)
|
||||
if err != nil {
|
||||
errMsg := fmt.Sprintf("query data error: %v query:%v\n ", err, query)
|
||||
logger.Warningf(errMsg)
|
||||
mu.Lock()
|
||||
errs = append(errs, err)
|
||||
mu.Unlock()
|
||||
return
|
||||
}
|
||||
|
||||
mu.Lock()
|
||||
resp.List = append(resp.List, data...)
|
||||
resp.Total += total
|
||||
mu.Unlock()
|
||||
}(q)
|
||||
}
|
||||
|
||||
wg.Wait()
|
||||
|
||||
if len(errs) > 0 {
|
||||
return LogResp{}, errs[0]
|
||||
}
|
||||
|
||||
if len(resp.List) == 0 {
|
||||
return LogResp{}, fmt.Errorf("no data")
|
||||
}
|
||||
|
||||
return resp, nil
|
||||
}
|
||||
|
||||
func (rt *Router) QueryLogV2(c *gin.Context) {
|
||||
var f models.QueryParam
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
var resp LogResp
|
||||
var errMsg string
|
||||
for _, q := range f.Querys {
|
||||
if !rt.Center.AnonymousAccess.PromQuerier && !CheckDsPerm(c, f.DatasourceId, f.Cate, q) {
|
||||
ginx.Bomb(200, "no permission")
|
||||
}
|
||||
|
||||
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
|
||||
if !exists {
|
||||
logger.Warningf("cluster:%d not exists query:%+v", f.DatasourceId, f)
|
||||
ginx.Bomb(200, "cluster not exists")
|
||||
}
|
||||
|
||||
data, total, err := plug.QueryLog(c.Request.Context(), q)
|
||||
if err != nil {
|
||||
errMsg += fmt.Sprintf("query data error: %v query:%v\n ", err, q)
|
||||
logger.Warningf("query data error: %v query:%v", err, q)
|
||||
continue
|
||||
}
|
||||
resp.List = append(resp.List, data...)
|
||||
resp.Total += total
|
||||
}
|
||||
|
||||
if errMsg != "" || len(resp.List) == 0 {
|
||||
ginx.Bomb(200, errMsg)
|
||||
}
|
||||
|
||||
ginx.NewRender(c).Data(resp, nil)
|
||||
resp, err := QueryLogConcurrently(rt.Center.AnonymousAccess.PromQuerier, c, f)
|
||||
ginx.NewRender(c).Data(resp, err)
|
||||
}
|
||||
|
||||
func (rt *Router) QueryLog(c *gin.Context) {
|
||||
@@ -159,7 +244,7 @@ func (rt *Router) QueryLog(c *gin.Context) {
|
||||
var resp []interface{}
|
||||
for _, q := range f.Querys {
|
||||
if !rt.Center.AnonymousAccess.PromQuerier && !CheckDsPerm(c, f.DatasourceId, f.Cate, q) {
|
||||
ginx.Bomb(200, "no permission")
|
||||
ginx.Bomb(200, "forbidden")
|
||||
}
|
||||
|
||||
plug, exists := dscache.DsCache.Get("elasticsearch", f.DatasourceId)
|
||||
|
||||
@@ -6,10 +6,10 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
func (rt *Router) recordingRuleGets(c *gin.Context) {
|
||||
@@ -19,7 +19,7 @@ func (rt *Router) recordingRuleGets(c *gin.Context) {
|
||||
}
|
||||
|
||||
func (rt *Router) recordingRuleGetsByGids(c *gin.Context) {
|
||||
gids := str.IdsInt64(ginx.QueryStr(c, "gids", ""), ",")
|
||||
gids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "gids", ""), ",")
|
||||
if len(gids) > 0 {
|
||||
for _, gid := range gids {
|
||||
rt.bgroCheck(c, gid)
|
||||
|
||||
36
center/router/router_source_token.go
Normal file
36
center/router/router_source_token.go
Normal file
@@ -0,0 +1,36 @@
|
||||
package router
|
||||
|
||||
import (
|
||||
"net/http"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/google/uuid"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
)
|
||||
|
||||
// sourceTokenAdd 生成新的源令牌
|
||||
func (rt *Router) sourceTokenAdd(c *gin.Context) {
|
||||
var f models.SourceToken
|
||||
ginx.BindJSON(c, &f)
|
||||
|
||||
if f.ExpireAt > 0 && f.ExpireAt <= time.Now().Unix() {
|
||||
ginx.Bomb(http.StatusBadRequest, "expire time must be in the future")
|
||||
}
|
||||
|
||||
token := uuid.New().String()
|
||||
|
||||
username := c.MustGet("username").(string)
|
||||
|
||||
f.Token = token
|
||||
f.CreateBy = username
|
||||
f.CreateAt = time.Now().Unix()
|
||||
|
||||
err := f.Add(rt.Ctx)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
go models.CleanupExpiredTokens(rt.Ctx)
|
||||
ginx.NewRender(c).Data(token, nil)
|
||||
}
|
||||
@@ -10,13 +10,13 @@ import (
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/ctx"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
"github.com/ccfos/nightingale/v6/storage"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/prometheus/common/model"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
type TargetQuery struct {
|
||||
@@ -44,7 +44,7 @@ func (rt *Router) targetGetsByHostFilter(c *gin.Context) {
|
||||
}
|
||||
|
||||
func (rt *Router) targetGets(c *gin.Context) {
|
||||
bgids := str.IdsInt64(ginx.QueryStr(c, "gids", ""), ",")
|
||||
bgids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "gids", ""), ",")
|
||||
query := ginx.QueryStr(c, "query", "")
|
||||
limit := ginx.QueryInt(c, "limit", 30)
|
||||
downtime := ginx.QueryInt64(c, "downtime", 0)
|
||||
@@ -56,7 +56,14 @@ func (rt *Router) targetGets(c *gin.Context) {
|
||||
hosts := queryStrListField(c, "hosts", ",", " ", "\n")
|
||||
|
||||
var err error
|
||||
if len(bgids) == 0 {
|
||||
if len(bgids) > 0 {
|
||||
// 如果用户当前查看的是未归组机器,会传入 bgids = [0],此时是不需要校验的,故而排除这种情况
|
||||
if !(len(bgids) == 1 && bgids[0] == 0) {
|
||||
for _, gid := range bgids {
|
||||
rt.bgroCheck(c, gid)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
user := c.MustGet("user").(*models.User)
|
||||
if !user.IsAdmin() {
|
||||
// 如果是非 admin 用户,全部对象的情况,找到用户有权限的业务组
|
||||
@@ -454,7 +461,7 @@ func (rt *Router) targetBindBgids(c *gin.Context) {
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if !can {
|
||||
ginx.Bomb(http.StatusForbidden, "No permission. You are not admin of BG(%s)", bg.Name)
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
}
|
||||
isNeverGrouped, checkErr := haveNeverGroupedIdent(rt.Ctx, f.Idents)
|
||||
@@ -464,7 +471,7 @@ func (rt *Router) targetBindBgids(c *gin.Context) {
|
||||
can, err := user.CheckPerm(rt.Ctx, "/targets/bind")
|
||||
ginx.Dangerous(err)
|
||||
if !can {
|
||||
ginx.Bomb(http.StatusForbidden, "No permission. Only admin can assign BG")
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -549,7 +556,7 @@ func (rt *Router) checkTargetPerm(c *gin.Context, idents []string) {
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if len(nopri) > 0 {
|
||||
ginx.Bomb(http.StatusForbidden, "No permission to operate the targets: %s", strings.Join(nopri, ", "))
|
||||
ginx.Bomb(http.StatusForbidden, "forbidden")
|
||||
}
|
||||
}
|
||||
|
||||
@@ -571,6 +578,15 @@ func (rt *Router) targetsOfAlertRule(c *gin.Context) {
|
||||
ginx.NewRender(c).Data(ret, err)
|
||||
}
|
||||
|
||||
func (rt *Router) checkTargetsExistByIndent(idents []string) {
|
||||
notExists, err := models.TargetNoExistIdents(rt.Ctx, idents)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
if len(notExists) > 0 {
|
||||
ginx.Bomb(http.StatusBadRequest, "targets not exist: %s", strings.Join(notExists, ", "))
|
||||
}
|
||||
}
|
||||
|
||||
func (rt *Router) targetsOfHostQuery(c *gin.Context) {
|
||||
var queries []models.HostQuery
|
||||
ginx.BindJSON(c, &queries)
|
||||
|
||||
@@ -1,15 +1,16 @@
|
||||
package router
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/alert/sender"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/i18n"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
func (rt *Router) taskGets(c *gin.Context) {
|
||||
@@ -40,7 +41,7 @@ func (rt *Router) taskGets(c *gin.Context) {
|
||||
}
|
||||
|
||||
func (rt *Router) taskGetsByGids(c *gin.Context) {
|
||||
gids := str.IdsInt64(ginx.QueryStr(c, "gids", ""), ",")
|
||||
gids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "gids", ""), ",")
|
||||
if len(gids) > 0 {
|
||||
for _, gid := range gids {
|
||||
rt.bgroCheck(c, gid)
|
||||
@@ -84,20 +85,6 @@ func (rt *Router) taskGetsByGids(c *gin.Context) {
|
||||
}, nil)
|
||||
}
|
||||
|
||||
type taskForm struct {
|
||||
Title string `json:"title" binding:"required"`
|
||||
Account string `json:"account" binding:"required"`
|
||||
Batch int `json:"batch"`
|
||||
Tolerance int `json:"tolerance"`
|
||||
Timeout int `json:"timeout"`
|
||||
Pause string `json:"pause"`
|
||||
Script string `json:"script" binding:"required"`
|
||||
Args string `json:"args"`
|
||||
Action string `json:"action" binding:"required"`
|
||||
Creator string `json:"creator"`
|
||||
Hosts []string `json:"hosts" binding:"required"`
|
||||
}
|
||||
|
||||
func (rt *Router) taskRecordAdd(c *gin.Context) {
|
||||
var f *models.TaskRecord
|
||||
ginx.BindJSON(c, &f)
|
||||
@@ -112,11 +99,21 @@ func (rt *Router) taskAdd(c *gin.Context) {
|
||||
|
||||
var f models.TaskForm
|
||||
ginx.BindJSON(c, &f)
|
||||
// 把 f.Hosts 中的空字符串过滤掉
|
||||
hosts := make([]string, 0, len(f.Hosts))
|
||||
for i := range f.Hosts {
|
||||
if strings.TrimSpace(f.Hosts[i]) != "" {
|
||||
hosts = append(hosts, strings.TrimSpace(f.Hosts[i]))
|
||||
}
|
||||
}
|
||||
f.Hosts = hosts
|
||||
|
||||
bgid := ginx.UrlParamInt64(c, "id")
|
||||
user := c.MustGet("user").(*models.User)
|
||||
f.Creator = user.Username
|
||||
|
||||
rt.checkTargetsExistByIndent(f.Hosts)
|
||||
|
||||
err := f.Verify()
|
||||
ginx.Dangerous(err)
|
||||
|
||||
|
||||
@@ -7,6 +7,7 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
@@ -35,7 +36,7 @@ func (rt *Router) taskTplGetsByGids(c *gin.Context) {
|
||||
query := ginx.QueryStr(c, "query", "")
|
||||
limit := ginx.QueryInt(c, "limit", 20)
|
||||
|
||||
gids := str.IdsInt64(ginx.QueryStr(c, "gids", ""), ",")
|
||||
gids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "gids", ""), ",")
|
||||
if len(gids) > 0 {
|
||||
for _, gid := range gids {
|
||||
rt.bgroCheck(c, gid)
|
||||
@@ -118,6 +119,18 @@ type taskTplForm struct {
|
||||
Hosts []string `json:"hosts"`
|
||||
}
|
||||
|
||||
func (f *taskTplForm) Verify() {
|
||||
// 传入的 f.Hosts 可能是 []string{"", "a", "b"},需要过滤掉空字符串
|
||||
args := make([]string, 0, len(f.Hosts))
|
||||
for _, ident := range f.Hosts {
|
||||
if strings.TrimSpace(ident) != "" {
|
||||
args = append(args, strings.TrimSpace(ident))
|
||||
}
|
||||
}
|
||||
|
||||
f.Hosts = args
|
||||
}
|
||||
|
||||
func (rt *Router) taskTplAdd(c *gin.Context) {
|
||||
if !rt.Ibex.Enable {
|
||||
ginx.Bomb(400, i18n.Sprintf(c.GetHeader("X-Language"), "This functionality has not been enabled. Please contact the system administrator to activate it."))
|
||||
@@ -126,10 +139,13 @@ func (rt *Router) taskTplAdd(c *gin.Context) {
|
||||
|
||||
var f taskTplForm
|
||||
ginx.BindJSON(c, &f)
|
||||
f.Verify()
|
||||
|
||||
user := c.MustGet("user").(*models.User)
|
||||
now := time.Now().Unix()
|
||||
|
||||
rt.checkTargetsExistByIndent(f.Hosts)
|
||||
|
||||
sort.Strings(f.Tags)
|
||||
|
||||
tpl := &models.TaskTpl{
|
||||
@@ -167,6 +183,9 @@ func (rt *Router) taskTplPut(c *gin.Context) {
|
||||
|
||||
var f taskTplForm
|
||||
ginx.BindJSON(c, &f)
|
||||
f.Verify()
|
||||
|
||||
rt.checkTargetsExistByIndent(f.Hosts)
|
||||
|
||||
sort.Strings(f.Tags)
|
||||
|
||||
|
||||
@@ -47,12 +47,27 @@ func (rt *Router) userGets(c *gin.Context) {
|
||||
query := ginx.QueryStr(c, "query", "")
|
||||
order := ginx.QueryStr(c, "order", "username")
|
||||
desc := ginx.QueryBool(c, "desc", false)
|
||||
usernames := strings.Split(ginx.QueryStr(c, "usernames", ""), ",")
|
||||
phones := strings.Split(ginx.QueryStr(c, "phones", ""), ",")
|
||||
emails := strings.Split(ginx.QueryStr(c, "emails", ""), ",")
|
||||
|
||||
if len(usernames) == 1 && usernames[0] == "" {
|
||||
usernames = []string{}
|
||||
}
|
||||
|
||||
if len(phones) == 1 && phones[0] == "" {
|
||||
phones = []string{}
|
||||
}
|
||||
|
||||
if len(emails) == 1 && emails[0] == "" {
|
||||
emails = []string{}
|
||||
}
|
||||
|
||||
go rt.UserCache.UpdateUsersLastActiveTime()
|
||||
total, err := models.UserTotal(rt.Ctx, query, stime, etime)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
list, err := models.UserGets(rt.Ctx, query, limit, ginx.Offset(c, limit), stime, etime, order, desc)
|
||||
list, err := models.UserGets(rt.Ctx, query, limit, ginx.Offset(c, limit), stime, etime, order, desc, usernames, phones, emails)
|
||||
ginx.Dangerous(err)
|
||||
|
||||
user := c.MustGet("user").(*models.User)
|
||||
|
||||
@@ -6,11 +6,11 @@ import (
|
||||
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/flashduty"
|
||||
"github.com/ccfos/nightingale/v6/pkg/strx"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/toolkits/pkg/ginx"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
"github.com/toolkits/pkg/str"
|
||||
)
|
||||
|
||||
func (rt *Router) checkBusiGroupPerm(c *gin.Context) {
|
||||
@@ -32,7 +32,7 @@ func (rt *Router) userGroupGets(c *gin.Context) {
|
||||
}
|
||||
|
||||
func (rt *Router) userGroupGetsByService(c *gin.Context) {
|
||||
ids := str.IdsInt64(ginx.QueryStr(c, "ids", ""))
|
||||
ids := strx.IdsInt64ForAPI(ginx.QueryStr(c, "ids", ""))
|
||||
|
||||
if len(ids) == 0 {
|
||||
lst, err := models.UserGroupGetAll(rt.Ctx)
|
||||
@@ -111,7 +111,6 @@ func (rt *Router) userGroupPut(c *gin.Context) {
|
||||
|
||||
me := c.MustGet("user").(*models.User)
|
||||
ug := c.MustGet("user_group").(*models.UserGroup)
|
||||
oldUGName := ug.Name
|
||||
|
||||
if ug.Name != f.Name {
|
||||
// name changed, check duplication
|
||||
@@ -130,7 +129,7 @@ func (rt *Router) userGroupPut(c *gin.Context) {
|
||||
if f.IsSyncToFlashDuty || flashduty.NeedSyncTeam(rt.Ctx) {
|
||||
ugs, err := flashduty.NewUserGroupSyncer(rt.Ctx, ug)
|
||||
ginx.Dangerous(err)
|
||||
err = ugs.SyncUGPut(oldUGName)
|
||||
err = ugs.SyncUGPut()
|
||||
ginx.Dangerous(err)
|
||||
}
|
||||
ginx.NewRender(c).Message(ug.Update(rt.Ctx, "Name", "Note", "UpdateAt", "UpdateBy"))
|
||||
@@ -159,8 +158,11 @@ func (rt *Router) userGroupDel(c *gin.Context) {
|
||||
if isSyncToFlashDuty || flashduty.NeedSyncTeam(rt.Ctx) {
|
||||
ugs, err := flashduty.NewUserGroupSyncer(rt.Ctx, ug)
|
||||
ginx.Dangerous(err)
|
||||
err = ugs.SyncUGDel(ug.Name)
|
||||
ginx.Dangerous(err)
|
||||
err = ugs.SyncUGDel()
|
||||
// 如果team 在 duty 被引用或者已经删除,会报错,可以忽略报错
|
||||
if err != nil {
|
||||
logger.Warningf("failed to sync user group %s to flashduty's team: %v", ug.Name, err)
|
||||
}
|
||||
}
|
||||
ginx.NewRender(c).Message(ug.Del(rt.Ctx))
|
||||
|
||||
|
||||
@@ -40,7 +40,7 @@ func (rt *Router) userVariableConfigPut(context *gin.Context) {
|
||||
user := context.MustGet("user").(*models.User)
|
||||
if !user.IsAdmin() && f.CreateBy != user.Username {
|
||||
// only admin or creator can update
|
||||
ginx.Bomb(403, "no permission")
|
||||
ginx.Bomb(403, "forbidden")
|
||||
}
|
||||
|
||||
ginx.NewRender(context).Message(models.ConfigsUserVariableUpdate(rt.Ctx, f))
|
||||
@@ -54,7 +54,7 @@ func (rt *Router) userVariableConfigDel(context *gin.Context) {
|
||||
user := context.MustGet("user").(*models.User)
|
||||
if !user.IsAdmin() && configs.CreateBy != user.Username {
|
||||
// only admin or creator can delete
|
||||
ginx.Bomb(403, "no permission")
|
||||
ginx.Bomb(403, "forbidden")
|
||||
}
|
||||
|
||||
if configs != nil && configs.External == models.ConfigExternal {
|
||||
|
||||
@@ -54,7 +54,7 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
|
||||
targetCache := memsto.NewTargetCache(ctx, syncStats, redis)
|
||||
busiGroupCache := memsto.NewBusiGroupCache(ctx, syncStats)
|
||||
configCvalCache := memsto.NewCvalCache(ctx, syncStats)
|
||||
idents := idents.New(ctx, redis)
|
||||
idents := idents.New(ctx, redis, config.Pushgw)
|
||||
metas := metas.New(redis)
|
||||
writers := writer.NewWriters(config.Pushgw)
|
||||
pushgwRouter := pushgwrt.New(config.HTTP, config.Pushgw, config.Alert, targetCache, busiGroupCache, idents, metas, writers, ctx)
|
||||
|
||||
@@ -24,6 +24,7 @@ type Query struct {
|
||||
Index string `json:"index" mapstructure:"index"`
|
||||
IndexPatternId int64 `json:"index_pattern" mapstructure:"index_pattern"`
|
||||
Filter string `json:"filter" mapstructure:"filter"`
|
||||
Offset int64 `json:"offset" mapstructure:"offset"`
|
||||
MetricAggr MetricAggr `json:"value" mapstructure:"value"`
|
||||
GroupBy []GroupBy `json:"group_by" mapstructure:"group_by"`
|
||||
DateField string `json:"date_field" mapstructure:"date_field"`
|
||||
@@ -347,12 +348,14 @@ func QueryData(ctx context.Context, queryParam interface{}, cliTimeout int64, ve
|
||||
if ip, ok := GetEsIndexPatternCacheType().Get(param.IndexPatternId); ok {
|
||||
param.DateField = ip.TimeField
|
||||
indexArr = []string{ip.Name}
|
||||
param.Index = ip.Name
|
||||
} else {
|
||||
return nil, fmt.Errorf("index pattern:%d not found", param.IndexPatternId)
|
||||
}
|
||||
} else {
|
||||
indexArr = strings.Split(param.Index, ",")
|
||||
}
|
||||
|
||||
q := elastic.NewRangeQuery(param.DateField)
|
||||
now := time.Now().Unix()
|
||||
var start, end int64
|
||||
@@ -370,6 +373,11 @@ func QueryData(ctx context.Context, queryParam interface{}, cliTimeout int64, ve
|
||||
start = start - delay
|
||||
}
|
||||
|
||||
if param.Offset > 0 {
|
||||
end = end - param.Offset
|
||||
start = start - param.Offset
|
||||
}
|
||||
|
||||
q.Gte(time.Unix(start, 0).UnixMilli())
|
||||
q.Lte(time.Unix(end, 0).UnixMilli())
|
||||
q.Format("epoch_millis")
|
||||
@@ -481,7 +489,7 @@ func QueryData(ctx context.Context, queryParam interface{}, cliTimeout int64, ve
|
||||
|
||||
source, _ := queryString.Source()
|
||||
b, _ := json.Marshal(source)
|
||||
logger.Debugf("query_data q:%+v tsAggr:%+v query_string:%s", param, tsAggr, string(b))
|
||||
logger.Debugf("query_data q:%+v indexArr:%+v tsAggr:%+v query_string:%s", param, indexArr, tsAggr, string(b))
|
||||
|
||||
searchSource := elastic.NewSearchSource().
|
||||
Query(queryString).
|
||||
@@ -528,7 +536,16 @@ func QueryData(ctx context.Context, queryParam interface{}, cliTimeout int64, ve
|
||||
|
||||
GetBuckts("", keys, bucketsData, metrics, "", 0, param.MetricAggr.Func)
|
||||
|
||||
return TransferData(fmt.Sprintf("%s_%s", field, param.MetricAggr.Func), param.Ref, metrics.Data), nil
|
||||
items, err := TransferData(fmt.Sprintf("%s_%s", field, param.MetricAggr.Func), param.Ref, metrics.Data), nil
|
||||
|
||||
var m map[string]interface{}
|
||||
bs, _ := json.Marshal(queryParam)
|
||||
json.Unmarshal(bs, &m)
|
||||
m["index"] = param.Index
|
||||
for i := range items {
|
||||
items[i].Query = fmt.Sprintf("%+v", m)
|
||||
}
|
||||
return items, nil
|
||||
}
|
||||
|
||||
func HitFilter(typ string) bool {
|
||||
|
||||
@@ -98,6 +98,7 @@ func GetDatasourceByType(typ string, settings map[string]interface{}) (Datasourc
|
||||
type DatasourceInfo struct {
|
||||
Id int64 `json:"id"`
|
||||
Name string `json:"name"`
|
||||
Identifier string `json:"identifier"`
|
||||
Description string `json:"description"`
|
||||
ClusterName string `json:"cluster_name"`
|
||||
Category string `json:"category"`
|
||||
|
||||
@@ -187,6 +187,7 @@ func (e *Elasticsearch) QueryData(ctx context.Context, queryParam interface{}) (
|
||||
search := func(ctx context.Context, indices []string, source interface{}, timeout int, maxShard int) (*elastic.SearchResult, error) {
|
||||
return e.Client.Search().
|
||||
Index(indices...).
|
||||
IgnoreUnavailable(true).
|
||||
Source(source).
|
||||
Timeout(fmt.Sprintf("%ds", timeout)).
|
||||
MaxConcurrentShardRequests(maxShard).
|
||||
@@ -204,7 +205,7 @@ func (e *Elasticsearch) QueryIndices() ([]string, error) {
|
||||
|
||||
func (e *Elasticsearch) QueryFields(indexs []string) ([]string, error) {
|
||||
var fields []string
|
||||
result, err := elastic.NewGetFieldMappingService(e.Client).Index(indexs...).Do(context.Background())
|
||||
result, err := elastic.NewGetFieldMappingService(e.Client).Index(indexs...).IgnoreUnavailable(true).Do(context.Background())
|
||||
if err != nil {
|
||||
return fields, err
|
||||
}
|
||||
@@ -264,6 +265,7 @@ func (e *Elasticsearch) QueryLog(ctx context.Context, queryParam interface{}) ([
|
||||
|
||||
return e.Client.Search().
|
||||
Index(indices...).
|
||||
IgnoreUnavailable(true).
|
||||
MaxConcurrentShardRequests(maxShard).
|
||||
Source(source).
|
||||
Timeout(fmt.Sprintf("%ds", timeout)).
|
||||
@@ -276,6 +278,7 @@ func (e *Elasticsearch) QueryLog(ctx context.Context, queryParam interface{}) ([
|
||||
func (e *Elasticsearch) QueryFieldValue(indexs []string, field string, query string) ([]string, error) {
|
||||
var values []string
|
||||
search := e.Client.Search().
|
||||
IgnoreUnavailable(true).
|
||||
Index(indexs...).
|
||||
Size(0)
|
||||
|
||||
@@ -359,6 +362,7 @@ func (e *Elasticsearch) QueryMapData(ctx context.Context, query interface{}) ([]
|
||||
|
||||
return e.Client.Search().
|
||||
Index(indices...).
|
||||
IgnoreUnavailable(true).
|
||||
Source(source).
|
||||
Timeout(fmt.Sprintf("%ds", timeout)).
|
||||
Do(ctx)
|
||||
|
||||
BIN
doc/img/readme/2025-05-23_18-43-37.png
Normal file
BIN
doc/img/readme/2025-05-23_18-43-37.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 384 KiB |
BIN
doc/img/readme/2025-05-23_18-46-06.png
Normal file
BIN
doc/img/readme/2025-05-23_18-46-06.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 345 KiB |
BIN
doc/img/readme/2025-05-23_18-49-02.png
Normal file
BIN
doc/img/readme/2025-05-23_18-49-02.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 336 KiB |
BIN
doc/img/readme/2025-05-30_08-49-28.png
Normal file
BIN
doc/img/readme/2025-05-30_08-49-28.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 497 KiB |
BIN
doc/img/readme/logos.png
Normal file
BIN
doc/img/readme/logos.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 956 KiB |
@@ -5,7 +5,7 @@ RunMode = "release"
|
||||
# log write dir
|
||||
Dir = "logs"
|
||||
# log level: DEBUG INFO WARNING ERROR
|
||||
Level = "DEBUG"
|
||||
Level = "INFO"
|
||||
# stdout, stderr, file
|
||||
Output = "stdout"
|
||||
# # rotate by time
|
||||
@@ -75,15 +75,6 @@ DefaultRoles = ["Standard"]
|
||||
[HTTP.RSA]
|
||||
# open RSA
|
||||
OpenRSA = false
|
||||
# Before replacing the key file, make sure that there are no encrypted variables in the database "configs".
|
||||
# It is recommended to decrypt and remove all encrypted values from the database before replacing the key file.
|
||||
# This will prevent any potential issues with accessing or decrypting the variables using the new key file.
|
||||
# RSA public key (auto carete)
|
||||
RSAPublicKeyPath = "etc/rsa/public.pem"
|
||||
# RSA private key (auto carete)
|
||||
RSAPrivateKeyPath = "etc/rsa/private.pem"
|
||||
# RSA private key password
|
||||
RSAPassWord = "n9e@n9e!"
|
||||
|
||||
[DB]
|
||||
# postgres: DSN="host=127.0.0.1 port=5432 user=root dbname=n9e_v6 password=1234 sslmode=disable"
|
||||
@@ -138,8 +129,6 @@ AlertDetail = false
|
||||
[Pushgw]
|
||||
# use target labels in database instead of in series
|
||||
LabelRewrite = true
|
||||
# # default busigroup key name
|
||||
# BusiGroupLabelKey = "busigroup"
|
||||
ForceUseServerTS = true
|
||||
|
||||
# [Pushgw.DebugSample]
|
||||
|
||||
@@ -5,7 +5,7 @@ RunMode = "release"
|
||||
# log write dir
|
||||
Dir = "logs"
|
||||
# log level: DEBUG INFO WARNING ERROR
|
||||
Level = "DEBUG"
|
||||
Level = "INFO"
|
||||
# stdout, stderr, file
|
||||
Output = "file"
|
||||
# # rotate by time
|
||||
@@ -71,15 +71,6 @@ DefaultRoles = ["Standard"]
|
||||
[HTTP.RSA]
|
||||
# open RSA
|
||||
OpenRSA = false
|
||||
# Before replacing the key file, make sure that there are no encrypted variables in the database "configs".
|
||||
# It is recommended to decrypt and remove all encrypted values from the database before replacing the key file.
|
||||
# This will prevent any potential issues with accessing or decrypting the variables using the new key file.
|
||||
# RSA public key (auto carete)
|
||||
RSAPublicKeyPath = "etc/rsa/public.pem"
|
||||
# RSA private key (auto carete)
|
||||
RSAPrivateKeyPath = "etc/rsa/private.pem"
|
||||
# RSA private key password
|
||||
RSAPassWord = "n9e@n9e!"
|
||||
|
||||
[DB]
|
||||
# postgres: host=%s port=%s user=%s dbname=%s password=%s sslmode=%s
|
||||
@@ -135,8 +126,6 @@ AlertDetail = true
|
||||
[Pushgw]
|
||||
# use target labels in database instead of in series
|
||||
LabelRewrite = true
|
||||
# # default busigroup key name
|
||||
# BusiGroupLabelKey = "busigroup"
|
||||
ForceUseServerTS = true
|
||||
|
||||
# [Pushgw.DebugSample]
|
||||
|
||||
@@ -5,7 +5,7 @@ RunMode = "release"
|
||||
# log write dir
|
||||
Dir = "logs"
|
||||
# log level: DEBUG INFO WARNING ERROR
|
||||
Level = "DEBUG"
|
||||
Level = "INFO"
|
||||
# stdout, stderr, file
|
||||
Output = "stdout"
|
||||
# # rotate by time
|
||||
@@ -71,15 +71,6 @@ DefaultRoles = ["Standard"]
|
||||
[HTTP.RSA]
|
||||
# open RSA
|
||||
OpenRSA = false
|
||||
# Before replacing the key file, make sure that there are no encrypted variables in the database "configs".
|
||||
# It is recommended to decrypt and remove all encrypted values from the database before replacing the key file.
|
||||
# This will prevent any potential issues with accessing or decrypting the variables using the new key file.
|
||||
# RSA public key (auto carete)
|
||||
RSAPublicKeyPath = "etc/rsa/public.pem"
|
||||
# RSA private key (auto carete)
|
||||
RSAPrivateKeyPath = "etc/rsa/private.pem"
|
||||
# RSA private key password
|
||||
RSAPassWord = "n9e@n9e!"
|
||||
|
||||
[DB]
|
||||
# postgres: host=%s port=%s user=%s dbname=%s password=%s sslmode=%s
|
||||
@@ -135,8 +126,6 @@ AlertDetail = true
|
||||
[Pushgw]
|
||||
# use target labels in database instead of in series
|
||||
LabelRewrite = true
|
||||
# # default busigroup key name
|
||||
# BusiGroupLabelKey = "busigroup"
|
||||
ForceUseServerTS = true
|
||||
|
||||
# [Pushgw.DebugSample]
|
||||
|
||||
@@ -903,4 +903,16 @@ CREATE TABLE dash_annotation (
|
||||
create_by varchar(64) not null default '',
|
||||
update_at bigint not null default 0,
|
||||
update_by varchar(64) not null default ''
|
||||
);
|
||||
);
|
||||
|
||||
CREATE TABLE source_token (
|
||||
id bigserial PRIMARY KEY,
|
||||
source_type varchar(64) NOT NULL DEFAULT '',
|
||||
source_id varchar(255) NOT NULL DEFAULT '',
|
||||
token varchar(255) NOT NULL DEFAULT '',
|
||||
expire_at bigint NOT NULL DEFAULT 0,
|
||||
create_at bigint NOT NULL DEFAULT 0,
|
||||
create_by varchar(64) NOT NULL DEFAULT ''
|
||||
);
|
||||
|
||||
CREATE INDEX idx_source_token_type_id_token ON source_token (source_type, source_id, token);
|
||||
|
||||
@@ -5,7 +5,7 @@ RunMode = "release"
|
||||
# log write dir
|
||||
Dir = "logs"
|
||||
# log level: DEBUG INFO WARNING ERROR
|
||||
Level = "DEBUG"
|
||||
Level = "INFO"
|
||||
# stdout, stderr, file
|
||||
Output = "stdout"
|
||||
# # rotate by time
|
||||
@@ -130,8 +130,6 @@ AlertDetail = true
|
||||
[Pushgw]
|
||||
# use target labels in database instead of in series
|
||||
LabelRewrite = true
|
||||
# # default busigroup key name
|
||||
# BusiGroupLabelKey = "busigroup"
|
||||
ForceUseServerTS = true
|
||||
|
||||
# [Pushgw.DebugSample]
|
||||
|
||||
@@ -1,25 +0,0 @@
|
||||
#### {{if .IsRecovered}}<font color="#008800">💚{{.RuleName}}</font>{{else}}<font color="#FF0000">💔{{.RuleName}}</font>{{end}}
|
||||
|
||||
---
|
||||
{{$time_duration := sub now.Unix .FirstTriggerTime }}{{if .IsRecovered}}{{$time_duration = sub .LastEvalTime .FirstTriggerTime }}{{end}}
|
||||
- **告警级别**: {{.Severity}}级
|
||||
{{- if .RuleNote}}
|
||||
- **规则备注**: {{.RuleNote}}
|
||||
{{- end}}
|
||||
{{- if not .IsRecovered}}
|
||||
- **当次触发时值**: {{.TriggerValue}}
|
||||
- **当次触发时间**: {{timeformat .TriggerTime}}
|
||||
- **告警持续时长**: {{humanizeDurationInterface $time_duration}}
|
||||
{{- else}}
|
||||
{{- if .AnnotationsJSON.recovery_value}}
|
||||
- **恢复时值**: {{formatDecimal .AnnotationsJSON.recovery_value 4}}
|
||||
{{- end}}
|
||||
- **恢复时间**: {{timeformat .LastEvalTime}}
|
||||
- **告警持续时长**: {{humanizeDurationInterface $time_duration}}
|
||||
{{- end}}
|
||||
- **告警事件标签**:
|
||||
{{- range $key, $val := .TagsMap}}
|
||||
{{- if ne $key "rulename" }}
|
||||
- `{{$key}}`: `{{$val}}`
|
||||
{{- end}}
|
||||
{{- end}}
|
||||
@@ -1,224 +0,0 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta http-equiv="X-UA-Compatible" content="ie=edge">
|
||||
<title>夜莺告警通知</title>
|
||||
<style type="text/css">
|
||||
.wrapper {
|
||||
background-color: #f8f8f8;
|
||||
padding: 15px;
|
||||
height: 100%;
|
||||
}
|
||||
.main {
|
||||
width: 600px;
|
||||
padding: 30px;
|
||||
margin: 0 auto;
|
||||
background-color: #fff;
|
||||
font-size: 12px;
|
||||
font-family: verdana,'Microsoft YaHei',Consolas,'Deja Vu Sans Mono','Bitstream Vera Sans Mono';
|
||||
}
|
||||
header {
|
||||
border-radius: 2px 2px 0 0;
|
||||
}
|
||||
header .title {
|
||||
font-size: 14px;
|
||||
color: #333333;
|
||||
margin: 0;
|
||||
}
|
||||
header .sub-desc {
|
||||
color: #333;
|
||||
font-size: 14px;
|
||||
margin-top: 6px;
|
||||
margin-bottom: 0;
|
||||
}
|
||||
hr {
|
||||
margin: 20px 0;
|
||||
height: 0;
|
||||
border: none;
|
||||
border-top: 1px solid #e5e5e5;
|
||||
}
|
||||
em {
|
||||
font-weight: 600;
|
||||
}
|
||||
table {
|
||||
margin: 20px 0;
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
table tbody tr{
|
||||
font-weight: 200;
|
||||
font-size: 12px;
|
||||
color: #666;
|
||||
height: 32px;
|
||||
}
|
||||
|
||||
.succ {
|
||||
background-color: green;
|
||||
color: #fff;
|
||||
}
|
||||
|
||||
.fail {
|
||||
background-color: red;
|
||||
color: #fff;
|
||||
}
|
||||
|
||||
.succ th, .succ td, .fail th, .fail td {
|
||||
color: #fff;
|
||||
}
|
||||
|
||||
table tbody tr th {
|
||||
width: 80px;
|
||||
text-align: right;
|
||||
}
|
||||
.text-right {
|
||||
text-align: right;
|
||||
}
|
||||
.body {
|
||||
margin-top: 24px;
|
||||
}
|
||||
.body-text {
|
||||
color: #666666;
|
||||
-webkit-font-smoothing: antialiased;
|
||||
}
|
||||
.body-extra {
|
||||
-webkit-font-smoothing: antialiased;
|
||||
}
|
||||
.body-extra.text-right a {
|
||||
text-decoration: none;
|
||||
color: #333;
|
||||
}
|
||||
.body-extra.text-right a:hover {
|
||||
color: #666;
|
||||
}
|
||||
.button {
|
||||
width: 200px;
|
||||
height: 50px;
|
||||
margin-top: 20px;
|
||||
text-align: center;
|
||||
border-radius: 2px;
|
||||
background: #2D77EE;
|
||||
line-height: 50px;
|
||||
font-size: 20px;
|
||||
color: #FFFFFF;
|
||||
cursor: pointer;
|
||||
}
|
||||
.button:hover {
|
||||
background: rgb(25, 115, 255);
|
||||
border-color: rgb(25, 115, 255);
|
||||
color: #fff;
|
||||
}
|
||||
footer {
|
||||
margin-top: 10px;
|
||||
text-align: right;
|
||||
}
|
||||
.footer-logo {
|
||||
text-align: right;
|
||||
}
|
||||
.footer-logo-image {
|
||||
width: 108px;
|
||||
height: 27px;
|
||||
margin-right: 10px;
|
||||
}
|
||||
.copyright {
|
||||
margin-top: 10px;
|
||||
font-size: 12px;
|
||||
text-align: right;
|
||||
color: #999;
|
||||
-webkit-font-smoothing: antialiased;
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="wrapper">
|
||||
<div class="main">
|
||||
<header>
|
||||
<h3 class="title">{{.RuleName}}</h3>
|
||||
<p class="sub-desc"></p>
|
||||
</header>
|
||||
|
||||
<hr>
|
||||
|
||||
<div class="body">
|
||||
<table cellspacing="0" cellpadding="0" border="0">
|
||||
<tbody>
|
||||
{{if .IsRecovered}}
|
||||
<tr class="succ">
|
||||
<th>级别状态:</th>
|
||||
<td>S{{.Severity}} Recovered</td>
|
||||
</tr>
|
||||
{{else}}
|
||||
<tr class="fail">
|
||||
<th>级别状态:</th>
|
||||
<td>S{{.Severity}} Triggered</td>
|
||||
</tr>
|
||||
{{end}}
|
||||
|
||||
<tr>
|
||||
<th>策略备注:</th>
|
||||
<td>{{.RuleNote}}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>设备备注:</th>
|
||||
<td>{{.TargetNote}}</td>
|
||||
</tr>
|
||||
{{if not .IsRecovered}}
|
||||
<tr>
|
||||
<th>触发时值:</th>
|
||||
<td>{{.TriggerValue}}</td>
|
||||
</tr>
|
||||
{{end}}
|
||||
|
||||
{{if .TargetIdent}}
|
||||
<tr>
|
||||
<th>监控对象:</th>
|
||||
<td>{{.TargetIdent}}</td>
|
||||
</tr>
|
||||
{{end}}
|
||||
<tr>
|
||||
<th>监控指标:</th>
|
||||
<td>{{.TagsJSON}}</td>
|
||||
</tr>
|
||||
|
||||
{{if .IsRecovered}}
|
||||
<tr>
|
||||
<th>恢复时间:</th>
|
||||
<td>{{timeformat .LastEvalTime}}</td>
|
||||
</tr>
|
||||
{{else}}
|
||||
<tr>
|
||||
<th>触发时间:</th>
|
||||
<td>
|
||||
{{timeformat .TriggerTime}}
|
||||
</td>
|
||||
</tr>
|
||||
{{end}}
|
||||
|
||||
<tr>
|
||||
<th>发送时间:</th>
|
||||
<td>
|
||||
{{timestamp}}
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<th>PromQL:</th>
|
||||
<td>
|
||||
{{.PromQl}}
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
<hr>
|
||||
|
||||
<footer>
|
||||
<div class="copyright" style="font-style: italic">
|
||||
报警太多?使用 <a href="https://flashcat.cloud/product/flashduty/" target="_blank">FlashDuty</a> 做告警聚合降噪、排班OnCall!
|
||||
</div>
|
||||
</footer>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
@@ -1,7 +0,0 @@
|
||||
级别状态: S{{.Severity}} {{if .IsRecovered}}Recovered{{else}}Triggered{{end}}
|
||||
规则名称: {{.RuleName}}{{if .RuleNote}}
|
||||
规则备注: {{.RuleNote}}{{end}}
|
||||
监控指标: {{.TagsJSON}}
|
||||
{{if .IsRecovered}}恢复时间:{{timeformat .LastEvalTime}}{{else}}触发时间: {{timeformat .TriggerTime}}
|
||||
触发时值: {{.TriggerValue}}{{end}}
|
||||
发送时间: {{timestamp}}
|
||||
@@ -1,7 +0,0 @@
|
||||
级别状态: S{{.Severity}} {{if .IsRecovered}}Recovered{{else}}Triggered{{end}}
|
||||
规则名称: {{.RuleName}}{{if .RuleNote}}
|
||||
规则备注: {{.RuleNote}}{{end}}
|
||||
监控指标: {{.TagsJSON}}
|
||||
{{if .IsRecovered}}恢复时间:{{timeformat .LastEvalTime}}{{else}}触发时间: {{timeformat .TriggerTime}}
|
||||
触发时值: {{.TriggerValue}}{{end}}
|
||||
发送时间: {{timestamp}}
|
||||
@@ -1 +0,0 @@
|
||||
{{if .IsRecovered}}Recovered{{else}}Triggered{{end}}: {{.RuleName}} {{.TagsJSON}}
|
||||
@@ -1,7 +0,0 @@
|
||||
**级别状态**: {{if .IsRecovered}}<font color="info">S{{.Severity}} Recovered</font>{{else}}<font color="warning">S{{.Severity}} Triggered</font>{{end}}
|
||||
**规则标题**: {{.RuleName}}{{if .RuleNote}}
|
||||
**规则备注**: {{.RuleNote}}{{end}}
|
||||
**监控指标**: {{.TagsJSON}}
|
||||
{{if .IsRecovered}}**恢复时间**:{{timeformat .LastEvalTime}}{{else}}**触发时间**: {{timeformat .TriggerTime}}
|
||||
**触发时值**: {{.TriggerValue}}{{end}}
|
||||
**发送时间**: {{timestamp}}
|
||||
@@ -1,7 +0,0 @@
|
||||
**级别状态**: {{if .IsRecovered}}<font color="info">S{{.Severity}} Recovered</font>{{else}}<font color="warning">S{{.Severity}} Triggered</font>{{end}}
|
||||
**规则标题**: {{.RuleName}}{{if .RuleNote}}
|
||||
**规则备注**: {{.RuleNote}}{{end}}
|
||||
**监控指标**: {{.TagsJSON}}
|
||||
{{if .IsRecovered}}**恢复时间**:{{timeformat .LastEvalTime}}{{else}}**触发时间**: {{timeformat .TriggerTime}}
|
||||
**触发时值**: {{.TriggerValue}}{{end}}
|
||||
**发送时间**: {{timestamp}}
|
||||
@@ -107,12 +107,6 @@ insert into `role_operation`(role_name, operation) values('Standard', '/help/mig
|
||||
insert into `role_operation`(role_name, operation) values('Standard', '/alert-rules-built-in');
|
||||
insert into `role_operation`(role_name, operation) values('Standard', '/dashboards-built-in');
|
||||
insert into `role_operation`(role_name, operation) values('Standard', '/trace/dependencies');
|
||||
|
||||
insert into `role_operation`(role_name, operation) values('Admin', '/help/source');
|
||||
insert into `role_operation`(role_name, operation) values('Admin', '/help/sso');
|
||||
insert into `role_operation`(role_name, operation) values('Admin', '/help/notification-tpls');
|
||||
insert into `role_operation`(role_name, operation) values('Admin', '/help/notification-settings');
|
||||
|
||||
insert into `role_operation`(role_name, operation) values('Standard', '/users');
|
||||
insert into `role_operation`(role_name, operation) values('Standard', '/user-groups');
|
||||
insert into `role_operation`(role_name, operation) values('Standard', '/user-groups/add');
|
||||
@@ -291,6 +285,8 @@ CREATE TABLE `alert_rule` (
|
||||
`append_tags` varchar(255) not null default '' comment 'split by space: service=n9e mod=api',
|
||||
`annotations` text not null comment 'annotations',
|
||||
`extra_config` text,
|
||||
`notify_rule_ids` varchar(1024) DEFAULT '',
|
||||
`notify_version` int DEFAULT 0,
|
||||
`create_at` bigint not null default 0,
|
||||
`create_by` varchar(64) not null default '',
|
||||
`update_at` bigint not null default 0,
|
||||
@@ -351,6 +347,8 @@ CREATE TABLE `alert_subscribe` (
|
||||
`extra_config` text,
|
||||
`redefine_webhooks` tinyint(1) default 0,
|
||||
`for_duration` bigint not null default 0,
|
||||
`notify_rule_ids` varchar(1024) DEFAULT '',
|
||||
`notify_version` int DEFAULT 0,
|
||||
`create_at` bigint not null default 0,
|
||||
`create_by` varchar(64) not null default '',
|
||||
`update_at` bigint not null default 0,
|
||||
@@ -467,6 +465,7 @@ CREATE TABLE `alert_cur_event` (
|
||||
`rule_config` text not null comment 'annotations',
|
||||
`tags` varchar(1024) not null default '' comment 'merge data_tags rule_tags, split by ,,',
|
||||
`original_tags` text comment 'labels key=val,,k2=v2',
|
||||
`notify_rule_ids` text COMMENT 'notify rule ids',
|
||||
PRIMARY KEY (`id`),
|
||||
KEY (`hash`),
|
||||
KEY (`rule_id`),
|
||||
@@ -509,6 +508,7 @@ CREATE TABLE `alert_his_event` (
|
||||
`original_tags` text comment 'labels key=val,,k2=v2',
|
||||
`annotations` text not null comment 'annotations',
|
||||
`rule_config` text not null comment 'annotations',
|
||||
`notify_rule_ids` text COMMENT 'notify rule ids',
|
||||
PRIMARY KEY (`id`),
|
||||
INDEX `idx_last_eval_time` (`last_eval_time`),
|
||||
KEY (`hash`),
|
||||
@@ -533,7 +533,7 @@ CREATE TABLE `builtin_components` (
|
||||
`updated_by` varchar(191) NOT NULL DEFAULT '' COMMENT '''updater''',
|
||||
`disabled` int NOT NULL DEFAULT 0 COMMENT '''is disabled or not''',
|
||||
PRIMARY KEY (`id`),
|
||||
UNIQUE KEY `idx_ident` (`ident`)
|
||||
KEY (`ident`)
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
|
||||
|
||||
CREATE TABLE `builtin_payloads` (
|
||||
@@ -560,6 +560,7 @@ CREATE TABLE `builtin_payloads` (
|
||||
|
||||
CREATE TABLE notification_record (
|
||||
`id` BIGINT PRIMARY KEY AUTO_INCREMENT,
|
||||
`notify_rule_id` BIGINT NOT NULL DEFAULT 0,
|
||||
`event_id` bigint NOT NULL COMMENT 'event history id',
|
||||
`sub_id` bigint COMMENT 'subscribed rule id',
|
||||
`channel` varchar(255) NOT NULL COMMENT 'notification channel name',
|
||||
@@ -640,6 +641,7 @@ CREATE TABLE `datasource`
|
||||
(
|
||||
`id` int unsigned NOT NULL AUTO_INCREMENT,
|
||||
`name` varchar(191) not null default '',
|
||||
`identifier` varchar(255) not null default '',
|
||||
`description` varchar(255) not null default '',
|
||||
`category` varchar(255) not null default '',
|
||||
`plugin_id` int unsigned not null default 0,
|
||||
@@ -696,6 +698,7 @@ CREATE TABLE `es_index_pattern` (
|
||||
`allow_hide_system_indices` tinyint(1) not null default 0,
|
||||
`fields_format` varchar(4096) not null default '',
|
||||
`cross_cluster_enabled` int not null default 0,
|
||||
`note` varchar(1024) not null default '',
|
||||
`create_at` bigint default '0',
|
||||
`create_by` varchar(64) default '',
|
||||
`update_at` bigint default '0',
|
||||
@@ -786,6 +789,7 @@ CREATE TABLE `notify_rule` (
|
||||
`enable` tinyint(1) not null default 0,
|
||||
`user_group_ids` varchar(255) not null default '',
|
||||
`notify_configs` text,
|
||||
`pipeline_configs` text,
|
||||
`create_at` bigint not null default 0,
|
||||
`create_by` varchar(64) not null default '',
|
||||
`update_at` bigint not null default 0,
|
||||
@@ -802,6 +806,7 @@ CREATE TABLE `notify_channel` (
|
||||
`param_config` text,
|
||||
`request_type` varchar(50) not null,
|
||||
`request_config` text,
|
||||
`weight` int not null default 0,
|
||||
`create_at` bigint not null default 0,
|
||||
`create_by` varchar(64) not null default '',
|
||||
`update_at` bigint not null default 0,
|
||||
@@ -817,6 +822,7 @@ CREATE TABLE `message_template` (
|
||||
`user_group_ids` varchar(64),
|
||||
`notify_channel_ident` varchar(64) not null default '',
|
||||
`private` int not null default 0,
|
||||
`weight` int not null default 0,
|
||||
`create_at` bigint not null default 0,
|
||||
`create_by` varchar(64) not null default '',
|
||||
`update_at` bigint not null default 0,
|
||||
@@ -824,6 +830,35 @@ CREATE TABLE `message_template` (
|
||||
PRIMARY KEY (`id`)
|
||||
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
|
||||
|
||||
CREATE TABLE `event_pipeline` (
|
||||
`id` bigint unsigned not null auto_increment,
|
||||
`name` varchar(128) not null,
|
||||
`team_ids` text,
|
||||
`description` varchar(255) not null default '',
|
||||
`filter_enable` tinyint(1) not null default 0,
|
||||
`label_filters` text,
|
||||
`attribute_filters` text,
|
||||
`processors` text,
|
||||
`create_at` bigint not null default 0,
|
||||
`create_by` varchar(64) not null default '',
|
||||
`update_at` bigint not null default 0,
|
||||
`update_by` varchar(64) not null default '',
|
||||
PRIMARY KEY (`id`)
|
||||
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
|
||||
|
||||
CREATE TABLE `embedded_product` (
|
||||
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
|
||||
`name` varchar(255) DEFAULT NULL,
|
||||
`url` varchar(255) DEFAULT NULL,
|
||||
`is_private` boolean DEFAULT NULL,
|
||||
`team_ids` varchar(255),
|
||||
`create_at` bigint not null default 0,
|
||||
`create_by` varchar(64) not null default '',
|
||||
`update_at` bigint not null default 0,
|
||||
`update_by` varchar(64) not null default '',
|
||||
PRIMARY KEY (`id`)
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
|
||||
|
||||
CREATE TABLE `task_meta`
|
||||
(
|
||||
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
|
||||
@@ -2180,4 +2215,16 @@ CREATE TABLE task_host_99
|
||||
UNIQUE KEY `idx_id_host` (`id`, `host`),
|
||||
PRIMARY KEY (`ii`)
|
||||
) ENGINE = InnoDB
|
||||
DEFAULT CHARSET = utf8mb4;
|
||||
DEFAULT CHARSET = utf8mb4;
|
||||
|
||||
CREATE TABLE `source_token` (
|
||||
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
|
||||
`source_type` varchar(64) NOT NULL DEFAULT '' COMMENT 'source type',
|
||||
`source_id` varchar(255) NOT NULL DEFAULT '' COMMENT 'source identifier',
|
||||
`token` varchar(255) NOT NULL DEFAULT '' COMMENT 'access token',
|
||||
`expire_at` bigint NOT NULL DEFAULT 0 COMMENT 'expire timestamp',
|
||||
`create_at` bigint NOT NULL DEFAULT 0 COMMENT 'create timestamp',
|
||||
`create_by` varchar(64) NOT NULL DEFAULT '' COMMENT 'creator',
|
||||
PRIMARY KEY (`id`),
|
||||
KEY `idx_source_type_id_token` (`source_type`, `source_id`, `token`)
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
|
||||
|
||||
@@ -210,7 +210,55 @@ CREATE TABLE `message_template` (
|
||||
PRIMARY KEY (`id`)
|
||||
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
|
||||
|
||||
ALTER TABLE `alert_rule` ADD COLUMN `notify_rule_ids` varchar(1024) DEFAULT '';
|
||||
ALTER TABLE `alert_rule` ADD COLUMN `notify_version` int DEFAULT 0;
|
||||
|
||||
ALTER TABLE `alert_subscribe` ADD COLUMN `notify_rule_ids` varchar(1024) DEFAULT '';
|
||||
ALTER TABLE `alert_subscribe` ADD COLUMN `notify_version` int DEFAULT 0;
|
||||
|
||||
ALTER TABLE `notification_record` ADD COLUMN `notify_rule_id` BIGINT NOT NULL DEFAULT 0;
|
||||
|
||||
|
||||
/* v8.0.0-beta.9 2025-03-17 */
|
||||
ALTER TABLE `message_template` ADD COLUMN `weight` int not null default 0;
|
||||
ALTER TABLE `notify_channel` ADD COLUMN `weight` int not null default 0;
|
||||
|
||||
/* v8.0.0-beta.11 2025-04-10 */
|
||||
ALTER TABLE `es_index_pattern` ADD COLUMN `note` varchar(1024) not null default '';
|
||||
ALTER TABLE `datasource` ADD COLUMN `identifier` varchar(255) not null default '';
|
||||
|
||||
/* v8.0.0-beta.11 2025-05-15 */
|
||||
ALTER TABLE `notify_rule` ADD COLUMN `pipeline_configs` text;
|
||||
|
||||
CREATE TABLE `event_pipeline` (
|
||||
`id` bigint unsigned not null auto_increment,
|
||||
`name` varchar(128) not null,
|
||||
`team_ids` text,
|
||||
`description` varchar(255) not null default '',
|
||||
`filter_enable` tinyint(1) not null default 0,
|
||||
`label_filters` text,
|
||||
`attribute_filters` text,
|
||||
`processors` text,
|
||||
`create_at` bigint not null default 0,
|
||||
`create_by` varchar(64) not null default '',
|
||||
`update_at` bigint not null default 0,
|
||||
`update_by` varchar(64) not null default '',
|
||||
PRIMARY KEY (`id`)
|
||||
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
|
||||
|
||||
/* v8.0.0-next */
|
||||
CREATE TABLE `source_token` (
|
||||
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
|
||||
`source_type` varchar(64) NOT NULL DEFAULT '' COMMENT 'source type',
|
||||
`source_id` varchar(255) NOT NULL DEFAULT '' COMMENT 'source identifier',
|
||||
`token` varchar(255) NOT NULL DEFAULT '' COMMENT 'access token',
|
||||
`expire_at` bigint NOT NULL DEFAULT 0 COMMENT 'expire timestamp',
|
||||
`create_at` bigint NOT NULL DEFAULT 0 COMMENT 'create timestamp',
|
||||
`create_by` varchar(64) NOT NULL DEFAULT '' COMMENT 'creator',
|
||||
PRIMARY KEY (`id`),
|
||||
KEY `idx_source_type_id_token` (`source_type`, `source_id`, `token`)
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
|
||||
|
||||
/* v8.0.0-beta.12 2025-06-03 */
|
||||
ALTER TABLE `alert_his_event` ADD COLUMN `notify_rule_ids` text COMMENT 'notify rule ids';
|
||||
ALTER TABLE `alert_cur_event` ADD COLUMN `notify_rule_ids` text COMMENT 'notify rule ids';
|
||||
|
||||
@@ -47,6 +47,7 @@ var PromDefaultDatasourceId int64
|
||||
func getDatasourcesFromDBLoop(ctx *ctx.Context, fromAPI bool) {
|
||||
for {
|
||||
if !fromAPI {
|
||||
foundDefaultDatasource := false
|
||||
items, err := models.GetDatasources(ctx)
|
||||
if err != nil {
|
||||
logger.Errorf("get datasource from database fail: %v", err)
|
||||
@@ -58,6 +59,7 @@ func getDatasourcesFromDBLoop(ctx *ctx.Context, fromAPI bool) {
|
||||
for _, item := range items {
|
||||
if item.PluginType == "prometheus" && item.IsDefault {
|
||||
atomic.StoreInt64(&PromDefaultDatasourceId, item.Id)
|
||||
foundDefaultDatasource = true
|
||||
}
|
||||
|
||||
logger.Debugf("get datasource: %+v", item)
|
||||
@@ -90,6 +92,12 @@ func getDatasourcesFromDBLoop(ctx *ctx.Context, fromAPI bool) {
|
||||
}
|
||||
dss = append(dss, ds)
|
||||
}
|
||||
|
||||
if !foundDefaultDatasource && atomic.LoadInt64(&PromDefaultDatasourceId) != 0 {
|
||||
logger.Debugf("no default datasource found")
|
||||
atomic.StoreInt64(&PromDefaultDatasourceId, 0)
|
||||
}
|
||||
|
||||
PutDatasources(dss)
|
||||
} else {
|
||||
FromAPIHook()
|
||||
@@ -183,7 +191,14 @@ func PutDatasources(items []datasource.DatasourceInfo) {
|
||||
ids = append(ids, item.Id)
|
||||
|
||||
// 异步初始化 client 不然数据源同步的会很慢
|
||||
go DsCache.Put(typ, item.Id, ds)
|
||||
go func() {
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
logger.Errorf("panic in datasource item: %+v panic:%v", item, r)
|
||||
}
|
||||
}()
|
||||
DsCache.Put(typ, item.Id, ds)
|
||||
}()
|
||||
}
|
||||
|
||||
logger.Debugf("get plugin by type success Ids:%v", ids)
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -1,5 +1,5 @@
|
||||
{
|
||||
"name": " Kubernetes-Deployment/ Container",
|
||||
"name": "Kubernetes / Deployment / Container",
|
||||
"tags": "Categraf",
|
||||
"configs": {
|
||||
"panels": [
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
{
|
||||
"id": 0,
|
||||
"group_id": 0,
|
||||
"name": "Kubernetes / Container",
|
||||
"name": "Kubernetes / Pod",
|
||||
"ident": "",
|
||||
"tags": "Categraf",
|
||||
"create_at": 0,
|
||||
@@ -1748,20 +1748,34 @@
|
||||
],
|
||||
"var": [
|
||||
{
|
||||
"definition": "prometheus",
|
||||
"name": "datasource",
|
||||
"type": "datasource"
|
||||
"type": "datasource",
|
||||
"definition": "prometheus",
|
||||
"defaultValue": 40
|
||||
},
|
||||
{
|
||||
"name": "namespace",
|
||||
"type": "query",
|
||||
"hide": false,
|
||||
"datasource": {
|
||||
"cate": "prometheus",
|
||||
"value": "${datasource}"
|
||||
},
|
||||
"definition": "label_values(container_cpu_usage_seconds_total, pod)",
|
||||
"multi": false,
|
||||
"name": "pod_name",
|
||||
"definition": "label_values(container_cpu_usage_seconds_total, namespace)",
|
||||
"reg": "",
|
||||
"type": "query"
|
||||
"multi": false
|
||||
},
|
||||
{
|
||||
"name": "pod_name",
|
||||
"type": "query",
|
||||
"hide": false,
|
||||
"datasource": {
|
||||
"cate": "prometheus",
|
||||
"value": "${datasource}"
|
||||
},
|
||||
"definition": "label_values(container_cpu_usage_seconds_total{namespace=\"$namespace\"}, pod)",
|
||||
"reg": "",
|
||||
"multi": false
|
||||
}
|
||||
],
|
||||
"version": "3.0.0"
|
||||
@@ -1,5 +1,5 @@
|
||||
{
|
||||
"name": " Kubernetes-Statefulset / Container ",
|
||||
"name": "Kubernetes / Statefulset / Container ",
|
||||
"tags": "Categraf",
|
||||
"configs": {
|
||||
"panels": [
|
||||
|
||||
342
integrations/Kubernetes/metrics/k8s-node.json
Normal file
342
integrations/Kubernetes/metrics/k8s-node.json
Normal file
@@ -0,0 +1,342 @@
|
||||
[
|
||||
{
|
||||
"uuid": 1745735239727485700,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "TCP当前连接数",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_netstat_Tcp_CurrEstab * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239701096000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "文件描述符使用数",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_filefd_allocated * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239704160000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "文件描述符最大限制",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_filefd_maximum * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239750006800,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "文件系统inode使用率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: -",
|
||||
"lang": "zh_CN",
|
||||
"expression": "100 - (node_filesystem_files_free * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"} / node_filesystem_files * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"} * 100)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239746991600,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "文件系统使用率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: -",
|
||||
"lang": "zh_CN",
|
||||
"expression": "100 - ((node_filesystem_avail_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"} * 100) / node_filesystem_size_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"})"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239753550000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "文件系统错误数",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(node_filesystem_device_error * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}) by (mountpoint)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239743097300,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "磁盘IO使用率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "rate(node_disk_io_now[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239740169500,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "磁盘写入IOPS",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "rate(node_disk_writes_completed_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239734228700,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "磁盘写入速率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "rate(node_disk_written_bytes_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239737122600,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "磁盘读取IOPS",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "rate(node_disk_reads_completed_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239730406000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "磁盘读取速率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "rate(node_disk_read_bytes_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239694202600,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "系统上下文切换率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "rate(node_context_switches_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239697167400,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "系统中断率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "rate(node_intr_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239724650200,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络发送丢包率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(node_network_transmit_drop_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"})"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239710266000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络发送带宽",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(node_network_transmit_bytes_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"})"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239716205000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络发送错误率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(node_network_transmit_errs_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"})"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239721688800,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络接收丢包率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(node_network_receive_drop_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"})"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239707241500,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络接收带宽",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(node_network_receive_bytes_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"})"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239713318000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络接收错误率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(node_network_receive_errs_total[5m]) * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"})"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239783181800,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络连接跟踪条目数",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_nf_conntrack_entries * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239786134000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络连接跟踪限制",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_nf_conntrack_entries_limit * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239675145700,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点 CPU 使用率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: by",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum by (instance) (rate(node_cpu_seconds_total{mode!~\"idle|iowait|steal\"}[5m])) * on(instance) group_left(nodename) node_uname_info{nodename=~\"$node_name\"} *100"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239691192000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点15分钟负载",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_load15 * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239685264100,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点1分钟负载",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_load1 * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239688232700,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点5分钟负载",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_load5 * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239776256800,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点Swap使用量",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_memory_SwapTotal_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"} - node_memory_SwapFree_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239779806500,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点Swap总量",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_memory_SwapTotal_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239681529300,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点上运行的Pod数量",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(kube_pod_info * on(node) group_left(nodename) node_uname_info{nodename=~\"$node_name\"})"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239678397700,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点内存使用率",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(node_memory_MemTotal_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"} - node_memory_MemAvailable_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}) / sum(node_memory_MemTotal_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"})"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239760507400,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点内存详细信息 - 可用",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_memory_MemAvailable_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239756641800,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点内存详细信息 - 总量",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_memory_MemTotal_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239772786200,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点内存详细信息 - 空闲",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_memory_MemFree_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239769542000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点内存详细信息 - 缓冲区",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_memory_Buffers_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
},
|
||||
{
|
||||
"uuid": 1745735239764136000,
|
||||
"collector": "Node",
|
||||
"typ": "Kubernetes",
|
||||
"name": "节点内存详细信息 - 缓存",
|
||||
"unit": "",
|
||||
"note": "节点指标\n类型: *",
|
||||
"lang": "zh_CN",
|
||||
"expression": "node_memory_Cached_bytes * on(instance, cluster) group_left(nodename) node_uname_info{nodename=~\"$node_name\"}"
|
||||
}
|
||||
]
|
||||
282
integrations/Kubernetes/metrics/k8s-pod.json
Normal file
282
integrations/Kubernetes/metrics/k8s-pod.json
Normal file
@@ -0,0 +1,282 @@
|
||||
[
|
||||
{
|
||||
"uuid": 1745893024149445000,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "Inode数量",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(container_fs_inodes_total{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024121015300,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "不可中断任务数量",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(container_tasks_state{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\", state=\"uninterruptible\"}) by (name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024130551800,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器cache使用",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "(sum(container_memory_cache{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name))"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024108569900,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器CPU Limit",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\"}/container_spec_cpu_period{namespace=\"$namespace\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "(sum(container_spec_cpu_quota{namespace=\"$namespace\", pod=~\"$pod_name\"}/container_spec_cpu_period{namespace=\"$namespace\", pod=~\"$pod_name\"}) by (name))"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024112672500,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器CPU load 10",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(container_cpu_load_average_10s{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024026246700,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器CPU使用率",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}[1m])*100) by(name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024029544000,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器CPU归一化后使用率",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}[1m])*100) by(name)/((sum(container_spec_cpu_quota{namespace=\"$namespace\", pod=~\"$pod_name\"}/container_spec_cpu_period{namespace=\"$namespace\", pod=~\"$pod_name\"}) by (name)))"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024146207700,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器I/O",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(container_fs_io_current{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024136457000,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器RSS内存使用",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "(sum(container_memory_rss{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name))"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024139900200,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器内存 Limit",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(container_spec_memory_limit_bytes{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024032984300,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器内存使用",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "(sum(container_memory_usage_bytes{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name))"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024127585500,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器内存使用率",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "((sum(container_memory_usage_bytes{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name)) /(sum(container_spec_memory_limit_bytes{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name)))*100"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024093620000,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器内核态CPU使用率",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_cpu_system_seconds_total{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}[1m])*100) by(name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024102879000,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器发生CPU throttle的比率",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_cpu_cfs_throttled_periods_total{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}[1m]))by(name) *100"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024143177000,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器发生OOM次数",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(container_oom_events_total{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024083942000,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器启动时长(小时)",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum((time()-container_start_time_seconds{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"})) by (name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024152466200,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器已使用的文件系统大小",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(container_fs_usage_bytes{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}) by (name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024097849600,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "容器用户态CPU使用率",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_cpu_user_seconds_total{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}[1m])*100) by(name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024036896800,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "文件系统写入速率",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_fs_writes_bytes_total{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}[1m])) by(name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024057722000,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "文件系统读取速率",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\",",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_fs_reads_bytes_total{namespace=\"$namespace\", pod=~\"$pod_name\", image!~\".*pause.*\"}[1m])) by(name)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024166898000,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络发送丢包数",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\"}[1m]))",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_network_transmit_packets_dropped_total{namespace=\"$namespace\", pod=~\"$pod_name\"}[1m])) by(name, interface)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024160266500,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络发送数据包",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\"}[1m]))",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_network_transmit_packets_total{namespace=\"$namespace\", pod=~\"$pod_name\"}[1m])) by(name, interface)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024069935000,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络发送速率",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\"}[1m]))",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_network_transmit_bytes_total{namespace=\"$namespace\", pod=~\"$pod_name\"}[1m])) by(name, interface)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024163721700,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络发送错误数",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\"}[1m]))",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_network_transmit_errors_total{namespace=\"$namespace\", pod=~\"$pod_name\"}[1m])) by(name, interface)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024173485600,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络接收丢包数",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\"}[1m]))",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_network_receive_packets_dropped_total{namespace=\"$namespace\", pod=~\"$pod_name\"}[1m])) by(name, interface)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024156389600,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络接收数据包数",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\"}[1m]))",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_network_receive_packets_total{namespace=\"$namespace\", pod=~\"$pod_name\"}[1m])) by(name, interface)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024075864800,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络接收速率",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\"}[1m]))",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_network_receive_bytes_total{namespace=\"$namespace\", pod=~\"$pod_name\"}[1m])) by(name, interface)"
|
||||
},
|
||||
{
|
||||
"uuid": 1745893024170233300,
|
||||
"collector": "Pod",
|
||||
"typ": "Kubernetes",
|
||||
"name": "网络接收错误数",
|
||||
"unit": "",
|
||||
"note": "Pod自身指标\n类型: pod=~\"$pod_name\"}[1m]))",
|
||||
"lang": "zh_CN",
|
||||
"expression": "sum(rate(container_network_receive_errors_total{namespace=\"$namespace\", pod=~\"$pod_name\"}[1m])) by(name, interface)"
|
||||
}
|
||||
]
|
||||
@@ -26,7 +26,7 @@
|
||||
"prom_ql": "",
|
||||
"queries": [
|
||||
{
|
||||
"prom_ql": "(node_filesystem_device_error{instance=\"$node\",mountpoint!~\"/var/lib/.*\",mountpoint!~\"/run.*\"}) \u003e 0",
|
||||
"prom_ql": "(node_filesystem_device_error{mountpoint!~\"/var/lib/.*\",mountpoint!~\"/run.*\"}) \u003e 0",
|
||||
"severity": 1
|
||||
}
|
||||
],
|
||||
@@ -271,7 +271,7 @@
|
||||
"prom_ql": "",
|
||||
"queries": [
|
||||
{
|
||||
"prom_ql": "(node_filefd_allocated{instance=\"$node\"}/node_filefd_maximum{instance=\"$node\"}*100) \u003e 90",
|
||||
"prom_ql": "(node_filefd_allocated/node_filefd_maximum*100) \u003e 90",
|
||||
"severity": 2
|
||||
}
|
||||
],
|
||||
|
||||
8
integrations/Linux/collect/ntp/ntp.toml
Normal file
8
integrations/Linux/collect/ntp/ntp.toml
Normal file
@@ -0,0 +1,8 @@
|
||||
# # collect interval
|
||||
# interval = 15
|
||||
|
||||
# # ntp servers
|
||||
# ntp_servers = ["ntp.aliyun.com"]
|
||||
|
||||
# # response time out seconds
|
||||
# timeout = 5
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "机器常用指标 - 所有机器",
|
||||
"tags": "categraf",
|
||||
"name": "机器常用指标(使用 Categraf 作为采集器,如果只想看当前业务组内的机器修改大盘变量 ident 的变量类型为机器标识即可)",
|
||||
"tags": "Categraf",
|
||||
"ident": "",
|
||||
"uuid": 1737103014612000,
|
||||
"configs": {
|
||||
@@ -33,9 +33,11 @@
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "count(last_over_time(system_uptime{ident=~\"$ident\"}[1m]))",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "A"
|
||||
"expr": "count(last_over_time(system_uptime{ident=~\"$ident\"}[$__rate_interval]))",
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15,
|
||||
"instant": false
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
@@ -48,7 +50,7 @@
|
||||
"maxPerRow": 4,
|
||||
"custom": {
|
||||
"textMode": "value",
|
||||
"graphMode": "none",
|
||||
"graphMode": "area",
|
||||
"colorMode": "background",
|
||||
"calc": "lastNotNull",
|
||||
"valueField": "Value",
|
||||
@@ -333,7 +335,7 @@
|
||||
{
|
||||
"expr": "100-cpu_usage_idle{ident=~\"$ident\",cpu=\"cpu-total\"}",
|
||||
"legend": "{{ident}}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
@@ -348,7 +350,8 @@
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -428,9 +431,9 @@
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(diskio_io_time{ident=~\"$ident\"}[1m])/10",
|
||||
"expr": "rate(diskio_io_time{ident=~\"$ident\"}[$__rate_interval])/10",
|
||||
"legend": "{{ident}} {{name}}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
@@ -445,7 +448,8 @@
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -527,7 +531,7 @@
|
||||
{
|
||||
"expr": "mem_used_percent{ident=~\"$ident\"}",
|
||||
"legend": "{{ident}}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
@@ -542,7 +546,8 @@
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -622,9 +627,9 @@
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "(1 - mem_swap_free / mem_swap_total)*100",
|
||||
"expr": "(1 - mem_swap_free{ident=~\"$ident\"} / mem_swap_total{ident=~\"$ident\"})*100 and mem_swap_total{ident=~\"$ident\"} > 0",
|
||||
"legend": "{{ident}}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
@@ -640,7 +645,8 @@
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -720,9 +726,9 @@
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "increase(kernel_vmstat_oom_kill{ident=~\"$ident\"}[5m])",
|
||||
"expr": "rate(kernel_vmstat_oom_kill{ident=~\"$ident\"}[$__rate_interval])",
|
||||
"legend": "{{ident}}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
@@ -733,11 +739,12 @@
|
||||
"options": {}
|
||||
}
|
||||
],
|
||||
"name": "5分钟内OOM次数",
|
||||
"name": "每秒OOM次数",
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -762,8 +769,8 @@
|
||||
"steps": [
|
||||
{
|
||||
"color": "#634CD9",
|
||||
"type": "base",
|
||||
"value": null
|
||||
"value": null,
|
||||
"type": "base"
|
||||
}
|
||||
]
|
||||
},
|
||||
@@ -827,9 +834,9 @@
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(net_bytes_recv{ident=~\"$ident\"}[1m])*8",
|
||||
"expr": "rate(net_bytes_recv{ident=~\"$ident\"}[$__rate_interval])*8",
|
||||
"legend": "{{ident}} {{interface}}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
@@ -844,7 +851,8 @@
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -914,9 +922,9 @@
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(net_bytes_sent{ident=~\"$ident\"}[1m])*8",
|
||||
"expr": "rate(net_bytes_sent{ident=~\"$ident\"}[$__rate_interval])*8",
|
||||
"legend": "{{ident}} {{interface}}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "B",
|
||||
"step": 15
|
||||
}
|
||||
@@ -931,7 +939,8 @@
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -986,21 +995,7 @@
|
||||
]
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.03,
|
||||
"gradientMode": "none",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"scaleDistribution": {
|
||||
"type": "linear"
|
||||
},
|
||||
"spanNulls": false,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"description": "",
|
||||
"type": "timeseries",
|
||||
"id": "cfb80689-de7b-47fb-9155-052b796dd7f5",
|
||||
"layout": {
|
||||
"h": 5,
|
||||
@@ -1010,43 +1005,13 @@
|
||||
"i": "cfb80689-de7b-47fb-9155-052b796dd7f5",
|
||||
"isResizable": true
|
||||
},
|
||||
"maxPerRow": 4,
|
||||
"name": "Time Wait 状态的连接数",
|
||||
"options": {
|
||||
"legend": {
|
||||
"behaviour": "showItem",
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {
|
||||
"decimals": 0
|
||||
},
|
||||
"thresholds": {
|
||||
"steps": [
|
||||
{
|
||||
"color": "#634CD9",
|
||||
"type": "base",
|
||||
"value": null
|
||||
}
|
||||
]
|
||||
},
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
}
|
||||
},
|
||||
"overrides": [
|
||||
{
|
||||
"matcher": {
|
||||
"id": "byFrameRefID"
|
||||
},
|
||||
"properties": {
|
||||
"rightYAxisDisplay": "off"
|
||||
}
|
||||
}
|
||||
],
|
||||
"version": "3.1.0",
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "netstat_tcp_tw{ident=~\"$ident\"}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "B",
|
||||
"step": 15
|
||||
}
|
||||
@@ -1057,8 +1022,61 @@
|
||||
"options": {}
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "3.0.0"
|
||||
"name": "Time Wait 状态的连接数",
|
||||
"description": "",
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
"placement": "bottom",
|
||||
"behaviour": "showItem",
|
||||
"selectMode": "single"
|
||||
},
|
||||
"standardOptions": {
|
||||
"decimals": 0
|
||||
},
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "#634CD9",
|
||||
"value": null,
|
||||
"type": "base"
|
||||
}
|
||||
]
|
||||
},
|
||||
"thresholdsStyle": {
|
||||
"mode": "dashed"
|
||||
}
|
||||
},
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"lineInterpolation": "smooth",
|
||||
"spanNulls": false,
|
||||
"lineWidth": 2,
|
||||
"fillOpacity": 0.03,
|
||||
"gradientMode": "none",
|
||||
"stack": "off",
|
||||
"scaleDistribution": {
|
||||
"type": "linear"
|
||||
},
|
||||
"showPoints": "none",
|
||||
"pointSize": 5
|
||||
},
|
||||
"overrides": [
|
||||
{
|
||||
"matcher": {
|
||||
"id": "byFrameRefID"
|
||||
},
|
||||
"properties": {
|
||||
"rightYAxisDisplay": "off"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "timeseries",
|
||||
@@ -1076,16 +1094,18 @@
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(net_err_in{ident=~\"$ident\"}[1m])",
|
||||
"expr": "rate(net_err_in{ident=~\"$ident\"}[$__rate_interval])",
|
||||
"legend": "{{ident}}-{{interface}}-in",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "A"
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
},
|
||||
{
|
||||
"expr": "rate(net_err_out{ident=~\"$ident\"}[1m])",
|
||||
"expr": "rate(net_err_out{ident=~\"$ident\"}[$__rate_interval])",
|
||||
"legend": "{{ident}}-{{interface}}-out",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "B"
|
||||
"maxDataPoints": 480,
|
||||
"refId": "B",
|
||||
"step": 15
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
@@ -1098,7 +1118,8 @@
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -1164,16 +1185,18 @@
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "rate(net_drop_in{ident=~\"$ident\"}[1m])",
|
||||
"expr": "rate(net_drop_in{ident=~\"$ident\"}[$__rate_interval])",
|
||||
"legend": "{{ident}}-{{interface}}-in",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "A"
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
},
|
||||
{
|
||||
"expr": "rate(net_drop_out{ident=~\"$ident\"}[1m])",
|
||||
"expr": "rate(net_drop_out{ident=~\"$ident\"}[$__rate_interval])",
|
||||
"legend": "{{ident}}-{{interface}}-out",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "B"
|
||||
"maxDataPoints": 480,
|
||||
"refId": "B",
|
||||
"step": 15
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
@@ -1186,7 +1209,8 @@
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -1269,7 +1293,7 @@
|
||||
{
|
||||
"expr": "disk_device_error{ident=~\"$ident\"}",
|
||||
"legend": "{{ident}} {{path}}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
@@ -1284,7 +1308,8 @@
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -1293,7 +1318,7 @@
|
||||
"selectMode": "single"
|
||||
},
|
||||
"standardOptions": {
|
||||
"decimals": 2
|
||||
"decimals": 0
|
||||
},
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
@@ -1352,7 +1377,7 @@
|
||||
{
|
||||
"expr": "100 * conntrack_ip_conntrack_count{ident=~\"$ident\"} / conntrack_ip_conntrack_max{ident=~\"$ident\"}",
|
||||
"legend": "ip_conntrack {{ident}}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
},
|
||||
@@ -1360,7 +1385,7 @@
|
||||
"__mode__": "__query__",
|
||||
"expr": "100 * conntrack_nf_conntrack_count{ident=~\"$ident\"} / conntrack_nf_conntrack_max{ident=~\"$ident\"}",
|
||||
"legend": "nf_conntrack {{ident}}",
|
||||
"maxDataPoints": 240,
|
||||
"maxDataPoints": 480,
|
||||
"refId": "B",
|
||||
"step": 15
|
||||
}
|
||||
@@ -1372,10 +1397,12 @@
|
||||
}
|
||||
],
|
||||
"name": "Conntrack使用率",
|
||||
"description": "`dmesg -T` 有时看到 conntrack table full 的报错,大概率就是 conntrack 限制太小了,需要调整内核参数",
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "single"
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
@@ -1425,6 +1452,346 @@
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "timeseries",
|
||||
"id": "7c90380f-5ab6-4aa5-9070-f604985a0389",
|
||||
"layout": {
|
||||
"h": 5,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 32,
|
||||
"i": "e7117d7c-b946-49fa-bc49-2afb0d2b3a44",
|
||||
"isResizable": true
|
||||
},
|
||||
"version": "3.1.0",
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "processes_total{ident=~\"$ident\"}",
|
||||
"legend": "",
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {}
|
||||
}
|
||||
],
|
||||
"name": "Process 总量",
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
"placement": "bottom",
|
||||
"behaviour": "showItem",
|
||||
"selectMode": "single"
|
||||
},
|
||||
"standardOptions": {
|
||||
"decimals": 0
|
||||
},
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "rgba(44, 157, 61, 1)",
|
||||
"value": null,
|
||||
"type": "base"
|
||||
}
|
||||
]
|
||||
},
|
||||
"thresholdsStyle": {
|
||||
"mode": "dashed"
|
||||
}
|
||||
},
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"lineInterpolation": "smooth",
|
||||
"spanNulls": false,
|
||||
"lineWidth": 2,
|
||||
"fillOpacity": 0.03,
|
||||
"gradientMode": "none",
|
||||
"stack": "off",
|
||||
"scaleDistribution": {
|
||||
"type": "linear"
|
||||
},
|
||||
"showPoints": "none",
|
||||
"pointSize": 5
|
||||
},
|
||||
"overrides": [
|
||||
{
|
||||
"matcher": {
|
||||
"id": "byFrameRefID"
|
||||
},
|
||||
"properties": {
|
||||
"rightYAxisDisplay": "off"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "timeseries",
|
||||
"id": "3334c222-dd92-49eb-9744-4ce0f59031e4",
|
||||
"layout": {
|
||||
"h": 5,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 32,
|
||||
"i": "0ecb9f26-4c4d-40d7-9934-5116e3ffa51a",
|
||||
"isResizable": true
|
||||
},
|
||||
"version": "3.1.0",
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "procstat_rlimit_num_fds_hard{ident=~\"$ident\"}",
|
||||
"legend": "",
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {}
|
||||
}
|
||||
],
|
||||
"name": "进程句柄数限制(低于4096要注意)",
|
||||
"description": "以现在的硬件配置,通常句柄的 ulimit 应该比较大,如果低于 4096,大概率是忘记修改配置了,需要注意。这个数据是 Categraf 的 procstat 插件采集的。",
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
"placement": "bottom",
|
||||
"behaviour": "showItem",
|
||||
"selectMode": "single"
|
||||
},
|
||||
"standardOptions": {
|
||||
"decimals": 0
|
||||
},
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "rgba(44, 157, 61, 1)",
|
||||
"value": null,
|
||||
"type": "base"
|
||||
}
|
||||
]
|
||||
},
|
||||
"thresholdsStyle": {
|
||||
"mode": "dashed"
|
||||
}
|
||||
},
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"lineInterpolation": "smooth",
|
||||
"spanNulls": false,
|
||||
"lineWidth": 2,
|
||||
"fillOpacity": 0.03,
|
||||
"gradientMode": "none",
|
||||
"stack": "off",
|
||||
"scaleDistribution": {
|
||||
"type": "linear"
|
||||
},
|
||||
"showPoints": "none",
|
||||
"pointSize": 5
|
||||
},
|
||||
"overrides": [
|
||||
{
|
||||
"matcher": {
|
||||
"id": "byFrameRefID"
|
||||
},
|
||||
"properties": {
|
||||
"rightYAxisDisplay": "off"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "timeseries",
|
||||
"id": "c3ee640f-e654-4fc7-aa2a-0dd8e9de67cb",
|
||||
"layout": {
|
||||
"h": 5,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 37,
|
||||
"i": "423adbbf-8c23-45ab-b7d5-9a81b72291f1",
|
||||
"isResizable": true
|
||||
},
|
||||
"version": "3.1.0",
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "ntp_offset_ms{ident=~\"$ident\"}",
|
||||
"legend": "",
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {}
|
||||
}
|
||||
],
|
||||
"name": "NTP时间偏移",
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
"placement": "bottom",
|
||||
"behaviour": "showItem",
|
||||
"selectMode": "single"
|
||||
},
|
||||
"standardOptions": {
|
||||
"util": "milliseconds",
|
||||
"decimals": 2
|
||||
},
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "rgba(44, 157, 61, 1)",
|
||||
"value": null,
|
||||
"type": "base"
|
||||
}
|
||||
]
|
||||
},
|
||||
"thresholdsStyle": {
|
||||
"mode": "dashed"
|
||||
}
|
||||
},
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"lineInterpolation": "smooth",
|
||||
"spanNulls": false,
|
||||
"lineWidth": 2,
|
||||
"fillOpacity": 0.03,
|
||||
"gradientMode": "none",
|
||||
"stack": "off",
|
||||
"scaleDistribution": {
|
||||
"type": "linear"
|
||||
},
|
||||
"showPoints": "none",
|
||||
"pointSize": 5
|
||||
},
|
||||
"overrides": [
|
||||
{
|
||||
"matcher": {
|
||||
"id": "byFrameRefID"
|
||||
},
|
||||
"properties": {
|
||||
"rightYAxisDisplay": "off"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "timeseries",
|
||||
"id": "9bb8d5ef-dc4e-419f-8e95-6dbb97b2afb6",
|
||||
"layout": {
|
||||
"h": 5,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 37,
|
||||
"i": "e97f1934-26e8-4bf3-be21-95307443f146",
|
||||
"isResizable": true
|
||||
},
|
||||
"version": "3.1.0",
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "linux_sysctl_fs_file_nr{ident=~\"$ident\"}/linux_sysctl_fs_file_max{ident=~\"$ident\"} * 100",
|
||||
"legend": "",
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {}
|
||||
}
|
||||
],
|
||||
"name": "操作系统文件句柄使用率",
|
||||
"description": "",
|
||||
"maxPerRow": 4,
|
||||
"options": {
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
},
|
||||
"legend": {
|
||||
"displayMode": "hidden",
|
||||
"placement": "bottom",
|
||||
"behaviour": "showItem",
|
||||
"selectMode": "single"
|
||||
},
|
||||
"standardOptions": {
|
||||
"util": "percent",
|
||||
"decimals": 0
|
||||
},
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "rgba(44, 157, 61, 1)",
|
||||
"value": null,
|
||||
"type": "base"
|
||||
}
|
||||
]
|
||||
},
|
||||
"thresholdsStyle": {
|
||||
"mode": "dashed"
|
||||
}
|
||||
},
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"lineInterpolation": "smooth",
|
||||
"spanNulls": false,
|
||||
"lineWidth": 2,
|
||||
"fillOpacity": 0.03,
|
||||
"gradientMode": "none",
|
||||
"stack": "off",
|
||||
"scaleDistribution": {
|
||||
"type": "linear"
|
||||
},
|
||||
"showPoints": "none",
|
||||
"pointSize": 5
|
||||
},
|
||||
"overrides": [
|
||||
{
|
||||
"matcher": {
|
||||
"id": "byFrameRefID"
|
||||
},
|
||||
"properties": {
|
||||
"rightYAxisDisplay": "off"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"var": [
|
||||
@@ -1450,4 +1817,4 @@
|
||||
],
|
||||
"version": "3.0.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,6 +1,8 @@
|
||||
{
|
||||
"name": "机器台账表格视图",
|
||||
"tags": "",
|
||||
"name": "机器台账表格视图(使用 Categraf 作为采集器)",
|
||||
"tags": "Categraf",
|
||||
"ident": "",
|
||||
"uuid": 1717556327742611000,
|
||||
"configs": {
|
||||
"links": [
|
||||
{
|
||||
@@ -16,17 +18,7 @@
|
||||
],
|
||||
"panels": [
|
||||
{
|
||||
"custom": {
|
||||
"calc": "lastNotNull",
|
||||
"colorRange": [
|
||||
"thresholds"
|
||||
],
|
||||
"detailUrl": "/built-in-components/dashboard/detail?__uuid__=1717556327744505000&ident=${__field.labels.ident}",
|
||||
"textMode": "valueAndName",
|
||||
"valueField": "Value"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"type": "hexbin",
|
||||
"id": "21b8b3ab-26aa-47cb-b814-f310f2d143aa",
|
||||
"layout": {
|
||||
"h": 5,
|
||||
@@ -36,18 +28,43 @@
|
||||
"x": 0,
|
||||
"y": 0
|
||||
},
|
||||
"maxPerRow": 4,
|
||||
"version": "3.1.0",
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(100, cpu_usage_active{cpu=\"cpu-total\", ident=~\"$ident\"})",
|
||||
"instant": true,
|
||||
"legend": "{{ident}}",
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {}
|
||||
}
|
||||
],
|
||||
"name": "CPU利用率",
|
||||
"maxPerRow": 4,
|
||||
"custom": {
|
||||
"textMode": "valueAndName",
|
||||
"calc": "lastNotNull",
|
||||
"valueField": "Value",
|
||||
"colorRange": [
|
||||
"thresholds"
|
||||
],
|
||||
"detailUrl": "/components/dashboard/detail?__uuid__=1737103014612000&ident=${__field.labels.ident}"
|
||||
},
|
||||
"options": {
|
||||
"standardOptions": {
|
||||
"util": "percent"
|
||||
},
|
||||
"thresholds": {
|
||||
"steps": [
|
||||
{
|
||||
"color": "#ef3c3c",
|
||||
"type": "",
|
||||
"value": 95
|
||||
"value": 95,
|
||||
"type": ""
|
||||
},
|
||||
{
|
||||
"color": "#ff656b",
|
||||
@@ -65,38 +82,15 @@
|
||||
"value": null
|
||||
}
|
||||
]
|
||||
},
|
||||
"standardOptions": {
|
||||
"util": "percent",
|
||||
"decimals": 2
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "cpu_usage_active{cpu=\"cpu-total\", ident=~\"$ident\"}",
|
||||
"instant": true,
|
||||
"legend": "{{ident}}",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {}
|
||||
}
|
||||
],
|
||||
"type": "hexbin",
|
||||
"version": "3.0.0"
|
||||
}
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"calc": "lastNotNull",
|
||||
"colorRange": [
|
||||
"thresholds"
|
||||
],
|
||||
"detailUrl": "/built-in-components/dashboard/detail?__uuid__=1717556327744505000&ident=${__field.labels.ident}",
|
||||
"textMode": "valueAndName",
|
||||
"valueField": "Value"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"type": "hexbin",
|
||||
"id": "86d4a502-21f7-4981-9b38-ed8e696b6f49",
|
||||
"layout": {
|
||||
"h": 5,
|
||||
@@ -106,18 +100,43 @@
|
||||
"x": 12,
|
||||
"y": 0
|
||||
},
|
||||
"maxPerRow": 4,
|
||||
"version": "3.1.0",
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(100, mem_used_percent{ident=~\"$ident\"})",
|
||||
"instant": true,
|
||||
"legend": "{{ident}}",
|
||||
"maxDataPoints": 480,
|
||||
"refId": "A",
|
||||
"step": 15
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {}
|
||||
}
|
||||
],
|
||||
"name": "内存利用率",
|
||||
"maxPerRow": 4,
|
||||
"custom": {
|
||||
"textMode": "valueAndName",
|
||||
"calc": "lastNotNull",
|
||||
"valueField": "Value",
|
||||
"colorRange": [
|
||||
"thresholds"
|
||||
],
|
||||
"detailUrl": "/components/dashboard/detail?__uuid__=1737103014612000&ident=${__field.labels.ident}"
|
||||
},
|
||||
"options": {
|
||||
"standardOptions": {
|
||||
"util": "percent"
|
||||
},
|
||||
"thresholds": {
|
||||
"steps": [
|
||||
{
|
||||
"color": "#ef3c3c",
|
||||
"type": "",
|
||||
"value": 95
|
||||
"value": 95,
|
||||
"type": ""
|
||||
},
|
||||
{
|
||||
"color": "#ff656b",
|
||||
@@ -135,48 +154,15 @@
|
||||
"value": null
|
||||
}
|
||||
]
|
||||
},
|
||||
"standardOptions": {
|
||||
"util": "percent",
|
||||
"decimals": 2
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "mem_used_percent{ident=~\"$ident\"}",
|
||||
"instant": true,
|
||||
"legend": "{{ident}}",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {}
|
||||
}
|
||||
],
|
||||
"type": "hexbin",
|
||||
"version": "3.0.0"
|
||||
}
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"aggrDimension": "ident",
|
||||
"calc": "lastNotNull",
|
||||
"colorMode": "background",
|
||||
"displayMode": "labelValuesToRows",
|
||||
"linkMode": "appendLinkColumn",
|
||||
"links": [
|
||||
{
|
||||
"targetBlank": true,
|
||||
"title": "详情",
|
||||
"url": "/built-in-components/dashboard/detail?__uuid__=1717556327744505000&ident=${__field.labels.ident}"
|
||||
}
|
||||
],
|
||||
"nowrap": false,
|
||||
"showHeader": true,
|
||||
"sortColumn": "ident",
|
||||
"sortOrder": "ascend",
|
||||
"tableLayout": "fixed"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"type": "table",
|
||||
"id": "77bf513a-8504-4d33-9efe-75aaf9abc9e4",
|
||||
"layout": {
|
||||
"h": 11,
|
||||
@@ -186,10 +172,71 @@
|
||||
"x": 0,
|
||||
"y": 5
|
||||
},
|
||||
"maxPerRow": 4,
|
||||
"version": "3.1.0",
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "avg(cpu_usage_active{cpu=\"cpu-total\", ident=~\"$ident\"}) by (ident)",
|
||||
"legend": "CPU使用率",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "avg(mem_used_percent{ident=~\"$ident\"}) by (ident)",
|
||||
"legend": "内存使用率",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"expr": "avg(mem_total{ident=~\"$ident\"}) by (ident)",
|
||||
"legend": "总内存",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "C"
|
||||
},
|
||||
{
|
||||
"expr": "avg(disk_used_percent{ident=~\"$ident\",path=\"/\"}) by (ident)",
|
||||
"legend": "根分区使用率",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "D"
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {
|
||||
"renameByName": {
|
||||
"ident": "机器"
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"name": "机器列表",
|
||||
"maxPerRow": 4,
|
||||
"custom": {
|
||||
"showHeader": true,
|
||||
"colorMode": "background",
|
||||
"nowrap": false,
|
||||
"tableLayout": "fixed",
|
||||
"calc": "lastNotNull",
|
||||
"displayMode": "labelValuesToRows",
|
||||
"aggrDimension": "ident",
|
||||
"sortColumn": "ident",
|
||||
"sortOrder": "ascend",
|
||||
"pageLimit": 500,
|
||||
"linkMode": "appendLinkColumn",
|
||||
"links": [
|
||||
{
|
||||
"targetBlank": true,
|
||||
"title": "详情",
|
||||
"url": "/components/dashboard/detail?__uuid__=1737103014612000&ident=${__field.labels.ident}"
|
||||
}
|
||||
]
|
||||
},
|
||||
"options": {
|
||||
"standardOptions": {}
|
||||
"standardOptions": {
|
||||
"decimals": 2
|
||||
}
|
||||
},
|
||||
"overrides": [
|
||||
{
|
||||
@@ -199,7 +246,8 @@
|
||||
},
|
||||
"properties": {
|
||||
"standardOptions": {
|
||||
"util": "percent"
|
||||
"util": "percent",
|
||||
"decimals": 2
|
||||
},
|
||||
"valueMappings": [
|
||||
{
|
||||
@@ -239,7 +287,8 @@
|
||||
},
|
||||
"properties": {
|
||||
"standardOptions": {
|
||||
"util": "percent"
|
||||
"util": "percent",
|
||||
"decimals": 2
|
||||
},
|
||||
"valueMappings": [
|
||||
{
|
||||
@@ -320,66 +369,32 @@
|
||||
},
|
||||
"type": "special"
|
||||
}
|
||||
],
|
||||
"targets": [
|
||||
{
|
||||
"expr": "avg(cpu_usage_active{cpu=\"cpu-total\", ident=~\"$ident\"}) by (ident)",
|
||||
"legend": "CPU使用率",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "avg(mem_used_percent{ident=~\"$ident\"}) by (ident)",
|
||||
"legend": "内存使用率",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"expr": "avg(mem_total{ident=~\"$ident\"}) by (ident)",
|
||||
"legend": "总内存",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "C"
|
||||
},
|
||||
{
|
||||
"expr": "avg(disk_used_percent{ident=~\"$ident\",path=\"/\"}) by (ident)",
|
||||
"legend": "根分区使用率",
|
||||
"maxDataPoints": 240,
|
||||
"refId": "D"
|
||||
}
|
||||
],
|
||||
"transformations": [
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {
|
||||
"renameByName": {
|
||||
"ident": "机器"
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"type": "table",
|
||||
"version": "3.0.0"
|
||||
]
|
||||
}
|
||||
],
|
||||
"var": [
|
||||
{
|
||||
"definition": "prometheus",
|
||||
"name": "prom",
|
||||
"type": "datasource"
|
||||
"label": "数据源",
|
||||
"type": "datasource",
|
||||
"hide": false,
|
||||
"definition": "prometheus"
|
||||
},
|
||||
{
|
||||
"name": "ident",
|
||||
"label": "机器",
|
||||
"type": "query",
|
||||
"hide": false,
|
||||
"multi": true,
|
||||
"allOption": true,
|
||||
"allValue": ".*",
|
||||
"datasource": {
|
||||
"cate": "prometheus",
|
||||
"value": "${prom}"
|
||||
},
|
||||
"definition": "label_values(system_load1,ident)",
|
||||
"multi": true,
|
||||
"name": "ident",
|
||||
"type": "query"
|
||||
"definition": "label_values(system_load1,ident)"
|
||||
}
|
||||
],
|
||||
"version": "3.0.0"
|
||||
},
|
||||
"uuid": 1717556327742611000
|
||||
}
|
||||
}
|
||||
@@ -1,18 +1,12 @@
|
||||
{
|
||||
"id": 0,
|
||||
"group_id": 0,
|
||||
"name": "Processes by UlricQin",
|
||||
"name": "机器进程数量统计(使用 Categraf 作为采集器)",
|
||||
"tags": "Categraf",
|
||||
"ident": "",
|
||||
"tags": "Categraf Linux OS",
|
||||
"create_at": 0,
|
||||
"create_by": "",
|
||||
"update_at": 0,
|
||||
"update_by": "",
|
||||
"uuid": 1717556327738575000,
|
||||
"configs": {
|
||||
"panels": [
|
||||
{
|
||||
"custom": {
|
||||
"baseColor": "#9470FF",
|
||||
"calc": "lastNotNull",
|
||||
"serieWidth": 20,
|
||||
"sortOrder": "desc"
|
||||
@@ -41,7 +35,17 @@
|
||||
},
|
||||
"type": "range"
|
||||
}
|
||||
]
|
||||
],
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "#9470FF",
|
||||
"type": "base",
|
||||
"value": null
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
@@ -62,7 +66,6 @@
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"baseColor": "#9470FF",
|
||||
"calc": "lastNotNull",
|
||||
"serieWidth": 20,
|
||||
"sortOrder": "desc"
|
||||
@@ -91,7 +94,17 @@
|
||||
},
|
||||
"type": "range"
|
||||
}
|
||||
]
|
||||
],
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "#9470FF",
|
||||
"type": "base",
|
||||
"value": null
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
@@ -112,7 +125,6 @@
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"baseColor": "#9470FF",
|
||||
"calc": "lastNotNull",
|
||||
"serieWidth": 20,
|
||||
"sortOrder": "desc"
|
||||
@@ -150,7 +162,17 @@
|
||||
},
|
||||
"type": "range"
|
||||
}
|
||||
]
|
||||
],
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "#9470FF",
|
||||
"type": "base",
|
||||
"value": null
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
@@ -216,30 +238,26 @@
|
||||
],
|
||||
"var": [
|
||||
{
|
||||
"definition": "prometheus",
|
||||
"label": "",
|
||||
"name": "Datasource",
|
||||
"type": "datasource"
|
||||
"label": "数据源",
|
||||
"type": "datasource",
|
||||
"hide": false,
|
||||
"definition": "prometheus"
|
||||
},
|
||||
{
|
||||
"allOption": true,
|
||||
"name": "ident",
|
||||
"label": "机器",
|
||||
"type": "query",
|
||||
"hide": false,
|
||||
"datasource": {
|
||||
"cate": "prometheus",
|
||||
"value": "${Datasource}"
|
||||
},
|
||||
"definition": "label_values(processes_running, ident)",
|
||||
"label": "Host",
|
||||
"multi": true,
|
||||
"name": "ident",
|
||||
"type": "query"
|
||||
"allOption": true
|
||||
}
|
||||
],
|
||||
"version": "3.0.0"
|
||||
},
|
||||
"public": 0,
|
||||
"public_cate": 0,
|
||||
"bgids": null,
|
||||
"built_in": 0,
|
||||
"hide": 0,
|
||||
"uuid": 1717556327738575000
|
||||
}
|
||||
}
|
||||
3322
integrations/Linux/dashboards/exporter-detail.json
Normal file
3322
integrations/Linux/dashboards/exporter-detail.json
Normal file
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -1,267 +0,0 @@
|
||||
{
|
||||
"id": 0,
|
||||
"group_id": 0,
|
||||
"name": "Linux Host by Categraf Overview",
|
||||
"ident": "",
|
||||
"tags": "",
|
||||
"create_at": 0,
|
||||
"create_by": "",
|
||||
"update_at": 0,
|
||||
"update_by": "",
|
||||
"configs": {
|
||||
"links": [
|
||||
{
|
||||
"targetBlank": true,
|
||||
"title": "n9e",
|
||||
"url": "https://n9e.github.io/"
|
||||
},
|
||||
{
|
||||
"targetBlank": true,
|
||||
"title": "author",
|
||||
"url": "http://flashcat.cloud/"
|
||||
}
|
||||
],
|
||||
"panels": [
|
||||
{
|
||||
"collapsed": true,
|
||||
"id": "e5d14dd7-4417-42bd-b7ba-560f34d299a2",
|
||||
"layout": {
|
||||
"h": 1,
|
||||
"i": "e5d14dd7-4417-42bd-b7ba-560f34d299a2",
|
||||
"isResizable": false,
|
||||
"w": 24,
|
||||
"x": 0,
|
||||
"y": 0
|
||||
},
|
||||
"name": "整体概况",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"calc": "lastNotNull",
|
||||
"colSpan": 1,
|
||||
"colorMode": "value",
|
||||
"textMode": "value",
|
||||
"textSize": {
|
||||
"value": 50
|
||||
}
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "41f37540-e695-492a-9d2f-24bfd2d36805",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "41f37540-e695-492a-9d2f-24bfd2d36805",
|
||||
"isResizable": true,
|
||||
"w": 3,
|
||||
"x": 0,
|
||||
"y": 1
|
||||
},
|
||||
"name": "监控机器数",
|
||||
"options": {
|
||||
"standardOptions": {}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "count(system_load1)",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "stat",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"baseColor": "#cd75eb",
|
||||
"calc": "lastNotNull",
|
||||
"serieWidth": 20,
|
||||
"sortOrder": "desc"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "585bfc50-7c92-42b1-88ee-5b725b640418",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "585bfc50-7c92-42b1-88ee-5b725b640418",
|
||||
"isResizable": true,
|
||||
"w": 9,
|
||||
"x": 3,
|
||||
"y": 1
|
||||
},
|
||||
"name": "内存使用率 top10",
|
||||
"options": {
|
||||
"standardOptions": {},
|
||||
"valueMappings": [
|
||||
{
|
||||
"match": {
|
||||
"from": 60
|
||||
},
|
||||
"result": {
|
||||
"color": "#f8070e"
|
||||
},
|
||||
"type": "range"
|
||||
}
|
||||
]
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10, (mem_used_percent))",
|
||||
"legend": "{{ident}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "barGauge",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.3,
|
||||
"gradientMode": "opacity",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "60b1e833-3f03-45bb-9385-a3825904a0ac",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "60b1e833-3f03-45bb-9385-a3825904a0ac",
|
||||
"isResizable": true,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 1
|
||||
},
|
||||
"name": "cpu使用率 top10",
|
||||
"options": {
|
||||
"legend": {
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {},
|
||||
"thresholds": {},
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "none"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10, (100-cpu_usage_idle{cpu=\"cpu-total\"}))",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"baseColor": "#9470ff",
|
||||
"calc": "lastNotNull",
|
||||
"serieWidth": 20,
|
||||
"sortOrder": "desc"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "69351db9-e646-4e5d-925a-cba29823b00d",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "69351db9-e646-4e5d-925a-cba29823b00d",
|
||||
"isResizable": true,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 4
|
||||
},
|
||||
"name": "磁盘分区使用率 top10",
|
||||
"options": {
|
||||
"standardOptions": {},
|
||||
"valueMappings": [
|
||||
{
|
||||
"match": {
|
||||
"from": 85
|
||||
},
|
||||
"result": {
|
||||
"color": "#f00404"
|
||||
},
|
||||
"type": "range"
|
||||
}
|
||||
]
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10, (disk_used_percent{path!~\"/var.*\"}))",
|
||||
"legend": "{{ident}} {{path}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "barGauge",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.3,
|
||||
"gradientMode": "opacity",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "e3675ed9-6d3b-4a41-8d16-d6e82037dce3",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "e3675ed9-6d3b-4a41-8d16-d6e82037dce3",
|
||||
"isResizable": true,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 4
|
||||
},
|
||||
"name": "设备io util top10",
|
||||
"options": {
|
||||
"legend": {
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {},
|
||||
"thresholds": {},
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10, (rate(diskio_io_time[1m])/10))",
|
||||
"legend": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "2.0.0"
|
||||
}
|
||||
],
|
||||
"var": [
|
||||
{
|
||||
"definition": "prometheus",
|
||||
"name": "prom",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"cate": "prometheus",
|
||||
"value": "${prom}"
|
||||
},
|
||||
"definition": "label_values(system_load1,ident)",
|
||||
"name": "ident",
|
||||
"type": "query"
|
||||
}
|
||||
],
|
||||
"version": "3.0.0"
|
||||
},
|
||||
"public": 0,
|
||||
"public_cate": 0,
|
||||
"bgids": null,
|
||||
"built_in": 0,
|
||||
"hide": 0,
|
||||
"uuid": 1717556327746983000
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,269 +0,0 @@
|
||||
{
|
||||
"id": 0,
|
||||
"group_id": 0,
|
||||
"name": "HOST by Node Exporter Overview",
|
||||
"ident": "",
|
||||
"tags": "Prometheus Host",
|
||||
"create_at": 0,
|
||||
"create_by": "",
|
||||
"update_at": 0,
|
||||
"update_by": "",
|
||||
"configs": {
|
||||
"links": [
|
||||
{
|
||||
"targetBlank": true,
|
||||
"title": "n9e",
|
||||
"url": "https://n9e.gitee.io/"
|
||||
},
|
||||
{
|
||||
"targetBlank": true,
|
||||
"title": "author",
|
||||
"url": "http://flashcat.cloud/"
|
||||
}
|
||||
],
|
||||
"panels": [
|
||||
{
|
||||
"collapsed": true,
|
||||
"id": "3173366d-01a2-420e-8878-75124b0051b6",
|
||||
"layout": {
|
||||
"h": 1,
|
||||
"i": "3173366d-01a2-420e-8878-75124b0051b6",
|
||||
"isResizable": false,
|
||||
"w": 24,
|
||||
"x": 0,
|
||||
"y": 0
|
||||
},
|
||||
"name": "整体概况",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"calc": "lastNotNull",
|
||||
"colSpan": 1,
|
||||
"colorMode": "value",
|
||||
"textMode": "value",
|
||||
"textSize": {
|
||||
"value": 40
|
||||
}
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "9a5e3292-b346-4ccf-a793-b83a2f8ac8c5",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "9a5e3292-b346-4ccf-a793-b83a2f8ac8c5",
|
||||
"isResizable": true,
|
||||
"w": 3,
|
||||
"x": 0,
|
||||
"y": 1
|
||||
},
|
||||
"name": "监控机器数",
|
||||
"options": {
|
||||
"standardOptions": {}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "count(node_boot_time_seconds)",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "stat",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.3,
|
||||
"gradientMode": "opacity",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"description": "",
|
||||
"id": "e1925fc8-cb05-467b-ba82-bb5cb6be7595",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "e1925fc8-cb05-467b-ba82-bb5cb6be7595",
|
||||
"isResizable": true,
|
||||
"w": 9,
|
||||
"x": 3,
|
||||
"y": 1
|
||||
},
|
||||
"links": [],
|
||||
"name": "cpu使用率 top10",
|
||||
"options": {
|
||||
"legend": {
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {},
|
||||
"thresholds": {},
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10,100-(avg by (mode, instance)(rate(node_cpu_seconds_total{mode=\"idle\"}[1m])))*100)",
|
||||
"legend": "{{instance}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.3,
|
||||
"gradientMode": "opacity",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "327b7e4b-6ec1-47e1-8840-d31cf4b5532b",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "327b7e4b-6ec1-47e1-8840-d31cf4b5532b",
|
||||
"isResizable": true,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 1
|
||||
},
|
||||
"name": "内存使用率 top10",
|
||||
"options": {
|
||||
"legend": {
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {},
|
||||
"thresholds": {},
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10,(node_memory_MemTotal_bytes - node_memory_MemFree_bytes - (node_memory_Cached_bytes + node_memory_Buffers_bytes))/node_memory_MemTotal_bytes*100)",
|
||||
"legend": "{{instance}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.3,
|
||||
"gradientMode": "opacity",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "5a9d4a65-3f73-42cc-859e-fc0b82791b59",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "5a9d4a65-3f73-42cc-859e-fc0b82791b59",
|
||||
"isResizable": true,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 4
|
||||
},
|
||||
"name": "磁盘分区使用率 top10",
|
||||
"options": {
|
||||
"legend": {
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {},
|
||||
"thresholds": {},
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10,(node_filesystem_avail_bytes{device!~'rootfs', device!~\"tmpfs\",mountpoint!~\"/var/lib.*\"} * 100) / node_filesystem_size_bytes{device!~'rootfs', device!~\"tmpfs\",mountpoint!~\"/var/lib.*\"})",
|
||||
"legend": "{{instance}}-{{mountpoint}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.3,
|
||||
"gradientMode": "opacity",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "fa764e4b-5ca9-45d8-b12e-604f8743f9d9",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "fa764e4b-5ca9-45d8-b12e-604f8743f9d9",
|
||||
"isResizable": true,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 4
|
||||
},
|
||||
"name": "设备io util top10",
|
||||
"options": {
|
||||
"legend": {
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {},
|
||||
"thresholds": {},
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10,rate(node_disk_io_time_seconds_total[5m]) * 100)",
|
||||
"legend": "{{instance}}-{{device}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "2.0.0"
|
||||
}
|
||||
],
|
||||
"var": [
|
||||
{
|
||||
"definition": "prometheus",
|
||||
"name": "prom",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"cate": "prometheus",
|
||||
"value": "${prom}"
|
||||
},
|
||||
"definition": "label_values(node_uname_info, instance)",
|
||||
"name": "node",
|
||||
"selected": "$node",
|
||||
"type": "query"
|
||||
}
|
||||
],
|
||||
"version": "3.0.0"
|
||||
},
|
||||
"public": 0,
|
||||
"public_cate": 0,
|
||||
"bgids": null,
|
||||
"built_in": 0,
|
||||
"hide": 0,
|
||||
"uuid": 1717556327752931000
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,264 +0,0 @@
|
||||
{
|
||||
"id": 0,
|
||||
"group_id": 0,
|
||||
"name": "HOST by Telegraf Overview",
|
||||
"ident": "",
|
||||
"tags": "",
|
||||
"create_at": 0,
|
||||
"create_by": "",
|
||||
"update_at": 0,
|
||||
"update_by": "",
|
||||
"configs": {
|
||||
"links": [
|
||||
{
|
||||
"targetBlank": true,
|
||||
"title": "n9e",
|
||||
"url": "https://n9e.gitee.io/"
|
||||
},
|
||||
{
|
||||
"targetBlank": true,
|
||||
"title": "author",
|
||||
"url": "http://flashcat.cloud/"
|
||||
}
|
||||
],
|
||||
"panels": [
|
||||
{
|
||||
"collapsed": true,
|
||||
"id": "0f6a1394-7cf9-4958-bcfe-2fbb59e77c12",
|
||||
"layout": {
|
||||
"h": 1,
|
||||
"i": "0f6a1394-7cf9-4958-bcfe-2fbb59e77c12",
|
||||
"isResizable": false,
|
||||
"w": 24,
|
||||
"x": 0,
|
||||
"y": 0
|
||||
},
|
||||
"name": "整体概况",
|
||||
"type": "row"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"calc": "lastNotNull",
|
||||
"colSpan": 1,
|
||||
"colorMode": "value",
|
||||
"textMode": "value",
|
||||
"textSize": {
|
||||
"value": 50
|
||||
}
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "877b6db5-e82c-499a-9ebc-8ad72c2891a8",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "877b6db5-e82c-499a-9ebc-8ad72c2891a8",
|
||||
"isResizable": true,
|
||||
"w": 3,
|
||||
"x": 0,
|
||||
"y": 1
|
||||
},
|
||||
"name": "监控机器数",
|
||||
"options": {
|
||||
"standardOptions": {}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "count(system_load1)",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "stat",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.3,
|
||||
"gradientMode": "opacity",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "29a3e6ae-d278-49b3-972b-f12a6c7c091c",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "29a3e6ae-d278-49b3-972b-f12a6c7c091c",
|
||||
"isResizable": true,
|
||||
"w": 9,
|
||||
"x": 3,
|
||||
"y": 1
|
||||
},
|
||||
"name": "内存率 top10",
|
||||
"options": {
|
||||
"legend": {
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {},
|
||||
"thresholds": {},
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10, mem_used_percent)",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.3,
|
||||
"gradientMode": "opacity",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "9f2a24d5-d19f-4651-b76d-add6b9011821",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "9f2a24d5-d19f-4651-b76d-add6b9011821",
|
||||
"isResizable": true,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 1
|
||||
},
|
||||
"name": "cpu使用率 top10",
|
||||
"options": {
|
||||
"legend": {
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {},
|
||||
"thresholds": {},
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "none"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10, (100-cpu_usage_idle{cpu=\"cpu-total\"}))",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.3,
|
||||
"gradientMode": "opacity",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "dcd60296-db84-4562-99f3-2829c2f064a4",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "dcd60296-db84-4562-99f3-2829c2f064a4",
|
||||
"isResizable": true,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 4
|
||||
},
|
||||
"name": "磁盘分区使用率 top10",
|
||||
"options": {
|
||||
"legend": {
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {},
|
||||
"thresholds": {},
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "none"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10, (disk_used_percent{path!~\"/var.*\"}))",
|
||||
"legend": "{{ident}}-{{path}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "2.0.0"
|
||||
},
|
||||
{
|
||||
"custom": {
|
||||
"drawStyle": "lines",
|
||||
"fillOpacity": 0.3,
|
||||
"gradientMode": "opacity",
|
||||
"lineInterpolation": "smooth",
|
||||
"lineWidth": 2,
|
||||
"stack": "off"
|
||||
},
|
||||
"datasourceCate": "prometheus",
|
||||
"datasourceValue": "${prom}",
|
||||
"id": "ef7df29d-7dce-4788-ae42-d21d842c67d6",
|
||||
"layout": {
|
||||
"h": 3,
|
||||
"i": "ef7df29d-7dce-4788-ae42-d21d842c67d6",
|
||||
"isResizable": true,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 4
|
||||
},
|
||||
"name": "设备io util top10",
|
||||
"options": {
|
||||
"legend": {
|
||||
"displayMode": "hidden"
|
||||
},
|
||||
"standardOptions": {},
|
||||
"thresholds": {},
|
||||
"tooltip": {
|
||||
"mode": "all",
|
||||
"sort": "desc"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"expr": "topk(10, (rate(diskio_io_time[1m])/10))",
|
||||
"legend": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"type": "timeseries",
|
||||
"version": "2.0.0"
|
||||
}
|
||||
],
|
||||
"var": [
|
||||
{
|
||||
"definition": "prometheus",
|
||||
"name": "prom",
|
||||
"type": "datasource"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"cate": "prometheus",
|
||||
"value": "${prom}"
|
||||
},
|
||||
"definition": "label_values(system_load1,ident)",
|
||||
"name": "ident",
|
||||
"type": "query"
|
||||
}
|
||||
],
|
||||
"version": "3.0.0"
|
||||
},
|
||||
"public": 0,
|
||||
"public_cate": 0,
|
||||
"bgids": null,
|
||||
"built_in": 0,
|
||||
"hide": 0,
|
||||
"uuid": 1717556327757522000
|
||||
}
|
||||
@@ -53,4 +53,9 @@ nr_alloc_batch = 0
|
||||
|
||||
## arp_package
|
||||
|
||||
统计 ARP 包的数量,该插件依赖 cgo,如果需要该插件需要下载 `with-cgo` 的 categraf 发布包。
|
||||
统计 ARP 包的数量,该插件依赖 cgo,如果需要该插件需要下载 `with-cgo` 的 categraf 发布包。
|
||||
|
||||
|
||||
## ntp
|
||||
|
||||
监控机器时间偏移量,只需要给出 ntp 服务端地址,Categraf 就会周期性去请求,对比本机时间,得到偏移量,监控指标是 ntp_offset_ms 顾名思义,单位是毫秒,一般这个值不能超过 1000
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
1877
integrations/RabbitMQ/dashboards/rabbitmq_CN_v3.8_gt.json
Normal file
1877
integrations/RabbitMQ/dashboards/rabbitmq_CN_v3.8_gt.json
Normal file
File diff suppressed because it is too large
Load Diff
@@ -78,7 +78,7 @@ func (c *ConfigCache) syncConfigs() error {
|
||||
decryptMap, decryptErr := models.ConfigUserVariableGetDecryptMap(c.ctx, c.privateKey, c.passWord)
|
||||
if decryptErr != nil {
|
||||
dumper.PutSyncRecord("user_variables", start.Unix(), -1, -1, "failed to query records: "+decryptErr.Error())
|
||||
return errors.WithMessage(err, "failed to call ConfigUserVariableGetDecryptMap")
|
||||
return errors.WithMessage(decryptErr, "failed to call ConfigUserVariableGetDecryptMap")
|
||||
}
|
||||
|
||||
c.Set(decryptMap, stat.Total, stat.LastUpdated)
|
||||
|
||||
163
memsto/event_processor_cache.go
Normal file
163
memsto/event_processor_cache.go
Normal file
@@ -0,0 +1,163 @@
|
||||
package memsto
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/ccfos/nightingale/v6/dumper"
|
||||
"github.com/ccfos/nightingale/v6/models"
|
||||
"github.com/ccfos/nightingale/v6/pkg/ctx"
|
||||
|
||||
"github.com/pkg/errors"
|
||||
"github.com/toolkits/pkg/logger"
|
||||
)
|
||||
|
||||
type EventProcessorCacheType struct {
|
||||
statTotal int64
|
||||
statLastUpdated int64
|
||||
ctx *ctx.Context
|
||||
stats *Stats
|
||||
|
||||
sync.RWMutex
|
||||
eventPipelines map[int64]*models.EventPipeline // key: pipeline id
|
||||
}
|
||||
|
||||
func NewEventProcessorCache(ctx *ctx.Context, stats *Stats) *EventProcessorCacheType {
|
||||
epc := &EventProcessorCacheType{
|
||||
statTotal: -1,
|
||||
statLastUpdated: -1,
|
||||
ctx: ctx,
|
||||
stats: stats,
|
||||
eventPipelines: make(map[int64]*models.EventPipeline),
|
||||
}
|
||||
epc.SyncEventProcessors()
|
||||
return epc
|
||||
}
|
||||
|
||||
func (epc *EventProcessorCacheType) Reset() {
|
||||
epc.Lock()
|
||||
defer epc.Unlock()
|
||||
|
||||
epc.statTotal = -1
|
||||
epc.statLastUpdated = -1
|
||||
epc.eventPipelines = make(map[int64]*models.EventPipeline)
|
||||
}
|
||||
|
||||
func (epc *EventProcessorCacheType) StatChanged(total, lastUpdated int64) bool {
|
||||
if epc.statTotal == total && epc.statLastUpdated == lastUpdated {
|
||||
return false
|
||||
}
|
||||
|
||||
return true
|
||||
}
|
||||
|
||||
func (epc *EventProcessorCacheType) Set(m map[int64]*models.EventPipeline, total, lastUpdated int64) {
|
||||
epc.Lock()
|
||||
epc.eventPipelines = m
|
||||
epc.Unlock()
|
||||
|
||||
// only one goroutine used, so no need lock
|
||||
epc.statTotal = total
|
||||
epc.statLastUpdated = lastUpdated
|
||||
}
|
||||
|
||||
func (epc *EventProcessorCacheType) Get(processorId int64) *models.EventPipeline {
|
||||
epc.RLock()
|
||||
defer epc.RUnlock()
|
||||
return epc.eventPipelines[processorId]
|
||||
}
|
||||
|
||||
func (epc *EventProcessorCacheType) GetProcessorsById(processorId int64) []models.Processor {
|
||||
epc.RLock()
|
||||
defer epc.RUnlock()
|
||||
|
||||
eventPipeline, ok := epc.eventPipelines[processorId]
|
||||
if !ok {
|
||||
return []models.Processor{}
|
||||
}
|
||||
|
||||
return eventPipeline.Processors
|
||||
}
|
||||
|
||||
func (epc *EventProcessorCacheType) GetProcessorIds() []int64 {
|
||||
epc.RLock()
|
||||
defer epc.RUnlock()
|
||||
|
||||
count := len(epc.eventPipelines)
|
||||
list := make([]int64, 0, count)
|
||||
for eid := range epc.eventPipelines {
|
||||
list = append(list, eid)
|
||||
}
|
||||
|
||||
return list
|
||||
}
|
||||
|
||||
func (epc *EventProcessorCacheType) SyncEventProcessors() {
|
||||
err := epc.syncEventProcessors()
|
||||
if err != nil {
|
||||
fmt.Println("failed to sync event processors:", err)
|
||||
exit(1)
|
||||
}
|
||||
|
||||
go epc.loopSyncEventProcessors()
|
||||
}
|
||||
|
||||
func (epc *EventProcessorCacheType) loopSyncEventProcessors() {
|
||||
duration := time.Duration(9000) * time.Millisecond
|
||||
for {
|
||||
time.Sleep(duration)
|
||||
if err := epc.syncEventProcessors(); err != nil {
|
||||
logger.Warning("failed to sync event processors:", err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (epc *EventProcessorCacheType) syncEventProcessors() error {
|
||||
start := time.Now()
|
||||
|
||||
stat, err := models.EventPipelineStatistics(epc.ctx)
|
||||
if err != nil {
|
||||
dumper.PutSyncRecord("event_processors", start.Unix(), -1, -1, "failed to query statistics: "+err.Error())
|
||||
return errors.WithMessage(err, "failed to exec StatisticsGet for EventPipeline")
|
||||
}
|
||||
|
||||
if !epc.StatChanged(stat.Total, stat.LastUpdated) {
|
||||
epc.stats.GaugeCronDuration.WithLabelValues("sync_event_processors").Set(0)
|
||||
epc.stats.GaugeSyncNumber.WithLabelValues("sync_event_processors").Set(0)
|
||||
dumper.PutSyncRecord("event_processors", start.Unix(), -1, -1, "not changed")
|
||||
return nil
|
||||
}
|
||||
|
||||
lst, err := models.ListEventPipelines(epc.ctx)
|
||||
if err != nil {
|
||||
dumper.PutSyncRecord("event_processors", start.Unix(), -1, -1, "failed to query records: "+err.Error())
|
||||
return errors.WithMessage(err, "failed to exec ListEventPipelines")
|
||||
}
|
||||
|
||||
m := make(map[int64]*models.EventPipeline)
|
||||
for i := 0; i < len(lst); i++ {
|
||||
eventPipeline := lst[i]
|
||||
for _, p := range eventPipeline.ProcessorConfigs {
|
||||
processor, err := models.GetProcessorByType(p.Typ, p.Config)
|
||||
if err != nil {
|
||||
logger.Warningf("event_pipeline_id: %d, event:%+v, processor:%+v type not found", eventPipeline.ID, eventPipeline, p)
|
||||
continue
|
||||
}
|
||||
|
||||
eventPipeline.Processors = append(eventPipeline.Processors, processor)
|
||||
}
|
||||
|
||||
m[lst[i].ID] = eventPipeline
|
||||
}
|
||||
|
||||
epc.Set(m, stat.Total, stat.LastUpdated)
|
||||
|
||||
ms := time.Since(start).Milliseconds()
|
||||
epc.stats.GaugeCronDuration.WithLabelValues("sync_event_processors").Set(float64(ms))
|
||||
epc.stats.GaugeSyncNumber.WithLabelValues("sync_event_processors").Set(float64(len(m)))
|
||||
logger.Infof("timer: sync event processors done, cost: %dms, number: %d", ms, len(m))
|
||||
dumper.PutSyncRecord("event_processors", start.Unix(), ms, len(m), "success")
|
||||
|
||||
return nil
|
||||
}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user