Compare commits

...

1136 Commits

Author SHA1 Message Date
kongfei
ab83ded038 build arm64 artifacts 2022-07-07 14:15:30 +08:00
kongfei
51f82dc476 auto release with github action 2022-07-07 13:52:20 +08:00
Yening Qin
315e0ef903 fix: get clusters by api (#1030) 2022-07-07 12:29:35 +08:00
Ulric Qin
98d5dfff8e add namespace and subsystem prefix for metrics 2022-07-07 12:23:06 +08:00
Ulric Qin
6b4705608b add forward stat 2022-07-07 12:13:45 +08:00
Ulric Qin
5907817cba n9e-server: add http request stat 2022-07-07 10:52:04 +08:00
Ulric Qin
aa97ac54d1 register GaugeSampleQueueSize 2022-07-07 10:17:15 +08:00
Ulric Qin
8fe548aba9 rename mapkey alertname to rulename 2022-07-07 10:06:34 +08:00
Tripitakav
18a9288b75 fix mute bug (#1025)
Co-authored-by: tripitakav <chengzhi.shang@longbridge.sg>
2022-07-07 10:05:39 +08:00
ulricqin
fe82886f09 report sample queue size (#1027)
* report sample queue size

* report sample channel size
2022-07-07 10:00:08 +08:00
xtan
32e6993eea fix: fix event api for service (#1026)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-07-07 09:58:05 +08:00
ning
56b61909a3 fix: event service api 2022-07-07 09:44:26 +08:00
ulricqin
2ef541cdd7 refactor recording rule and and field disabled (#1022) 2022-07-06 17:21:14 +08:00
laiwei
6b1d283cda Merge pull request #1019 from ccfos/community-guide
add community guide and governance docs (draft)
2022-07-06 17:08:14 +08:00
laiwei
c8e5566c81 add stargazers chart 2022-07-06 16:25:54 +08:00
laiwei
7f3d9df089 update 2022-07-06 16:20:58 +08:00
laiwei
99aa4dbca8 update format 2022-07-06 16:12:59 +08:00
Ulric Qin
c193b8abd4 remove drop table sql 2022-07-06 16:08:24 +08:00
laiwei
4efdc4f169 update format 2022-07-06 16:06:13 +08:00
Tripitakav
1304a4630b Add recording rule (#1015)
* add prometheus recording rules

* fix recording rule sql

* add record rule note

* fix copy error

* add some regx

Co-authored-by: 尚承志 <chengzhi.shang@longbridge.sg>
2022-07-06 15:58:08 +08:00
laiwei
2cc3f939a7 Merge branch 'main' into community-guide 2022-07-06 15:44:11 +08:00
laiwei
d0260e564c Merge remote-tracking branch 'origin/main' into community-guide 2022-07-06 15:40:52 +08:00
laiwei
2a24179423 update readme to add community governance 2022-07-06 15:40:25 +08:00
laiwei
34082b44f1 Merge branch 'main' of github.com:ccfos/nightingale 2022-07-06 13:12:04 +08:00
UlricQin
bfe340d24d upgrade 5.9.4 2022-07-05 16:54:15 +08:00
xtan
a9288e376d feat: persist notify cur number (#1013)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-07-05 16:42:20 +08:00
laiwei
cb9a03d010 update guide 2022-07-05 00:43:36 +08:00
laiwei
e62366b755 community guide 2022-07-05 00:37:01 +08:00
Ulric Qin
2a2a96d9fc add contains funcmap 2022-07-04 20:03:11 +08:00
ysyneu
64a671ae13 update kafka alerts and dashboard (#1012)
* update kafka alerts and dashboard

* update kafka dashboard

Co-authored-by: yushuangyu <yushuangyu@flashcat.cloud>
2022-07-04 19:56:51 +08:00
Ulric Qin
45945876d8 update README 2022-07-03 08:57:13 +08:00
Henry Chia
90dacd0085 fix typo (#1004)
* 修改拼写错误

修改拼写错误
exsits -> exists

* Update router_login.go
2022-06-29 19:08:58 +08:00
ning
540ef68dc8 fix: alert mute add by service 2022-06-29 11:11:12 +08:00
zheng
54cc981956 fix ForDuration (#999) 2022-06-28 16:13:23 +08:00
xtan
2e8ea354d7 refactor: now categraf v0.1.6 is fine (#993)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
Co-authored-by: ulricqin <ulric.qin@gmail.com>
2022-06-28 11:01:23 +08:00
Ulric Qin
217f52294e update categraf image version to v0.1.9 2022-06-27 15:06:14 +08:00
Ulric Qin
f9b2675077 upgrade 5.9.3 2022-06-27 15:01:34 +08:00
Ulric Qin
95fd2d99b2 code refactor 2022-06-27 12:28:40 +08:00
Ulric Qin
dba2b23e9e code refactor 2022-06-27 12:25:26 +08:00
Ulric Qin
acbc199143 code refactor 2022-06-27 12:23:25 +08:00
Ulric Qin
2449c8715e update doc 2022-06-27 12:18:42 +08:00
Ulric Qin
c62593c0eb update README 2022-06-23 10:37:47 +08:00
Ulric Qin
00cbc9342f modify vx-qrcode.png 2022-06-22 16:56:08 +08:00
Ulric Qin
df5a3a37f2 refactor docker-compose.yaml for categraf 2022-06-22 10:41:59 +08:00
xtan
5ec14c588b Feat:update docker-compose from telegraf to categraf (#992)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-22 10:25:57 +08:00
Ulric Qin
d78a3a638a update issue template 2022-06-20 15:20:50 +08:00
Ulric Qin
19d2cbfa27 add issue_template 2022-06-20 15:17:10 +08:00
chenxuan
f9af916352 fix alert put api not verify bug (#987) 2022-06-20 11:50:14 +08:00
xtan
90db12b513 Fix:fix target_up nodata judge for prometheus scrape (#986) 2022-06-17 22:44:25 +08:00
Ulric Qin
7d326ef306 use metrics as hash key 2022-06-17 09:56:10 +08:00
Ulric Qin
d0b005fb14 code refactor: set createBy when update metric_view 2022-06-16 13:17:58 +08:00
Ulric Qin
118060cf77 UPDATE README 2022-06-15 15:38:11 +08:00
Ulric Qin
d2ef68daac upgrade 5.9.2 2022-06-15 14:40:43 +08:00
Ulric Qin
8393b93c53 Merge branch 'main' of github.com:ccfos/nightingale 2022-06-15 14:01:22 +08:00
Ulric Qin
63adcc2cd9 bugfix for alert-aggr-views 2022-06-15 14:01:01 +08:00
xtan
60c842c704 fix: NotifyMaxNumber for postgres db (#978)
* fix: NotifyMaxNumber for postgres db

* fix: sql for pg

Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-13 10:56:14 +08:00
Ulric Qin
f6fd6aed7f add some categraf alerts.json 2022-06-11 17:54:52 +08:00
Ulric Qin
cb92368e5b add categraf dashboard 2022-06-11 17:40:43 +08:00
Ulric Qin
8cd97db362 add some categraf dashboard 2022-06-11 17:38:10 +08:00
Ulric Qin
94e1359895 fix handler: NotifyMaxNumber 2022-06-10 17:49:48 +08:00
Ulric Qin
1bcc5b77ec remote write and read: support header 2022-06-10 17:37:33 +08:00
Ulric Qin
ae622e0c08 fix 2022-06-10 16:36:47 +08:00
Ulric Qin
c951f7d822 support max notify number 2022-06-10 16:26:53 +08:00
Ulric Qin
6a366acc74 modify log level 2022-06-10 15:39:45 +08:00
Ulric Qin
a5f7d5e9cf modify log level 2022-06-10 15:15:13 +08:00
Ulric Qin
ea2249c30c forward samples in sequence 2022-06-10 14:20:18 +08:00
Ulric Qin
a8c60c9f2b alert_aggr_view support modify by admin 2022-06-10 13:55:26 +08:00
xtan
0581e02cf3 Feat:add common template functions (#976)
* Feat:增加常用模板函数

* Feat:修改增加模板函数的实现方式

Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-08 20:40:45 +08:00
Ulric Qin
efec811b91 update README 2022-06-08 13:40:06 +08:00
Ulric Qin
f85209c817 add gif in README 2022-06-08 11:19:57 +08:00
Ulric Qin
7fda5a9a4b update README 2022-06-08 11:17:03 +08:00
Ulric Qin
ab689fc0db update README 2022-06-08 11:12:51 +08:00
Ulric Qin
cdcdbe8f70 update README 2022-06-08 11:11:15 +08:00
Ulric Qin
46ca8a409a update README 2022-06-08 10:53:58 +08:00
Ulric Qin
e5c1641b6b code refactor: move struct ReaderOptions to config 2022-06-07 18:00:11 +08:00
Ulric Qin
3e475e7e08 Merge branch 'main' of github.com:ccfos/nightingale 2022-06-07 17:56:49 +08:00
Ulric Qin
3899144f8f add header for writer post 2022-06-07 17:56:23 +08:00
ning
b8cb9e7734 fix: linux_by_telegraf dashboard 2022-06-07 14:17:17 +08:00
xtan
c62b9edf87 fix:pg数据库脚本同步 (#974)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-06 11:48:57 +08:00
ning
0e5aea40e8 Merge branch 'main' of github.com:ccfos/nightingale 2022-06-02 11:07:40 +08:00
ning
1dbfcd3dc8 refactor: service api 2022-06-02 11:07:31 +08:00
xtan
a4ef5fca46 pg数据库初始化脚本字段同步 (#969)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-01 19:57:17 +08:00
Ulric Qin
7cf309345f Merge branch 'main' of github.com:ccfos/nightingale 2022-06-01 12:58:36 +08:00
Ulric Qin
495632a064 fix alert rule delete by service 2022-06-01 12:58:09 +08:00
xtan
f6591e80ea Feat:提供基于Postgres的数据库初始化脚本 (#967)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-05-31 18:06:42 +08:00
Ulric Qin
ab5e8c366e code refactor 2022-05-31 14:44:57 +08:00
Ulric Qin
ce35e23a0f modify alert rule verify 2022-05-31 13:08:09 +08:00
Ulric Qin
ece263ea45 update readme 2022-05-30 09:44:19 +08:00
Ulric Qin
f777318cc7 update README 2022-05-30 09:13:44 +08:00
Ulric Qin
c29f3ecdeb update README 2022-05-30 09:12:16 +08:00
Ulric Qin
8f8740ad94 update README 2022-05-30 09:10:58 +08:00
Ulric Qin
9acabba761 use go1.18 and tidy go.mod 2022-05-30 09:08:54 +08:00
Ulric Qin
c3adcc877a use standalone mode when RedisType is blank 2022-05-30 08:39:35 +08:00
xtan
7f92e921b4 Feat:增加对redis集群模式、哨兵模式的支持 (#965)
* 修复go plugin相关错误

* Feat:增加对redis集群模式、哨兵模式的支持

Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-05-30 08:36:17 +08:00
caojiaqiang
e22a4394f7 feat: 告警处理出错给Maintainer管理员发送告警信息 (#955)
* feat: 告警处理出错给管理员发送告警信息

* feat: 告警处理出错给管理员发送告警信息,发送信息自己拼接,不使用模版

* feat: 告警处理出错给管理员发送告警信息,不实用AlertCurEvent结构

* feat: 告警处理出错给管理员发送告警信息,日志打印、文本发送优化
2022-05-27 19:00:41 +08:00
xtan
070e5051c6 修复go plugin相关错误 (#964)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-05-27 17:38:15 +08:00
Yening Qin
c040dffb5f feat: add some service api
* feat: add some service api
2022-05-25 15:14:52 +08:00
Ulric Qin
c2f2a7d5e2 use post method to get datasources 2022-05-23 13:31:05 +08:00
Ulric Qin
fd29d18312 delete no use code 2022-05-23 13:29:08 +08:00
Ulric Qin
2f724075b2 loop load clusters from api 2022-05-23 13:13:35 +08:00
Ulric Qin
06224e4b20 refactor 2022-05-22 17:03:57 +08:00
Ulric Qin
f81888cd8a get prometheus info from api. code skelton 2022-05-22 16:56:58 +08:00
Ulric Qin
6a7b543ad6 add mutex for prom transport 2022-05-22 12:45:25 +08:00
Ulric Qin
6ba93527ba upgrade server.conf and webapi.conf in docker environment 2022-05-21 18:10:22 +08:00
Ulric Qin
d6d2639e3a upgrade 5.8.0 2022-05-21 17:49:33 +08:00
ulricqin
ecc51001c3 New Dashboard and support variables in alert_rule_note (#953)
* change alert rule

* Db connect update (#939)

* update target's cluster field when clustername modified in server.conf

* code refactor

* db connect update

* delete DriverName

Co-authored-by: Ulric Qin <ulric.qin@gmail.com>
Co-authored-by: zhangjiandong <zhang.jiandong@baiso.com>

* update sql struct

* change sql

* add some files for new dashboard

* add new board apis

* fix query data

* add dashboard migrate api

* rule note support template

* add value as data for template

* parse rule note before persist

* use prometheus var names

* fixbug rule note template

* refactor sql

* add logo

* refactor: add some log

* mv package poster to pkg

* add version

* compute user total in usage reporter

* feat: add some service api

Co-authored-by: 710leo <710leo@gmail.com>
Co-authored-by: countingwww <871138993@qq.com>
Co-authored-by: zhangjiandong <zhang.jiandong@baiso.com>
2022-05-20 23:48:49 +08:00
GitHamburg
e2232bfa12 update server.conf, add DisableUsageReport (#949) 2022-05-20 13:13:10 +08:00
Ulric Qin
2bea8b7c84 add usage report 2022-05-17 19:24:06 +08:00
Ulric Qin
dd5ae29f82 delete no use code 2022-05-12 10:58:28 +08:00
Ulric Qin
cb741a5521 add wait tool for docker-compose 2022-05-11 12:43:34 +08:00
Ulric Qin
9d434a36d6 add wait tool for docker-compose 2022-05-11 12:42:08 +08:00
Ulric Qin
e89760f374 code refactor 2022-05-08 16:04:20 +08:00
Ulric Qin
02dd70480d update target's cluster field when clustername modified in server.conf 2022-05-08 16:02:50 +08:00
UlricQin
c8e59cdd0c upgrade 5.7.0 2022-04-28 14:30:23 +08:00
Ulric Qin
882952de3e feature: builtin metric_view can be modified by admin 2022-04-27 10:51:12 +08:00
Ulric Qin
279bec6eaa Delete redundant judgment logic 2022-04-24 10:39:24 +08:00
Ulric Qin
614ed283c0 rename MinVersion to TLSMinVersion 2022-04-22 22:25:02 +08:00
Ulric Qin
06672d5ff9 fix user group search 2022-04-22 22:18:58 +08:00
Ulric Qin
78b8cfd365 add tls configurations for webapi 2022-04-22 22:14:59 +08:00
Ulric Qin
e0f0e08852 support redis tls 2022-04-22 21:48:56 +08:00
Ulric Qin
e00f102703 give default configuration value for QueueCount 2022-04-21 12:29:43 +08:00
Ulric Qin
3921627fa2 Merge branch 'main' of github.com:didi/nightingale 2022-04-21 12:26:44 +08:00
Ulric Qin
7a1a65c31b add queue count control chan number 2022-04-21 12:24:26 +08:00
Curith
5e763f1a8b use const http status text instead of a variable (#921) 2022-04-21 11:30:25 +08:00
Ulric Qin
808fa5839a 5.6.4 dev 2022-04-21 11:08:39 +08:00
Ulric Qin
a0c5f94017 use goroutine to send metrics to backend 2022-04-21 11:07:56 +08:00
zheng
9ba1c2c32d 优化钉钉@ 方式,允许关闭at (#917)
token_xxx?noat=1
2022-04-19 15:14:18 +08:00
zheng
5333fb8eab 优化告警格式,增加 监控对象 (#918) 2022-04-19 15:05:07 +08:00
Ulric Qin
ee4a918fc7 Merge branch 'main' of github.com:didi/nightingale 2022-04-18 13:41:59 +08:00
Ulric Qin
1dbfe3417b upgrade 5.6.3 2022-04-18 13:41:20 +08:00
Ulric Qin
c829732af0 add configs for docker env 2022-04-18 13:40:40 +08:00
Yening Qin
1b313a3202 doc: improve readme (#916)
* doc: improve readme
2022-04-16 23:30:08 +08:00
qzh
5732c4403b perf: 合并targets_up指标为一个ident,减少资源利用。 (#915) 2022-04-15 21:17:16 +08:00
Yening Qin
6033a0a743 fix: err is nil (#914) 2022-04-15 14:35:26 +08:00
zheng
e8cfe46381 按告警级别和数量排序 (#913) 2022-04-15 14:33:44 +08:00
Ulric Qin
e94f807d52 delete no used code 2022-04-15 11:10:16 +08:00
Ulric Qin
c15490e756 code refactor 2022-04-14 19:09:25 +08:00
Ulric Qin
b25c523528 code refactor, use NotifyBuiltinChannels to control 2022-04-14 18:56:14 +08:00
Ulric Qin
6d27da8ad8 delete no used code 2022-04-14 17:32:51 +08:00
Ulric Qin
d0e6788724 upgrade 5.6.2 2022-04-14 17:20:07 +08:00
Ulric Qin
1633308000 modify queue size 2022-04-14 17:19:14 +08:00
UlricQin
08141e36cb use 5.6.1 image version 2022-04-14 14:40:10 +08:00
Ulric Qin
b5cfdb1ef6 upgrade 5.6.1 2022-04-14 12:58:02 +08:00
Ulric Qin
3a97a67c7e third time: code refactor for pr 906. use channel as queue for all the receivers 2022-04-14 12:57:30 +08:00
Ulric Qin
8d6101ec5a second time: code refactor for pr 906. new concurrent-map when init; move lock to WritersType 2022-04-14 12:43:39 +08:00
Ulric Qin
e73da37bc0 first time: code refactor for pr 906 2022-04-14 11:11:14 +08:00
qzh
3d587a5762 perf(opentsdb): 数据拉取以ident分发,并把list方式改为chan方式,提高消费效率。如果有多个prometheus实例,也可以通过header中的Ident字段进行一致性hash分发。 (#906)
Co-authored-by: zhihao.qu <zhihao.qu@ly.com>
2022-04-14 10:31:36 +08:00
zheng
42a6be95e8 fix dashboard name (#911) 2022-04-13 21:34:25 +08:00
zheng
ee8c367933 修复大盘目录错误 (#910) 2022-04-13 18:39:41 +08:00
Lars Lehtonen
a20e19922e src/pkg/ibex: fix dropped error (#907) 2022-04-13 10:43:32 +08:00
Ulric Qin
d6d588c5aa upgrade image version 2022-04-11 15:00:26 +08:00
Ulric Qin
1ba0f5ab74 upgrade to 5.6.0 2022-04-08 11:43:11 +08:00
Ulric Qin
b838cb1c6f return last insert object of metric view 2022-04-08 11:07:30 +08:00
Ulric Qin
7eb665e401 modify default sql of alert_aggr_view 2022-04-08 10:11:55 +08:00
Ulric Qin
cb3e371094 parse tags for cur_events 2022-04-07 18:30:11 +08:00
ning
ea30f38b9b doc: modify linux dashboard 2022-04-07 16:06:48 +08:00
ysyneu
8187334ef6 add alerts and dashboard templates for Elasticsearch, MongoDB and Linux Process (#904)
Co-authored-by: yushuangyu <yushuangyu@flashcat.cloud>
2022-04-07 15:53:09 +08:00
Ulric Qin
ac24e8b028 fix: import builtin dashboards 2022-04-07 14:09:14 +08:00
Ulric Qin
30ba544f35 fix order metric_view 2022-04-07 12:01:16 +08:00
Ulric Qin
14b1bc3710 upgrade 5.5.1 2022-04-07 11:43:49 +08:00
Ulric Qin
e8c0d6b987 order by cate and name 2022-04-07 11:37:49 +08:00
Ulric Qin
d0efb206d9 add preset metric_view 2022-04-06 19:04:02 +08:00
Ulric Qin
8abb04afde use hostname+pid instead of ip 2022-04-06 10:27:28 +08:00
Ulric Qin
f7318cfc5a alter table user to users 2022-04-05 09:12:30 +08:00
laiwei
067727165a improve readme (#898)
* improve readme

* resize img of readme
2022-04-02 12:31:10 +08:00
Ulric Qin
544c93c7cf Merge branch 'main' of github.com:didi/nightingale 2022-04-02 12:28:58 +08:00
Ulric Qin
66bc023e51 bugfix: list builtin alerts and dashboards 2022-04-02 12:21:03 +08:00
ning
c5ea2d0d24 doc: add linux telegraf dashboard 2022-04-02 11:15:45 +08:00
Ulric Qin
9e8d9b44b1 fix NotifyRecovered logic 2022-04-01 15:20:52 +08:00
Ulric Qin
db15eaab04 add linux_by_telegraf alerts 2022-03-31 16:06:14 +08:00
Ulric Qin
9d016212c8 move sender package to common 2022-03-31 15:31:21 +08:00
Ulric Qin
a4158c476e mv poster to common package 2022-03-31 15:27:14 +08:00
Ulric Qin
0f1148e096 add jmx_by_exporter dashboard 2022-03-31 15:01:03 +08:00
Ulric Qin
5d17f006f0 check smtp configurations 2022-03-31 12:02:57 +08:00
Ulric Qin
16d303a6fb rename var 2022-03-31 10:40:14 +08:00
Ulric Qin
70e5ac4898 add alert_aggr_view 2022-03-31 10:24:42 +08:00
UlricQin
a914de63c6 upgrade docker-compose to 5.5.0 2022-03-30 15:37:55 +08:00
Ulric Qin
dec518369b update readme 2022-03-30 14:31:21 +08:00
Ulric Qin
926a4e642a Merge branch 'main' of gitee.com:n9e/nightingale 2022-03-30 13:10:42 +08:00
UlricQin
3236883cce update server.conf 2022-03-30 13:09:58 +08:00
ning
be1c3b17d6 doc: add node_exporter kafka_exporter zk_exporter's dashboard and alert template 2022-03-30 13:03:14 +08:00
Yening Qin
a67356639b feat: support OIDC (#893)
* feat: support oidc

* refactor: sso -> oidc

* refactor: add AccessToken

* refactor: change some naming
2022-03-30 11:01:02 +08:00
Lars Lehtonen
7b3cb2eb00 fix router errors (#894) 2022-03-30 10:57:54 +08:00
dependabot[bot]
8459ffb690 Bump github.com/gogo/protobuf from 1.1.1 to 1.3.2 (#895)
Bumps [github.com/gogo/protobuf](https://github.com/gogo/protobuf) from 1.1.1 to 1.3.2.
- [Release notes](https://github.com/gogo/protobuf/releases)
- [Commits](https://github.com/gogo/protobuf/compare/v1.1.1...v1.3.2)

---
updated-dependencies:
- dependency-name: github.com/gogo/protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-30 10:18:31 +08:00
Ulric Qin
b260a20646 give blank method for datadog-agent 2022-03-28 16:41:31 +08:00
Ulric Qin
db29adff5d Merge branch 'main' of github.com:didi/nightingale 2022-03-28 10:25:56 +08:00
Ulric Qin
d3576440d4 fix index of metric_view and alert_aggr_view 2022-03-28 10:15:53 +08:00
Ulric Qin
c557e383b6 add metric_view crud method 2022-03-27 19:06:31 +08:00
laiwei
768a1e37e9 add contributing of readme 2022-03-25 16:02:46 +08:00
Ulric Qin
46e2fc6ab6 add windows metrics description 2022-03-23 18:08:32 +08:00
Ulric Qin
dacf004797 add windows alerts 2022-03-23 17:47:22 +08:00
Ulric Qin
44ed81218a update promql of windows dashboard 2022-03-23 17:34:06 +08:00
Ulric Qin
d802abc86c add windows dashboard 2022-03-23 17:19:50 +08:00
Ulric Qin
4c22284ca7 add cluster field when import builtin alerts 2022-03-23 14:48:28 +08:00
Ulric Qin
929c970b42 import builtin dashboard 2022-03-23 14:04:55 +08:00
Ulric Qin
496c8d8356 handle alerts builtin 2022-03-23 13:58:45 +08:00
Ulric Qin
e707f1a23d Merge branch 'main' of github.com:didi/nightingale 2022-03-23 13:36:43 +08:00
Ulric Qin
e7145018ef add alerts and dashboards 2022-03-23 13:36:22 +08:00
Jeyrce.Lu
18164fdb16 perf: optimize alert plugin call(#886) (#891) 2022-03-22 18:10:35 +08:00
Ulric Qin
3b9e40c5d4 add severity in card 2022-03-22 15:49:59 +08:00
Ulric Qin
6d20b8ef72 fill notify groups of events 2022-03-22 15:36:51 +08:00
Ulric Qin
8bdd35975e AlertCurEventGetByIds 2022-03-22 15:24:25 +08:00
Ulric Qin
9ccdd6c3e7 fix nil pointer 2022-03-22 15:18:45 +08:00
Ulric Qin
30365a2256 code refactor 2022-03-22 15:14:56 +08:00
Ulric Qin
cdd4100a30 code refactor 2022-03-22 14:43:30 +08:00
Ulric Qin
2cd9f50357 code refactor 2022-03-22 14:38:56 +08:00
Ulric Qin
106345ff49 add debug log 2022-03-22 14:26:37 +08:00
Ulric Qin
7c8c961aef query alerts card 2022-03-22 14:10:10 +08:00
Ulric Qin
e1bd7f0267 verify alert_aggr_view 2022-03-22 11:38:16 +08:00
Ulric Qin
025c5809be add alert_aggr_view crud 2022-03-22 11:19:06 +08:00
Ulric Qin
d45fdd50e7 modify sql: add group_name for event 2022-03-21 17:35:10 +08:00
Ulric Qin
f4388d36de update readme, remove gitee docs links 2022-03-21 17:11:31 +08:00
Ulric Qin
4a62339c69 do not math.Round for metric value 2022-03-21 16:43:28 +08:00
Ulric Qin
5a9b8d6bd0 add configuration: BusiGroupLabelKey 2022-03-21 14:13:04 +08:00
Ulric Qin
8ce71de693 code refactor for append labels 2022-03-21 14:04:32 +08:00
Ulric Qin
6d9846f1f5 sync busi_group 2022-03-21 12:06:53 +08:00
Ulric Qin
c9be9b0538 add label_value field for busi_group 2022-03-21 11:44:51 +08:00
Ulric Qin
65f7214e67 update redis metrics 2022-03-20 16:33:39 +08:00
Jeyrce.Lu
302cebbbec [#886] Feature: 提供一种go plugin 告警通知方式 (#887)
* [#886] Feature: 提供一种go plugin 告警通知方式

* fix: 移除下层并发
2022-03-20 10:27:17 +08:00
zheng
46c60a32fd 修复无法删除空dashboard问题 (#889) 2022-03-17 19:07:24 +08:00
Ulric Qin
7ec6d84c7d use text for chart.configs 2022-03-17 18:57:57 +08:00
Ulric Qin
0bbdb03ace add metrics for mysqld_exporter 2022-03-17 10:38:13 +08:00
Ulric Qin
149d074206 add metrics of mysqld_exporter 2022-03-16 19:43:32 +08:00
Ulric Qin
0b491826ee modify metrics order of mysqld_exporters 2022-03-16 15:18:26 +08:00
Ulric Qin
e6d4f2540c add some mysql metric descriptions 2022-03-16 14:44:16 +08:00
Ulric Qin
fcc75710cb add some mysql metrics of mysqld_exporter 2022-03-16 13:23:35 +08:00
UlricQin
de65c5a6cf docker-compose use 5.4.1 2022-03-14 19:04:37 +08:00
Ulric Qin
fde52167b3 delete no use code 2022-03-07 18:21:10 +08:00
Ulric Qin
1ffdf3d283 bugfix: AdminRole 2022-03-07 18:19:19 +08:00
Ulric Qin
94a49c17f7 persist recovered events 2022-03-03 10:44:06 +08:00
Ulric Qin
e515039ad4 use bgrwCheck func to check alert_rule put 2022-03-03 10:25:52 +08:00
Ulric Qin
93f88296da update notify.py in docker dir 2022-03-01 17:07:20 +08:00
Ulric Qin
1f4e8e752e update docker-compose configs 2022-03-01 17:04:10 +08:00
Ulric Qin
fed9b9a19d upgrade docker-compose 2022-03-01 16:32:58 +08:00
Ulric Qin
fbcc71340d upgrade 5.4.0 2022-03-01 16:27:51 +08:00
Ulric Qin
c6356df81f +NotifyBuiltinEnable 2022-03-01 16:27:21 +08:00
Ulric Qin
085bd39684 modify mailbody 2022-03-01 14:02:38 +08:00
Ulric Qin
b73bef8a0c lower NotifyConcurrency 2022-03-01 13:52:03 +08:00
Ulric Qin
9c662de129 add smtp log 2022-03-01 13:50:51 +08:00
Ulric Qin
caa37b087c use batch send mail 2022-03-01 13:44:46 +08:00
Ulric Qin
b63c853889 use smtp.DialAndSend func 2022-03-01 13:27:23 +08:00
Ulric Qin
2ff79c7780 use golang as sender 2022-03-01 11:16:55 +08:00
Ulric Qin
403cb5a6ad not stable version 2022-02-28 23:50:02 +08:00
zheng
b43f196d86 优化只保留5位小数 (#878)
* 优化只保留5位小数

* 优化小数点保留方法
2022-02-28 18:06:23 +08:00
Ulric Qin
483b353494 Merge branch 'main' of github.com:didi/nightingale 2022-02-26 12:00:25 +08:00
Ulric Qin
cddc99981d modify perm of read tasks 2022-02-26 12:00:08 +08:00
zheng
01f1f50880 限制timestamp不能大于当前时间5分钟 (#872) 2022-02-18 19:17:28 +08:00
Ulric Qin
8664c3df37 refactor 2022-02-18 16:29:49 +08:00
eshun
f009c43878 add windows support (#867)
* add windows support

* add windows support

* add windows support

Co-authored-by: 78552423@qq.com <chenyz0812>
2022-02-18 15:55:00 +08:00
Ulric Qin
f8482601a8 Merge branch 'main' of github.com:didi/nightingale 2022-02-17 19:29:51 +08:00
Ulric Qin
8c4ab88888 return all busi-groups when subscribe 2022-02-17 19:28:35 +08:00
张哲铭
37421dd56a 兼容使用pg数据库,contacts字段json格式无法转换的问题 (#868) 2022-02-15 17:58:33 +08:00
UlricQin
5c2581a90a upgrade 5.3.4 2022-02-15 17:19:47 +08:00
Ulric Qin
6a3a630759 modify Makefile 2022-02-14 16:18:30 +08:00
Ulric Qin
fff5110e9a copy metrics.yaml from https://articles.zsxq.com/id_izcsnhl3dtd6.html 2022-02-13 14:18:15 +08:00
UlricQin
d31fe9cb71 modify user-groups query limit 2022-02-11 13:05:20 +08:00
UlricQin
bd762172d4 add space in error log 2022-02-10 17:54:52 +08:00
UlricQin
b32a7b3a9e add global callback 2022-02-10 17:32:06 +08:00
UlricQin
3ccc09674e query user-groups 2022-02-10 15:45:38 +08:00
UlricQin
c10f10010a upgrade 5.3.3 2022-01-29 13:58:59 +08:00
UlricQin
9beef8f36a add last_sent_time for alert_cur_event 2022-01-29 13:46:10 +08:00
UlricQin
8408220870 upgrade 5.3.2 2022-01-29 11:12:06 +08:00
UlricQin
2e63993b7f fix 2022-01-29 11:08:32 +08:00
UlricQin
b482c7a076 recover_duration done 2022-01-29 11:01:44 +08:00
laiwei
733abd5568 update introduction of nightingale in readme 2022-01-28 13:31:46 +08:00
Ulric Qin
dd1147f534 refactor telegraf.service 2022-01-26 09:15:41 +08:00
UlricQin
19c90d356c refactor make pack 2022-01-26 09:08:28 +08:00
Ulric Qin
c042e39d54 upgrade 2022-01-26 09:03:57 +08:00
Ulric Qin
598ae07fc2 add feature: recover_duration 2022-01-26 08:59:30 +08:00
Ulric Qin
e5d7612af9 n9e-server support basic auth for Reader 2022-01-21 23:34:25 +08:00
UlricQin
f3924dab5b delete pendings when recoverRule 2022-01-12 13:50:29 +08:00
UlricQin
ac6f49e63d upgrade 5.3.0 2022-01-11 11:56:09 +08:00
UlricQin
7f4cb3888f support falcon datamodel 2022-01-11 11:25:03 +08:00
UlricQin
120c2fe52a fix proxy Host header 2022-01-10 20:16:44 +08:00
UlricQin
b9c674d662 prometheus proxy add Header Host 2022-01-08 19:40:43 +08:00
Ulric Qin
dcee4677ed Merge branch 'main' of github.com:didi/nightingale 2022-01-08 17:52:42 +08:00
Ulric Qin
d590f6d5c1 enable_in_bg logic 2022-01-08 17:52:29 +08:00
UlricQin
850a370f9d add targets apis 2022-01-06 11:48:30 +08:00
UlricQin
40e7ede5e3 Merge branch 'main' of github.com:didi/nightingale 2022-01-04 16:47:15 +08:00
UlricQin
9a2257dd1e ldap user default role configuration 2022-01-04 16:47:03 +08:00
Ulric Qin
7b4eddc967 code refactor 2021-12-31 13:52:44 +08:00
Ulric Qin
843e37b99d code refactor 2021-12-31 13:50:12 +08:00
Ulric Qin
19981ce649 refactor 2021-12-31 13:49:14 +08:00
Ulric Qin
2740af3571 add arch.png 2021-12-31 13:45:54 +08:00
Ulric Qin
b693e80d75 check basicauth 2021-12-31 12:07:23 +08:00
Ulric Qin
e9ce679649 handle python2 encoding 2021-12-31 11:13:57 +08:00
Ulric Qin
a56d6b568b refactor log print 2021-12-30 09:37:52 +08:00
Ulric Qin
904d09d91c add datadog deflate encoding 2021-12-29 14:59:05 +08:00
Ulric Qin
3700f7a10b update datadog url 2021-12-29 14:52:22 +08:00
Ulric Qin
d57415d23d add datadog receiver 2021-12-28 11:00:48 +08:00
Ulric Qin
86649d8314 Merge branch 'main' of github.com:didi/nightingale 2021-12-27 13:30:56 +08:00
Ulric Qin
06eca94492 add datadogSeries 2021-12-27 13:30:45 +08:00
JeffreyBool
ef6f6f95c0 增加 server 指标采集 (#850)
* 修改名称

* 增加 server 指标采集
2021-12-25 22:31:49 +08:00
JeffreyBool
991a3e2ab5 添加 n9e-webapi 指标采集 (#848)
* 添加自定义发现文件

* 添加 webapi 指标
2021-12-25 22:11:41 +08:00
Ulric Qin
08c6659804 use longer varchar 2021-12-24 20:03:53 +08:00
Ulric Qin
74e4724e66 delete no use code: repeater.go 2021-12-23 22:54:37 +08:00
Ulric Qin
1ea8694769 refactor fireEvent 2021-12-23 22:43:18 +08:00
Ulric Qin
218140066b fix r.rule.NotifyRepeatStep unit 2021-12-23 22:26:53 +08:00
Ulric Qin
837cfab1bd refactor repeater 2021-12-23 22:19:49 +08:00
Ulric Qin
3428b11ea8 configuration for metrics.yaml and templates 2021-12-23 12:53:32 +08:00
Ulric Qin
f661a6bd37 refactor dingtalk.tpl 2021-12-17 13:04:39 +08:00
Ulric Qin
c3c1aa5aff refactor dingtalk.tpl 2021-12-17 12:24:24 +08:00
Ulric Qin
7bcb6acb03 refactor 2021-12-17 12:11:15 +08:00
Ulric Qin
5b22d65dba add space line 2021-12-17 12:09:35 +08:00
Ulric Qin
8570c2d287 modify dingtalk markdown 2021-12-17 12:05:41 +08:00
Ulric Qin
acc797666d test markdown 2021-12-17 11:20:32 +08:00
Ulric Qin
b62a42bed8 dingtalk use markdown 2021-12-17 11:05:15 +08:00
Ulric Qin
b452be880b update README 2021-12-16 19:57:55 +08:00
Ulric Qin
49176ae240 support grafana-agent 2021-12-16 17:58:49 +08:00
Ulric Qin
8eb4a39e7d fix index out of range 2021-12-16 17:07:27 +08:00
Ulric Qin
0f65a1f5dd add remote write api support 2021-12-16 16:59:51 +08:00
Ulric Qin
a71edc4040 extract IamLeader function and fix repeat 2021-12-15 20:52:00 +08:00
Ulric Qin
23b6cf1a68 fix repeat sender 2021-12-15 19:37:55 +08:00
Ulric Qin
3babc6c50a fix tple 2021-12-15 19:22:18 +08:00
Ulric Qin
a4ef00fe3e add send time 2021-12-15 19:16:39 +08:00
Ulric Qin
0f3bbf6368 use NotifyRepeatNext as TriggerTime when repeat notify 2021-12-15 18:37:48 +08:00
Ulric Qin
95ebc44f05 refactor notify.py 2021-12-14 21:39:01 +08:00
UlricQin
64945637e0 upgrade 5.0.0-ga-06 2021-12-14 15:29:19 +08:00
Ulric Qin
0baf977bc9 feishu done flag 2021-12-14 15:08:20 +08:00
Ulric Qin
caa33c29e9 refactor creating busi group 2021-12-13 11:12:49 +08:00
Ulric Qin
d5050338f3 use last_eval_time for filter 2021-12-11 18:14:23 +08:00
Ulric Qin
7f0877bf28 add table column: last_eval_time in alert_his_event 2021-12-11 18:07:01 +08:00
Ulric Qin
d4c4257517 code refactor for i18n when occur duplicate tagkey 2021-12-11 17:25:45 +08:00
Ulric Qin
61f76afa0d handle duplicate tagkey 2021-12-11 17:23:18 +08:00
UlricQin
fe86cb4b74 update version 2021-12-11 17:22:41 +08:00
Ulric Qin
5634f48725 remove perm of targets 2021-12-10 09:49:11 +08:00
Ulric Qin
964d50b4e7 add perm function in routers 2021-12-10 09:44:06 +08:00
Ulric Qin
d2cb48a2ef remove writer name 2021-12-09 23:07:45 +08:00
Ulric Qin
53411dc5d9 add perm 2021-12-09 22:08:22 +08:00
Ulric Qin
cab6089a37 add perm control busi-group adding 2021-12-09 22:04:16 +08:00
Ulric Qin
32fea64f3e use configuration file to control AnonymousAccess 2021-12-09 16:59:02 +08:00
Ulric Qin
bf4e0ca7c0 modify github template 2021-12-09 14:22:39 +08:00
Ulric Qin
39bd02f741 Merge branch 'main' of gitee.com:n9e/nightingale 2021-12-09 12:58:40 +08:00
Ulric Qin
930b1181ee add tmp jvm-dash.json 2021-12-09 12:58:08 +08:00
UlricQin
1aac8c1e25 delete dirty files 2021-12-08 23:59:54 +08:00
Ulric Qin
3e8b110809 upgrade 5.0.0-ga-04 2021-12-08 23:57:20 +08:00
Ulric Qin
e0c1bebb13 modify n9eetc dir 2021-12-08 23:55:51 +08:00
UlricQin
7ccb2aaa9c upgrade 5.0.0-ga-03 2021-12-08 22:48:02 +08:00
Ulric Qin
aa2e5f15ee update recover event 2021-12-08 22:31:48 +08:00
Ulric Qin
ed5e93f373 modify event url 2021-12-08 21:36:21 +08:00
Ulric Qin
48247ea7fe At least one team have rw permission 2021-12-08 13:18:53 +08:00
Ulric Qin
12a5f335bd get event detail no need login 2021-12-08 10:04:31 +08:00
Ulric Qin
5e19eadd61 add recover_time only when IsRecovered 2021-12-08 00:17:42 +08:00
Ulric Qin
0e88f0074c add recover_time 2021-12-08 00:07:25 +08:00
Ulric Qin
2bfc67686d refactor alert_subscribe.user_group_ids 2021-12-07 19:33:39 +08:00
Ulric Qin
6c2c8f9900 add feishu support 2021-12-07 18:39:44 +08:00
Ulric Qin
766bf9e401 code refactor 2021-12-07 13:50:42 +08:00
Ulric Qin
4f8fedbaa0 delete no use code 2021-12-07 13:44:14 +08:00
Ulric Qin
b108c9f11a refactor: The business group must retain at least one team 2021-12-06 21:33:36 +08:00
Ulric Qin
cc380c85b9 upgrade docker-compose's n9e image version to 5.0.0-ga-02 2021-12-06 20:34:45 +08:00
Ulric Qin
62165ce01d add operations 2021-12-06 19:23:43 +08:00
Ulric Qin
c8b05649f5 modify rule operations 2021-12-06 19:20:33 +08:00
UlricQin
94fb62fcca upgrade 5.0.0-ga-02 2021-12-06 19:06:38 +08:00
Ulric Qin
bef8e8e548 bugfix: handle rule judge 2021-12-06 18:44:56 +08:00
Ulric Qin
88063cd30e bugfix: callback ibex 2021-12-06 18:20:44 +08:00
Ulric Qin
2185fbff65 Merge branch 'main' of gitee.com:n9e/nightingale into main 2021-12-06 15:19:30 +08:00
Ulric Qin
a94a602d4f remove jwtAuth in prom api 2021-12-06 15:18:56 +08:00
UlricQin
6ed13bdccb add pub when pack 2021-12-06 10:56:55 +08:00
UlricQin
ff79ad1338 add disk and diskio metric description 2021-12-06 10:33:58 +08:00
UlricQin
f6703e11c4 add some metric desn 2021-12-06 09:40:14 +08:00
UlricQin
2e1936dcce modify doc 2021-12-06 08:35:21 +08:00
UlricQin
698ac2758f upgrade docker-compose for test 2021-12-05 22:12:05 +08:00
UlricQin
8cef8e5c9e upgrade docker-compose 2021-12-05 22:05:53 +08:00
UlricQin
c5c53466fb refactor Dockerfile 2021-12-05 21:57:36 +08:00
UlricQin
acd2e9398b add pub in Dockerfile 2021-12-05 21:41:17 +08:00
UlricQin
df97166f07 add api: check perm 2021-12-05 20:40:13 +08:00
UlricQin
022fef2b9e add telegraf.service 2021-12-05 15:39:49 +08:00
UlricQin
5f05e8fcaf ignore 2021-12-05 15:07:26 +08:00
UlricQin
7e353eb0e8 ga-01 done 2021-12-04 17:11:39 +08:00
UlricQin
499389d2c3 modify default group 2021-12-04 16:53:20 +08:00
UlricQin
7274f606dc delete no use binary 2021-12-04 16:51:53 +08:00
UlricQin
da52f125f3 ignore .payload 2021-12-04 12:09:44 +08:00
UlricQin
b418dec3ab bugfix: event mute 2021-12-04 12:07:30 +08:00
UlricQin
79401183ca bugfix 2021-12-02 17:37:42 +08:00
UlricQin
270d3b7e5b code refactor 2021-12-02 17:34:54 +08:00
UlricQin
4e3f9914f1 use i18n error when import rules and dashboards 2021-12-02 10:19:10 +08:00
UlricQin
dd8e1f2d71 add api: /api/n9e/version 2021-12-01 16:46:37 +08:00
UlricQin
f63f019e87 refactor readme 2021-12-01 15:35:38 +08:00
UlricQin
11e7c41908 add EngineDelay 2021-12-01 14:09:08 +08:00
UlricQin
57c2fd9b73 update jwt 2021-12-01 11:40:49 +08:00
UlricQin
dc9fe38735 modify args: hours->days 2021-12-01 11:26:44 +08:00
UlricQin
622d4ac165 refactor 2021-12-01 10:14:35 +08:00
Ulric Qin
3090e13be7 verify tpl tags modify 2021-11-30 18:16:09 +08:00
UlricQin
f96a36aa43 bugfix 2021-11-30 14:25:02 +08:00
UlricQin
6ad24419ab engine wait 2min 2021-11-30 12:33:37 +08:00
UlricQin
04319a6b41 add /v1/n9e/users 2021-11-30 11:57:55 +08:00
UlricQin
952f6b139d add api: get one alert-subscribe 2021-11-30 11:49:08 +08:00
UlricQin
f58cb923d4 add todo item 2021-11-29 20:29:21 +08:00
UlricQin
d43067bad4 bugfix 2021-11-29 20:06:45 +08:00
UlricQin
c17ade64e1 bugfix 2021-11-29 19:56:36 +08:00
UlricQin
4ddbba1400 bugfix 2021-11-29 15:36:15 +08:00
UlricQin
536e2a3b7c modify img width 2021-11-28 19:19:27 +08:00
UlricQin
fe97e158ef add doc 2021-11-28 19:16:30 +08:00
UlricQin
6e3ad3dd6b version 5.1 2021-11-28 18:57:49 +08:00
UlricQin
7a2b07eebd code refactor 2021-11-18 09:09:38 +08:00
UlricQin
70235eeeee Merge branch 'master' of https://github.com/didi/nightingale 2021-11-18 09:08:17 +08:00
UlricQin
f0f5af4fb0 code refactor 2021-11-18 09:07:53 +08:00
710leo
0254a4ec34 refactor: ldap search request 2021-11-11 15:16:26 +08:00
710leo
6d24b07573 feat: add get-user-by-token api 2021-09-09 23:18:58 +08:00
Ulric Qin
0d19ec267f set http.Server.ReadTimeout to 30*time.Second 2021-09-04 19:48:11 +08:00
Ulric Qin
c63987d726 add limit for local call 2021-09-04 19:45:38 +08:00
Ulric Qin
086dcad81f move var Version to config package 2021-09-04 19:35:34 +08:00
UlricQin
eaacf04c68 feature: load alert_events to memory when start 2021-09-04 16:41:55 +08:00
UlricQin
ee859df057 Merge branch 'master' of github.com:didi/nightingale 2021-09-04 16:21:01 +08:00
UlricQin
d809c6ffa9 bugfix: cannot delete alert_event when recovered 2021-09-04 16:20:46 +08:00
710leo
0cc4d85b37 refactor: remove processor logic 2021-09-04 12:43:49 +08:00
710leo
7c2f49146d refactor: modify history_alert_event column 2021-09-04 10:58:31 +08:00
UlricQin
19c2fb6f82 remove processor logic 2021-09-04 10:47:21 +08:00
710leo
882a97566b docs: upgrade 5.0.0-rc7 2021-09-03 10:46:41 +08:00
UlricQin
b1d67af206 code refactor 2021-09-03 09:51:55 +08:00
UlricQin
ca1daaaea3 code refactor 2021-09-03 09:49:19 +08:00
UlricQin
b24b1d17e6 code refactor 2021-09-03 09:23:33 +08:00
710leo
86b2dcd248 fix: support new static file 2021-09-02 12:23:20 +08:00
UlricQin
bd84c433cd set alias blank if __alias__ not found 2021-08-31 16:50:57 +08:00
Ulric Qin
238a611cbb Merge branch 'master' of github.com:didi/nightingale 2021-08-28 17:22:40 +08:00
Ulric Qin
8b951a306d code refactor 2021-08-28 17:21:29 +08:00
UlricQin
d685fa2a30 code refactor 2021-08-27 18:58:06 +08:00
Istil
ae0b036ae4 feat: support to mute by resource classpath_prefix (#785) 2021-08-27 18:21:36 +08:00
UlricQin
351dee6a12 Merge branch 'master' of github.com:didi/nightingale 2021-08-26 12:22:19 +08:00
UlricQin
8b748a7840 delete user/group when get alert rule if user/group is deleted 2021-08-26 12:21:57 +08:00
710leo
afb01b0258 fix: batch update alert rule 2021-08-26 11:03:32 +08:00
UlricQin
d92ca5f2a9 Merge branch 'master' of github.com:didi/nightingale 2021-08-24 10:35:17 +08:00
UlricQin
f21909adb8 check resource exists when bind classpath 2021-08-24 10:35:00 +08:00
710leo
2bf4a25dd1 refactor: delete classpaths tree api 2021-08-23 19:48:42 +08:00
710leo
2096e6ea8d feat: support get user by name api 2021-08-23 19:46:18 +08:00
710leo
25cad60ef8 refactor: event process 2021-08-23 18:35:54 +08:00
UlricQin
8e5b72c833 Update bug_report.md 2021-08-23 17:04:56 +08:00
UlricQin
90d9510d23 Update bug_report.md 2021-08-23 17:03:34 +08:00
710leo
58093eedf0 feat: script collect support append_tags and fix timeout 2021-08-23 16:59:55 +08:00
Istil
754ad160dc feat: support to deal with alert_event (#777) 2021-08-23 16:39:07 +08:00
UlricQin
12e3b3a490 add goproxy 2021-08-23 10:11:53 +08:00
Ulric Qin
eca12a6587 remove csrf 2021-08-22 17:55:49 +08:00
Ulric Qin
5106b73699 del unused code 2021-08-22 09:43:42 +08:00
Ulric Qin
98fe1e0121 feat: add timer: CleanStalePoints 2021-08-22 09:30:41 +08:00
Ulric Qin
de99077b32 code refactor: remove unused field(TagsLst) of MetricPoint 2021-08-22 08:45:29 +08:00
Ulric Qin
e288a3d3a9 Merge branch 'master' of github.com:didi/nightingale 2021-08-21 21:52:47 +08:00
Ulric Qin
0588d39911 bugfix: handle status when create alert_rule 2021-08-21 21:52:08 +08:00
710leo
55ec76a23d refactor: fix conflicts 2021-08-20 19:52:37 +08:00
710leo
a2eab9e5ab refactor: delete csrf check and some v1 api 2021-08-20 19:45:17 +08:00
Istil
f651af970f feat: support classpaths prefix tree display (#769) 2021-08-18 16:43:39 +08:00
Ulric Qin
1eecb324d0 code refactor: exec pull_prom early 2021-08-15 10:28:37 +08:00
Ulric Qin
60a964ae55 bugfix: for range goroutine 2021-08-15 10:08:08 +08:00
Ulric Qin
a7cf8f9ec9 !1 fix judge prom
Merge pull request !1 from Ulric Qin/judge_prom_bugfix
2021-08-14 16:20:17 +00:00
Ulric Qin
0b4e3b9656 fix judge prom 2021-08-14 23:17:06 +08:00
710leo
ca8a8701b4 fix: collect append tags 2021-08-13 16:01:33 +08:00
710leo
4ca83fcc1a docs: upgrade 5.0.0-rc5 2021-08-12 17:53:19 +08:00
710leo
509e1ef00a refactor: remove useless code 2021-08-12 17:41:57 +08:00
ning1875
42fc0527cb 1. Move the default ql to the configuration (#764)
2. add slowLogRecordSecond to  log slow query
3. Create a slice with a specified length to avoid dynamic expansion
4. slow query print fetch series time took and the result series num
2021-08-10 15:25:54 +08:00
UlricQin
8b508fc514 code refactor 2021-08-06 18:03:36 +08:00
ning1875
2f060cfb43 1. set Config.Heartbeat.LocalAddr when ip can not fetch by auto detect (#761)
2. upgrade snappy to v0.0.3 to avoid the following failure on Go 1.16
2021-08-06 12:19:36 +08:00
Istil
4eb79fb017 feat: support history alert events store (#760) 2021-08-06 12:16:32 +08:00
ning1875
c38d595cb8 1. fix CommonQuerySeries.Warning range nil pointer err (#759) 2021-08-05 11:04:50 +08:00
710leo
e29407486d update service file 2021-08-04 22:33:43 +08:00
710leo
4d3ca94e4c docs: upgrade 5.0.0-rc4 2021-08-04 17:56:26 +08:00
710leo
e27ec8136e refactor: remote write log print 2021-08-04 14:26:53 +08:00
ning1875
9383976918 1. delete recovery event from cache after event is really mark recovery (#758) 2021-08-04 12:14:47 +08:00
710leo
8764270f47 feat: support permission check api 2021-08-04 11:05:54 +08:00
Istil
2ef85d9aae feat: support batch edit notify-users, notify-channels and append-tags of alert rule (#757) 2021-08-03 14:20:22 +08:00
201806060205
5064e5b0b1 feat: support batch edit notify groups of alert rule (#756) 2021-08-02 20:16:43 +08:00
ning1875
72244b1983 1. remove some todo (#755) 2021-08-02 17:37:55 +08:00
710leo
e14d3eac4d refactor: get notify content by tpl 2021-08-01 18:32:36 +08:00
UlricQin
62cbe4a833 Merge branch 'master' of https://github.com/didi/nightingale 2021-07-30 16:03:44 +08:00
UlricQin
6040e79d50 sort classpaths of event 2021-07-30 16:03:25 +08:00
710leo
ffe3dd6bca refactor: alert rule name duplicate check 2021-07-30 13:30:50 +08:00
710leo
853053f56d fix: alert rule put 2021-07-30 10:58:42 +08:00
710leo
dde431b422 Merge branch 'master' of https://github.com/didi/nightingale 2021-07-30 10:38:54 +08:00
ning1875
e6cf77f34b 1. query nodata at one ts fill none (#753) 2021-07-30 10:19:03 +08:00
UlricQin
834a5e83ee Merge branch 'master' of https://github.com/didi/nightingale 2021-07-30 10:14:34 +08:00
UlricQin
f2173bff26 log regexp checker 2021-07-30 10:14:17 +08:00
710leo
c4b1da66d4 docs: change metric_description sql 2021-07-29 20:32:19 +08:00
ning1875
9c9f2973f8 1. fix remote write err assert on recoverableError (#749) 2021-07-28 12:11:36 +08:00
UlricQin
0c35f32c5c fix send_email 2021-07-27 18:31:50 +08:00
UlricQin
f2e3e3dbf1 add WECOM url 2021-07-27 15:42:14 +08:00
李伟强
5045098c91 Update notify.py (#748) 2021-07-27 15:35:37 +08:00
ning1875
20e34bfe15 1. tag-pair support fuzzy matching search (#747) 2021-07-27 15:07:54 +08:00
dependabot[bot]
0f63feed22 build(deps): bump github.com/gin-gonic/gin from 1.6.3 to 1.7.0 (#746)
Bumps [github.com/gin-gonic/gin](https://github.com/gin-gonic/gin) from 1.6.3 to 1.7.0.
- [Release notes](https://github.com/gin-gonic/gin/releases)
- [Changelog](https://github.com/gin-gonic/gin/blob/master/CHANGELOG.md)
- [Commits](https://github.com/gin-gonic/gin/compare/v1.6.3...v1.7.0)

---
updated-dependencies:
- dependency-name: github.com/gin-gonic/gin
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-07-27 15:06:58 +08:00
ning1875
ec6f3098bb notify.py add sys.encoding to avoid coding error (#744) 2021-07-27 11:38:42 +08:00
UlricQin
53f08eae30 Update build.sh (#745) 2021-07-27 11:28:11 +08:00
ning1875
cade83f075 添加debug日志 (#743)
* 1. notify.py 支持安装channel反射发送
2. 支持钉钉群发送
3. 生成告警模板信息

* 1. notify.py 支持安装channel反射发送
2. 支持钉钉群发送
3. 增加二开说明

* 1. notify.py 用户创建一个虚拟的用户保存上述im群 的机器人token信息 user的contacts map中

* 1. notify.py alerts目录改为原来的

* 1. notify.py dingtalk send continue匹配

* 1. push型告警支持多条件 任意一个触发就触发

* 1. prometheus查询接口 tag-keys tag-values支持 params为空的情况

* 1. prometheus查询接口 ident匹配全部改为精确匹配
2. tagKey 提示改为tag_key

* 1. prometheus查询接口 支持instance_query 对外暴露

* 1. prometheus instance_query改名为instant-query
2. page group中去掉数据查询相关path

* 1. prometheus range_query 时间戳改为秒级
2. 查询支持传入分辨率参数

* 1. 新增jmx_exporter内置大盘

* 1. 新增blackbox_exporter内置大盘
2. 新增blackbox_exporter内置告警策略

* 1. 添加一些debug帮助定位恢复的告警在db event中删除的过程
2021-07-26 17:36:32 +08:00
710leo
af93088d2f refactor: change collect config 2021-07-26 14:44:03 +08:00
710leo
e396ad4f67 refactor: change collect config 2021-07-25 21:41:37 +08:00
Ulric Qin
979a77eafa simplify logs 2021-07-25 18:52:06 +08:00
UlricQin
c0b42cf29a bugfix: reuse error var in remoteWriteProm (#741)
* fix: reuse error var when remoteWritePost

* add debug log

* remove logs
2021-07-25 18:23:47 +08:00
710leo
ae4f20bca1 Merge branch 'master' of https://github.com/didi/nightingale 2021-07-25 18:19:48 +08:00
710leo
b947a466a9 fix: alert ineffective on sunday 2021-07-25 18:19:18 +08:00
710leo
4063998ddb docs: change user.role to user.roles in sql 2021-07-24 23:48:13 +08:00
UlricQin
266397bac3 hehe... 2021-07-24 23:47:33 +08:00
UlricQin
f023b99fa9 Merge branch 'master' of https://github.com/didi/nightingale 2021-07-23 21:15:41 +08:00
UlricQin
bb148f9bea add more log 2021-07-23 21:15:23 +08:00
710leo
7bdcbf2e95 Merge branch 'master' of https://github.com/didi/nightingale 2021-07-23 12:15:19 +08:00
710leo
070d1947e8 feat: temporary support for creating collection configurations api 2021-07-23 12:15:08 +08:00
UlricQin
a4c244cb61 modify user_group_ids of alert_rule_group when user_group deleted 2021-07-23 12:03:47 +08:00
UlricQin
e089271f78 code refactor 2021-07-22 23:41:22 +08:00
UlricQin
24887cce83 use group get api: handle empty usergroup 2021-07-22 23:34:06 +08:00
UlricQin
0f7a81ff11 fill objs when return alert event 2021-07-22 23:30:12 +08:00
ning1875
2d15445482 新增blackbox_exporter支持 (#740)
* 1. notify.py 支持安装channel反射发送
2. 支持钉钉群发送
3. 生成告警模板信息

* 1. notify.py 支持安装channel反射发送
2. 支持钉钉群发送
3. 增加二开说明

* 1. notify.py 用户创建一个虚拟的用户保存上述im群 的机器人token信息 user的contacts map中

* 1. notify.py alerts目录改为原来的

* 1. notify.py dingtalk send continue匹配

* 1. push型告警支持多条件 任意一个触发就触发

* 1. prometheus查询接口 tag-keys tag-values支持 params为空的情况

* 1. prometheus查询接口 ident匹配全部改为精确匹配
2. tagKey 提示改为tag_key

* 1. prometheus查询接口 支持instance_query 对外暴露

* 1. prometheus instance_query改名为instant-query
2. page group中去掉数据查询相关path

* 1. prometheus range_query 时间戳改为秒级
2. 查询支持传入分辨率参数

* 1. 新增jmx_exporter内置大盘

* 1. 新增blackbox_exporter内置大盘
2. 新增blackbox_exporter内置告警策略
2021-07-22 15:23:49 +08:00
UlricQin
3236d7dfd1 code refactor for name checker 2021-07-22 14:51:14 +08:00
UlricQin
8a06cac5f1 bugfix: check perm for alert_rule_delete_batch 2021-07-22 14:44:44 +08:00
UlricQin
60d5c6b55e modify readme 2021-07-22 09:51:39 +08:00
UlricQin
77c6e0dbff Merge branch 'master' of https://github.com/didi/nightingale 2021-07-21 18:13:43 +08:00
UlricQin
3e22aabe28 tag value support blank 2021-07-21 18:11:45 +08:00
ning1875
bedea9eb05 添加jmx_exporter内置大盘图 (#739)
* 1. notify.py 支持安装channel反射发送
2. 支持钉钉群发送
3. 生成告警模板信息

* 1. notify.py 支持安装channel反射发送
2. 支持钉钉群发送
3. 增加二开说明

* 1. notify.py 用户创建一个虚拟的用户保存上述im群 的机器人token信息 user的contacts map中

* 1. notify.py alerts目录改为原来的

* 1. notify.py dingtalk send continue匹配

* 1. push型告警支持多条件 任意一个触发就触发

* 1. prometheus查询接口 tag-keys tag-values支持 params为空的情况

* 1. prometheus查询接口 ident匹配全部改为精确匹配
2. tagKey 提示改为tag_key

* 1. prometheus查询接口 支持instance_query 对外暴露

* 1. prometheus instance_query改名为instant-query
2. page group中去掉数据查询相关path

* 1. prometheus range_query 时间戳改为秒级
2. 查询支持传入分辨率参数

* 1. 新增jmx_exporter内置大盘
2021-07-21 14:20:01 +08:00
ning1875
0d8e5ec77c prometheus range_query 时间戳和分辨率 (#738)
* 1. notify.py 支持安装channel反射发送
2. 支持钉钉群发送
3. 生成告警模板信息

* 1. notify.py 支持安装channel反射发送
2. 支持钉钉群发送
3. 增加二开说明

* 1. notify.py 用户创建一个虚拟的用户保存上述im群 的机器人token信息 user的contacts map中

* 1. notify.py alerts目录改为原来的

* 1. notify.py dingtalk send continue匹配

* 1. push型告警支持多条件 任意一个触发就触发

* 1. prometheus查询接口 tag-keys tag-values支持 params为空的情况

* 1. prometheus查询接口 ident匹配全部改为精确匹配
2. tagKey 提示改为tag_key

* 1. prometheus查询接口 支持instance_query 对外暴露

* 1. prometheus instance_query改名为instant-query
2. page group中去掉数据查询相关path

* 1. prometheus range_query 时间戳改为秒级
2. 查询支持传入分辨率参数
2021-07-20 16:15:06 +08:00
UlricQin
de840af331 Multi roles support (#737)
* support multi user roles

* resources list support search tags and note
2021-07-20 14:40:50 +08:00
710leo
255b2b2320 fix: router path duplicate 2021-07-19 19:31:58 +08:00
ning1875
034b7a642e refactor: rename instance_query to instant_query 2021-07-19 19:17:22 +08:00
710leo
407f9ca6ad refactor: change query default resolution 2021-07-19 16:27:37 +08:00
ning1875
bf1d8b1be4 feat: add instance_query api (#731) 2021-07-19 16:14:52 +08:00
UlricQin
562f3ea937 code refactor 2021-07-18 16:39:10 +08:00
UlricQin
0a29fb89c4 test xorm 2021-07-18 16:30:02 +08:00
UlricQin
27daddcb72 bugfix: query alert event 2021-07-18 16:16:57 +08:00
UlricQin
c7b00ee8c6 rename default dash 2021-07-18 15:08:14 +08:00
UlricQin
1d7c7fd8af add i18n configuration 2021-07-18 09:04:01 +08:00
710leo
6b06e78b61 Merge branch 'master' of https://github.com/didi/nightingale 2021-07-17 19:33:53 +08:00
710leo
9ec1882032 docs: update changelog 2021-07-17 19:33:46 +08:00
Ulric Qin
18fc86d68a refactor plugin example 2021-07-17 19:27:10 +08:00
710leo
a628d5bb59 docs: change tpl and sql 2021-07-17 18:48:55 +08:00
710leo
df1e1cd334 docs: upgrade 5.0.0-rc2 2021-07-17 17:03:47 +08:00
710leo
d6c6eaa064 refactor: series push api 2021-07-17 15:37:16 +08:00
yubo
b4bdb08dc1 fix: support gzip/zlib with series push (#734) 2021-07-17 14:56:10 +08:00
Ulric Qin
ae9c21e293 code refactor 2021-07-16 23:46:11 +08:00
Ulric Qin
b65c8f696b check tag key 2021-07-16 23:30:09 +08:00
UlricQin
88e6e4bf56 add plugin example 2021-07-16 20:29:34 +08:00
UlricQin
8f4597045d rename preset classpath all->all.resources 2021-07-16 20:02:56 +08:00
UlricQin
a7f12ad871 support query data use guest 2021-07-16 18:41:11 +08:00
UlricQin
4e791d50d4 refactor warning info at api: check regexp 2021-07-16 08:30:48 +08:00
UlricQin
9758e55b72 code refactor: extract _s and _e func 2021-07-15 18:46:40 +08:00
710leo
473239cc9a refactor: format history points timestamp 2021-07-10 02:42:57 +08:00
710leo
477cac6ca9 fix: process event mute 2021-07-10 02:32:28 +08:00
710leo
258e9738f7 feat: add tpl & status api 2021-07-09 20:13:28 +08:00
ning1875
39de0892f1 fix: query index api 2021-07-08 20:37:23 +08:00
710leo
36ec4e09fd docs: add docs 2021-07-06 23:00:44 +08:00
710leo
aa4e6b7f36 feat: dashboards import and export 2021-07-06 22:52:36 +08:00
qinyening
6440645c5a feat: alert rule api fill users and groups (#723) 2021-07-06 21:33:19 +08:00
qinyening
4585519943 fix: tag-keys tag-values query when params are empty (#722)
Co-authored-by: ning1875 <907974064@qq.com>
2021-07-06 20:42:27 +08:00
ning1875
1f16bc9a7b refactor: send dingtalk notify 2021-07-04 18:53:33 +08:00
710leo
4b9cbf9aee refactor: add metric description in sql 2021-07-02 09:14:14 +08:00
710leo
e0cc7dbffa refactor: log regex check api 2021-06-29 20:09:40 +08:00
ning1875
fd9d78061b feat: notify support mail and dingding 2021-06-29 14:55:30 +08:00
UlricQin
b03d57f40a do not cache ident alias mapper when ident is blank 2021-06-28 18:17:14 +08:00
qinyening
4e6e70c14d release v5.0.0-rc1 (#708)
* release v5.0.0-rc1
2021-06-28 00:42:39 +08:00
710leo
2ef9a77325 upgrade 4.0.3 2021-06-27 18:13:26 +08:00
710leo
18b9fb3ee2 add some log 2021-06-25 11:46:34 +08:00
710leo
02f2554cc1 fix: nodata repeated recovery alerting 2021-06-22 23:11:55 +08:00
stonelgh
07961c9f21 m3db: fix Errorf calls (#703) 2021-06-21 15:06:44 +08:00
wjkxiaowu
f770b3cf14 add system env when plugin run (#699)
Co-authored-by: root <root@localhost.localdomain>
2021-06-15 11:13:51 +08:00
UlricQin
62dd006d50 Update README.md 2021-06-14 20:57:43 +08:00
qinyening
9ff845d375 Update README.md 2021-06-09 19:32:21 +08:00
moses
58860dca48 去除配置文件重复项 (#694) 2021-06-09 15:15:08 +08:00
710leo
f2e397f533 upgrade 4.0.2 2021-06-02 01:05:20 +08:00
710leo
5afff12848 upgrade 4.0.2 2021-06-02 00:45:35 +08:00
yubo
37abf19f0d add m3db client timeout check (#693) 2021-05-31 15:35:00 +08:00
710leo
bbbd7faeb1 bugfix: user and team info cache 2021-05-27 20:55:47 +08:00
710leo
c1382dc0aa Merge branch 'master' of https://github.com/didi/nightingale 2021-05-27 00:46:32 +08:00
710leo
a73f2654df bugfix: aggr output and alert 2021-05-27 00:46:21 +08:00
UlricQin
abc9a6ffbf Merge branch 'master' of https://github.com/didi/nightingale 2021-05-26 16:12:02 +08:00
UlricQin
87e32a159b upgrade toolkits/pkg 2021-05-26 16:11:52 +08:00
710leo
22f0aee55d add event write perm check 2021-05-25 17:54:09 +08:00
710leo
01420ff1d8 optimize user information filling 2021-05-16 17:42:53 +08:00
710leo
c4b5d13348 optimize user information filling 2021-05-16 15:42:30 +08:00
hubo
9cf2d47eef agent 增加默认tags功能, agent 增加正则匹配磁盘挂载类型过滤功能 (#683)
* agent 增加默认tags功能, agent 增加正则匹配磁盘挂载类型过滤功能

* agent 增加默认tags功能, agent 增加正则匹配磁盘挂载类型过滤功能

Co-authored-by: huboc <huboc@zbj.com>
2021-05-08 19:17:01 +08:00
Paul Chu
a9d6d6f820 支持节点迁移 (#680)
* enable promethues summary

* ADD: 添加节点迁移的方法

* FIX: node move session commit

* ADD: 注册迁移节点的接口

* MOD: fix error handle

Co-authored-by: zhupeiyuan <zhupeiyuan@fenbi.com>
2021-05-07 11:10:05 +08:00
Ulric Qin
f70d303942 fix http_response compile error 2021-05-06 17:00:18 +08:00
UlricQin
967c3aa591 Merge branch 'master' of https://github.com/didi/nightingale 2021-04-29 11:32:28 +08:00
UlricQin
3a47fb2c79 use n9e-3.8.0.tar.gz in Dockerfile 2021-04-29 11:32:19 +08:00
peng19940915
1112186d1c 新增postgresql监控 (#671)
* add postgresql & remove http_response status_code tag

* add postgresql & remove http_response status_code tag

Co-authored-by: leiyupeng <susu898287771@>
2021-04-27 23:16:07 +08:00
yubo
f40332f197 bugfix: add user.Type (#667) 2021-04-26 19:15:33 +08:00
UlricQin
a11813f4b2 Merge branch 'master' of https://github.com/didi/nightingale 2021-04-26 09:16:11 +08:00
UlricQin
13d396a388 code refactor 2021-04-26 09:15:56 +08:00
Ulric Qin
3d3458d577 add LimitNOFILE example in service files 2021-04-24 13:25:33 +08:00
Ulric Qin
e142785a9d add ams-builtin-token as server default token and refactor nginx.conf 2021-04-24 12:44:25 +08:00
yubo
ddac3a9871 add connect timeout options (#664) 2021-04-20 19:10:25 +08:00
joyexpr
bdb15aa0bb perf: mem and disk size calc from %d to %.1f (#662)
Co-authored-by: 周晓明 <zhouxiaoming@star-net.cn>
2021-04-20 19:04:40 +08:00
joyexpr
6693b131d8 perf: control command support mod name with n9e- prefix (#661)
Co-authored-by: 周晓明 <zhouxiaoming@star-net.cn>
2021-04-20 10:30:48 +08:00
joyexpr
41efc66d25 fix: send mail not work(wrong notifyType and subject) (#660) 2021-04-19 23:57:20 +08:00
710leo
a5197b4ced upgrade 4.0.1 2021-04-19 21:37:12 +08:00
710leo
d49d40768c organize configuration 2021-04-19 21:28:02 +08:00
710leo
c71264ab30 fix send message 2021-04-19 20:10:29 +08:00
710leo
8f1fd17f5c add configuration 2021-04-19 16:44:07 +08:00
UlricQin
7179bb79a0 default setting: udp not enable 2021-04-17 18:47:10 +08:00
710leo
bb64a2f1ec support static files 2021-04-16 19:21:02 +08:00
710leo
3f0dfd63d4 support static files 2021-04-15 21:23:59 +08:00
710leo
46f7ec7af9 complete version information 2021-04-15 19:35:25 +08:00
yubo
999c1b4239 bugfix: use InviteMustGet instead of InviteGet (#654)
* add fmt import
2021-04-14 20:57:27 +08:00
yubo
f6b2535cdb bugfix: use InviteMustGet instead of InviteGet (#653) 2021-04-14 12:48:26 +08:00
yubo
5f1c868006 feature: logout when the user is invalidated (#652) 2021-04-13 14:33:21 +08:00
qinyening
59366e4d3a 发布v4版本 (#651)
* init
2021-04-13 11:38:40 +08:00
710leo
bea0532872 update changelog 2021-04-12 10:46:58 +08:00
710leo
e684c583fb upgrade 3.8.0 2021-04-09 17:42:17 +08:00
710leo
eed2f073a0 Merge branch 'master' of https://github.com/didi/nightingale 2021-04-09 15:34:06 +08:00
710leo
31a03aa331 alert event modify filling user detail 2021-04-09 15:33:52 +08:00
yubo
71984c72b5 feature: add password changed notify (#647)
* feature: add password changed notify
2021-04-09 11:21:09 +08:00
yubo
72573e32cb feature: add get self permissions by nodeID (#643) 2021-04-07 13:12:00 +08:00
chixianliangGithub
50f4cc10c4 去除重复代码 (#641) 2021-04-03 16:00:32 +08:00
710leo
c2f98583e1 add ntp in agent conf 2021-03-31 11:39:37 +08:00
yubo
1ff6d0a2dc feature: add [start,end) param for clude, endpointMetric, endpoints api (#639) 2021-03-30 18:10:14 +08:00
yubo
92ac8b09c0 prober plugin use all mode as default (#634) 2021-03-26 11:17:31 +08:00
Paul Chu
384e993ca1 enable promethues summary (#630) 2021-03-24 16:08:42 +00:00
yubo
c1241fdfbc bugfix: created_at -> create_at for rdb.user table (#632) 2021-03-24 19:10:01 +08:00
yubo
be9d6ac660 use logger.Warning instead of fmt.Printf at loading plugins (#629) 2021-03-23 18:37:03 +08:00
yubo
30b469ddbd add subject for rdb rst-code/login-code mail (#628) 2021-03-22 17:27:01 +08:00
UlricQin
22ee99f222 Merge branch 'master' of https://github.com/didi/nightingale 2021-03-19 13:19:51 +08:00
UlricQin
f4675f0a34 upgrade 3.7.1 2021-03-19 13:19:27 +08:00
yubo
111c6fc1bf feature: support node event notify with webhook (#627)
* feature: support node event notify with webhook
2021-03-19 13:06:41 +08:00
710leo
0cd2761021 Merge branch 'master' of https://github.com/didi/nightingale 2021-03-19 11:12:41 +08:00
710leo
0a7c8988c6 stra add user group detail 2021-03-19 11:12:32 +08:00
UlricQin
7947533182 monapi support new timestamp 2021-03-19 10:48:40 +08:00
710leo
184c39d311 add some audit log 2021-03-18 21:22:50 +08:00
UlricQin
d89eaec596 bugfix: GetTeamsNameByIds 2021-03-18 10:03:20 +08:00
yubo
40ce0d75ed prettify msg (#620) 2021-03-17 11:57:30 +08:00
ning1875
61bd28db31 日志采集字段变更 whether_attache_one_log_line--> whether_attach_one_log_line (#619)
* m3db writetagged应该并发做,不然会导致transfer rpc变慢

* go func指针传参问题

* 新增k8s-mon三个大盘文件

* 新增k8s-mon三个大盘文件

* 修改k8s-mon三个大盘文件

* 日志采集新增带上最后一条日志 到extra字段中,为后续报警做准备

* 日志采集字段变更 whether_attache_one_log_line--> whether_attach_one_log_line

* 日志采集带上日志
2021-03-15 16:03:02 +08:00
710leo
b1426945d4 fix agent proc.cpu.util 2021-03-13 18:21:21 +08:00
ning1875
dec9097ce7 transfer写m3db出错时打印metric信息帮助定位 (#615)
* m3db writetagged应该并发做,不然会导致transfer rpc变慢

* go func指针传参问题

* 新增k8s-mon三个大盘文件

* 新增k8s-mon三个大盘文件

* 修改k8s-mon三个大盘文件

* transfer写m3db出错时打印metric信息帮助定位
2021-03-13 13:21:05 +08:00
ning1875
7bb93e8351 日志采集新增带上最后一条日志 到extra字段中,为后续报警做准备 (#614)
* m3db writetagged应该并发做,不然会导致transfer rpc变慢

* go func指针传参问题

* 新增k8s-mon三个大盘文件

* 新增k8s-mon三个大盘文件

* 修改k8s-mon三个大盘文件

* 日志采集新增带上最后一条日志 到extra字段中,为后续报警做准备
2021-03-13 13:19:45 +08:00
alick-liming
7a84223d5b Aggr lanteness (#611)
* aggr lateness

* default value

* test

* test

Co-authored-by: alickliming <alickliming@didiglobal.com>
2021-03-12 15:10:42 +08:00
yubo
398628870c bugfix: add prober.plugins Stop() for release resource (#610) 2021-03-11 16:22:55 +08:00
yubo
3e426537c7 add maxSeriesPoints for config.transfer.m3db (#609) 2021-03-10 17:50:38 +08:00
HONG YANG
bf1bd3ef5a “massage” (#603) 2021-03-10 17:36:58 +08:00
yubo
b85b1e44ef bugfix: auth password history size (#607) 2021-03-10 17:35:12 +08:00
yubo
ff194c0382 add sample.out for mysql & redist (#605) 2021-03-09 19:10:40 +08:00
Zayscott
078f7cfc90 新增n9e模块监控大盘 (#602)
* Update changelog

* Create n9e_mudules

* Update changelog
2021-03-05 12:59:52 +08:00
stiei13wangluo
bd72a773f4 telegraf dns_query plugins (#601)
* dns_query

* dns_query

Co-authored-by: root <root@localhost.localdomain>
2021-03-05 11:54:13 +08:00
Zayscott
9637fb4bf3 Update changelog (#600) 2021-03-04 17:36:15 +08:00
yubo
22dc5c909c feature: add dryrun for collect_rule add/update (#599)
* feature: add dryrun for collect_rule add/update

* ignore sso when it is disable
2021-03-04 17:35:40 +08:00
UlricQin
cd4336d438 code refactor 2021-03-02 17:07:53 +08:00
UlricQin
fb619e0fa9 add comment 2021-03-02 17:06:47 +08:00
UlricQin
08dbc3c035 Delete n9e_rdb-v3.4.0.sql 2021-03-02 17:03:41 +08:00
UlricQin
3fba4390c5 generate upgrade sql for 3.7.0 2021-03-02 16:58:09 +08:00
UlricQin
500585a0a0 upgrade 3.7.0 2021-03-02 14:48:27 +08:00
Feng_Qi
acaa88f1a9 add ping/net_response/http_response support (#594)
* fix port check and push debug log

1:如果服务没有监听在 0.0.0.0 上,而是监听在特定地址上的话,在 127.0.0.1 上无法检测到端口。修改为如果 127.0.0.1 检测不到话,在 identity 的地址上再检测一次。
2. http push 部分缺乏 debug 日志,把 debug log 改到 push 里面以补全。

* Update cron.go

* notify add resource name and note

* Update notify.go

* Update notify.go

修复一个当 name/note 为空值且 resource 只有一台时, 由于被 config.Set 清空
因此获取下标 index out of range 导致 panic 的 bug

* add ping, net_response, http_response plugin

增加
ping
net_response
http_response
的插件支持

* Update all.go

* add example config yml

* Update notify.go
2021-02-28 07:56:35 +08:00
yubo
005dc47868 fix: https://github.com/didi/nightingale/issues/583 (#590) 2021-02-25 15:37:35 +08:00
yubo
9c1c894e29 feature: support dlopen for prober plugin (#588) 2021-02-23 18:04:03 +08:00
yubo
b055bc73c5 add a demo plugin for prober (#586)
* add a demo plugin for prober

* update demo plugin
2021-02-23 11:41:38 +08:00
yubo
322cbf27dc use testhttp instead of http for ut (#585)
* use testhttp instead of http for ut
* bugfix: add username check
2021-02-22 11:25:02 +08:00
UlricQin
417a13c1be bugfix: judge: redis conn pools 2021-02-07 17:07:00 +08:00
UlricQin
05819497e4 Update README.md 2021-02-06 11:28:07 +08:00
yubo
66c93f472a update vendor for local_build (#578) 2021-02-03 19:10:19 +08:00
710leo
023b23a0ef fix build monapi 2021-02-03 17:01:54 +08:00
710leo
900896c045 add sync stra log 2021-02-03 16:55:14 +08:00
yubo
db97453c54 build error fix: replace grpc to 1.29.1 (#577) 2021-02-03 16:42:27 +08:00
sun763625521
3fdd61edfc 新增rabbitmq、haproxy组件采集 (#575)
* add

* add prober plugin for rabbitmq

* add prober plugin for haproxy

Co-authored-by: root <root@localhost.localdomain>
Co-authored-by: UlricQin <ulric.qin@gmail.com>
2021-02-03 15:02:48 +08:00
yubo
e839c6bd6b bugfix: update session param has a mistake (#576)
* add cache counter for login part.1

* add login counter api

* feature: prober support multi-metric with different tags

* bugfix: session counter reset

* add models.stats for counter

* bugfix: update session param has a mistake
2021-02-03 15:00:14 +08:00
lynxcat
2d9bc50401 新增zookeeper,tengine采集 (#574)
* add prober plugin for elasticsearch

* 新增zookeeper,tengine插件,补齐了prober采集插件的测试

* 添加zookeeper插件描述

Co-authored-by: lynxcat <lynxcatdeng@gmail.com>
2021-02-03 14:43:39 +08:00
qinyening
c48d8b93dd add some permssion api (#572) 2021-02-03 11:01:29 +08:00
alick-liming
e2e96a04d1 权限调整 (#571) 2021-02-02 14:03:05 +08:00
yubo
c724896ecd adjust session GC interval (#569)
* keep at least 4 history passwords

* adjust gc time for session
2021-02-01 23:29:38 +08:00
燕小乙
914aaa0a96 修改k8s-mon ksm,控制平面大盘 (#567)
* m3db writetagged应该并发做,不然会导致transfer rpc变慢

* go func指针传参问题

* 新增k8s-mon三个大盘文件

* 新增k8s-mon三个大盘文件

* 修改k8s-mon三个大盘文件
2021-01-31 11:56:44 +08:00
UlricQin
b5becda6fc go mod vendor 2021-01-31 11:04:17 +08:00
Ulric Qin
37868777e7 release 3.6.0 2021-01-31 10:59:01 +08:00
Ulric Qin
3663ed0235 uniq res bindings 2021-01-31 10:52:04 +08:00
yubo
7fa84af66a add session get api (#566) 2021-01-31 10:48:37 +08:00
Ulric Qin
55718a09e0 refactor bug_report.md 2021-01-30 10:14:57 +08:00
yubo
3754e0cbe3 remove local telegraf plugins.inputs (#563) 2021-01-29 23:49:48 +08:00
lynxcat
3df2536bb6 add prober plugin for elasticsearch (#562)
Co-authored-by: lynxcat <lynxcatdeng@gmail.com>
2021-01-29 23:47:50 +08:00
UlricQin
9e07f1924c Update README.md 2021-01-29 08:28:40 +08:00
yubo
fbf4544849 add accumulator for prober & generate default plugin config (#560)
* add accumulator for prober & generate default plugin config

* add prometheus plugin

* add prober plugin test util
2021-01-29 08:26:28 +08:00
lynxcat
2d4e6bb8da prober nginx 采集插件 (#557)
* add a method to get the Endpoint

* 增加nginx插件,修改control。支持./control build prober job这种多个参数

* 修改提示

Co-authored-by: lynxcat <lynxcatdeng@gmail.com>
2021-01-28 17:23:46 +08:00
UlricQin
211dfc62e4 code refactor 2021-01-28 10:07:48 +08:00
UlricQin
679c5892a4 Revert "add Prometheus as a plugin for prober (#556)" (#558)
This reverts commit 1dac755787.
2021-01-27 23:51:42 +08:00
yubo
1dac755787 add Prometheus as a plugin for prober (#556)
* update changelog

* add prometheus as a plugin for prober

* bugfix: add counter type for summary & histogram

* ignore summary, histogram for prometheus plugin
2021-01-27 23:47:53 +08:00
燕小乙
1f4e0f5e73 新增k8s-mon三个大盘文件 (#555)
* m3db writetagged应该并发做,不然会导致transfer rpc变慢

* go func指针传参问题

* 新增k8s-mon三个大盘文件

* 新增k8s-mon三个大盘文件
2021-01-27 19:20:00 +08:00
UlricQin
b616894f2e code refactor 2021-01-27 17:46:56 +08:00
UlricQin
7a910709f9 code refactor 2021-01-27 10:12:37 +08:00
UlricQin
d61f8dac2e Merge branch 'master' of https://github.com/didi/nightingale 2021-01-27 08:53:14 +08:00
UlricQin
6d1fecc408 modify README 2021-01-27 08:52:40 +08:00
UlricQin
366d44959e upgrade go mod 2021-01-26 20:45:50 +08:00
UlricQin
d37c8f5387 add doc 2021-01-26 20:38:42 +08:00
UlricQin
06c5ca412a code refactor 2021-01-26 18:14:45 +08:00
yubo
afa95f79cd update changelog (#552) 2021-01-26 18:13:42 +08:00
yubo
8fe3457e0a support anonymous struct field for monapi.plugins.template (#547)
* move get collectrule api from /api/mon to /v1/mon

* support anonymous struct field for monapi.plugins.template

* add tls with mysql, redis and mongodb

* add rdb.user.pwdExpiresAt
2021-01-25 20:43:15 +08:00
alick-liming
7bfd60be86 资源排行去掉内置租户 (#544)
Co-authored-by: alickliming <alickliming@didi.global.com>
2021-01-25 10:28:40 +08:00
UlricQin
b7284ada94 use more conns for mysql 2021-01-25 10:19:50 +08:00
yubo
66e2dc73f9 remove prober RPC.port from config (#543)
* remove prober rpc.port from yml config

* remove prober.config.rpcPort && add prober.plugins.config.metrics checker
2021-01-24 14:09:08 +08:00
Ulric Qin
d254e5670b rebuild table collect_rule 2021-01-23 17:02:37 +08:00
Ulric Qin
9c945b33fb is tag value is blank, use nil instead 2021-01-23 14:08:44 +08:00
yubo
c53a66d20e remove sql.mon.collect_rule.created (#542) 2021-01-23 13:28:45 +08:00
Ulric Qin
335b113327 release 3.5.1 2021-01-23 11:04:58 +08:00
Ulric Qin
122590265d ignore EOF error 2021-01-23 10:59:57 +08:00
lynxcat
aab2f8b090 add a method to get the Endpoint (#540)
Co-authored-by: lynxcat <lynxcatdeng@gmail.com>
2021-01-23 08:50:36 +08:00
shaojie
f0a4c130f6 后端日志格式更改 (#539) 2021-01-23 08:06:02 +08:00
yubo
029f0a09ba bugfix: return err when unable to get monapi.collectRule (#537) 2021-01-22 17:39:02 +08:00
yubo
8fe3d2b0b3 bugfix: replace ref with instantiated variable for prober.rules.updatedAt (#536)
* add mon.plugins.redis descriptions

* bugfix: add region field for instances/heartbeat

* bugfix: replace ref with instantiated variable for prober.rules.updatedAt
2021-01-22 16:20:07 +08:00
UlricQin
25c31fcb2e +alarmEnabled true 2021-01-22 14:34:20 +08:00
UlricQin
2a74809294 add doc 2021-01-22 11:52:15 +08:00
UlricQin
09154e40aa 3.5.0 release 2021-01-22 11:45:05 +08:00
UlricQin
2e9b236406 3.5.0 release 2021-01-22 11:29:49 +08:00
Ulric Qin
1f88b72dba Merge branch 'master' of github.com:didi/nightingale 2021-01-21 20:49:39 +08:00
Ulric Qin
0c133afbaf wget tarball 2021-01-21 20:49:29 +08:00
710leo
2f87121e27 fix typo 2021-01-21 20:23:13 +08:00
710leo
b72e2d3fe0 Merge branch 'master' of https://github.com/didi/nightingale 2021-01-21 20:18:48 +08:00
710leo
4119414079 add tree search by user 2021-01-21 20:18:16 +08:00
Ulric Qin
955fe6795d test modify screen tpl 2021-01-21 20:16:50 +08:00
市民233
fbbb59971c feat: update go proxy module url (#532) 2021-01-21 20:03:38 +08:00
市民233
0a6df20b7d fix aliyun repo golang mod 404 (#531)
aliyun repo golang mod 404, just connect directly

go module公共代理仓库资料404,就直连
2021-01-21 16:36:46 +08:00
yubo
d640d86160 add mon.plugins.redis descriptions (#529)
* add mon.plugins.redis descriptions

* bugfix: add region field for instances/heartbeat
2021-01-21 16:35:31 +08:00
qinyening
6a70bed30f modify proc info collect (#528) 2021-01-21 00:38:18 +08:00
yubo
91503cfd25 update template document for mysql,mongo and redis (#526)
* update mysql document

* update template document for mysql,mongo and redis

* use TelegrafPlugin interface

* add mon.plugins.github as an exmpale
2021-01-20 23:07:56 +08:00
710leo
56feba9b45 Merge branch 'master' of https://github.com/didi/nightingale 2021-01-19 19:25:02 +08:00
710leo
5eec7c317c bugfix: arbitrary file reading 2021-01-19 19:24:46 +08:00
UlricQin
04c650528f Merge branch 'master' of https://github.com/didi/nightingale 2021-01-19 18:42:18 +08:00
UlricQin
e8ba0fb0bb fix ssrf 2021-01-19 18:41:51 +08:00
710leo
22cb24da09 add license 2021-01-18 23:39:23 +08:00
yubo
8204641656 add logger with monapi.plugins as telegraf.Logger interface (#522) 2021-01-18 19:25:47 +08:00
710leo
c5ba127b9e Merge branch 'master' of https://github.com/didi/nightingale 2021-01-18 13:27:02 +08:00
710leo
d2be562619 add alert & screen template 2021-01-18 13:26:37 +08:00
UlricQin
a4c8638448 code refactor 2021-01-15 19:59:26 +08:00
alick-liming
51cf58fcdf ams agent直接挂载到节点 (#518)
* ams agent直接挂载到节点

* 代码调整

Co-authored-by: alickliming <alickliming@didi.global.com>
2021-01-15 19:54:25 +08:00
qinyening
b00b7817f2 Support screen and alert template (#517) 2021-01-15 15:58:21 +08:00
燕小乙
6b1e432f6d m3db writetagged应该并发做,不然会导致transfer rpc变慢 (#514)
* m3db writetagged应该并发做,不然会导致transfer rpc变慢

* go func指针传参问题
2021-01-15 09:16:06 +08:00
UlricQin
f590194fba bugfix: same stra in nid 2021-01-14 10:06:49 +08:00
710leo
72ec59bdac Support unassigned tenant search 2021-01-13 14:04:56 +08:00
UlricQin
a07df519ab add 3.4.1 doc 2021-01-13 12:45:26 +08:00
UlricQin
7bf3049f4a upgrade 3.4.1 bugfix log_collect 2021-01-13 12:39:02 +08:00
yubo
a88315ee74 bugfix: call collect.Decode before get() (#507)
* add models.user.i18n

* bugfix: call collect.Decode before get()
2021-01-13 12:34:44 +08:00
UlricQin
c182c70b8d bugfix: check node is nil 2021-01-13 10:10:00 +08:00
alick-liming
fab8568633 新增用户添加组织字段 (#505)
Co-authored-by: alickliming <alickliming@didi.global.com>
2021-01-13 10:00:58 +08:00
yubo
74545012ed add models.user.i18n (#504) 2021-01-12 20:27:49 +08:00
UlricQin
903a1654b6 fix sql inject 2021-01-12 18:38:12 +08:00
yubo
7161c1ac4e adjust some file, variable name for prober module (#503)
* move pulgins_config.go to config dir

* add mongodb, redis yml
2021-01-11 22:08:27 +08:00
alick-liming
a9cf307cbf 租户项目粒度某类资源top数量 (#499)
* 租户项目粒度某类资源top数量

* 租户项目粒度某类资源top数量

* resname->rescate

Co-authored-by: alickliming <alickliming@didi.global.com>
2021-01-11 12:58:32 +08:00
alick-liming
f9cfcaeabe 告警hours支持 (#497)
Co-authored-by: alickliming <alickliming@didi.global.com>
2021-01-11 11:42:28 +08:00
yubo
b9aacf28e5 add start,end for transfer.index.fullmatch get (#494)
* add start,end with transfer.index.fullmatch get

* bugfix: should cleanup token before destory session when auth.extra.mode.enable
2021-01-09 14:09:06 +08:00
710leo
54512491b7 fix alert history tag display 2021-01-08 20:08:47 +08:00
UlricQin
a3fee54f7a edit changelog 2021-01-08 17:40:12 +08:00
yubo
e5f05aa724 add err log for session start (#493)
* add validate for some plugins

* restore getSessionUserWithCache method

* add validate for i18n.config

* use embed dict for i18n

* add err log for session start
2021-01-08 16:59:26 +08:00
UlricQin
312c2d1574 move patch sql to 3.4.0 2021-01-08 16:11:38 +08:00
yubo
7289636d35 use embed dict for i18n (#492)
* add validate for some plugins

* restore getSessionUserWithCache method

* add validate for i18n.config

* use embed dict for i18n
2021-01-08 15:11:41 +08:00
UlricQin
733da1ea94 converge delete 2021-01-08 15:02:31 +08:00
UlricQin
a1b4344943 use beego client in sender_sms 2021-01-08 10:28:45 +08:00
UlricQin
3c26beb48c Update README.md 2021-01-07 22:55:07 +08:00
yubo
c0049326b6 Add mongdb as a plugin (#489)
* bugfix: whiteList list return empty

* support multi-dict for i18n && add mongodb for monapi as a plugin

* use 10day as max lifetime for extra mode auth

* bugfix: ignore i18n with default value

* Spelling mistakes
2021-01-07 20:06:48 +08:00
yubo
543d345aea Dev (#487)
* add rdb config auth.debug for white_list

* update prober config support mode param

* feature: support access-token control with max connection, idle time, ...

* add token/session delete with auth check

* enable debug user for auth

* skip init sso db if not enable
2021-01-07 09:13:04 +08:00
UlricQin
fb1354898c Merge branch 'master' of https://github.com/didi/nightingale 2021-01-06 12:17:15 +08:00
UlricQin
e0263edc54 fix genNameAndNoteByResources when note is blank 2021-01-06 12:17:06 +08:00
alick-liming
b3a7d7c9a8 用户组织下拉接口 (#486)
Co-authored-by: alickliming <alickliming@didi.global.com>
2021-01-06 11:14:38 +08:00
alick-liming
a8008a9418 Nodename search (#484)
* node name search

* node name search

* node name search

* node name search

Co-authored-by: alickliming <alickliming@didi.global.com>
2021-01-05 14:31:18 +08:00
Ulric Qin
e82f560c44 code refactor 2021-01-03 18:13:48 +08:00
Ulric Qin
3589c7de69 compatible for blank tag value 2021-01-03 17:57:11 +08:00
Ulric Qin
72cf2c7578 add changelog for 3.4.0 2021-01-01 12:12:04 +08:00
Ulric Qin
eddd77d2d9 add patch sql for hbs 2021-01-01 11:42:17 +08:00
UlricQin
5d61468de6 add vendor 2021-01-01 11:11:47 +08:00
Ulric Qin
e85debddfc upgrade 3.4.0 2021-01-01 10:46:59 +08:00
yubo
d45ea02562 Rdb (#479)
* use collector interface

* mysql can work fine

* add basecollector

* add prober & monapi.plugins

* enable mysql plugins work

* rename collector -> manager

* add white list access check for rdb

* add cache module for authConfig & session

* rollback n9e_rdb_3.3.0.sql

* add sql ddl document

* add white_list, pwd, login access control

* add email code for login & reset password

* use sessionUsername instead of cookieUsername

* remove cookie name and data from session

* rename userName to username

* add remote_addr with session connection

* add get user by sid with cache

* enable cookie life time could be zero

* go mod tidy

* Rdb with session & monapi with telegraf (#456)

* use collector interface

* mysql can work fine

* add basecollector

* add prober & monapi.plugins

* enable mysql plugins work

* rename collector -> manager

* add white list access check for rdb

* add cache module for authConfig & session

* rollback n9e_rdb_3.3.0.sql

* add sql ddl document

* add white_list, pwd, login access control

* add email code for login & reset password

* use sessionUsername instead of cookieUsername

* remove cookie name and data from session

* rename userName to username

* add remote_addr with session connection

* add get user by sid with cache

* enable cookie life time could be zero

* go mod tidy

* add plugins config for prober

* add prober plugin expression parse

* update transfer default config for m3

* Rdb (#458)

* bugfix: session gc

* use flag for pwdMustInclude

* change user login function

* delete invite token after use

* bugfix: login response

* add sessionStart middle ware

* add auth module

* add i18n for rdb

* add i18n.zh for rdb.auth

* add mon plugins(redis, mongodb)

* update config

* add sub struct into definitions

* clean up sid cache after session destory

* bugfix: get user return nil when not found

* update i18n

* bugfix: ignore cache nologin user

* add user for callback output

* add password change api

* update default configfile & sql patch

* merge mon http middleware from rdb

* remove sso logout, sso already supporte one time auth
2021-01-01 10:41:30 +08:00
710leo
bad43090ff Add user dispname under the node 2020-12-31 18:09:42 +08:00
710leo
c2867d9638 Modify alert function 2020-12-31 15:11:59 +08:00
UlricQin
6dbbbac344 bugfix: insert task_meta sql inject 2020-12-31 13:01:43 +08:00
UlricQin
e903f609a5 Update README.md 2020-12-30 09:19:10 +08:00
Tiny
9dd1f1f90b 钉钉告警可以at特定指定人员 (#475)
* - add dingtalk robot @ special person
should config phone number in user IM field

* send msg by each robot when there are more than one robot in stra
2020-12-29 19:43:02 +08:00
Ulric Qin
4b22390faf refactor ip shell 2020-12-29 12:00:17 +08:00
710leo
9dbfef3df8 Optimize log output 2020-12-27 15:25:50 +08:00
UlricQin
28c794b2da modify stra.timeout default = 5000 2020-12-23 20:37:31 +08:00
UlricQin
ee96ce5046 test port listen on 127.0.0.1 and identity.ip and ::1 2020-12-23 15:57:11 +08:00
UlricQin
28d311e759 bugfix sql 注入 2020-12-22 11:56:20 +08:00
710leo
d355393074 fix: 同比变化率告警函数 2020-12-20 13:00:45 +08:00
moses
7b220da936 变化率绝对值算法问题 (#462) 2020-12-20 12:42:01 +08:00
Paul Chu
a1130b0e7c MOD: agent 文件系统 rw 探测 (#465) 2020-12-20 12:34:21 +08:00
710leo
8bba55e441 fix node role save 2020-12-17 20:26:13 +08:00
Paul Chu
d6d2e32b2e FIX: 修复订阅大盘图表 (#461)
* FIX: 修复短信报警模板的转义问题

报警说明里的信息由于 html template 的转义,会将部分字符转义为 html 表示,但是短信内容不需要转义。
向 template 模板添加 unescaped 处理函数,并在模板文件中使用 unescaped 标识不需要转义的字段,实现避免转义

* FIX: html template func 需要在 phase 之前添加

* FIX: use the filename as template name

* FIX: template name

* FIX: 修复订阅大盘图表

Co-authored-by: zhupeiyuan <zhupeiyuan@fenbi.com>
2020-12-17 15:24:02 +08:00
Feng_Qi
37c8317410 告警信息增加设备名称(name)和设备备注(note) (#460)
* fix port check and push debug log

1:如果服务没有监听在 0.0.0.0 上,而是监听在特定地址上的话,在 127.0.0.1 上无法检测到端口。修改为如果 127.0.0.1 检测不到话,在 identity 的地址上再检测一次。
2. http push 部分缺乏 debug 日志,把 debug log 改到 push 里面以补全。

* Update cron.go

* notify add resource name and note

* Update notify.go
2020-12-16 19:51:06 +08:00
yubo
19337b230c change transfer.m3db.env default -> default_env (#459) 2020-12-16 19:37:16 +08:00
UlricQin
d0ec6d8244 Merge branch 'master' of https://github.com/didi/nightingale 2020-12-16 15:22:08 +08:00
UlricQin
a232e7dfcd add some logs 2020-12-16 15:21:44 +08:00
qinyening
9b0a8dbb07 Agent 默认不采集 disk.rw.error 指标 (#455)
* configurable FsRWMetrics collect
2020-12-12 17:27:57 +08:00
ysicing
2e7a2a07ac fix: ldap登录如果配置允许coverAttributes,会导致panic (#454)
fix #453

Signed-off-by: ysicing <i@ysicing.me>
2020-12-12 03:46:13 +08:00
UlricQin
36e119770a code refactor 2020-12-10 09:58:48 +08:00
UlricQin
ac2efd2baf bugfix 2020-12-09 20:20:21 +08:00
alick-liming
5bd9c5fefc 支持全局数据统计 1.rdb分类资源数量 2.告警数量 (#447) 2020-12-08 23:47:09 +08:00
Ulric Qin
68bccb4d3f update readme 2020-12-06 08:16:28 +08:00
Ulric Qin
284b2c0db3 update readme 2020-12-06 08:08:58 +08:00
710leo
83a63da6c4 update vendor 2020-12-04 21:30:02 +08:00
710leo
dc7c0885a7 feat: support get nodes by ids 2020-12-04 21:27:14 +08:00
UlricQin
82a42f3649 host filter support id field 2020-12-04 18:56:28 +08:00
Paul Chu
30b600fe36 BUGFIX: 修复短信报警模板的转义问题 (#440)
* FIX: 修复短信报警模板的转义问题

报警说明里的信息由于 html template 的转义,会将部分字符转义为 html 表示,但是短信内容不需要转义。
向 template 模板添加 unescaped 处理函数,并在模板文件中使用 unescaped 标识不需要转义的字段,实现避免转义

* FIX: html template func 需要在 phase 之前添加

* FIX: use the filename as template name

* FIX: template name

Co-authored-by: zhupeiyuan <zhupeiyuan@fenbi.com>
2020-12-04 16:18:05 +08:00
Ulric Qin
eedfc99064 upgrade 3.3.1 2020-12-03 21:22:50 +08:00
Ulric Qin
80c316201d code refactor 2020-12-03 21:11:39 +08:00
710leo
a8f7f6a04e judge refactor 2020-12-03 11:50:45 +08:00
alick-liming
94eb306692 ams agent上报注册代码调整 (#436)
* rdb资源增加volume

* rdb用户增加创建时间

* rdb用户添加时间

* rdb新增添加用户时间代码调整

* test

* 1.agent上报扩展字段 2.rdb标签批量修改

* 代码调整

* 代码调整

* ams扩展代码调整

* test

* test

* 测试

* 错误调整

* ams agent上报注册代码调整

* map clear返回值去掉

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-12-02 11:49:26 +08:00
Hayden
e673c5340c Update README.md (#434)
Change the README
2020-12-02 11:48:12 +08:00
DemoLiang
3c7c836b64 [FIX][issue#433]优化告警收敛,查询告警事件统计数据数据时使用created做between查询数据库时耗时非常久的问题,修改问使用etime 可以命中etime索引,提升查询性能,4C8G查询100W告警事件大概9s耗时,修改为etime大概700ms耗时 (#435) 2020-12-02 11:38:35 +08:00
Rick
7068faaa92 clean cache, reduce image size (#430)
* clean cache, reduce image size

* remove set -ex, image size indeed reduced
2020-12-02 04:16:24 +08:00
Ulric Qin
d063bc0e78 for jeff, dirty data 2020-12-01 23:39:03 +08:00
Ulric Qin
c6442ed68a add agent log 2020-12-01 22:32:13 +08:00
Ulric Qin
ebb95a8292 add debug log 2020-12-01 22:13:16 +08:00
alick-liming
7a185b5054 ams 扩展字段bugfix (#429)
* rdb资源增加volume

* rdb用户增加创建时间

* rdb用户添加时间

* rdb新增添加用户时间代码调整

* test

* 1.agent上报扩展字段 2.rdb标签批量修改

* 代码调整

* 代码调整

* ams扩展代码调整

* test

* test

* 测试

* 错误调整

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-12-01 19:43:23 +08:00
710leo
0bd9b5b0d1 update dict.json 2020-12-01 10:56:44 +08:00
qinyening
74e85cdadc support i18n (#431) 2020-11-30 21:11:46 +08:00
alick-liming
82dadb31b5 1. ams扩展字段 2.rdb标签批量修改 (#428)
* rdb资源增加volume

* rdb用户增加创建时间

* rdb用户添加时间

* rdb新增添加用户时间代码调整

* test

* 1.agent上报扩展字段 2.rdb标签批量修改

* 代码调整

* 代码调整

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-11-30 15:17:43 +08:00
UlricQin
b65e56b9fd report change value 2020-11-30 14:36:07 +08:00
qinyening
0b696202e7 support get users by org (#426)
* support get users by org
2020-11-28 11:42:46 +08:00
Ulric Qin
5d0c6c0c6e add build_local 2020-11-28 10:24:26 +08:00
snow_white
99d2d7a2ae 修复metricIndex的TagkvMap初始化两次造成的内存浪费 (#424)
Co-authored-by: zhuxingtao <zhuxingtao@baijiahulian.com>
2020-11-28 10:12:49 +08:00
UlricQin
0e8b626528 Update m3db-install.md 2020-11-28 09:18:38 +08:00
yubo
d80ce1d8c5 update auth/logout (#417) 2020-11-24 18:56:18 +08:00
UlricQin
86c0520076 fix job scheduler 2020-11-24 16:26:54 +08:00
UlricQin
966870200b code refactor 2020-11-23 11:05:58 +08:00
Ulric Qin
a3004a5140 add send_voice 2020-11-21 02:09:00 +08:00
710leo
dc97556807 update changelog 2020-11-21 00:50:15 +08:00
710leo
79d8aeee11 monapi get index configurable 2020-11-21 00:43:33 +08:00
710leo
d5430256c7 fix rdb sql 2020-11-20 18:21:10 +08:00
UlricQin
6691801721 code refactor 2020-11-20 18:04:35 +08:00
yubo
46c1c972ab docs: add m3db install (#414)
* docs: add m3db install

* change transfer default settings
2020-11-20 07:37:00 +08:00
UlricQin
4339a653ce Update README.md 2020-11-19 08:27:45 +08:00
yubo
299122f965 add nid to transfer query data/index (#411) 2020-11-18 23:46:51 +08:00
dujiashu
a6b160caed fix typo and add render (#410) 2020-11-18 18:16:56 +08:00
710leo
a5672bc1b3 Merge branch 'master' of https://github.com/didi/nightingale 2020-11-18 16:17:30 +08:00
710leo
6e5dff6454 update sql 2020-11-18 16:17:21 +08:00
yubo
38e060f704 When the time exceeds the limit, adjust the end time (#409) 2020-11-18 16:14:58 +08:00
710leo
227652ec8f Merge branch 'master' of https://github.com/didi/nightingale 2020-11-18 16:14:12 +08:00
710leo
fa263eb68d fix sql 2020-11-18 16:14:01 +08:00
yubo
d3992b81ef use consolFun instead of aggrFunc with resample (#408) 2020-11-18 11:13:45 +08:00
yubo
c430657738 transfer query end = end+1 (#406) 2020-11-17 22:08:32 +08:00
yubo
d78301567b M3db 2 (#404)
* bugfix: transfer ignore counter when tag is empty

* add m3db benchmark
2020-11-17 22:07:50 +08:00
710leo
bddd93cd80 udpate rdb sql 2020-11-17 18:32:52 +08:00
qinyening
a659820b07 refactor judge (#405)
* refactor judge
* rdb user add organization typ status
2020-11-17 18:27:22 +08:00
UlricQin
d32f9ef763 del no use code for transfer 2020-11-16 16:45:04 +08:00
yubo
920dd9a947 Add http_middleware to transfer (#402)
* validate ui query, add aggrFun support for resample

* add http_middleware to transfer
2020-11-16 16:33:02 +08:00
dujiashu
3f352a393b code refactory (#403)
* support tt automation by job

* format

* import order

* use map to avoid repetition

* add api for sync from ccp by force

* fix bug

* code refactory

* delete unused code

* delete func

* code refactory

* rename labels key

Co-authored-by: dujiashu <dujiashu@didiglobal.com>
2020-11-16 16:29:42 +08:00
yubo
86df27587e bugfix: query index by clude only get last record (#401)
* support time limit for m3db query

* bugfix: query index by clude only get last record
2020-11-16 13:25:35 +08:00
UlricQin
7b1ccd956b Merge branch 'master' of https://github.com/didi/nightingale 2020-11-16 11:03:54 +08:00
UlricQin
e928faf4f9 refactor /api/rdb/nodes?cate=xx 2020-11-16 11:01:36 +08:00
Ulric Qin
70e2cefd98 go mod vendor 2020-11-16 08:56:04 +08:00
Ulric Qin
49e8dbe5f6 准备发版3.3.0 2020-11-14 17:23:30 +08:00
Ulric Qin
250dd0c92d no cache for *.json 2020-11-13 21:32:10 +08:00
dujiashu
d4adafbcb7 support sync container by force (#399)
* support tt automation by job

* format

* import order

* use map to avoid repetition

* add api for sync from ccp by force

* fix bug

* code refactory

* delete unused code

* delete func

Co-authored-by: dujiashu <dujiashu@didiglobal.com>
2020-11-13 21:22:23 +08:00
alick-liming
ffc98f31c9 rdb新增用户添加时间代码调整 (#398)
* rdb资源增加volume

* rdb用户增加创建时间

* rdb用户添加时间

* rdb新增添加用户时间代码调整

* test

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-11-13 14:18:49 +08:00
alick-liming
1a71de851a rdb新增用户添加时间 (#397)
* rdb资源增加volume

* rdb用户增加创建时间

* rdb用户添加时间

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-11-13 13:39:28 +08:00
yubo
dd67efe0f6 support time limit for m3db query (#396) 2020-11-12 15:50:31 +08:00
yubo
033383eea4 validate ui query, add aggrFun support for resample (#392) 2020-11-12 11:50:40 +08:00
UlricQin
ee873a4ae2 code refactor 2020-11-12 11:24:54 +08:00
alick-liming
5667a6ee09 rdb资源增加volume (#394)
Co-authored-by: alickliming <alickliming@didi.global.com>
2020-11-11 20:49:48 +08:00
UlricQin
d26a4a35ab code refactor 2020-11-11 16:44:57 +08:00
gcxfd
d7acc88c05 sed ip to host for docker (#391) 2020-11-11 13:50:09 +08:00
710leo
1ab6dfceb1 fix docker-compose redis connection 2020-11-10 23:55:25 +08:00
yubo
a90c746626 bugfix: variable scope problem (#389)
* support openID2.0

* generate UUID if it's not set

* add m3db support

* add test shell

* update transfer.yml

* remove klog

* use remote m3 repo

* remove some file

* add description for tansfer.m3db config

* add query data for ui

* bugfix: Variable scope problem
2020-11-10 16:21:09 +08:00
gcxfd
fb1b5802ab fix redis connection (#386) 2020-11-10 16:06:18 +08:00
yubo
69ceeff9b8 M3db (#388)
* support openID2.0

* generate UUID if it's not set

* add m3db support

* add test shell

* update transfer.yml

* remove klog

* use remote m3 repo

* remove some file

* add description for tansfer.m3db config

* add query data for ui
2020-11-10 14:58:27 +08:00
yubo
b2baef0643 Auth with SSO (#387)
* add logout v2 for sso

* support sms-code login

* use db instead of memory cache for login code

* feature: support reset password by sms code

* remove deprecated api/code

* feature: support image captcha

* use db instead of memory cache for sso.auth.state

* add authLogin for login, v1/login; support (*)[.local].tpl for tpl file

* add username to sms-code api

* disable captcha by default in rdb.yml
2020-11-10 13:23:37 +08:00
yubo
2d1a2fd187 M3db (#385)
* support openID2.0

* generate UUID if it's not set

* add m3db support

* add test shell

* update transfer.yml

* remove klog

* use remote m3 repo

* remove some file
2020-11-09 19:52:44 +08:00
Ulric Qin
6d02d8876a Merge branch 'master' of github.com:didi/nightingale 2020-11-07 08:18:52 +08:00
Ulric Qin
712d0051d9 code refactor 2020-11-07 08:18:33 +08:00
UlricQin
e64a892275 add cleaner config 2020-11-06 17:25:14 +08:00
alick-liming
2e8a8966d7 rdb资源数量统计接口增加容器和交换机 (#383)
* rdb 资源cate分类统计接口

* 代码调节

* rdb:rabbitmq资源操作名称统一

* rdb控制台资源统计增加交换机,弹性云服务器

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-11-06 10:16:05 +08:00
alick-liming
f9b3db4058 rdb:rabbitmq资源操作名称统一 (#382)
* rdb 资源cate分类统计接口

* 代码调节

* rdb:rabbitmq资源操作名称统一

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-11-05 19:25:31 +08:00
alick-liming
633f224be6 rdb资源cate分类统计接口 (#380)
* rdb 资源cate分类统计接口

* 代码调节

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-11-05 14:06:57 +08:00
DemoLiang
79501b46fe [FIX] 修复告警策略配置不发送告警恢复的逻辑判断跟注释不相符的问题,当前告警策略定义的是0 发送告警恢复,1不发送告警恢复,而实际代码逻辑上现在判断的是0不发送告警恢复,修改为1 不发送告警恢复的判断 (#373) 2020-11-05 12:32:17 +08:00
yubo
df55398100 add checkPassword to reset password by sms code (#378)
* add logout v2 for sso

* support sms-code login

* use db instead of memory cache for login code

* feature: support reset password by sms code

* remove deprecated api/code

* feature: support image captcha

* use db instead of memory cache for sso.auth.state

* add authLogin for login, v1/login; support (*)[.local].tpl for tpl file

* add username to sms-code api
2020-11-05 12:20:02 +08:00
alick-liming
2bcb20d710 rdb:资源解除注册,统一成单个uuid来处理 (#376)
* im wechat

* im wechat

* im add wechat_robot dingtalk_robot

* metaq 资源解除注册改为单个解除方式

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-11-04 22:29:37 +08:00
UlricQin
79ae96f15d add some api 2020-11-04 21:47:55 +08:00
UlricQin
ac6d269d90 code refactor 2020-11-03 12:18:17 +08:00
UlricQin
a4026e8c25 code refactor 2020-11-03 12:17:17 +08:00
UlricQin
b448dad860 Merge branch 'master' of https://github.com/didi/nightingale 2020-11-03 12:11:32 +08:00
yubo
17762d9daa merge login & v1Login (#375)
* add logout v2 for sso

* support sms-code login

* use db instead of memory cache for login code

* feature: support reset password by sms code

* remove deprecated api/code

* feature: support image captcha

* use db instead of memory cache for sso.auth.state

* add authLogin for login, v1/login; support (*)[.local].tpl for tpl file
2020-11-02 15:54:15 +08:00
UlricQin
205201668c check password more strict 2020-11-02 13:44:14 +08:00
UlricQin
522cfca0af login fail, check your username and password 2020-11-02 13:27:23 +08:00
yubo
9bef8ddee3 login with sso,captcha,sms-code (#374)
* add logout v2 for sso

* support sms-code login

* use db instead of memory cache for login code

* feature: support reset password by sms code

* remove deprecated api/code

* feature: support image captcha

* use db instead of memory cache for sso.auth.state
2020-11-02 12:47:41 +08:00
UlricQin
7999c1fbe5 render user json do not return uuid 2020-11-02 12:38:22 +08:00
dujiashu
bec893d662 validate hosts unique (#371)
* support tt automation by job

* format

* import order

* use map to avoid repetition

Co-authored-by: dujiashu <dujiashu@didiglobal.com>
2020-11-01 11:07:17 +08:00
dujiashu
6f999f6a87 support tt automation by job (#370)
* support tt automation by job

* format

* import order

Co-authored-by: dujiashu <dujiashu@didiglobal.com>
2020-10-31 23:13:28 +08:00
710leo
10fef82225 modify node specification 2020-10-30 13:27:44 +08:00
710leo
ebbcfa3157 upgrade 3.2.0 2020-10-29 18:15:08 +08:00
qinyening
313144bebf agent支持metrics指标采集能力 (#368) 2020-10-29 16:54:48 +08:00
yubo
c6b5a5b400 feature: support reset password by sms code (#365)
* add logout v2 for sso

* support sms-code login

* use db instead of memory cache for login code

* feature: support reset password by sms code
2020-10-29 07:03:57 +08:00
yubo
1fdcbd848c Dev (#361)
* add logout v2 for sso

* support sms-code login

* use db instead of memory cache for login code
2020-10-27 17:51:39 +08:00
UlricQin
e63e741ad6 job api for tt 2020-10-27 09:57:58 +08:00
Ulric Qin
282aede691 upgrade 3.1.6 2020-10-25 20:20:10 +08:00
Ulric Qin
e5b95921cf add some validator for hostFieldNew 2020-10-25 20:06:46 +08:00
Ulric Qin
8c6726800f host field management done 2020-10-25 20:00:08 +08:00
Ulric Qin
6987b3b4d4 add host fields 2020-10-25 19:33:34 +08:00
Ulric Qin
28a2196143 use ips when recycle and del hosts 2020-10-25 18:16:20 +08:00
yubo
5b9a03a261 add OAuth2.0 callback/authorize V2 for UI (#353)
* support openID2.0

* generate UUID if it's not set

* change OAuth2 callback method to API style
2020-10-23 15:22:06 +08:00
710leo
228ffcfbc9 Merge branch 'master' of https://github.com/didi/nightingale 2020-10-22 21:23:59 +08:00
710leo
cc3b3575b6 sync from internal 2020-10-22 21:23:39 +08:00
Ulric Qin
ecacc71596 pack script 2020-10-22 17:50:32 +08:00
Ulric Qin
b2ad4b6995 upgrade 3.1.5 2020-10-22 17:49:15 +08:00
710leo
86929d8f69 fix monapi clean stra 2020-10-22 11:26:56 +08:00
710leo
39fa7e3e17 refactor /v1/rdb/node/:id/resources 2020-10-21 17:43:35 +08:00
Ulric Qin
70bc909565 code refactor 2020-10-19 22:29:09 +08:00
UlricQin
9ebe967e28 code refactor 2020-10-19 10:04:20 +08:00
Ulric Qin
93c35fd0ec upgrade 3.1.4 2020-10-18 10:17:39 +08:00
qinyening
2e80e82fc4 change hbs api & change perm point (#344)
* change hbs api & change perm point
2020-10-17 17:32:15 +08:00
UlricQin
2d19a1e86a code refactor 2020-10-15 20:29:06 +08:00
yubo
91700ab93e [ADD] generate UUID if it's not set (#338)
* support openID2.0

* generate UUID if it's not set
2020-10-14 15:02:58 +08:00
yubo
ecc736be8b support openID2.0 (#337) 2020-10-14 13:30:53 +08:00
Ulric Qin
8feb2287cc code refactor 2020-10-13 13:14:38 +08:00
Ulric Qin
e8d907156c upgrade 3.1.3 2020-10-13 13:09:02 +08:00
Ulric Qin
744980f119 cannot modify node-category to tenant 2020-10-13 09:35:34 +08:00
Ulric Qin
7f4dd8859e code refactor 2020-10-12 18:04:59 +08:00
Ulric Qin
f3962266b4 add changelog 2020-10-12 11:56:35 +08:00
UlricQin
b867c985ed bugfix: job callback for mon use method: post 2020-10-12 11:53:08 +08:00
Ulric Qin
0baa06cee0 rdb support wechat_robot and dingtalk_robot 2020-10-11 08:33:30 +08:00
alick-liming
e84d7f8741 im add wechat_robot and dingtalk_robot (#332)
* im wechat

* im wechat

* im add wechat_robot dingtalk_robot

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-10-11 08:24:04 +08:00
UlricQin
8c5e9534b2 code refactor 2020-10-10 10:55:38 +08:00
UlricQin
8f5a9c9349 code refactor 2020-10-10 10:54:34 +08:00
Ulric Qin
e11d1dd5d9 code refactor 2020-10-09 23:34:28 +08:00
Ulric Qin
df7210f9ad code refactor 2020-10-09 23:31:04 +08:00
Ulric Qin
8eead759d9 code refactor 2020-10-09 23:02:31 +08:00
Ulric Qin
c6b9fd181f pack exclude etc/*.local.yml 2020-10-09 23:02:02 +08:00
Ulric Qin
c4ad9f1e88 refactor wechat sender 2020-10-09 22:47:10 +08:00
alick-liming
8455994118 im wechat (#330)
* im wechat

* im wechat

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-10-09 22:42:19 +08:00
frank0417
2495405511 remove duplicate RDB sql (#328) 2020-10-09 16:24:28 +08:00
Ulric Qin
68f6312953 add gitee site url 2020-10-08 20:20:33 +08:00
Ulric Qin
e87b8dc368 code refactor 2020-10-08 20:13:08 +08:00
Ulric Qin
6083484d19 add some doc 2020-10-08 20:12:25 +08:00
Ulric Qin
3bdd0a7265 modify tarball download url 2020-10-08 19:54:40 +08:00
Ulric Qin
9d509efe35 upgrade 3.0.1 2020-10-08 19:47:49 +08:00
Ulric Qin
97acc7d1d0 del stras if node not exists 2020-10-08 19:46:54 +08:00
Ulric Qin
3534aa7e69 bugfix: GetLeafNidsForMon check node is nil 2020-10-08 19:39:52 +08:00
Ulric Qin
700370f70f add arch image and stra.json 2020-10-08 15:23:39 +08:00
Ulric Qin
95c96b3894 add service files 2020-10-08 10:55:27 +08:00
Ulric Qin
fdce55cf3b upgrade httpbli 2020-10-08 10:26:21 +08:00
Ulric Qin
9d118e0ef3 exclude *.local.yml when pack 2020-10-02 11:22:35 +08:00
Ulric Qin
18ef32a34c bugfix: log collect locker 2020-10-02 11:17:22 +08:00
Ulric Qin
cfd81a91fc add doc 2020-09-30 11:07:11 +08:00
Ulric Qin
87ebbcd7ec upgrade pub dir 2020-09-29 17:42:53 +08:00
yimeng
4b2ebf4761 docker-compose v0.2 (#318)
* fix regular dot

* add yarn.lock to .gitignore

* add docker-compose

* add docker-compose

* Delete PORTForm.tsx

* Delete .gitignore

* add .gitignore

* Update .gitignore

* add ifconfig

* fix dockerfile filename

* remove nginx.sh in Dockerfile

* 1 修复agent缺失命令 2 二次构建二进制 3 mysql启动等待

* 1 修复agent缺失命令 2 二次构建二进制 3 mysql启动等待
2020-09-29 11:18:07 +08:00
yimeng
1482cfcf32 docker-compose快速体验夜莺v3 (#316)
* fix regular dot

* add yarn.lock to .gitignore

* add docker-compose

* add docker-compose

* Delete PORTForm.tsx

* Delete .gitignore

* add .gitignore

* Update .gitignore
2020-09-28 19:05:38 -05:00
Ulric Qin
472ed62c12 Merge branch 'master' of https://github.com/didi/nightingale 2020-09-28 22:55:48 +08:00
Ulric Qin
04822e9d8f bugfix: address.yml: give default config for rdb.addresses 2020-09-28 22:55:26 +08:00
alick-liming
fe81c6cad7 配置调整,连接检查 (#314)
* 1.rabbitmq 配置文件 2.连接检查

* 代码调整

* 启动协程

* rabbitmq 连接检查代码调优

* rabbitmq 连接检查代码调优

Co-authored-by: alickliming <alickliming@didi.global.com>
2020-09-28 17:20:31 +08:00
UlricQin
6e6771ff3b upgrade pub dir 2020-09-28 14:54:12 +08:00
UlricQin
e5fc1bef44 code refactor 2020-09-28 12:33:27 +08:00
litianshun
68b213b737 influxdb query sql bug fix (#313)
* bug fix influxdb query data, query index

Co-authored-by: litianshun <litianshun@meicai.cn>
2020-09-28 11:10:16 +08:00
UlricQin
ceb86a2d5f bugfix: sql schema 2020-09-28 10:07:10 +08:00
UlricQin
6bd28ba82e code refactor 2020-09-27 23:27:11 +08:00
UlricQin
dbc0c0ad40 add install doc 2020-09-27 23:20:23 +08:00
Ulric Qin
6baebfad11 pub dir set to /home/n9e/pub 2020-09-27 23:03:45 +08:00
UlricQin
536b154aaa add some snapshot 2020-09-27 23:02:54 +08:00
UlricQin
db75a5fb74 modify nginx for static files 2020-09-27 19:57:43 +08:00
Ulric Qin
be1e161f31 add vendor Makefile 2020-09-27 19:27:10 +08:00
Ulric Qin
7864114d7c add readme 2020-09-26 23:59:34 +08:00
Ulric Qin
405114a893 code refactor 2020-09-26 23:30:34 +08:00
Ulric Qin
a90adf1212 Merge branch 'master' of github.com:didi/nightingale 2020-09-26 23:27:26 +08:00
Ulric Qin
6c5fb4cd35 code refactor 2020-09-26 23:26:58 +08:00
Ulric Qin
a443710669 go mod refactor 2020-09-26 23:11:34 +08:00
Ulric Qin
9e881bc9a5 use rabbitmq 2020-09-26 23:09:03 +08:00
Ulric Qin
4197cfba98 add nginx.conf 2020-09-26 22:58:14 +08:00
Ulric Qin
4ec1086fc9 add tsdb.yml 2020-09-26 22:41:34 +08:00
Ulric Qin
3d11f2cacc code refactor 2020-09-26 22:38:24 +08:00
Ulric Qin
b4ce2e8167 code refactor 2020-09-26 22:33:31 +08:00
Ulric Qin
5527ed0a86 code refactor 2020-09-26 22:28:24 +08:00
Ulric Qin
107ad572f8 code refactor 2020-09-26 22:21:23 +08:00
Ulric Qin
f16315328b refactor control 2020-09-26 21:58:37 +08:00
Ulric Qin
ae705e1b40 refactor operation log new 2020-09-26 21:45:23 +08:00
710leo
2d9287805e add vendor 2020-09-26 17:02:52 +08:00
710leo
ed35ddc388 3.0.0 2020-09-26 16:53:10 +08:00
710leo
8379624581 clean code 2020-09-26 16:37:46 +08:00
710leo
86b31575eb clean code 2020-09-26 16:28:27 +08:00
UlricQin
654e4278fa Update control 2020-09-22 12:56:38 +08:00
sunyu
f9c6c0465a 支持容器化部署配置自监控,兼容之前的配置 (#310)
* "配置化collector推送地址"

* "配置化collector推送地址"

* "兼容collector之前的配置"

Co-authored-by: 孙宇 <suny129@chinaunicom.cn>
Co-authored-by: suny129 <1061691533@qq.com>
2020-09-22 10:43:52 +08:00
UlricQin
9cb3bd564b code refactor: stats.init 2020-09-22 10:42:50 +08:00
sunyu
bc884175be Configure collector push address (#296)
* "配置化collector推送地址"

* "配置化collector推送地址"

Co-authored-by: 孙宇 <suny129@chinaunicom.cn>
2020-09-16 20:33:00 +08:00
qinyening
fb6e789909 Merge pull request #300 from zhuxingtao/master
fix: misspelling
2020-09-12 12:41:06 +08:00
zhuxingtao
944982898e fix: misspelling 2020-09-09 19:23:40 +08:00
chixianliangGithub
a8357dffb9 更改代码顺序,提高代码性能 (#291)
Co-authored-by: chixl <chixl@t3go.cn>
2020-09-02 15:05:38 +08:00
ronething-bot
50f5c6a98d fix: link error (#284) 2020-08-28 10:06:32 +08:00
UlricQin
f5d050f3f2 delete /api/transfer/v2 2020-08-25 16:19:47 +08:00
UlricQin
2a79303241 add QueryDataV2 2020-08-25 15:29:23 +08:00
xingren23
a72fa5b8dd collect config sys enable,default true (#272)
Co-authored-by: wangzhiguo04 <wangzhiguo04@meicai.cn>
2020-07-26 12:31:31 +08:00
xingren23
66421ae557 refactor collect push , sys -> core (#271)
Co-authored-by: wangzhiguo04 <wangzhiguo04@meicai.cn>
2020-07-26 12:28:45 +08:00
jsers
4b2f6a2c27 web build 2020-07-23 10:11:57 +08:00
jsers
9b99fe61e0 Alarm stra allows to configure the same metrics (#260) 2020-07-23 10:11:27 +08:00
jsers
e0584066a9 feat: add stddev func option (#211) 2020-07-23 10:10:57 +08:00
710leo
d70e60d4a5 fix exclude leaf nid when sync stra 2020-07-22 21:11:00 +08:00
youtwo123
7abec2ccb8 [transfer] fix and trigger generates event twice bug (#266)
* [transfer] fix and trigger generates event twice bug
* [monapi] stra excl all leaf nodes under exclNid
2020-07-22 17:08:38 +08:00
UlricQin
16a39410b8 del space line 2020-07-20 10:00:57 +08:00
杨善阳
b63c4e510c 修改adress.go的convPort方法,支持IPv6地址之间建立连接。 (#248)
monapi:
  http: '[::]:5800'
  address:
  - '[::1]'
2020-07-20 09:58:03 +08:00
xingren23
520dda70c0 refactor transfer datasources for ui/judge, implement tsdb(+index) an… (#246)
* refactor transfer datasources for ui/judge, implement tsdb(+index) and influxdb

* fix error string; fix import identidy ; refactor pushendpoint init

* fix influx queryData

Co-authored-by: wangzhiguo04 <wangzhiguo04@meicai.cn>
2020-07-20 09:57:22 +08:00
UlricQin
b6169ac706 agent interface /v1/push compatible with open-falcon 2020-07-11 10:42:34 +08:00
hutaishi
080d921791 本身403组件名的,写成404组件名了。 (#244) 2020-07-09 10:13:24 +08:00
710leo
a283329e4d Change tagkv count limit 2020-07-08 18:14:49 +08:00
710leo
5bb48df01d Merge branch 'master' of https://github.com/didi/nightingale 2020-07-07 14:12:02 +08:00
710leo
b3e961a3c6 filterString rm ':' 2020-07-07 14:11:50 +08:00
UlricQin
420b61ab52 use release mode in collector 2020-07-06 14:46:39 +08:00
UlricQin
dbd81eed2b code refactor 2020-07-03 12:38:20 +08:00
710leo
d0a00236ba fix upgrading link in readme 2020-07-02 14:52:54 +08:00
UlricQin
01aa9352aa set judge.query.maxConn default to 100 2020-07-01 16:27:26 +08:00
Ulric Qin
5c258520a2 upgrade 2.7.2: refactor index syncing 2020-06-30 00:11:12 +08:00
710leo
c232260e46 Fix change 2020-06-30 00:04:31 +08:00
710leo
bce825ff32 Change push index from async to sync 2020-06-29 23:57:15 +08:00
710leo
3c1ed52bb9 Change push index from async to sync 2020-06-29 23:49:20 +08:00
Ulric Qin
982fc6aaa2 upgrade 2.7.1 2020-06-29 16:12:23 +08:00
Ulric Qin
19890460b9 index.rebuildInterval modify to 6h 2020-06-28 17:05:22 +08:00
Ulric Qin
a46824c8ab Merge branch 'master' of github.com:didi/nightingale 2020-06-28 17:03:05 +08:00
Ulric Qin
bbfa03c894 index.rebuildInterval modify to 12h 2020-06-28 17:02:51 +08:00
710leo
56d1f7b6eb Fix judge get index addrs 2020-06-27 15:34:49 +08:00
710leo
1d2e183839 Delete collect which does not find nid 2020-06-26 18:20:37 +08:00
710leo
d741f24e8c Not allowed same collect name in same nodepath 2020-06-26 16:31:28 +08:00
710leo
22489f2dec Support tagkv check 2020-06-26 16:12:54 +08:00
mt
163c116871 transfer support kafka (#227)
* 修改:     etc/transfer.yml
	修改:     go.mod
	修改:     go.sum
	修改:     src/modules/transfer/backend/init.go
	新文件:   src/modules/transfer/backend/kafka.go
	修改:     src/modules/transfer/backend/sender.go
	修改:     src/modules/transfer/http/routes/push_router.go
	修改:     src/modules/transfer/rpc/push.go
	新文件:   vendor/github.com/Shopify/sarama/.gitignore
	新文件:   vendor/github.com/Shopify/sarama/.golangci.yml
	新文件:   vendor/github.com/Shopify/sarama/CHANGELOG.md
	新文件:   vendor/github.com/Shopify/sarama/LICENSE
	新文件:   vendor/github.com/Shopify/sarama/Makefile
	新文件:   vendor/github.com/Shopify/sarama/README.md
	新文件:   vendor/github.com/Shopify/sarama/Vagrantfile
	新文件:   vendor/github.com/Shopify/sarama/acl_bindings.go
	新文件:   vendor/github.com/Shopify/sarama/acl_create_request.go
	新文件:   vendor/github.com/Shopify/sarama/acl_create_response.go
	新文件:   vendor/github.com/Shopify/sarama/acl_delete_request.go
	新文件:   vendor/github.com/Shopify/sarama/acl_delete_response.go
	新文件:   vendor/github.com/Shopify/sarama/acl_describe_request.go
	新文件:   vendor/github.com/Shopify/sarama/acl_describe_response.go
	新文件:   vendor/github.com/Shopify/sarama/acl_filter.go
	新文件:   vendor/github.com/Shopify/sarama/acl_types.go
	新文件:   vendor/github.com/Shopify/sarama/add_offsets_to_txn_request.go
	新文件:   vendor/github.com/Shopify/sarama/add_offsets_to_txn_response.go
	新文件:   vendor/github.com/Shopify/sarama/add_partitions_to_txn_request.go
	新文件:   vendor/github.com/Shopify/sarama/add_partitions_to_txn_response.go
	新文件:   vendor/github.com/Shopify/sarama/admin.go
	新文件:   vendor/github.com/Shopify/sarama/alter_configs_request.go
	新文件:   vendor/github.com/Shopify/sarama/alter_configs_response.go
	新文件:   vendor/github.com/Shopify/sarama/alter_partition_reassignments_request.go
	新文件:   vendor/github.com/Shopify/sarama/alter_partition_reassignments_response.go
	新文件:   vendor/github.com/Shopify/sarama/api_versions_request.go
	新文件:   vendor/github.com/Shopify/sarama/api_versions_response.go
	新文件:   vendor/github.com/Shopify/sarama/async_producer.go
	新文件:   vendor/github.com/Shopify/sarama/balance_strategy.go
	新文件:   vendor/github.com/Shopify/sarama/broker.go
	新文件:   vendor/github.com/Shopify/sarama/client.go
	新文件:   vendor/github.com/Shopify/sarama/compress.go
	新文件:   vendor/github.com/Shopify/sarama/config.go
	新文件:   vendor/github.com/Shopify/sarama/config_resource_type.go
	新文件:   vendor/github.com/Shopify/sarama/consumer.go
	新文件:   vendor/github.com/Shopify/sarama/consumer_group.go
	新文件:   vendor/github.com/Shopify/sarama/consumer_group_members.go
	新文件:   vendor/github.com/Shopify/sarama/consumer_metadata_request.go
	新文件:   vendor/github.com/Shopify/sarama/consumer_metadata_response.go
	新文件:   vendor/github.com/Shopify/sarama/control_record.go
	新文件:   vendor/github.com/Shopify/sarama/crc32_field.go
	新文件:   vendor/github.com/Shopify/sarama/create_partitions_request.go
	新文件:   vendor/github.com/Shopify/sarama/create_partitions_response.go
	新文件:   vendor/github.com/Shopify/sarama/create_topics_request.go
	新文件:   vendor/github.com/Shopify/sarama/create_topics_response.go
	新文件:   vendor/github.com/Shopify/sarama/decompress.go
	新文件:   vendor/github.com/Shopify/sarama/delete_groups_request.go
	新文件:   vendor/github.com/Shopify/sarama/delete_groups_response.go
	新文件:   vendor/github.com/Shopify/sarama/delete_records_request.go
	新文件:   vendor/github.com/Shopify/sarama/delete_records_response.go
	新文件:   vendor/github.com/Shopify/sarama/delete_topics_request.go
	新文件:   vendor/github.com/Shopify/sarama/delete_topics_response.go
	新文件:   vendor/github.com/Shopify/sarama/describe_configs_request.go
	新文件:   vendor/github.com/Shopify/sarama/describe_configs_response.go
	新文件:   vendor/github.com/Shopify/sarama/describe_groups_request.go
	新文件:   vendor/github.com/Shopify/sarama/describe_groups_response.go
	新文件:   vendor/github.com/Shopify/sarama/describe_log_dirs_request.go
	新文件:   vendor/github.com/Shopify/sarama/describe_log_dirs_response.go
	新文件:   vendor/github.com/Shopify/sarama/dev.yml
	新文件:   vendor/github.com/Shopify/sarama/encoder_decoder.go
	新文件:   vendor/github.com/Shopify/sarama/end_txn_request.go
	新文件:   vendor/github.com/Shopify/sarama/end_txn_response.go
	新文件:   vendor/github.com/Shopify/sarama/errors.go
	新文件:   vendor/github.com/Shopify/sarama/fetch_request.go
	新文件:   vendor/github.com/Shopify/sarama/fetch_response.go
	新文件:   vendor/github.com/Shopify/sarama/find_coordinator_request.go
	新文件:   vendor/github.com/Shopify/sarama/find_coordinator_response.go
	新文件:   vendor/github.com/Shopify/sarama/go.mod
	新文件:   vendor/github.com/Shopify/sarama/go.sum
	新文件:   vendor/github.com/Shopify/sarama/gssapi_kerberos.go
	新文件:   vendor/github.com/Shopify/sarama/heartbeat_request.go
	新文件:   vendor/github.com/Shopify/sarama/heartbeat_response.go
	新文件:   vendor/github.com/Shopify/sarama/init_producer_id_request.go
	新文件:   vendor/github.com/Shopify/sarama/init_producer_id_response.go
	新文件:   vendor/github.com/Shopify/sarama/join_group_request.go
	新文件:   vendor/github.com/Shopify/sarama/join_group_response.go
	新文件:   vendor/github.com/Shopify/sarama/kerberos_client.go
	新文件:   vendor/github.com/Shopify/sarama/leave_group_request.go
	新文件:   vendor/github.com/Shopify/sarama/leave_group_response.go
	新文件:   vendor/github.com/Shopify/sarama/length_field.go
	新文件:   vendor/github.com/Shopify/sarama/list_groups_request.go
	新文件:   vendor/github.com/Shopify/sarama/list_groups_response.go
	新文件:   vendor/github.com/Shopify/sarama/list_partition_reassignments_request.go
	新文件:   vendor/github.com/Shopify/sarama/list_partition_reassignments_response.go
	新文件:   vendor/github.com/Shopify/sarama/message.go
	新文件:   vendor/github.com/Shopify/sarama/message_set.go
	新文件:   vendor/github.com/Shopify/sarama/metadata_request.go
	新文件:   vendor/github.com/Shopify/sarama/metadata_response.go
	新文件:   vendor/github.com/Shopify/sarama/metrics.go
	新文件:   vendor/github.com/Shopify/sarama/mockbroker.go
	新文件:   vendor/github.com/Shopify/sarama/mockkerberos.go
	新文件:   vendor/github.com/Shopify/sarama/mockresponses.go
	新文件:   vendor/github.com/Shopify/sarama/offset_commit_request.go
	新文件:   vendor/github.com/Shopify/sarama/offset_commit_response.go
	新文件:   vendor/github.com/Shopify/sarama/offset_fetch_request.go
	新文件:   vendor/github.com/Shopify/sarama/offset_fetch_response.go
	新文件:   vendor/github.com/Shopify/sarama/offset_manager.go
	新文件:   vendor/github.com/Shopify/sarama/offset_request.go
	新文件:   vendor/github.com/Shopify/sarama/offset_response.go
	新文件:   vendor/github.com/Shopify/sarama/packet_decoder.go
	新文件:   vendor/github.com/Shopify/sarama/packet_encoder.go
	新文件:   vendor/github.com/Shopify/sarama/partitioner.go
	新文件:   vendor/github.com/Shopify/sarama/prep_encoder.go
	新文件:   vendor/github.com/Shopify/sarama/produce_request.go
	新文件:   vendor/github.com/Shopify/sarama/produce_response.go
	新文件:   vendor/github.com/Shopify/sarama/produce_set.go
	新文件:   vendor/github.com/Shopify/sarama/real_decoder.go
	新文件:   vendor/github.com/Shopify/sarama/real_encoder.go
	新文件:   vendor/github.com/Shopify/sarama/record.go
	新文件:   vendor/github.com/Shopify/sarama/record_batch.go
	新文件:   vendor/github.com/Shopify/sarama/records.go
	新文件:   vendor/github.com/Shopify/sarama/request.go
	新文件:   vendor/github.com/Shopify/sarama/response_header.go
	新文件:   vendor/github.com/Shopify/sarama/sarama.go
	新文件:   vendor/github.com/Shopify/sarama/sasl_authenticate_request.go
	新文件:   vendor/github.com/Shopify/sarama/sasl_authenticate_response.go
	新文件:   vendor/github.com/Shopify/sarama/sasl_handshake_request.go
	新文件:   vendor/github.com/Shopify/sarama/sasl_handshake_response.go
	新文件:   vendor/github.com/Shopify/sarama/sticky_assignor_user_data.go
	新文件:   vendor/github.com/Shopify/sarama/sync_group_request.go
	新文件:   vendor/github.com/Shopify/sarama/sync_group_response.go
	新文件:   vendor/github.com/Shopify/sarama/sync_producer.go
	新文件:   vendor/github.com/Shopify/sarama/timestamp.go
	新文件:   vendor/github.com/Shopify/sarama/txn_offset_commit_request.go
	新文件:   vendor/github.com/Shopify/sarama/txn_offset_commit_response.go
	新文件:   vendor/github.com/Shopify/sarama/utils.go
	新文件:   vendor/github.com/Shopify/sarama/zstd.go
	新文件:   vendor/github.com/eapache/go-resiliency/LICENSE
	新文件:   vendor/github.com/eapache/go-resiliency/breaker/README.md
	新文件:   vendor/github.com/eapache/go-resiliency/breaker/breaker.go
	新文件:   vendor/github.com/eapache/go-xerial-snappy/.gitignore
	新文件:   vendor/github.com/eapache/go-xerial-snappy/.travis.yml
	新文件:   vendor/github.com/eapache/go-xerial-snappy/LICENSE
	新文件:   vendor/github.com/eapache/go-xerial-snappy/README.md
	新文件:   vendor/github.com/eapache/go-xerial-snappy/fuzz.go
	新文件:   vendor/github.com/eapache/go-xerial-snappy/snappy.go
	新文件:   vendor/github.com/eapache/queue/.gitignore
	新文件:   vendor/github.com/eapache/queue/.travis.yml
	新文件:   vendor/github.com/eapache/queue/LICENSE
	新文件:   vendor/github.com/eapache/queue/README.md
	新文件:   vendor/github.com/eapache/queue/queue.go
	新文件:   vendor/github.com/golang/snappy/.gitignore
	新文件:   vendor/github.com/golang/snappy/AUTHORS
	新文件:   vendor/github.com/golang/snappy/CONTRIBUTORS
	新文件:   vendor/github.com/golang/snappy/LICENSE
	新文件:   vendor/github.com/golang/snappy/README
	新文件:   vendor/github.com/golang/snappy/decode.go
	新文件:   vendor/github.com/golang/snappy/decode_amd64.go
	新文件:   vendor/github.com/golang/snappy/decode_amd64.s
	新文件:   vendor/github.com/golang/snappy/decode_other.go
	新文件:   vendor/github.com/golang/snappy/encode.go
	新文件:   vendor/github.com/golang/snappy/encode_amd64.go
	新文件:   vendor/github.com/golang/snappy/encode_amd64.s
	新文件:   vendor/github.com/golang/snappy/encode_other.go
	新文件:   vendor/github.com/golang/snappy/go.mod
	新文件:   vendor/github.com/golang/snappy/snappy.go
	新文件:   vendor/github.com/hashicorp/go-uuid/.travis.yml
	新文件:   vendor/github.com/hashicorp/go-uuid/LICENSE
	新文件:   vendor/github.com/hashicorp/go-uuid/README.md
	新文件:   vendor/github.com/hashicorp/go-uuid/go.mod
	新文件:   vendor/github.com/hashicorp/go-uuid/uuid.go
	新文件:   vendor/github.com/jcmturner/gofork/LICENSE
	新文件:   vendor/github.com/jcmturner/gofork/encoding/asn1/README.md
	新文件:   vendor/github.com/jcmturner/gofork/encoding/asn1/asn1.go
	新文件:   vendor/github.com/jcmturner/gofork/encoding/asn1/common.go
	新文件:   vendor/github.com/jcmturner/gofork/encoding/asn1/marshal.go
	新文件:   vendor/github.com/jcmturner/gofork/x/crypto/pbkdf2/pbkdf2.go
	新文件:   vendor/github.com/klauspost/compress/LICENSE
	新文件:   vendor/github.com/klauspost/compress/fse/README.md
	新文件:   vendor/github.com/klauspost/compress/fse/bitreader.go
	新文件:   vendor/github.com/klauspost/compress/fse/bitwriter.go
	新文件:   vendor/github.com/klauspost/compress/fse/bytereader.go
	新文件:   vendor/github.com/klauspost/compress/fse/compress.go
	新文件:   vendor/github.com/klauspost/compress/fse/decompress.go
	新文件:   vendor/github.com/klauspost/compress/fse/fse.go
	新文件:   vendor/github.com/klauspost/compress/huff0/.gitignore
	新文件:   vendor/github.com/klauspost/compress/huff0/README.md
	新文件:   vendor/github.com/klauspost/compress/huff0/bitreader.go
	新文件:   vendor/github.com/klauspost/compress/huff0/bitwriter.go
	新文件:   vendor/github.com/klauspost/compress/huff0/bytereader.go
	新文件:   vendor/github.com/klauspost/compress/huff0/compress.go
	新文件:   vendor/github.com/klauspost/compress/huff0/decompress.go
	新文件:   vendor/github.com/klauspost/compress/huff0/huff0.go
	新文件:   vendor/github.com/klauspost/compress/snappy/.gitignore
	新文件:   vendor/github.com/klauspost/compress/snappy/AUTHORS
	新文件:   vendor/github.com/klauspost/compress/snappy/CONTRIBUTORS
	新文件:   vendor/github.com/klauspost/compress/snappy/LICENSE
	新文件:   vendor/github.com/klauspost/compress/snappy/README
	新文件:   vendor/github.com/klauspost/compress/snappy/decode.go
	新文件:   vendor/github.com/klauspost/compress/snappy/decode_amd64.go
	新文件:   vendor/github.com/klauspost/compress/snappy/decode_amd64.s
	新文件:   vendor/github.com/klauspost/compress/snappy/decode_other.go
	新文件:   vendor/github.com/klauspost/compress/snappy/encode.go
	新文件:   vendor/github.com/klauspost/compress/snappy/encode_amd64.go
	新文件:   vendor/github.com/klauspost/compress/snappy/encode_amd64.s
	新文件:   vendor/github.com/klauspost/compress/snappy/encode_other.go
	新文件:   vendor/github.com/klauspost/compress/snappy/runbench.cmd
	新文件:   vendor/github.com/klauspost/compress/snappy/snappy.go
	新文件:   vendor/github.com/klauspost/compress/zstd/README.md
	新文件:   vendor/github.com/klauspost/compress/zstd/bitreader.go
	新文件:   vendor/github.com/klauspost/compress/zstd/bitwriter.go
	新文件:   vendor/github.com/klauspost/compress/zstd/blockdec.go
	新文件:   vendor/github.com/klauspost/compress/zstd/blockenc.go
	新文件:   vendor/github.com/klauspost/compress/zstd/blocktype_string.go
	新文件:   vendor/github.com/klauspost/compress/zstd/bytebuf.go
	新文件:   vendor/github.com/klauspost/compress/zstd/bytereader.go
	新文件:   vendor/github.com/klauspost/compress/zstd/decoder.go
	新文件:   vendor/github.com/klauspost/compress/zstd/decoder_options.go
	新文件:   vendor/github.com/klauspost/compress/zstd/enc_dfast.go
	新文件:   vendor/github.com/klauspost/compress/zstd/enc_fast.go
	新文件:   vendor/github.com/klauspost/compress/zstd/enc_params.go
	新文件:   vendor/github.com/klauspost/compress/zstd/encoder.go
	新文件:   vendor/github.com/klauspost/compress/zstd/encoder_options.go
	新文件:   vendor/github.com/klauspost/compress/zstd/framedec.go
	新文件:   vendor/github.com/klauspost/compress/zstd/frameenc.go
	新文件:   vendor/github.com/klauspost/compress/zstd/fse_decoder.go
	新文件:   vendor/github.com/klauspost/compress/zstd/fse_encoder.go
	新文件:   vendor/github.com/klauspost/compress/zstd/fse_predefined.go
	新文件:   vendor/github.com/klauspost/compress/zstd/hash.go
	新文件:   vendor/github.com/klauspost/compress/zstd/history.go
	新文件:   vendor/github.com/klauspost/compress/zstd/internal/xxhash/LICENSE.txt
	新文件:   vendor/github.com/klauspost/compress/zstd/internal/xxhash/README.md
	新文件:   vendor/github.com/klauspost/compress/zstd/internal/xxhash/xxhash.go
	新文件:   vendor/github.com/klauspost/compress/zstd/internal/xxhash/xxhash_amd64.go
	新文件:   vendor/github.com/klauspost/compress/zstd/internal/xxhash/xxhash_amd64.s
	新文件:   vendor/github.com/klauspost/compress/zstd/internal/xxhash/xxhash_other.go
	新文件:   vendor/github.com/klauspost/compress/zstd/internal/xxhash/xxhash_safe.go
	新文件:   vendor/github.com/klauspost/compress/zstd/seqdec.go
	新文件:   vendor/github.com/klauspost/compress/zstd/seqenc.go
	新文件:   vendor/github.com/klauspost/compress/zstd/snappy.go
	新文件:   vendor/github.com/klauspost/compress/zstd/zstd.go
	新文件:   vendor/github.com/pierrec/lz4/.gitignore
	新文件:   vendor/github.com/pierrec/lz4/.travis.yml
	新文件:   vendor/github.com/pierrec/lz4/LICENSE
	新文件:   vendor/github.com/pierrec/lz4/README.md
	新文件:   vendor/github.com/pierrec/lz4/block.go
	新文件:   vendor/github.com/pierrec/lz4/debug.go
	新文件:   vendor/github.com/pierrec/lz4/debug_stub.go
	新文件:   vendor/github.com/pierrec/lz4/decode_amd64.go
	新文件:   vendor/github.com/pierrec/lz4/decode_amd64.s
	新文件:   vendor/github.com/pierrec/lz4/decode_other.go
	新文件:   vendor/github.com/pierrec/lz4/errors.go
	新文件:   vendor/github.com/pierrec/lz4/internal/xxh32/xxh32zero.go
	新文件:   vendor/github.com/pierrec/lz4/lz4.go
	新文件:   vendor/github.com/pierrec/lz4/lz4_go1.10.go
	新文件:   vendor/github.com/pierrec/lz4/lz4_notgo1.10.go
	新文件:   vendor/github.com/pierrec/lz4/reader.go
	新文件:   vendor/github.com/pierrec/lz4/writer.go
	新文件:   vendor/github.com/rcrowley/go-metrics/.gitignore
	新文件:   vendor/github.com/rcrowley/go-metrics/.travis.yml
	新文件:   vendor/github.com/rcrowley/go-metrics/LICENSE
	新文件:   vendor/github.com/rcrowley/go-metrics/README.md
	新文件:   vendor/github.com/rcrowley/go-metrics/counter.go
	新文件:   vendor/github.com/rcrowley/go-metrics/debug.go
	新文件:   vendor/github.com/rcrowley/go-metrics/ewma.go
	新文件:   vendor/github.com/rcrowley/go-metrics/gauge.go
	新文件:   vendor/github.com/rcrowley/go-metrics/gauge_float64.go
	新文件:   vendor/github.com/rcrowley/go-metrics/graphite.go
	新文件:   vendor/github.com/rcrowley/go-metrics/healthcheck.go
	新文件:   vendor/github.com/rcrowley/go-metrics/histogram.go
	新文件:   vendor/github.com/rcrowley/go-metrics/json.go
	新文件:   vendor/github.com/rcrowley/go-metrics/log.go
	新文件:   vendor/github.com/rcrowley/go-metrics/memory.md
	新文件:   vendor/github.com/rcrowley/go-metrics/meter.go
	新文件:   vendor/github.com/rcrowley/go-metrics/metrics.go
	新文件:   vendor/github.com/rcrowley/go-metrics/opentsdb.go
	新文件:   vendor/github.com/rcrowley/go-metrics/registry.go
	新文件:   vendor/github.com/rcrowley/go-metrics/runtime.go
	新文件:   vendor/github.com/rcrowley/go-metrics/runtime_cgo.go
	新文件:   vendor/github.com/rcrowley/go-metrics/runtime_gccpufraction.go
	新文件:   vendor/github.com/rcrowley/go-metrics/runtime_no_cgo.go
	新文件:   vendor/github.com/rcrowley/go-metrics/runtime_no_gccpufraction.go
	新文件:   vendor/github.com/rcrowley/go-metrics/sample.go
	新文件:   vendor/github.com/rcrowley/go-metrics/syslog.go
	新文件:   vendor/github.com/rcrowley/go-metrics/timer.go
	新文件:   vendor/github.com/rcrowley/go-metrics/validate.sh
	新文件:   vendor/github.com/rcrowley/go-metrics/writer.go
	删除:     vendor/github.com/shirou/gopsutil/mem/types_openbsd.go
	删除:     vendor/github.com/shirou/gopsutil/process/types_darwin.go
	删除:     vendor/github.com/shirou/gopsutil/process/types_freebsd.go
	删除:     vendor/github.com/shirou/gopsutil/process/types_openbsd.go
	删除:     vendor/github.com/ugorji/go/codec/xml.go
	新文件:   vendor/golang.org/x/crypto/AUTHORS
	新文件:   vendor/golang.org/x/crypto/CONTRIBUTORS
	新文件:   vendor/golang.org/x/crypto/LICENSE
	新文件:   vendor/golang.org/x/crypto/PATENTS
	新文件:   vendor/golang.org/x/crypto/md4/md4.go
	新文件:   vendor/golang.org/x/crypto/md4/md4block.go
	新文件:   vendor/golang.org/x/crypto/pbkdf2/pbkdf2.go
	新文件:   vendor/golang.org/x/net/AUTHORS
	新文件:   vendor/golang.org/x/net/CONTRIBUTORS
	新文件:   vendor/golang.org/x/net/LICENSE
	新文件:   vendor/golang.org/x/net/PATENTS
	新文件:   vendor/golang.org/x/net/internal/socks/client.go
	新文件:   vendor/golang.org/x/net/internal/socks/socks.go
	新文件:   vendor/golang.org/x/net/proxy/dial.go
	新文件:   vendor/golang.org/x/net/proxy/direct.go
	新文件:   vendor/golang.org/x/net/proxy/per_host.go
	新文件:   vendor/golang.org/x/net/proxy/proxy.go
	新文件:   vendor/golang.org/x/net/proxy/socks5.go
	删除:     vendor/golang.org/x/sys/unix/mkasm_darwin.go
	删除:     vendor/golang.org/x/sys/unix/mkpost.go
	删除:     vendor/golang.org/x/sys/unix/mksyscall.go
	删除:     vendor/golang.org/x/sys/unix/mksyscall_aix_ppc.go
	删除:     vendor/golang.org/x/sys/unix/mksyscall_aix_ppc64.go
	删除:     vendor/golang.org/x/sys/unix/mksyscall_solaris.go
	删除:     vendor/golang.org/x/sys/unix/mksysctl_openbsd.go
	删除:     vendor/golang.org/x/sys/unix/mksysnum.go
	删除:     vendor/golang.org/x/sys/unix/types_aix.go
	删除:     vendor/golang.org/x/sys/unix/types_darwin.go
	删除:     vendor/golang.org/x/sys/unix/types_dragonfly.go
	删除:     vendor/golang.org/x/sys/unix/types_freebsd.go
	删除:     vendor/golang.org/x/sys/unix/types_netbsd.go
	删除:     vendor/golang.org/x/sys/unix/types_openbsd.go
	删除:     vendor/golang.org/x/sys/unix/types_solaris.go
	删除:     vendor/golang.org/x/text/unicode/norm/maketables.go
	删除:     vendor/golang.org/x/text/unicode/norm/triegen.go
	删除:     vendor/golang.org/x/tools/go/gcexportdata/main.go
	新文件:   vendor/gopkg.in/jcmturner/aescts.v1/.gitignore
	新文件:   vendor/gopkg.in/jcmturner/aescts.v1/LICENSE
	新文件:   vendor/gopkg.in/jcmturner/aescts.v1/README.md
	新文件:   vendor/gopkg.in/jcmturner/aescts.v1/aescts.go
	新文件:   vendor/gopkg.in/jcmturner/dnsutils.v1/.gitignore
	新文件:   vendor/gopkg.in/jcmturner/dnsutils.v1/.travis.yml
	新文件:   vendor/gopkg.in/jcmturner/dnsutils.v1/LICENSE
	新文件:   vendor/gopkg.in/jcmturner/dnsutils.v1/srv.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/LICENSE
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/asn1tools/tools.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/client/ASExchange.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/client/TGSExchange.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/client/cache.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/client/client.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/client/network.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/client/passwd.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/client/session.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/client/settings.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/config/error.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/config/hosts.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/config/krb5conf.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/credentials/ccache.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/credentials/credentials.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/aes128-cts-hmac-sha1-96.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/aes128-cts-hmac-sha256-128.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/aes256-cts-hmac-sha1-96.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/aes256-cts-hmac-sha384-192.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/common/common.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/crypto.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/des3-cbc-sha1-kd.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/etype/etype.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rc4-hmac.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc3961/encryption.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc3961/keyDerivation.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc3961/nfold.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc3962/encryption.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc3962/keyDerivation.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc4757/checksum.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc4757/encryption.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc4757/keyDerivation.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc4757/msgtype.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc8009/encryption.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/crypto/rfc8009/keyDerivation.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/gssapi/MICToken.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/gssapi/README.md
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/gssapi/contextFlags.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/gssapi/gssapi.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/gssapi/wrapToken.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/addrtype/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/adtype/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/asnAppTag/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/chksumtype/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/errorcode/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/etypeID/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/flags/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/keyusage/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/msgtype/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/nametype/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/iana/patype/constants.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/kadmin/changepasswddata.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/kadmin/message.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/kadmin/passwd.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/keytab/keytab.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/krberror/error.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/messages/APRep.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/messages/APReq.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/messages/KDCRep.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/messages/KDCReq.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/messages/KRBCred.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/messages/KRBError.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/messages/KRBPriv.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/messages/KRBSafe.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/messages/Ticket.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/client_claims.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/client_info.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/credentials_info.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/device_claims.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/device_info.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/kerb_validation_info.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/pac_type.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/s4u_delegation_info.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/signature_data.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/supplemental_cred.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/pac/upn_dns_info.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/types/Authenticator.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/types/AuthorizationData.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/types/Cryptosystem.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/types/HostAddress.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/types/KerberosFlags.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/types/PAData.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/types/PrincipalName.go
	新文件:   vendor/gopkg.in/jcmturner/gokrb5.v7/types/TypedData.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/LICENSE
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/mstypes/claims.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/mstypes/common.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/mstypes/filetime.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/mstypes/group_membership.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/mstypes/kerb_sid_and_attributes.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/mstypes/reader.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/mstypes/rpc_unicode_string.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/mstypes/sid.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/mstypes/user_session_key.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/ndr/arrays.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/ndr/decoder.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/ndr/error.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/ndr/header.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/ndr/pipe.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/ndr/primitives.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/ndr/rawbytes.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/ndr/strings.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/ndr/tags.go
	新文件:   vendor/gopkg.in/jcmturner/rpc.v1/ndr/union.go
	修改:     vendor/gopkg.in/yaml.v2/.travis.yml
	修改:     vendor/gopkg.in/yaml.v2/decode.go
	修改:     vendor/gopkg.in/yaml.v2/scannerc.go
	修改:     vendor/gopkg.in/yaml.v2/yaml.go
	修改:     vendor/gopkg.in/yaml.v2/yamlh.go
	修改:     vendor/modules.txt

* Update sender.go

* Update sender.go

* Update kafka.go

* Update kafka.go

Co-authored-by: 马涛 <matao@staff.sina.com.cn>
2020-06-26 12:07:44 +08:00
Ulric Qin
7d5d791376 support SUBTRACT 2020-06-20 22:59:47 +08:00
Ulric Qin
4b3f11418e code refactor 2020-06-20 22:17:38 +08:00
Ulric Qin
0b4d1639c6 support type: SUBTRACT 2020-06-20 22:08:10 +08:00
dongdong
018d19857d Fix stddev function (#224)
* Correct stddev function usage

* Simplify code

* Fix CI error
2020-06-20 13:56:52 +08:00
jsers
4233c36fd0 web build 2020-06-18 16:04:49 +08:00
jsers
8375caaaba fix: clear useless data when subscribing 2020-06-18 16:04:03 +08:00
jsers
12ae7bbf56 web build 2020-06-17 11:40:09 +08:00
jsers
8555ad5118 style: optimize alarm scene table style (#198) 2020-06-17 11:39:15 +08:00
jsers
eeffa02f59 web build 2020-06-17 11:12:22 +08:00
jsers
7003e3a03b fix: graph legend is not available 2020-06-17 11:10:19 +08:00
jsers
fc023fd833 fix: Graph typescript error 2020-06-17 11:07:58 +08:00
dongdong
b45a968a9a Add stddev funtion for judge (#214) 2020-06-16 14:26:00 +08:00
710leo
8dac520a79 change go.mod 2020-06-15 15:57:10 +08:00
710leo
6237673725 Change module to v2 in go.mod file 2020-06-15 15:35:23 +08:00
UlricQin
0f9ec99c1d upgrade 2.7.0 2020-06-13 18:06:57 +08:00
UlricQin
433da35d34 delete no use config 2020-06-11 20:28:58 +08:00
UlricQin
1a675ed40e refactor port listening checker 2020-06-11 17:31:46 +08:00
matao3754
a1c47b7ca3 添加endpoint屏蔽,并且修改页面对应必填项,去除mcache不必要的锁
* 修改:     go.mod
	修改:     go.sum
	重命名:   pub/index-c6eeb66b35b8fd41c6bf.css -> pub/index-835a6df5e01917561f12.css
	新文件:   pub/index-835a6df5e01917561f12.js
	新文件:   pub/index-835a6df5e01917561f12.js.map
	删除:     pub/index-c6eeb66b35b8fd41c6bf.js
	删除:     pub/index-c6eeb66b35b8fd41c6bf.js.map
	修改:     pub/index.html
	修改:     src/model/user.go
	修改:     src/modules/monapi/cron/mask.go
	修改:     src/modules/monapi/mcache/mask.go
	修改:     web/package-lock.json
	修改:     web/src/pages/Monitor/Silence/CustomForm.tsx

* Update mask.go

Co-authored-by: 马涛 <matao@staff.sina.com.cn>
2020-06-07 09:36:58 +08:00
jsers
6c8e7b024f web build 2020-06-03 17:41:35 +08:00
jsers
76d7b8c3b8 fix: carry tags when strategy batch cloning (#190) 2020-06-03 17:41:35 +08:00
710leo
2b1387620b Merge branch 'master' of github.com:didi/nightingale 2020-06-01 21:12:19 +08:00
Ulric Qin
2d0fa2d26f upgrade 2.6.1 2020-06-01 20:54:47 +08:00
710leo
3eca4b3dac Refactor: remove tsdb xxhash key 2020-06-01 20:48:55 +08:00
jsers
436ae6c610 web build 2020-06-01 18:07:07 +08:00
jsers
70c00f1424 fix: HTML encode in the tooltip 2020-06-01 18:07:07 +08:00
UlricQin
61fc79ff47 add snmp funcs 2020-06-01 16:36:12 +08:00
UlricQin
7dfefedf77 fix time location parse 2020-06-01 16:26:41 +08:00
jsers
48ce07eaa1 web build (#183) 2020-06-01 13:10:06 +08:00
yanli
61d0f87f5b Update LOGForm.tsx (#166) 2020-06-01 09:54:27 +08:00
yanli
2ee7668382 Server Add log time format 02 / 01 / 2006:15:04:05 (#165)
Server Add log time format 02 / 01 / 2006:15:04:05
2020-06-01 09:53:58 +08:00
yanli
eaf1d1be6d Add log time format 02 / 01 / 2006:15:04:05 (#164)
* Add log time format 02 / 01 / 2006:15:04:05

* Server Add log time format 02 / 01 / 2006:15:04:05
2020-06-01 09:53:36 +08:00
jsers
36a24add6e web build 2020-05-29 17:34:42 +08:00
jsers
e66c14b086 feat: add stdin and env filed in the plugin collect page 2020-05-29 17:34:42 +08:00
710leo
6b646e3510 Merge branch 'master' of github.com:didi/nightingale 2020-05-29 16:46:49 +08:00
710leo
a727a7f377 Change plugin collect env format 2020-05-29 16:41:33 +08:00
UlricQin
a366f14434 upgrade 2.4.1: bugfix: default token 2020-05-29 16:02:00 +08:00
UlricQin
834669bf36 upgrade 2.4.0: support grafana 2020-05-28 20:28:48 +08:00
Ulric Qin
3c1c43ae9c monapi support grafana 2020-05-28 18:42:42 +08:00
710leo
0d2860dd8e Plugin collect support stdin and env 2020-05-27 21:05:38 +08:00
710leo
ea25842f9d Optimize alert function 2020-05-27 21:01:08 +08:00
sven
4b21874251 bug fix nodata (#157)
* bug fix nodata
2020-05-26 12:33:39 +08:00
710leo
36e000b988 Update control version 2020-05-24 17:44:42 +08:00
710leo
0d5ca9306e Fix rrd update when value less than 1 2020-05-24 17:40:22 +08:00
jsers
fb070a9843 fix: same key causes console warning (#153) 2020-05-22 14:28:03 +08:00
jsers
f8426529fe web build 2020-05-22 14:17:17 +08:00
jsers
79b3698b64 fix: search users 2020-05-22 14:17:17 +08:00
UlricQin
e74db54485 node post api support chinese 2020-05-22 13:27:07 +08:00
710leo
5f14e76b95 Update version 2020-05-22 00:14:43 +08:00
710leo
2c45f9585d Fix false alert when alert duration less than metric step 2020-05-21 23:56:17 +08:00
Ulric Qin
9015460c02 upgrade 2.2.0: support process mem and cpu collector 2020-05-19 11:05:30 +08:00
UlricQin
85ce228137 add vendor 2020-05-19 10:19:37 +08:00
UlricQin
0546d6cb99 Revert "update vendor"
This reverts commit 68826d39ae.
2020-05-19 10:15:30 +08:00
UlricQin
68826d39ae update vendor 2020-05-19 10:01:37 +08:00
zengwh
1accfdf9a8 support collect process mem util, mem used and cpu util metrics (#152)
* support send to influxdb

* fix influxdb config var

* fix name error

* fix remove unused variable

* support collect proc mem util, mem used and cpu util metrics

* opt calc timestamp

* opt code
2020-05-18 19:25:54 +08:00
Ulric Qin
91d9f3b29f upgrade 2.1.0 2020-05-16 18:46:53 +08:00
zengwh
7fa09f19de Transfer backend support opentsdb (#149)
* support send to opentsdb

* fix issue
2020-05-16 14:32:32 +08:00
Gavin Gao
c610d645bb Merge pull request #147 from GaoJiasheng/master
delete useless lock
2020-05-16 08:47:20 +08:00
gaojiasheng
5207b2fcba delete useless lock 2020-05-16 00:14:42 +08:00
jsers
806a392d0f web build 2020-05-15 18:27:51 +08:00
jsers
6211d4849a rm unused code 2020-05-15 18:27:20 +08:00
jsers
74cfb17d5f web build 2020-05-15 18:25:39 +08:00
jsers
e5cef8ccc0 fix: normalize dirty list to tree 2020-05-15 18:25:39 +08:00
710leo
fe9c84f263 Change verson to 2.0.0 2020-05-15 17:35:03 +08:00
710leo
e9e399de85 Update vendor 2020-05-15 17:30:07 +08:00
710leo
99bd108901 Plugin collect support different params 2020-05-15 17:23:00 +08:00
zengwh
973f94f510 support send to influxdb (#146)
* support send to influxdb

* fix influxdb config var

* fix name error

* fix remove unused variable
2020-05-15 16:41:54 +08:00
710leo
fc4fa979af Merge branch 'master' of github.com:didi/nightingale 2020-05-14 20:38:22 +08:00
710leo
f004543ce0 Optimize nodata alert value & fix plugin collect cycle not work 2020-05-14 20:38:04 +08:00
UlricQin
29f7044695 refactor error message 2020-05-13 18:17:30 +08:00
710leo
e196dcae79 Change control version to 1.4.1 2020-05-13 16:15:22 +08:00
710leo
fbb11de178 Fix parse port collect file 2020-05-13 16:09:44 +08:00
jsers
b19246021d web build 2020-05-13 14:27:20 +08:00
jsers
358b4a9529 fix: return list path 2020-05-13 14:26:33 +08:00
jsers
ba03584e67 web build 2020-05-13 13:20:14 +08:00
jsers
faa020a19a feat: Cache pagesize to local (#140) 2020-05-13 13:19:31 +08:00
jsers
4d08dd4e8c Fix the color of the serie in the comparison 2020-05-13 13:19:31 +08:00
710leo
ecb87505bc Change readme 2020-05-12 22:39:27 +08:00
jsers
5d0093bd5a web build 2020-05-12 21:43:36 +08:00
jsers
42efaafd1e Enable plugin collect page 2020-05-12 21:42:17 +08:00
710leo
71f9aa0618 Merge branch 'master' of github.com:didi/nightingale 2020-05-12 20:07:24 +08:00
710leo
a6de4d35a9 Support operating plugin collection by web 2020-05-12 20:07:11 +08:00
UlricQin
69d568d54e flush stdout 2020-05-12 13:39:33 +08:00
UlricQin
180ee94c55 code refactor 2020-05-12 12:17:06 +08:00
UlricQin
857b905934 add debug log 2020-05-12 11:40:30 +08:00
陈键冬
9264aa6293 Version management (#141)
* Provide a better version management approach

* Fix typo
2020-05-11 20:23:53 +08:00
Ulric Qin
b1c1fda547 add nic prefix: ens 2020-05-10 19:26:52 +08:00
Ulric Qin
b278e29f75 ignore mount point prefix: /run 2020-05-10 19:25:44 +08:00
Ulric Qin
6e10ab4b55 mount ignore refactor 2020-05-10 19:18:46 +08:00
Ulric Qin
04e6f36cac ignore invalid metric 2020-05-10 17:53:36 +08:00
Ulric Qin
6907ffa56c enhance endpoint query 2020-05-09 23:44:55 +08:00
chenjiandongx
275ecdeacc Chore 2020-05-09 21:56:32 +08:00
chenjiandongx
db0ab16315 Remove unnessary zero length 2020-05-09 21:51:23 +08:00
chenjiandongx
c1f38e5ca4 Make variable name more meaningful 2020-05-09 21:44:18 +08:00
jsers
bd35f53950 Optimize yAxis domain 2020-05-09 18:12:07 +08:00
jsers
b1bf15d354 Upgrade ts-graph 2020-05-09 18:08:42 +08:00
jsers
6bc60c11dc web build 2020-05-09 16:46:02 +08:00
jsers
4eeae1a3ac Hide plugin collect entry 2020-05-09 16:44:18 +08:00
jsers
06a537cd1d web build 2020-05-09 11:18:24 +08:00
jsers
82345bc0e2 feat: add plugin collect page 2020-05-09 11:17:30 +08:00
710leo
7d774562c7 Add delete index by endpoint api 2020-05-07 22:16:35 +08:00
陈键冬
a45ca89ecb No 'this' anymore (#135) 2020-05-07 21:42:39 +08:00
Ulric Qin
f609a84c18 display children node events 2020-05-06 22:09:32 +08:00
710leo
afb80933d0 fix: counter type check 2020-04-30 17:51:45 +08:00
qinxiaohui
25a1d30877 expose default configuration 2020-04-29 17:11:41 +08:00
jsers
c559b8eb85 web build 2020-04-29 10:50:19 +08:00
jsers
c27c0b09a9 fix: service name validation 2020-04-29 10:49:35 +08:00
710leo
9c129acdb4 refactor: standardized code 2020-04-28 22:42:15 +08:00
710leo
7f1a947226 refactor: add timestamp check 2020-04-28 19:26:07 +08:00
710leo
f648a6c8c2 refactor: change some log print 2020-04-27 23:36:35 +08:00
jsers
e60322a612 web build 2020-04-27 13:58:54 +08:00
jsers
a30e6d6be7 fix: type and query filter does not work (#104) 2020-04-27 13:56:47 +08:00
jsers
8b0e2b8042 modify 'extra' placement (#106) 2020-04-27 13:56:47 +08:00
jsers
d6f8c48b70 fix: dynamic value does not work (#122) 2020-04-27 13:56:47 +08:00
Ulric Qin
10cf6d94df bugfix: do not update alias if alias is blank whenendpoint import 2020-04-25 17:32:09 +08:00
710leo
b32775eb85 Change pr template 2020-04-25 12:47:06 +08:00
710leo
c5e9c28848 Change issue template 2020-04-25 11:50:30 +08:00
710leo
d9b89f24b5 refactor: sort event tags & change extra 2020-04-23 18:30:56 +08:00
陈键冬
322eb97a45 Add issue template (#112)
* Add issue template
2020-04-23 14:24:25 +08:00
yimeng
894723f7e6 service regular fix (#110)
* fix regular dot

* add yarn.lock to .gitignore
2020-04-23 10:52:50 +08:00
710leo
c3870840e8 Merge branch 'master' of github.com:didi/nightingale 2020-04-23 10:43:49 +08:00
710leo
12d4024710 fix: clude api get redundant tags 2020-04-23 10:35:23 +08:00
qinxiaohui
4bbb39913c Merge branch 'master' of github.com:didi/nightingale 2020-04-22 17:44:32 +08:00
qinxiaohui
12ebdd184a give port 22 as example 2020-04-22 17:44:22 +08:00
710leo
9cb07f1252 fix: nodata strategy does not work when index or tsdb down 2020-04-22 14:54:59 +08:00
710leo
2c285e63c1 optimize nodata 2020-04-21 22:41:43 +08:00
710leo
256dceb624 fix: recovery duration 2020-04-20 22:46:05 +08:00
710leo
45fe14b439 get plugin collect from monapi 2020-04-20 19:07:38 +08:00
陈键冬
0d496f71ca Add PR template (#100) 2020-04-19 21:06:41 +08:00
陈键冬
747c18a076 Fix logger format bug (#99) 2020-04-19 21:05:33 +08:00
penggy
8790094b15 change collector readme, fix url path (#98)
* change collector readme, fix url path

* add error return

* fix word mistake

* optimization code
2020-04-19 00:11:16 +08:00
陈键冬
e32b8e44d1 Reuse the pool module (#97)
* Reuse the pool module

* Remove useless init function

* Rename variable
2020-04-18 14:19:32 +08:00
陈键冬
c0675e9a2a Clean up gomod and vendor (#96) 2020-04-18 08:25:21 +08:00
陈键冬
c14a7a170a Try to fix broken tests (#89)
* Try to fix broken tests

* Remove uesless function
2020-04-17 18:52:32 +08:00
陈键冬
2d8ffb43e6 Standardized transfer code (#94)
* Standardized transfer code
2020-04-17 16:55:48 +08:00
710leo
b50b85f367 Merge branch 'master' of github.com:didi/nightingale 2020-04-16 15:38:51 +08:00
710leo
0b2d4d6d19 fix: concurrent map 2020-04-16 15:35:33 +08:00
sven
a3d956f780 nodata不显示judgeItem tags (#90)
Co-authored-by: litianshun <litianshun@meicai.cn>
2020-04-15 21:03:20 +08:00
jsers
438ec4a0f3 web build 2020-04-15 10:58:20 +08:00
jsers
01c47fd48f fix: remove all falsey values 2020-04-15 10:57:30 +08:00
jsers
b1a0580603 web build 2020-04-14 17:00:46 +08:00
jsers
47500b0a43 fix: get extra value 2020-04-14 17:00:46 +08:00
710leo
56f8241ea5 fix index slice append 2020-04-14 16:02:08 +08:00
jsers
c59cb74f12 web build 2020-04-14 14:30:33 +08:00
jsers
4845a8e87b fix: Complete translation 2020-04-14 14:29:42 +08:00
jsers
0769b02549 feat: Charts support time comparisons 2020-04-14 14:21:31 +08:00
jsers
500afafdf3 Add extra field 2020-04-14 13:12:57 +08:00
jsers
60fcbcb3eb fix: Fixed serie color 2020-04-14 13:07:13 +08:00
2286 changed files with 41598 additions and 568098 deletions

View File

@@ -1 +0,0 @@
web

67
.github/ISSUE_TEMPLATE/bug_report.yml vendored Normal file
View File

@@ -0,0 +1,67 @@
name: Bug Report
description: Report a bug encountered while running Nightingale
labels: ["kind/bug"]
body:
- type: markdown
attributes:
value: |
Thanks for taking time to fill out this bug report!
The more detailed the form is filled in, the easier the problem will be solved.
- type: textarea
id: config
attributes:
label: Relevant server.conf | webapi.conf
description: Place config in the toml code section. This will be automatically formatted into toml, so no need for backticks.
render: toml
validations:
required: true
- type: textarea
id: logs
attributes:
label: Relevant logs
description: categraf | telegraf | server | webapi | prometheus | chrome request/response ...
render: text
validations:
required: true
- type: input
id: system-info
attributes:
label: System info
description: Include nightingale version, operating system, and other relevant details
placeholder: ex. n9e 5.9.2, n9e-fe 5.5.0, categraf 0.1.0, Ubuntu 20.04, Docker 20.10.8
validations:
required: true
- type: textarea
id: reproduce
attributes:
label: Steps to reproduce
description: Describe the steps to reproduce the bug.
value: |
1.
2.
3.
...
validations:
required: true
- type: textarea
id: expected-behavior
attributes:
label: Expected behavior
description: Describe what you expected to happen when you performed the above steps.
validations:
required: true
- type: textarea
id: actual-behavior
attributes:
label: Actual behavior
description: Describe what actually happened when you performed the above steps.
validations:
required: true
- type: textarea
id: additional-info
attributes:
label: Additional info
description: Include gist of relevant config, logs, etc.
validations:
required: false

5
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@@ -0,0 +1,5 @@
blank_issues_enabled: false
contact_links:
- name: Nightingale docs
url: https://n9e.github.io/
about: You may want to read through the document before asking questions.

11
.github/ISSUE_TEMPLATE/enhancement.md vendored Normal file
View File

@@ -0,0 +1,11 @@
---
name: Enhancement Request
about: Suggest an enhancement to the nightingale project
labels: kind/feature
---
<!-- Please only use this template for submitting enhancement requests -->
**What would you like to be added**:
**Why is this needed**:

14
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,14 @@
**What type of PR is this?**
**What this PR does / why we need it**:
<!--
"Nice to have" "You need it" is not a good reason. :)
-->
**Which issue(s) this PR fixes**:
<!--
Usage: `Fixes #<issue number>`, or "Fixes (paste link of issue)"
-->
Fixes #
**Special notes for your reviewer**:

View File

@@ -1,26 +1,32 @@
name: Go
name: Release
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
tags:
- 'v*'
env:
GO_VERSION: 1.18
jobs:
build:
name: Build
goreleaser:
runs-on: ubuntu-latest
steps:
- name: Set up Go 1.13
uses: actions/setup-go@v1
with:
go-version: 1.13
id: go
- name: Check out code into the Go module directory
uses: actions/checkout@v2
- name: Build
run: ./control build
- name: Checkout Source Code
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Setup Go Environment
uses: actions/setup-go@v3
with:
go-version: ${{ env.GO_VERSION }}
- uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Run GoReleaser
uses: goreleaser/goreleaser-action@v3
with:
version: latest
args: release --rm-dist
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

25
.gitignore vendored
View File

@@ -29,24 +29,27 @@ _test
/build
/dist
/etc/*.local.yml
/etc/log/log.test.json
/etc/*.local.conf
/etc/plugins/*.local.yml
/data*
/tarball
/run
/vendor
/tmp
/pub
/n9e
/docker/pub
/docker/n9e
/docker/mysqldata
.alerts
.idea
.index
.vscode
.DS_Store
.cache-loader
.payload
queries.active
/n9e-*
/src/modules/index/index
/src/modules/collector/collector
/src/modules/transfer/transfer
/src/modules/tsdb/tsdb
/src/modules/monapi/monapi
/web/node_modules
/web/.cache-loader

88
.goreleaser.yaml Normal file
View File

@@ -0,0 +1,88 @@
before:
hooks:
# You may remove this if you don't use go modules.
- go mod tidy
snapshot:
name_template: '{{ .Tag }}'
checksum:
name_template: 'checksums.txt'
changelog:
skip: true
builds:
- id: build
hooks:
pre:
- ./fe.sh
main: ./src/
binary: n9e
env:
- CGO_ENABLED=0
goos:
- linux
goarch:
- amd64
- arm64
ldflags:
- -s -w
- -X github.com/didi/nightingale/v5/src/pkg/version.VERSION={{ .Tag }}-{{.Commit}}
archives:
- id: n9e
builds:
- build
format: tar.gz
format_overrides:
- goos: windows
format: zip
name_template: "n9e-v{{ .Version }}-{{ .Os }}-{{ .Arch }}"
wrap_in_directory: false
files:
- docker/*
- etc/*
- pub/*
release:
github:
owner: ccfos
name: nightingale
name_template: "v{{ .Version }}"
dockers:
- image_templates:
- flashcatcloud/nightingale:{{ .Version }}-amd64
goos: linux
goarch: amd64
ids:
- build
dockerfile: docker/Dockerfile.goreleaser
extra_files:
- pub
use: buildx
build_flag_templates:
- "--platform=linux/amd64"
- image_templates:
- flashcatcloud/nightingale:{{ .Version }}-arm64v8
goos: linux
goarch: arm64
ids:
- build
dockerfile: docker/Dockerfile.goreleaser
extra_files:
- pub
use: buildx
build_flag_templates:
- "--platform=linux/arm64/v8"
docker_manifests:
- name_template: flashcatcloud/nightingale:{{ .Version }}
image_templates:
- flashcatcloud/nightingale:{{ .Version }}-amd64
- flashcatcloud/nightingale:{{ .Version }}-arm64v8
- name_template: flashcatcloud/nightingale:latest
image_templates:
- flashcatcloud/nightingale:{{ .Version }}-amd64
- flashcatcloud/nightingale:{{ .Version }}-arm64v8

View File

@@ -1,11 +0,0 @@
FROM golang:1.13
LABEL maintainer="llitfkitfk@gmail.com,chenjiandongx@qq.com"
WORKDIR /app
RUN apt-get update && apt-get install net-tools -y
COPY . .
RUN ./control build docker
RUN mv /app/bin/* /usr/local/bin

42
Makefile Normal file
View File

@@ -0,0 +1,42 @@
.PHONY: start build
NOW = $(shell date -u '+%Y%m%d%I%M%S')
RELEASE_VERSION = 5.9.4
APP = n9e
SERVER_BIN = $(APP)
# RELEASE_ROOT = release
# RELEASE_SERVER = release/${APP}
# GIT_COUNT = $(shell git rev-list --all --count)
# GIT_HASH = $(shell git rev-parse --short HEAD)
# RELEASE_TAG = $(RELEASE_VERSION).$(GIT_COUNT).$(GIT_HASH)
all: build
build:
go build -ldflags "-w -s -X github.com/didi/nightingale/v5/src/pkg/version.VERSION=$(RELEASE_VERSION)" -o $(SERVER_BIN) ./src
# start:
# @go run -ldflags "-X main.VERSION=$(RELEASE_TAG)" ./cmd/${APP}/main.go web -c ./configs/config.toml -m ./configs/model.conf --menu ./configs/menu.yaml
run_webapi:
nohup ./n9e webapi > webapi.log 2>&1 &
run_server:
nohup ./n9e server > server.log 2>&1 &
# swagger:
# @swag init --parseDependency --generalInfo ./cmd/${APP}/main.go --output ./internal/app/swagger
# wire:
# @wire gen ./internal/app
# test:
# cd ./internal/app/test && go test -v
# clean:
# rm -rf data release $(SERVER_BIN) internal/app/test/data cmd/${APP}/data
pack: build
rm -rf $(APP)-$(RELEASE_VERSION).tar.gz
tar -zcvf $(APP)-$(RELEASE_VERSION).tar.gz docker etc $(SERVER_BIN) pub/font pub/index.html pub/assets pub/image

118
README.md
View File

@@ -1,50 +1,102 @@
<img src="https://s3-gz01.didistatic.com/n9e-pub/image/n9e-logo-bg-white.png" width="200" alt="Nightingale"/>
<br>
<img src="doc/img/ccf-n9e.png" width="240">
[中文简介](README_ZH.md)
[English](./README_EN.md) | [中文](./README.md)
Nightingale is a fork of Open-Falcon, and all the core modules have been greatly optimized. It integrates the best practices of DiDi. You can think of it as the next generation of Open-Falcon, and use directly in production environment.
> 夜莺是一款开源的云原生监控系统,采用 All-In-One 的设计,提供企业级的功能特性,开箱即用的产品体验。推荐升级您的 Prometheus + AlertManager + Grafana 组合方案到夜莺。
## Documentation
**夜莺监控具有以下特点:**
Nightingale user manual: [https://n9e.didiyun.com/](https://n9e.didiyun.com/)
#### 1. 开箱即用
支持 Docker、Helm Chart 等多种部署方式,内置多种监控大盘、快捷视图、告警规则模板,导入即可快速使用,活跃、专业的社区用户也在持续迭代和沉淀更多的最佳实践于产品中;
## Compile
#### 2. 兼容并包
支持 [Categraf](https://github.com/flashcatcloud/categraf)、Telegraf、Grafana-agent 等多种采集器,支持 Prometheus、VictoriaMetrics、M3DB 等各种时序数据库,支持对接 Grafana与云原生生态无缝集成
```bash
mkdir -p $GOPATH/src/github.com/didi
cd $GOPATH/src/github.com/didi
git clone https://github.com/didi/nightingale.git
cd nightingale
./control build
```
#### 3. 开放社区
托管于[中国计算机学会开源发展委员会](https://www.ccf.org.cn/kyfzwyh/),有[快猫星云](https://flashcat.cloud)的持续投入,和数千名社区用户的积极参与,以及夜莺监控项目清晰明确的定位,都保证了夜莺开源社区健康、长久的发展;
## Quickstart with Docker
#### 4. 高性能
得益于夜莺的多数据源管理引擎,和夜莺引擎侧优秀的架构设计,借助于高性能时序库,可以满足数亿时间线的采集、存储、告警分析场景,节省大量成本;
We has offered a Docker demo for the users who want to give it a try. Before you get started, make sure you have installed **Docker** & **docker-compose** and there are some details you should know.
#### 5. 高可用
夜莺监控组件均可水平扩展,无单点,已在上千家企业部署落地,经受了严苛的生产实践检验。众多互联网头部公司,夜莺集群机器达百台,处理十亿级时间线,重度使用夜莺监控;
* We highly recommend users prepare a new VM environment to use it.
* All the core components will be installed on your OS according to the `docker-compose.yaml`.
* Nightingale will use the following ports, `80`, `5800`, `5810`, `5811`, `5820`, `5821`, `5830`, `5831`, `5840`, `5841`, `6379`, `2058`, `3306`.
#### 6. 灵活扩展
夜莺监控可部署在1核1G的云主机可在上百台机器部署集群可运行在K8s中也可将时序库、告警引擎等组件下沉到各机房、各region兼顾边缘部署和中心化管理
Okay. Run it! Once the docker finish its jobs, visits http://your-env-ip in your broswer. Default username and password is `root:root`.
```bash
$ docker-compose up -d
```
![dashboard](https://user-images.githubusercontent.com/19553554/78956965-8b9c6180-7b16-11ea-9747-6ed5e62b068d.png)
#### 如果您在使用 Prometheus 过程中,有以下的一个或者多个需求场景,推荐您升级到夜莺:
## Team
- Prometheus、Alertmanager、Grafana 等多个系统较为割裂,缺乏统一视图,无法开箱即用;
- 通过修改配置文件来管理 Prometheus、Alertmanager 的方式,学习曲线大,协同有难度;
- 数据量过大而无法扩展您的 Prometheus 集群;
- 生产环境运行多套 Prometheus 集群,面临管理和使用成本高的问题;
[ulricqin](https://github.com/ulricqin) [710leo](https://github.com/710leo) [jsers](https://github.com/jsers) [hujter](https://github.com/hujter) [n4mine](https://github.com/n4mine) [heli567](https://github.com/heli567)
#### 如果您在使用 Zabbix有以下的场景推荐您升级到夜莺
## Community
- 监控的数据量太大,希望有更好的扩展解决方案;
- 学习曲线高,多人多团队模式下,希望有更好的协同使用效率;
- 微服务和云原生架构下监控数据的生命周期多变、监控数据维度基数高Zabbix 数据模型不易适配;
Nightingale is developed in open. Here we set up an organization, [github.com/n9e](https://github.com/n9e), which is used to communicate and contribute. We sincerely hope more developers can use their creativity to make lots of related projects for the Nightingale ecosystem.
#### 如果您在使用 [open-falcon](https://github.com/open-falcon/falcon-plus),我们更推荐您升级到夜莺:
- 关于open-falcon和夜莺的详细介绍请参考阅读[《云原生监控的十个特点和趋势》](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
#### 我们推荐您使用 [Categraf](https://github.com/flashcatcloud/categraf) 作为首选的监控数据采集器:
- Categraf 是夜莺监控的默认采集器采用开放插件机制和 all-in-one 的设计同时支持 metric、log、trace、event 的采集。Categraf 不仅可以采集 CPU、内存、网络等系统层面的指标也集成了众多开源组件的采集能力支持K8s生态。Categraf 内置了对应的仪表盘和告警规则开箱即用。
## 资料链接
- [快速安装](https://mp.weixin.qq.com/s/iEC4pfL1TgjMDOWYh8H-FA)
- [详细文档](https://n9e.github.io/)
- [社区分享](https://n9e.github.io/docs/prologue/share/)
## 产品演示
<img src="doc/img/intro.gif" width="680">
## 架构介绍
<img src="doc/img/arch-product.png" width="680">
Nightingale 可以接收各种采集器上报的监控数据(比如 [Categraf](https://github.com/flashcatcloud/categraf)、telegraf、grafana-agent、Prometheus并写入多种流行的时序数据库中可以支持Prometheus、M3DB、VictoriaMetrics、Thanos、TDEngine等提供告警规则、屏蔽规则、订阅规则的配置能力提供监控数据的查看能力提供告警自愈机制告警触发之后自动回调某个webhook地址或者执行某个脚本提供历史告警事件的存储管理、分组查看的能力。
<img src="doc/img/arch-system.png" width="680">
夜莺 v5 版本的设计非常简单,核心是 server 和 webapi 两个模块webapi 无状态放到中心端承接前端请求将用户配置写入数据库server 是告警引擎和数据转发模块,一般随着时序库走,一个时序库就对应一套 server每套 server 可以只用一个实例也可以多个实例组成集群server 可以接收 Categraf、Telegraf、Grafana-Agent、Datadog-Agent、Falcon-Plugins 上报的数据,写入后端时序库,周期性从数据库同步告警规则,然后查询时序库做告警判断。每套 server 依赖一个 redis。
<img src="doc/img/install-vm.png" width="680">
如果单机版本的 Prometheus 性能不够或容灾较差,我们推荐使用 [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics)VictoriaMetrics 架构较为简单性能优异易于部署和运维架构图如上。VictoriaMetrics 更详尽的文档,还请参考其[官网](https://victoriametrics.com/)。
## 如何参与
开源项目要更有生命力,离不开开放的治理架构和源源不断的开发者和用户共同参与,我们致力于建立开放、中立的开源治理架构,吸纳更多来自企业、高校等各方面对云原生监控感兴趣、有热情的计算机专业人士,打造专业、有活力的开发者社区。关于《夜莺开源项目和社区治理架构(草案)》,请查阅 [doc/community-governance.md](./doc/community-governance.md).
**我们欢迎您以各种方式参与到夜莺开源项目和开源社区中来,工作包括不限于**
- 补充和完善文档 => [n9e.github.io](https://n9e.github.io/)
- 分享您在使用夜莺监控过程中的最佳实践和经验心得 => [文章分享](https://n9e.github.io/docs/prologue/share/)
- 提交产品建议 =》 [github issue](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md)
- 提交代码,让夜莺监控更快、更稳、更好用 => [github pull request](https://github.com/didi/nightingale/pulls)
**尊重、认可和记录每一位贡献者的工作**是夜莺开源社区的第一指导原则,我们提倡**高效的提问**,这既是对开发者时间的尊重,也是对整个社区知识沉淀的贡献:
1. 提问之前请先查阅 [FAQ](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq)
2. 提问之前请先搜索 [github issue](https://github.com/ccfos/nightingale/issues)
3. 我们优先推荐通过提交 github issue 来提问,如果[有问题点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&template=bug_report.yml) | [有需求建议点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md)
4. 最后,我们推荐你加入微信群,针对相关开放式问题,相互交流咨询 (请先加好友:[UlricGO](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg) 备注:夜莺加群+姓名+公司,交流群里会有开发者团队和专业、热心的群友回答问题)
## 联系我们
- 推荐您关注夜莺监控公众号,及时获取相关产品和社区动态
<img src="doc/img/n9e-vx-new.png" width="180">
## Stargazers over time
[![Stargazers over time](https://starchart.cc/ccfos/nightingale.svg)](https://starchart.cc/ccfos/nightingale)
## License
<img alt="Apache-2.0 license" src="https://s3-gz01.didistatic.com/n9e-pub/image/apache.jpeg" width="128">
Nightingale is available under the Apache-2.0 license. See the [LICENSE](LICENSE) file for more info.
- [Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE)

68
README_EN.md Normal file
View File

@@ -0,0 +1,68 @@
<img src="doc/img/ccf-n9e.png" width="240">
Nightingale is an enterprise-level cloud-native monitoring system, which can be used as drop-in replacement of Prometheus for alerting and management.
[English](./README_EN.md) | [中文](./README.md)
## Introduction
Nightingale is an cloud-native monitoring system by All-In-On design, support enterprise-class functional features with an out-of-the-box experience. We recommend upgrading your `Prometheus` + `AlertManager` + `Grafana` combo solution to Nightingale.
- **Multiple prometheus data sources management**: manage all alerts and dashboards in one centralized visually view;
- **Out-of-the-box alert rule**: built-in multiple alert rules, reuse alert rules template by one-click import with detailed explanation of metrics;
- **Multiple modes for visualizing data**: out-of-the-box dashboards, instance customize views, expression browser and Grafana integration;
- **Multiple collection clients**: support using Promethues Exporter、Telegraf、Datadog Agent to collecting metrics;
- **Integration of multiple storage**: support Prometheus, M3DB, VictoriaMetrics, Influxdb, TDEngine as storage solutions, and original support for PromQL;
- **Fault self-healing**: support the ability to self-heal from failures by configuring webhook;
#### If you are using Prometheus and have one or more of the following requirement scenarios, it is recommended that you upgrade to Nightingale:
- Multiple systems such as Prometheus, Alertmanager, Grafana, etc. are fragmented and lack a unified view and cannot be used out of the box;
- The way to manage Prometheus and Alertmanager by modifying configuration files has a big learning curve and is difficult to collaborate;
- Too much data to scale-up your Prometheus cluster;
- Multiple Prometheus clusters running in production environments, which faced high management and usage costs;
#### If you are using Zabbix and have the following scenarios, it is recommended that you upgrade to Nightingale:
- Monitoring too much data and wanting a better scalable solution;
- A high learning curve and a desire for better efficiency of collaborative use in a multi-person, multi-team model;
- Microservice and cloud-native architectures with variable monitoring data lifecycles and high monitoring data dimension bases, which are not easily adaptable to the Zabbix data model;
#### If you are using [open-falcon](https://github.com/open-falcon/falcon-plus), we recommend you to upgrade to Nightingale
- For more information about open-falcon and Nightingale, please refer to read [Ten features and trends of cloud-native monitoring](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
## Quickstart
- [n9e.github.io/quickstart](https://n9e.github.io/docs/install/compose/)
## Documentation
- [n9e.github.io](https://n9e.github.io/)
## Example of use
<img src="doc/img/intro.gif" width="680">
## System Architecture
#### A typical Nightingale deployment architecture:
<img src="doc/img/arch-system.png" width="680">
#### Typical deployment architecture using VictoriaMetrics as storage:
<img src="doc/img/install-vm.png" width="680">
## Contact us and feedback questions
- We recommend that you use [github issue](https://github.com/didi/nightingale/issues) as the preferred channel for issue feedback and requirement submission;
- You can join our WeChat group
<img src="doc/img/n9e-vx-new.png" width="180">
## Contributing
We welcome your participation in the Nightingale open source project and open source community in a variety of ways:
- Feedback on problems and bugs => [github issue](https://github.com/didi/nightingale/issues)
- Additional and improved documentation => [n9e.github.io](https://n9e.github.io/)
- Share your best practices and insights on using Nightingale => [User Story](https://github.com/didi/nightingale/issues/897)
- Join our community events => [Nightingale wechat group](https://s3-gz01.didistatic.com/n9e-pub/image/n9e-wx.png)
- Submit code to make Nightingale better =>[github PR](https://github.com/didi/nightingale/pulls)
## License
Nightingale with [Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE) open source license.

View File

@@ -1,49 +0,0 @@
<img src="https://s3-gz01.didistatic.com/n9e-pub/image/n9e-logo-bg-white.png" width="200" alt="Nightingale"/>
<br>
[English Introduction](README.md)
Nightingale 是一套衍生自 Open-Falcon 的互联网监控解决方案,融入了部分滴滴生产环境的最佳实践,灵活易用,稳定可靠,是一个生产环境直接可用的版本 :-)
## 文档
使用手册请参考:[夜莺使用手册](https://n9e.didiyun.com/)
## 编译
```bash
mkdir -p $GOPATH/src/github.com/didi
cd $GOPATH/src/github.com/didi
git clone https://github.com/didi/nightingale.git
cd nightingale
./control build
```
## 快速开始
使用 docker 和 docker-compose 环境可以快速部署一整套 nightingale 系统,涵盖了所有的核心组件。
* 强烈建议使用一个新的虚拟环境来部署和测试这个系统。
* 系统组件占用了以下端口,`80`, `5800`, `5810`, `5811`, `5820`, `5821`, `5830`, `5831`, `5840`, `5841`, `6379`, `2058`, `3306`,部署前请确保这些端口没有被使用。
使用 docker-compose 一键构建部署,完成以后可以使用浏览器打开 http://your-env-ip。 默认的登录账号密码均为 `root`
```bash
$ docker-compose up -d
```
![dashboard](https://user-images.githubusercontent.com/19553554/78956965-8b9c6180-7b16-11ea-9747-6ed5e62b068d.png)
## 团队
[ulricqin](https://github.com/ulricqin) [710leo](https://github.com/710leo) [jsers](https://github.com/jsers) [hujter](https://github.com/hujter) [n4mine](https://github.com/n4mine) [heli567](https://github.com/heli567)
## 社区
[github.com/n9e](https://github.com/n9e) 是为夜莺所创建的 Organization用于收集和开发夜莺周边项目。
## License
<img alt="Apache-2.0 license" src="https://s3-gz01.didistatic.com/n9e-pub/image/apache.jpeg" width="128">
Nightingale 基于 Apache-2.0 许可证进行分发和使用,更多信息参见 [LICENSE](LICENSE)。

224
control
View File

@@ -1,224 +0,0 @@
#!/bin/bash
CWD=$(cd $(dirname $0)/; pwd)
cd $CWD
usage()
{
echo $"Usage: $0 {start|stop|restart|status|build|pack} <module>"
exit 0
}
start_all()
{
# http: 5800
test -x n9e-monapi && start monapi
# http: 5810 ; rpc: 5811
test -x n9e-transfer && start transfer
# http: 5820 ; rpc: 5821
test -x n9e-tsdb && start tsdb
# http: 5830 ; rpc: 5831
test -x n9e-index && start index
# http: 5840 ; rpc: 5841
test -x n9e-judge && start judge
# http: 2058
test -x n9e-collector && start collector
}
start()
{
mod=$1
if [ "x${mod}" = "x" ]; then
usage
return
fi
if [ "x${mod}" = "xall" ]; then
start_all
return
fi
binfile=n9e-${mod}
if [ ! -f $binfile ]; then
echo "file[$binfile] not found"
exit 1
fi
if [ $(ps aux|grep -v grep|grep -v control|grep "$binfile" -c) -gt 0 ]; then
echo "${mod} already started"
return
fi
mkdir -p logs/$mod
nohup $CWD/$binfile &> logs/${mod}/stdout.log &
for((i=1;i<=15;i++)); do
if [ $(ps aux|grep -v grep|grep -v control|grep "$binfile" -c) -gt 0 ]; then
echo "${mod} started"
return
fi
sleep 0.2
done
echo "cannot start ${mod}"
exit 1
}
stop_all()
{
test -x n9e-monapi && stop monapi
test -x n9e-transfer && stop transfer
test -x n9e-tsdb && stop tsdb
test -x n9e-index && stop index
test -x n9e-judge && stop judge
test -x n9e-collector && stop collector
}
stop()
{
mod=$1
if [ "x${mod}" = "x" ]; then
usage
return
fi
if [ "x${mod}" = "xall" ]; then
stop_all
return
fi
binfile=n9e-${mod}
if [ $(ps aux|grep -v grep|grep -v control|grep "$binfile" -c) -eq 0 ]; then
echo "${mod} already stopped"
return
fi
ps aux|grep -v grep|grep -v control|grep "$binfile"|awk '{print $2}'|xargs kill
for((i=1;i<=15;i++)); do
if [ $(ps aux|grep -v grep|grep -v control|grep "$binfile" -c) -eq 0 ]; then
echo "${mod} stopped"
return
fi
sleep 0.2
done
echo "cannot stop $mod"
exit 1
}
restart()
{
mod=$1
if [ "x${mod}" = "x" ]; then
usage
return
fi
if [ "x${mod}" = "xall" ]; then
stop_all
start_all
return
fi
stop $mod
start $mod
status
}
status()
{
ps aux|grep -v grep|grep "n9e"
}
build_one()
{
mod=$1
go build -mod=vendor -o n9e-${mod} --tags "md5" src/modules/${mod}/${mod}.go
}
build_docker()
{
mod=$1
go build -mod=vendor -o bin/n9e-${mod} --tags "md5" src/modules/${mod}/${mod}.go
}
build()
{
export GO111MODULE=on
mod=$1
if [ "x${mod}" = "x" ]; then
build_one monapi
build_one transfer
build_one index
build_one judge
build_one collector
build_one tsdb
return
fi
if [ "x${mod}" = "xdocker" ]; then
build_docker monapi
build_docker transfer
build_docker index
build_docker judge
build_docker collector
build_docker tsdb
return
fi
build_one $mod
}
reload()
{
mod=$1
if [ "x${mod}" = "x" ]; then
echo "arg: <mod> is necessary"
return
fi
build_one $mod
restart $mod
}
pack()
{
v=$(date +%Y-%m-%d-%H-%M-%S)
tar zcvf n9e-$v.tar.gz control sql plugin pub etc/log etc/port etc/service etc/nginx.conf etc/mysql.yml etc/address.yml \
n9e-collector etc/collector.yml \
n9e-tsdb etc/tsdb.yml \
n9e-index etc/index.yml \
n9e-judge etc/judge.yml \
n9e-transfer etc/transfer.yml \
n9e-monapi etc/monapi.yml
}
case "$1" in
start)
start $2
;;
stop)
stop $2
;;
restart)
restart $2
;;
status)
status
;;
build)
build $2
;;
reload)
reload $2
;;
pack)
pack
;;
*)
usage
esac

View File

@@ -1,689 +0,0 @@
# 前端接口
`POST /api/portal/auth/login`
校验用户登录信息的接口is_ldap=0表示不使用LDAP账号验证is_ldap=1表示使用LDAP账号验证
```
{
"username": "",
"password": "",
"is_ldap": 0
}
```
---
`GET /api/portal/auth/logout`
退出当前账号,如果请求成功,前端需要跳转到登录页面
---
`GET /api/portal/self/profile`
获取个人信息,可以用此接口校验用户是否登录了
---
`PUT /api/portal/self/profile`
更新个人信息
```
{
"dispname": "",
"phone": "",
"email": "",
"im": ""
}
```
---
`PUT /api/portal/self/password`
更新个人密码,新密码输入两次做校验,在前端完成
```
{
"oldpass": "",
"newpass": ""
}
```
---
`GET /api/portal/user`
获取用户列表支持搜索搜索条件参数是query每页显示条数是limit页码是p如果当前用户是root则展示相关操作按钮如果不是则所有按钮不展示只是查看
---
`POST /api/portal/user`
root账号新增一个用户is_root字段表示新增的这个用户是否是个root
```
{
"username": "",
"password": "",
"dispname": "",
"phone": "",
"email": "",
"im": "",
"is_root": 0
}
```
---
`GET /api/portal/user/:id/profile`
获取某个人的信息
---
`PUT /api/portal/user/:id/profile`
root账号修改某人的信息
```
{
"dispname": "",
"phone": "",
"email": "",
"im": "",
"is_root": 0
}
```
---
`PUT /api/portal/user/:id/password`
root账号重置某人的密码输入两次新密码保证一致的校验由前端来做
```
{
"password": ""
}
```
---
`DELETE /api/portal/user/:id`
root账号来删除某个用户
---
`GET /api/portal/team`
获取团队列表支持搜索搜索条件参数是query每页显示条数是limit页码是p
---
`POST /api/portal/team`
创建团队mgmt=0表示成员管理制mgmt=1表示管理员管理制admins是团队管理员的id列表members是团队普通成员的id列表
```
{
"ident": "",
"name": "",
"mgmt: 0,
"admins": [],
"members": []
}
```
---
`PUT /api/portal/team/:id`
修改团队信息
```
{
"ident": "",
"name": "",
"mgmt: 0,
"admins": [],
"members": []
}
```
---
`DELETE /api/portal/team/:id`
删除团队
---
`GET /api/portal/endpoint`
获取endpoint列表用于【服务树】-【对象列表】页面该页展示endpoint列表搜索条件参数是query每页显示条数是limit页码是p如果要做批量筛选则同时要指定用哪个字段(参数名字是field)来筛选只支持ident和alias批量筛选的内容是batch即batch和field一般是同时出现的
---
`POST /api/portal/endpoint`
导入endpoint要求传入列表每一条是ident::alias拼接在一起
```
{
"endpoints": []
}
```
---
`PUT /api/portal/endpoint/:id`
修改一个endpoint的alias信息
```
{
"alias": ""
}
```
---
`DELETE /api/portal/endpoint`
删除多个endpointids参数放到request body里
```
{
"ids": [10000, 200000]
}
```
---
`GET /api/portal/endpoints/bindings`
查询endpoint的绑定关系QueryStringidents逗号分隔多个
---
`GET /api/portal/endpoints/bynodeids`
根据节点id查询挂载了哪些endpointQueryStringids逗号分隔的多个节点id
---
`GET /api/portal/tree`
查询整颗服务树
---
`GET /api/portal/tree/search`
根据节点路径(path)查询服务树子树
---
`POST /api/portal/node`
创建服务树节点pid表示父节点idleaf=0表示非叶子节点leaf=1表示叶子节点note是备注信息
```
{
"pid": 0,
"name": "",
"leaf": 0,
"note": ""
}
```
---
`PUT /api/portal/node/:id/name`
服务树节点改名
```
{
"name": ""
}
```
---
`DELETE /api/portal/node/:id`
删除服务树节点
---
`GET /api/portal/node/:id/endpoint`
获取节点下面的endpoint列表查询字符串使用query每页展示多少条使用limit页码使用p如要批量筛选一行一个使用batch同时必须指定field即使用哪个字段进行批量筛选有ident和alias可选
---
`POST /api/portal/node/:id/endpoint-bind`
绑定一批endpoint到当前节点del_old=1表示同时删除老的挂载关系
```
{
"idents": [],
"del_old": 0
}
```
---
`POST /api/portal/node/:id/endpoint-unbind`
解绑endpoint和节点的挂载关系
```
{
"idents": []
}
```
---
`GET /api/portal/nodes/search`
搜索节点limit表示最多返回多少条query是搜索条件
---
`GET /api/portal/nodes/leafids`
获取节点对应的叶子节点的id参数是ids逗号分隔的多个节点id
---
`GET /api/portal/nodes/pids`
获取节点对应的父、祖节点的id参数是ids逗号分隔的多个节点id
---
`GET /api/portal/nodes/byids`
查询节点的信息参数是ids逗号分隔的多个节点id返回这多个节点的信息
---
`GET /api/portal/node/:id/maskconf`
获取报警屏蔽列表,因为已经是某个节点下的了,量比较少,后端不分页
---
`POST /api/portal/maskconf`
创建一个报警屏蔽策略
```
{
"nid": 0,
"endpoints": [],
"metric": "",
"tags": "",
"cause": "",
"btime": 1563838361,
"etime": 1563838461
}
```
---
`PUT /api/portal/maskconf/:id`
修改一个报警屏蔽策略
```
{
"endpoints": [],
"metric": "",
"tags": "",
"cause": "",
"btime": 1563838361,
"etime": 1563838461
}
```
---
`DELETE /api/portal/maskconf/:id`
删除一个报警屏蔽策略
---
`GET /api/portal/node/:id/screen`
获取screen列表因为已经是某个节点下的了量比较少后端不分页
---
`POST /api/portal/node/:id/screen`
创建screen
```
{
"name": ""
}
```
---
`PUT /api/portal/screen/:id`
修改screen其中node_id顺带也可以修改这样screen相当于直接挪动了挂载节点
```
{
"name": "",
"node_id": 0
}
```
---
`DELETE /api/portal/screen/:id`
删除某个screen
---
`GET /api/portal/screen/:id/subclass`
获取screen下面的子类返回的subclass按照weight字段排序
---
`POST /api/portal/screen/:id/subclass`
创建subclass
```
{
"name": "",
"weight": 0
}
```
---
`PUT /api/portal/subclass`
批量修改subclass
```
[
{
"id": 1,
"name": "a",
"weight": 1
},
{
"id": 2,
"name": "b",
"weight": 0
}
]
```
---
`DELETE /api/portal/subclass/:id`
删除某个subclass
---
`PUT /api/portal/subclasses/loc`
修改subclass的location即所属的screen
```
[
{
"id": 1,
"screen_id": 1
},
{
"id": 2,
"screen_id": 1
}
]
```
---
`GET /api/portal/subclass/:id/chart`
获取chart列表根据chart的weight排序不分页
---
`POST /api/portal/subclass/:id/chart`
创建chart
```
{
"configs": "",
"weight": 0
}
```
---
`PUT /api/portal/chart/:id`
修改某个chart的信息
```
{
"subclass_id": 1,
"configs": ""
}
```
---
`DELETE /api/portal/chart/:id`
删除某个chart
---
`PUT /api/portal/charts/weights`
修改chart的排序权重
```
{
"id": 1,
"weight": 9
}
```
---
`GET /api/portal/tmpchart`
获取临时图参数是QueryStringids逗号分隔的多个id
---
`POST /api/portal/tmpchart`
创建一个临时图返回生成的临时图的id列表
```
[
{
"configs": ""
},
{
"configs": ""
}
]
```
---
`GET /api/portal/event/cur`
获取某个节点下的未恢复报警列表QueryString
- 节点路径nodepath
- 开始时间stime
- 结束时间etime
- 每页条数limit
- 查询条件query
- 优先级priorities逗号分隔的多个
- 发送类型sendtypes逗号分隔的多个
---
`GET /api/portal/event/cur/:id`
获取当前某一个未恢复的报警
---
`DELETE /api/portal/event/cur/:id`
删除当前某一个未恢复的报警
---
`POST /api/portal/event/curs/claim`
认领某一些未恢复的告警避免告警升级到老板那里id和nodepath只能传入一个不能同时传入也不能一个都不传业务上的语义是要么认领某一个告警事件要么认领某个节点下的所有告警事件
```
{
"id": 1,
"nodepath": ""
}
```
---
`GET /api/portal/event/his`
获取某个节点下的所有报警列表QueryString
- 节点路径nodepath
- 开始时间stime
- 结束时间etime
- 每页条数limit
- 查询条件query
- 优先级priorities逗号分隔的多个
- 发送类型sendtypes逗号分隔的多个
- 事件类型type
---
`GET /api/portal/event/his/:id`
获取某个历史告警事件
---
`GET /api/portal/collect/list`
获取某个节点下面配置的采集策略必传QueryString: nid表示节点id后端不分页返回全部
---
`GET /api/portal/collect`
获取单个采集配置的详情QueryStringtype和id都必传type是port、proc、log之一。
---
`POST /api/portal/collect`
创建一个采集策略
```
{
"type": "",
"data": {
# 端口、进程、日志配置不同参阅model/collect.go
}
}
```
---
`PUT /api/portal/collect`
修改一个采集策略
```
{
"type": "",
"data": {
# 端口、进程、日志配置不同参阅model/collect.go
}
}
```
---
`DELETE /api/portal/collect`
删除采集策略
```
{
"type": "",
"ids": []
}
```
---
`POST /api/portal/collect/check`
校验用户输入的数据是否匹配正则,跟商业版本数据结构一致
---
`POST /api/portal/stra`
新增告警策略,跟商业版本数据结构一致
---
`PUT /api/portal/stra`
修改告警策略,跟商业版本数据结构一致
---
`DELETE /api/portal/stra`
删除告警策略,跟商业版本数据结构一致
---
`GET /api/portal/stra`
获取告警策略列表,跟商业版本数据结构一致
---
`GET /api/portal/stra/:id`
获取单个告警策略,跟商业版本数据结构一致

View File

@@ -0,0 +1,52 @@
# 夜莺开源项目和社区治理架构(草案)
#### 用户(User)
>欢迎任何个人、公司以及组织,使用 Nightingale并积极的反馈 bug、提交功能需求、以及相互帮助我们推荐使用 github issue 来跟踪 bug 和管理需求。
#### 贡献者(Contributer)
>欢迎每一位用户,包括但不限于以下列方式参与到 Nightingale 开源项目并做出贡献:
>1. 在 [github issue](https://github.com/ccfos/nightingale/issues) 中积极参与讨论;
>2. 提交代码补丁;
>3. 修订、补充和完善文档;
>4. 提交建议 / 批评;
#### 提交者(Committer)
>Committer 是指拥有 Nightingale 代码仓库写操作权限的贡献者,而且他们也签署了 Nightingale 项目贡献者许可协议CLA他们拥有 ccf.org.cn 为后缀的邮箱地址。原则上 Committer 能够自主决策某个代码补丁是否可以合入到 Nightingale 代码仓库,但是项目管委会拥有最终的决策权。
#### 项目管委会成员(PMC Member)
> 项目管委会成员,从贡献者或者 Committer 中选举产生,他们拥有 Nightingale 代码仓库的写操作权限,拥有 ccf.org.cn 为后缀的邮箱地址,拥有 Nightingale 社区相关事务的投票权、以及提名 Committer 候选人的权利。 项目管委会作为一个实体,为整个项目的发展全权负责。
#### 项目管委会主席(PMC Chair)
> 项目管委会主席采用任命制,由 [CCF ODC](https://www.ccf.org.cn/kyfzwyh/) 从项目管委会成员中任命产生。项目管委会作为一个统一的实体,来管理和领导 Nightingale 项目。管委会主席是 CCF ODC 和项目管委会之间的沟通桥梁,履行特定的项目管理职责。
# 沟通机制(Communication)
1. 我们推荐使用邮件列表来反馈建议(待发布);
2. 我们推荐使用 [github issue](https://github.com/ccfos/nightingale/issues) 跟踪 bug 和管理需求;
3. 我们推荐使用 [github milestone](https://github.com/ccfos/nightingale/milestones) 来管理项目进度和规划;
4. 我们推荐使用腾讯会议来定期召开项目例会;
# 文档(Documentation)
1. 我们推荐使用 [github pages](https://n9e.github.io) 来沉淀文档;
2. 我们推荐使用 [gitlink wiki](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq) 来沉淀FAQ
# 运营机制(Operation)
1. 我们定期组织用户、贡献者、项目管委会成员之间的沟通会议,讨论项目开发的目标、方案、进度,以及讨论相关需求的合理性、优先级等议题;
2. 我们定期组织 meetup (线上&线下),创造良好的用户交流分享环境,并沉淀相关内容到文档站点;
3. 我们定期组织 Nightingale 开发者大会,分享 best user story、同步年度开发目标和计划、讨论新技术方向等
# 社区指导原则(Philosophy)
- 尊重、认可和记录每一位贡献者的工作;
# 关于提问的原则
按照**尊重、认可、记录每一位贡献者的工作**原则,我们提倡**高效的提问**,这既是对开发者时间的尊重,也是对整个社区的知识沉淀的贡献:
1. 提问之前请先查阅 [FAQ](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq)
2. 提问之前请先搜索 [github issue](https://github.com/ccfos/nightingale/issues)
3. 我们优先推荐通过提交 github issue 来提问,如果[有问题点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&template=bug_report.yml) | [有需求建议点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md)
4. 最后,我们推荐你加入微信群,针对相关开放式问题,相互交流咨询 (请先加好友:[UlricGO](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg) 备注:夜莺加群+姓名+公司,交流群里会有开发者团队和专业、热心的群友回答问题)

BIN
doc/img/alert-events.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 450 KiB

BIN
doc/img/arch-product.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB

BIN
doc/img/arch-system.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 198 KiB

BIN
doc/img/arch.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 198 KiB

BIN
doc/img/ccf-n9e.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

BIN
doc/img/dingtalk.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 242 KiB

BIN
doc/img/install-vm.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 141 KiB

BIN
doc/img/intro.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.1 MiB

BIN
doc/img/mysql-alerts.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 393 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.3 MiB

BIN
doc/img/n9e-vx-new.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 331 KiB

BIN
doc/img/redis-dash.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 801 KiB

BIN
doc/img/vm-cluster-arch.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB

BIN
doc/img/wx.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

View File

@@ -1,90 +0,0 @@
version: "3"
volumes:
mysql-data:
services:
nginx:
image: nginx:stable-alpine
network_mode: host
ports:
- 80:80
volumes:
- ./docker/nginx/nginx.conf:/etc/nginx/nginx.conf
- ./docker/nginx/conf.d:/etc/nginx/conf.d
- ./pub:/home/n9e/pub
nightingale:
build: .
image: nightingale
monapi:
image: nightingale
network_mode: host
restart: always
command: n9e-monapi
ports:
- 5800:5800
transfer:
image: nightingale
network_mode: host
restart: always
command: n9e-transfer
ports:
- 5810:5810
- 5811:5811
tsdb:
image: nightingale
network_mode: host
restart: always
command: n9e-tsdb
ports:
- 5820:5820
- 5821:5821
index:
image: nightingale
network_mode: host
restart: always
command: n9e-index
ports:
- 5830:5830
- 5831:5831
judge:
image: nightingale
network_mode: host
restart: always
command: n9e-judge
ports:
- 5840:5840
- 5841:5841
collector:
image: nightingale
network_mode: host
restart: always
command: n9e-collector
ports:
- 2058:2058
redis:
image: redis
network_mode: host
restart: always
ports:
- 6379:6379
mysql:
image: mysql:5.7
network_mode: host
restart: always
environment:
- MYSQL_ROOT_PASSWORD=1234
ports:
- 3306:3306
volumes:
- ./sql:/docker-entrypoint-initdb.d
- mysql-data:/var/lib/mysql

7
docker/.dockerignore Normal file
View File

@@ -0,0 +1,7 @@
ibexetc
initsql
mysqletc
n9eetc
prometc
build.sh
docker-compose.yaml

14
docker/Dockerfile Normal file
View File

@@ -0,0 +1,14 @@
FROM python:2
#FROM ubuntu:21.04
WORKDIR /app
ADD n9e /app
ADD http://download.flashcat.cloud/wait /wait
RUN mkdir -p /app/pub && chmod +x /wait
ADD pub /app/pub/
RUN chmod +x n9e
EXPOSE 19000
EXPOSE 18000
CMD ["/app/n9e", "-h"]

View File

@@ -0,0 +1,14 @@
FROM --platform=$BUILDPLATFORM python:2
WORKDIR /app
ADD n9e /app
ADD http://download.flashcat.cloud/wait /wait
RUN mkdir -p /app/pub && chmod +x /wait
ADD pub /app/pub/
RUN chmod +x n9e
EXPOSE 19000
EXPOSE 18000
CMD ["/app/n9e", "-h"]

20
docker/build.sh Executable file
View File

@@ -0,0 +1,20 @@
#!/bin/sh
if [ $# -ne 1 ]; then
echo "$0 <tag>"
exit 0
fi
tag=$1
echo "tag: ${tag}"
rm -rf n9e pub
cp ../n9e .
cp -r ../pub .
docker build -t nightingale:${tag} .
docker tag nightingale:${tag} ulric2019/nightingale:${tag}
docker push ulric2019/nightingale:${tag}
rm -rf n9e pub

View File

@@ -0,0 +1,45 @@
[global]
# whether print configs
print_configs = false
# add label(agent_hostname) to series
# "" -> auto detect hostname
# "xx" -> use specified string xx
# "$hostname" -> auto detect hostname
# "$ip" -> auto detect ip
# "$hostname-$ip" -> auto detect hostname and ip to replace the vars
hostname = "$HOSTNAME"
# will not add label(agent_hostname) if true
omit_hostname = false
# s | ms
precision = "ms"
# global collect interval
interval = 15
[global.labels]
source="categraf"
# region = "shanghai"
# env = "localhost"
[writer_opt]
# default: 2000
batch = 2000
# channel(as queue) size
chan_size = 10000
[[writers]]
url = "http://nserver:19000/prometheus/v1/write"
# Basic auth username
basic_auth_user = ""
# Basic auth password
basic_auth_pass = ""
# timeout settings, unit: ms
timeout = 5000
dial_timeout = 2500
max_idle_conns_per_host = 100

View File

@@ -0,0 +1,5 @@
# # collect interval
# interval = 15
# # whether collect per cpu
# collect_per_cpu = false

View File

@@ -0,0 +1,11 @@
# # collect interval
# interval = 15
# # By default stats will be gathered for all mount points.
# # Set mount_points will restrict the stats to only the specified mount points.
# mount_points = ["/"]
# Ignore mount points by filesystem type.
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
ignore_mount_points = ["/boot"]

View File

@@ -0,0 +1,6 @@
# # collect interval
# interval = 15
# # By default, categraf will gather stats for all devices including disk partitions.
# # Setting devices will restrict the stats to the specified devices.
# devices = ["sda", "sdb", "vd*"]

View File

@@ -0,0 +1,63 @@
# # collect interval
# interval = 15
[[instances]]
# # append some labels for series
# labels = { region="cloud", product="n9e" }
# # interval = global.interval * interval_times
# interval_times = 1
## Docker Endpoint
## To use TCP, set endpoint = "tcp://[ip]:[port]"
## To use environment variables (ie, docker-machine), set endpoint = "ENV"
endpoint = "unix:///var/run/docker.sock"
## Set to true to collect Swarm metrics(desired_replicas, running_replicas)
gather_services = false
gather_extend_memstats = false
container_id_label_enable = true
container_id_label_short_style = true
## Containers to include and exclude. Globs accepted.
## Note that an empty array for both will include all containers
container_name_include = []
container_name_exclude = []
## Container states to include and exclude. Globs accepted.
## When empty only containers in the "running" state will be captured.
## example: container_state_include = ["created", "restarting", "running", "removing", "paused", "exited", "dead"]
## example: container_state_exclude = ["created", "restarting", "running", "removing", "paused", "exited", "dead"]
# container_state_include = []
# container_state_exclude = []
## Timeout for docker list, info, and stats commands
timeout = "5s"
## Specifies for which classes a per-device metric should be issued
## Possible values are 'cpu' (cpu0, cpu1, ...), 'blkio' (8:0, 8:1, ...) and 'network' (eth0, eth1, ...)
## Please note that this setting has no effect if 'perdevice' is set to 'true'
perdevice_include = []
## Specifies for which classes a total metric should be issued. Total is an aggregated of the 'perdevice' values.
## Possible values are 'cpu', 'blkio' and 'network'
## Total 'cpu' is reported directly by Docker daemon, and 'network' and 'blkio' totals are aggregated by this plugin.
## Please note that this setting has no effect if 'total' is set to 'false'
total_include = ["cpu", "blkio", "network"]
## Which environment variables should we use as a tag
##tag_env = ["JAVA_HOME", "HEAP_SIZE"]
## docker labels to include and exclude as tags. Globs accepted.
## Note that an empty array for both will include all labels as tags
docker_label_include = []
docker_label_exclude = ["annotation*", "io.kubernetes*", "*description*", "*maintainer*", "*hash", "*author*"]
## Optional TLS Config
# use_tls = false
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false

View File

@@ -0,0 +1,2 @@
# # collect interval
# interval = 15

View File

@@ -0,0 +1,124 @@
# # collect interval
# interval = 15
# file: /proc/vmstat
[white_list]
oom_kill = 1
nr_free_pages = 0
nr_alloc_batch = 0
nr_inactive_anon = 0
nr_active_anon = 0
nr_inactive_file = 0
nr_active_file = 0
nr_unevictable = 0
nr_mlock = 0
nr_anon_pages = 0
nr_mapped = 0
nr_file_pages = 0
nr_dirty = 0
nr_writeback = 0
nr_slab_reclaimable = 0
nr_slab_unreclaimable = 0
nr_page_table_pages = 0
nr_kernel_stack = 0
nr_unstable = 0
nr_bounce = 0
nr_vmscan_write = 0
nr_vmscan_immediate_reclaim = 0
nr_writeback_temp = 0
nr_isolated_anon = 0
nr_isolated_file = 0
nr_shmem = 0
nr_dirtied = 0
nr_written = 0
numa_hit = 0
numa_miss = 0
numa_foreign = 0
numa_interleave = 0
numa_local = 0
numa_other = 0
workingset_refault = 0
workingset_activate = 0
workingset_nodereclaim = 0
nr_anon_transparent_hugepages = 0
nr_free_cma = 0
nr_dirty_threshold = 0
nr_dirty_background_threshold = 0
pgpgin = 0
pgpgout = 0
pswpin = 0
pswpout = 0
pgalloc_dma = 0
pgalloc_dma32 = 0
pgalloc_normal = 0
pgalloc_movable = 0
pgfree = 0
pgactivate = 0
pgdeactivate = 0
pgfault = 0
pgmajfault = 0
pglazyfreed = 0
pgrefill_dma = 0
pgrefill_dma32 = 0
pgrefill_normal = 0
pgrefill_movable = 0
pgsteal_kswapd_dma = 0
pgsteal_kswapd_dma32 = 0
pgsteal_kswapd_normal = 0
pgsteal_kswapd_movable = 0
pgsteal_direct_dma = 0
pgsteal_direct_dma32 = 0
pgsteal_direct_normal = 0
pgsteal_direct_movable = 0
pgscan_kswapd_dma = 0
pgscan_kswapd_dma32 = 0
pgscan_kswapd_normal = 0
pgscan_kswapd_movable = 0
pgscan_direct_dma = 0
pgscan_direct_dma32 = 0
pgscan_direct_normal = 0
pgscan_direct_movable = 0
pgscan_direct_throttle = 0
zone_reclaim_failed = 0
pginodesteal = 0
slabs_scanned = 0
kswapd_inodesteal = 0
kswapd_low_wmark_hit_quickly = 0
kswapd_high_wmark_hit_quickly = 0
pageoutrun = 0
allocstall = 0
pgrotated = 0
drop_pagecache = 0
drop_slab = 0
numa_pte_updates = 0
numa_huge_pte_updates = 0
numa_hint_faults = 0
numa_hint_faults_local = 0
numa_pages_migrated = 0
pgmigrate_success = 0
pgmigrate_fail = 0
compact_migrate_scanned = 0
compact_free_scanned = 0
compact_isolated = 0
compact_stall = 0
compact_fail = 0
compact_success = 0
htlb_buddy_alloc_success = 0
htlb_buddy_alloc_fail = 0
unevictable_pgs_culled = 0
unevictable_pgs_scanned = 0
unevictable_pgs_rescued = 0
unevictable_pgs_mlocked = 0
unevictable_pgs_munlocked = 0
unevictable_pgs_cleared = 0
unevictable_pgs_stranded = 0
thp_fault_alloc = 0
thp_fault_fallback = 0
thp_collapse_alloc = 0
thp_collapse_alloc_failed = 0
thp_split = 0
thp_zero_page_alloc = 0
thp_zero_page_alloc_failed = 0
balloon_inflate = 0
balloon_deflate = 0
balloon_migrate = 0

View File

@@ -0,0 +1,2 @@
# # collect interval
# interval = 15

View File

@@ -0,0 +1,5 @@
# # collect interval
# interval = 15
# # whether collect platform specified metrics
collect_platform_fields = true

View File

@@ -0,0 +1,8 @@
# # collect interval
# interval = 15
# # whether collect protocol stats on Linux
# collect_protocol_stats = false
# # setting interfaces will tell categraf to gather these explicit interfaces
# interfaces = ["eth0"]

View File

@@ -0,0 +1,2 @@
# # collect interval
# interval = 15

View File

@@ -0,0 +1,8 @@
# # collect interval
# interval = 15
# # force use ps command to gather
# force_ps = false
# # force use /proc to gather
# force_proc = false

View File

@@ -0,0 +1,5 @@
# # collect interval
# interval = 15
# # whether collect metric: system_n_users
# collect_user_number = false

View File

@@ -0,0 +1,35 @@
[logs]
## key 占位符
api_key = "ef4ahfbwzwwtlwfpbertgq1i6mq0ab1q"
## 是否开启日志采集
enable = false
## 接受日志的server地址
send_to = "127.0.0.1:17878"
## 发送日志的协议 http/tcp
send_type = "http"
## 是否压缩发送
use_compress = false
## 是否采用ssl
send_with_tls = false
##
batch_wait = 5
## 日志offset信息保存目录
run_path = "/opt/categraf/run"
## 最多同时采集多少个日志文件
open_files_limit = 100
## 定期扫描目录下是否有新增日志
scan_period = 10
##
frame_size = 10
##
collect_container_all = true
## 全局的处理规则
[[logs.Processing_rules]]
## 单个日志采集配置
[[logs.items]]
## file/journald
type = "file"
## type=file时 path必填type=journald时 port必填
path = "/opt/tomcat/logs/*.txt"
source = "tomcat"
service = "my_service"

178
docker/docker-compose.yaml Normal file
View File

@@ -0,0 +1,178 @@
version: "3.7"
networks:
nightingale:
driver: bridge
services:
mysql:
image: "mysql:5.7"
container_name: mysql
hostname: mysql
restart: always
ports:
- "3306:3306"
environment:
TZ: Asia/Shanghai
MYSQL_ROOT_PASSWORD: 1234
volumes:
- ./mysqldata:/var/lib/mysql/
- ./initsql:/docker-entrypoint-initdb.d/
- ./mysqletc/my.cnf:/etc/my.cnf
networks:
- nightingale
redis:
image: "redis:6.2"
container_name: redis
hostname: redis
restart: always
ports:
- "6379:6379"
environment:
TZ: Asia/Shanghai
networks:
- nightingale
prometheus:
image: prom/prometheus
container_name: prometheus
hostname: prometheus
restart: always
environment:
TZ: Asia/Shanghai
volumes:
- ./prometc:/etc/prometheus
ports:
- "9090:9090"
networks:
- nightingale
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--web.console.libraries=/usr/share/prometheus/console_libraries"
- "--web.console.templates=/usr/share/prometheus/consoles"
- "--enable-feature=remote-write-receiver"
- "--query.lookback-delta=2m"
ibex:
image: ulric2019/ibex:0.3
container_name: ibex
hostname: ibex
restart: always
environment:
GIN_MODE: release
TZ: Asia/Shanghai
WAIT_HOSTS: mysql:3306
ports:
- "10090:10090"
- "20090:20090"
volumes:
- ./ibexetc:/app/etc
networks:
- nightingale
depends_on:
- mysql
links:
- mysql:mysql
command: >
sh -c "/wait && /app/ibex server"
nwebapi:
image: ulric2019/nightingale:5.9.4
container_name: nwebapi
hostname: nwebapi
restart: always
environment:
GIN_MODE: release
TZ: Asia/Shanghai
WAIT_HOSTS: mysql:3306, redis:6379
volumes:
- ./n9eetc:/app/etc
ports:
- "18000:18000"
networks:
- nightingale
depends_on:
- mysql
- redis
- prometheus
- ibex
links:
- mysql:mysql
- redis:redis
- prometheus:prometheus
- ibex:ibex
command: >
sh -c "/wait && /app/n9e webapi"
nserver:
image: ulric2019/nightingale:5.9.4
container_name: nserver
hostname: nserver
restart: always
environment:
GIN_MODE: release
TZ: Asia/Shanghai
WAIT_HOSTS: mysql:3306, redis:6379
volumes:
- ./n9eetc:/app/etc
ports:
- "19000:19000"
networks:
- nightingale
depends_on:
- mysql
- redis
- prometheus
- ibex
links:
- mysql:mysql
- redis:redis
- prometheus:prometheus
- ibex:ibex
command: >
sh -c "/wait && /app/n9e server"
categraf:
image: "flashcatcloud/categraf:v0.1.9"
container_name: "categraf"
hostname: "categraf01"
restart: always
environment:
TZ: Asia/Shanghai
HOST_PROC: /hostfs/proc
HOST_SYS: /hostfs/sys
HOST_MOUNT_PREFIX: /hostfs
volumes:
- ./categraf/conf:/etc/categraf/conf
- /:/hostfs
- /var/run/docker.sock:/var/run/docker.sock
ports:
- "8094:8094/tcp"
networks:
- nightingale
depends_on:
- nserver
links:
- nserver:nserver
agentd:
image: ulric2019/ibex:0.3
container_name: agentd
hostname: agentd
restart: always
environment:
GIN_MODE: release
TZ: Asia/Shanghai
volumes:
- ./ibexetc:/app/etc
networks:
- nightingale
depends_on:
- ibex
links:
- ibex:ibex
command:
- "/app/ibex"
- "agentd"

View File

@@ -0,0 +1,38 @@
# debug, release
RunMode = "release"
# task meta storage dir
MetaDir = "./meta"
[HTTP]
Enable = true
# http listening address
Host = "0.0.0.0"
# http listening port
Port = 2090
# https cert file path
CertFile = ""
# https key file path
KeyFile = ""
# whether print access log
PrintAccessLog = true
# whether enable pprof
PProf = false
# http graceful shutdown timeout, unit: s
ShutdownTimeout = 30
# max content length: 64M
MaxContentLength = 67108864
# http server read timeout, unit: s
ReadTimeout = 20
# http server write timeout, unit: s
WriteTimeout = 40
# http server idle timeout, unit: s
IdleTimeout = 120
[Heartbeat]
# unit: ms
Interval = 1000
# rpc servers
Servers = ["ibex:20090"]
# $ip or $hostname or specified string
Host = "categraf01"

View File

@@ -0,0 +1,97 @@
# debug, release
RunMode = "release"
[Log]
# log write dir
Dir = "logs-server"
# log level: DEBUG INFO WARNING ERROR
Level = "DEBUG"
# stdout, stderr, file
Output = "stdout"
# # rotate by time
# KeepHours: 4
# # rotate by size
# RotateNum = 3
# # unit: MB
# RotateSize = 256
[HTTP]
Enable = true
# http listening address
Host = "0.0.0.0"
# http listening port
Port = 10090
# https cert file path
CertFile = ""
# https key file path
KeyFile = ""
# whether print access log
PrintAccessLog = true
# whether enable pprof
PProf = false
# http graceful shutdown timeout, unit: s
ShutdownTimeout = 30
# max content length: 64M
MaxContentLength = 67108864
# http server read timeout, unit: s
ReadTimeout = 20
# http server write timeout, unit: s
WriteTimeout = 40
# http server idle timeout, unit: s
IdleTimeout = 120
[BasicAuth]
# using when call apis
ibex = "ibex"
[RPC]
Listen = "0.0.0.0:20090"
[Heartbeat]
# auto detect if blank
IP = ""
# unit: ms
Interval = 1000
[Output]
# database | remote
ComeFrom = "database"
AgtdPort = 2090
[Gorm]
# enable debug mode or not
Debug = false
# mysql postgres
DBType = "mysql"
# unit: s
MaxLifetime = 7200
# max open connections
MaxOpenConns = 150
# max idle connections
MaxIdleConns = 50
# table prefix
TablePrefix = ""
[MySQL]
# mysql address host:port
Address = "mysql:3306"
# mysql username
User = "root"
# mysql password
Password = "1234"
# database name
DBName = "ibex"
# connection params
Parameters = "charset=utf8mb4&parseTime=True&loc=Local&allowNativePasswords=true"
[Postgres]
# pg address host:port
Address = "postgres:5432"
# pg user
User = "root"
# pg password
Password = "1234"
# database name
DBName = "ibex"
# ssl mode
SSLMode = "disable"

500
docker/initsql/a-n9e.sql Normal file
View File

@@ -0,0 +1,500 @@
set names utf8mb4;
drop database if exists n9e_v5;
create database n9e_v5;
use n9e_v5;
CREATE TABLE `users` (
`id` bigint unsigned not null auto_increment,
`username` varchar(64) not null comment 'login name, cannot rename',
`nickname` varchar(64) not null comment 'display name, chinese name',
`password` varchar(128) not null default '',
`phone` varchar(16) not null default '',
`email` varchar(64) not null default '',
`portrait` varchar(255) not null default '' comment 'portrait image url',
`roles` varchar(255) not null comment 'Admin | Standard | Guest, split by space',
`contacts` varchar(1024) comment 'json e.g. {wecom:xx, dingtalk_robot_token:yy}',
`maintainer` tinyint(1) not null default 0,
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
`update_at` bigint not null default 0,
`update_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
UNIQUE KEY (`username`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
insert into `users`(id, username, nickname, password, roles, create_at, create_by, update_at, update_by) values(1, 'root', '超管', 'root.2020', 'Admin', unix_timestamp(now()), 'system', unix_timestamp(now()), 'system');
CREATE TABLE `user_group` (
`id` bigint unsigned not null auto_increment,
`name` varchar(128) not null default '',
`note` varchar(255) not null default '',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
`update_at` bigint not null default 0,
`update_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
KEY (`create_by`),
KEY (`update_at`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
insert into user_group(id, name, create_at, create_by, update_at, update_by) values(1, 'demo-root-group', unix_timestamp(now()), 'root', unix_timestamp(now()), 'root');
CREATE TABLE `user_group_member` (
`group_id` bigint unsigned not null,
`user_id` bigint unsigned not null,
KEY (`group_id`),
KEY (`user_id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
insert into user_group_member(group_id, user_id) values(1, 1);
CREATE TABLE `configs` (
`id` bigint unsigned not null auto_increment,
`ckey` varchar(191) not null,
`cval` varchar(1024) not null default '',
PRIMARY KEY (`id`),
UNIQUE KEY (`ckey`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `role` (
`id` bigint unsigned not null auto_increment,
`name` varchar(191) not null default '',
`note` varchar(255) not null default '',
PRIMARY KEY (`id`),
UNIQUE KEY (`name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
insert into `role`(name, note) values('Admin', 'Administrator role');
insert into `role`(name, note) values('Standard', 'Ordinary user role');
insert into `role`(name, note) values('Guest', 'Readonly user role');
CREATE TABLE `role_operation`(
`role_name` varchar(128) not null,
`operation` varchar(191) not null,
KEY (`role_name`),
KEY (`operation`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
-- Admin is special, who has no concrete operation but can do anything.
insert into `role_operation`(role_name, operation) values('Guest', '/metric/explorer');
insert into `role_operation`(role_name, operation) values('Guest', '/object/explorer');
insert into `role_operation`(role_name, operation) values('Guest', '/help/version');
insert into `role_operation`(role_name, operation) values('Guest', '/help/contact');
insert into `role_operation`(role_name, operation) values('Standard', '/metric/explorer');
insert into `role_operation`(role_name, operation) values('Standard', '/object/explorer');
insert into `role_operation`(role_name, operation) values('Standard', '/help/version');
insert into `role_operation`(role_name, operation) values('Standard', '/help/contact');
insert into `role_operation`(role_name, operation) values('Standard', '/users');
insert into `role_operation`(role_name, operation) values('Standard', '/user-groups');
insert into `role_operation`(role_name, operation) values('Standard', '/user-groups/add');
insert into `role_operation`(role_name, operation) values('Standard', '/user-groups/put');
insert into `role_operation`(role_name, operation) values('Standard', '/user-groups/del');
insert into `role_operation`(role_name, operation) values('Standard', '/busi-groups');
insert into `role_operation`(role_name, operation) values('Standard', '/busi-groups/add');
insert into `role_operation`(role_name, operation) values('Standard', '/busi-groups/put');
insert into `role_operation`(role_name, operation) values('Standard', '/busi-groups/del');
insert into `role_operation`(role_name, operation) values('Standard', '/targets');
insert into `role_operation`(role_name, operation) values('Standard', '/targets/add');
insert into `role_operation`(role_name, operation) values('Standard', '/targets/put');
insert into `role_operation`(role_name, operation) values('Standard', '/targets/del');
insert into `role_operation`(role_name, operation) values('Standard', '/dashboards');
insert into `role_operation`(role_name, operation) values('Standard', '/dashboards/add');
insert into `role_operation`(role_name, operation) values('Standard', '/dashboards/put');
insert into `role_operation`(role_name, operation) values('Standard', '/dashboards/del');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-rules');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-rules/add');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-rules/put');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-rules/del');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-mutes');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-mutes/add');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-mutes/del');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-subscribes');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-subscribes/add');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-subscribes/put');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-subscribes/del');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-cur-events');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-cur-events/del');
insert into `role_operation`(role_name, operation) values('Standard', '/alert-his-events');
insert into `role_operation`(role_name, operation) values('Standard', '/job-tpls');
insert into `role_operation`(role_name, operation) values('Standard', '/job-tpls/add');
insert into `role_operation`(role_name, operation) values('Standard', '/job-tpls/put');
insert into `role_operation`(role_name, operation) values('Standard', '/job-tpls/del');
insert into `role_operation`(role_name, operation) values('Standard', '/job-tasks');
insert into `role_operation`(role_name, operation) values('Standard', '/job-tasks/add');
insert into `role_operation`(role_name, operation) values('Standard', '/job-tasks/put');
insert into `role_operation`(role_name, operation) values('Standard', '/recording-rules');
insert into `role_operation`(role_name, operation) values('Standard', '/recording-rules/add');
insert into `role_operation`(role_name, operation) values('Standard', '/recording-rules/put');
insert into `role_operation`(role_name, operation) values('Standard', '/recording-rules/del');
-- for alert_rule | collect_rule | mute | dashboard grouping
CREATE TABLE `busi_group` (
`id` bigint unsigned not null auto_increment,
`name` varchar(191) not null,
`label_enable` tinyint(1) not null default 0,
`label_value` varchar(191) not null default '' comment 'if label_enable: label_value can not be blank',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
`update_at` bigint not null default 0,
`update_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
UNIQUE KEY (`name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
insert into busi_group(id, name, create_at, create_by, update_at, update_by) values(1, 'Default Busi Group', unix_timestamp(now()), 'root', unix_timestamp(now()), 'root');
CREATE TABLE `busi_group_member` (
`id` bigint unsigned not null auto_increment,
`busi_group_id` bigint not null comment 'busi group id',
`user_group_id` bigint not null comment 'user group id',
`perm_flag` char(2) not null comment 'ro | rw',
PRIMARY KEY (`id`),
KEY (`busi_group_id`),
KEY (`user_group_id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
insert into busi_group_member(busi_group_id, user_group_id, perm_flag) values(1, 1, "rw");
-- for dashboard new version
CREATE TABLE `board` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default 0 comment 'busi group id',
`name` varchar(191) not null,
`tags` varchar(255) not null comment 'split by space',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
`update_at` bigint not null default 0,
`update_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
UNIQUE KEY (`group_id`, `name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
-- for dashboard new version
CREATE TABLE `board_payload` (
`id` bigint unsigned not null comment 'dashboard id',
`payload` mediumtext not null,
UNIQUE KEY (`id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
-- deprecated
CREATE TABLE `dashboard` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default 0 comment 'busi group id',
`name` varchar(191) not null,
`tags` varchar(255) not null comment 'split by space',
`configs` varchar(8192) comment 'dashboard variables',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
`update_at` bigint not null default 0,
`update_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
UNIQUE KEY (`group_id`, `name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
-- deprecated
-- auto create the first subclass 'Default chart group' of dashboard
CREATE TABLE `chart_group` (
`id` bigint unsigned not null auto_increment,
`dashboard_id` bigint unsigned not null,
`name` varchar(255) not null,
`weight` int not null default 0,
PRIMARY KEY (`id`),
KEY (`dashboard_id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
-- deprecated
CREATE TABLE `chart` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint unsigned not null comment 'chart group id',
`configs` text,
`weight` int not null default 0,
PRIMARY KEY (`id`),
KEY (`group_id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `chart_share` (
`id` bigint unsigned not null auto_increment,
`cluster` varchar(128) not null,
`configs` text,
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
primary key (`id`),
key (`create_at`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `alert_rule` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default 0 comment 'busi group id',
`cluster` varchar(128) not null,
`name` varchar(255) not null,
`note` varchar(1024) not null default '',
`prod` varchar(255) not null default '',
`algorithm` varchar(255) not null default '',
`algo_params` varchar(255),
`delay` int not null default 0,
`severity` tinyint(1) not null comment '1:Emergency 2:Warning 3:Notice',
`disabled` tinyint(1) not null comment '0:enabled 1:disabled',
`prom_for_duration` int not null comment 'prometheus for, unit:s',
`prom_ql` text not null comment 'promql',
`prom_eval_interval` int not null comment 'evaluate interval',
`enable_stime` char(5) not null default '00:00',
`enable_etime` char(5) not null default '23:59',
`enable_days_of_week` varchar(32) not null default '' comment 'split by space: 0 1 2 3 4 5 6',
`enable_in_bg` tinyint(1) not null default 0 comment '1: only this bg 0: global',
`notify_recovered` tinyint(1) not null comment 'whether notify when recovery',
`notify_channels` varchar(255) not null default '' comment 'split by space: sms voice email dingtalk wecom',
`notify_groups` varchar(255) not null default '' comment 'split by space: 233 43',
`notify_repeat_step` int not null default 0 comment 'unit: min',
`notify_max_number` int not null default 0 comment '',
`recover_duration` int not null default 0 comment 'unit: s',
`callbacks` varchar(255) not null default '' comment 'split by space: http://a.com/api/x http://a.com/api/y',
`runbook_url` varchar(255),
`append_tags` varchar(255) not null default '' comment 'split by space: service=n9e mod=api',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
`update_at` bigint not null default 0,
`update_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
KEY (`group_id`),
KEY (`update_at`)
) ENGINE=InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `alert_mute` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default 0 comment 'busi group id',
`prod` varchar(255) not null default '',
`cluster` varchar(128) not null,
`tags` varchar(4096) not null default '' comment 'json,map,tagkey->regexp|value',
`cause` varchar(255) not null default '',
`btime` bigint not null default 0 comment 'begin time',
`etime` bigint not null default 0 comment 'end time',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
KEY (`create_at`),
KEY (`group_id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `alert_subscribe` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default 0 comment 'busi group id',
`cluster` varchar(128) not null,
`rule_id` bigint not null default 0,
`tags` varchar(4096) not null default '' comment 'json,map,tagkey->regexp|value',
`redefine_severity` tinyint(1) default 0 comment 'is redefine severity?',
`new_severity` tinyint(1) not null comment '0:Emergency 1:Warning 2:Notice',
`redefine_channels` tinyint(1) default 0 comment 'is redefine channels?',
`new_channels` varchar(255) not null default '' comment 'split by space: sms voice email dingtalk wecom',
`user_group_ids` varchar(250) not null comment 'split by space 1 34 5, notify cc to user_group_ids',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
`update_at` bigint not null default 0,
`update_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
KEY (`update_at`),
KEY (`group_id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `target` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default 0 comment 'busi group id',
`cluster` varchar(128) not null comment 'append to alert event as field',
`ident` varchar(191) not null comment 'target id',
`note` varchar(255) not null default '' comment 'append to alert event as field',
`tags` varchar(512) not null default '' comment 'append to series data as tags, split by space, append external space at suffix',
`update_at` bigint not null default 0,
PRIMARY KEY (`id`),
UNIQUE KEY (`ident`),
KEY (`group_id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
-- case1: target_idents; case2: target_tags
-- CREATE TABLE `collect_rule` (
-- `id` bigint unsigned not null auto_increment,
-- `group_id` bigint not null default 0 comment 'busi group id',
-- `cluster` varchar(128) not null,
-- `target_idents` varchar(512) not null default '' comment 'ident list, split by space',
-- `target_tags` varchar(512) not null default '' comment 'filter targets by tags, split by space',
-- `name` varchar(191) not null default '',
-- `note` varchar(255) not null default '',
-- `step` int not null,
-- `type` varchar(64) not null comment 'e.g. port proc log plugin',
-- `data` text not null,
-- `append_tags` varchar(255) not null default '' comment 'split by space: e.g. mod=n9e dept=cloud',
-- `create_at` bigint not null default 0,
-- `create_by` varchar(64) not null default '',
-- `update_at` bigint not null default 0,
-- `update_by` varchar(64) not null default '',
-- PRIMARY KEY (`id`),
-- KEY (`group_id`, `type`, `name`)
-- ) ENGINE=InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `metric_view` (
`id` bigint unsigned not null auto_increment,
`name` varchar(191) not null default '',
`cate` tinyint(1) not null comment '0: preset 1: custom',
`configs` varchar(8192) not null default '',
`create_at` bigint not null default 0,
`create_by` bigint not null default 0 comment 'user id',
`update_at` bigint not null default 0,
PRIMARY KEY (`id`),
KEY (`create_by`)
) ENGINE=InnoDB DEFAULT CHARSET = utf8mb4;
insert into metric_view(name, cate, configs) values('Host View', 0, '{"filters":[{"oper":"=","label":"__name__","value":"cpu_usage_idle"}],"dynamicLabels":[],"dimensionLabels":[{"label":"ident","value":""}]}');
CREATE TABLE `recording_rule` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default '0' comment 'group_id',
`cluster` varchar(128) not null,
`name` varchar(255) not null comment 'new metric name',
`note` varchar(255) not null comment 'rule note',
`disabled` tinyint(1) not null comment '0:enabled 1:disabled',
`prom_ql` varchar(8192) not null comment 'promql',
`prom_eval_interval` int not null comment 'evaluate interval',
`append_tags` varchar(255) default '' comment 'split by space: service=n9e mod=api',
`create_at` bigint default '0',
`create_by` varchar(64) default '',
`update_at` bigint default '0',
`update_by` varchar(64) default '',
PRIMARY KEY (`id`),
KEY `group_id` (`group_id`),
KEY `update_at` (`update_at`)
) ENGINE=InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `alert_aggr_view` (
`id` bigint unsigned not null auto_increment,
`name` varchar(191) not null default '',
`rule` varchar(2048) not null default '',
`cate` tinyint(1) not null comment '0: preset 1: custom',
`create_at` bigint not null default 0,
`create_by` bigint not null default 0 comment 'user id',
`update_at` bigint not null default 0,
PRIMARY KEY (`id`),
KEY (`create_by`)
) ENGINE=InnoDB DEFAULT CHARSET = utf8mb4;
insert into alert_aggr_view(name, rule, cate) values('By BusiGroup, Severity', 'field:group_name::field:severity', 0);
insert into alert_aggr_view(name, rule, cate) values('By RuleName', 'field:rule_name', 0);
CREATE TABLE `alert_cur_event` (
`id` bigint unsigned not null comment 'use alert_his_event.id',
`cluster` varchar(128) not null,
`group_id` bigint unsigned not null comment 'busi group id of rule',
`group_name` varchar(255) not null default '' comment 'busi group name',
`hash` varchar(64) not null comment 'rule_id + vector_pk',
`rule_id` bigint unsigned not null,
`rule_name` varchar(255) not null,
`rule_note` varchar(2048) not null default 'alert rule note',
`rule_prod` varchar(255) not null default '',
`rule_algo` varchar(255) not null default '',
`severity` tinyint(1) not null comment '0:Emergency 1:Warning 2:Notice',
`prom_for_duration` int not null comment 'prometheus for, unit:s',
`prom_ql` varchar(8192) not null comment 'promql',
`prom_eval_interval` int not null comment 'evaluate interval',
`callbacks` varchar(255) not null default '' comment 'split by space: http://a.com/api/x http://a.com/api/y',
`runbook_url` varchar(255),
`notify_recovered` tinyint(1) not null comment 'whether notify when recovery',
`notify_channels` varchar(255) not null default '' comment 'split by space: sms voice email dingtalk wecom',
`notify_groups` varchar(255) not null default '' comment 'split by space: 233 43',
`notify_repeat_next` bigint not null default 0 comment 'next timestamp to notify, get repeat settings from rule',
`notify_cur_number` int not null default 0 comment '',
`target_ident` varchar(191) not null default '' comment 'target ident, also in tags',
`target_note` varchar(191) not null default '' comment 'target note',
`trigger_time` bigint not null,
`trigger_value` varchar(255) not null,
`tags` varchar(1024) not null default '' comment 'merge data_tags rule_tags, split by ,,',
PRIMARY KEY (`id`),
KEY (`hash`),
KEY (`rule_id`),
KEY (`trigger_time`, `group_id`),
KEY (`notify_repeat_next`)
) ENGINE=InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `alert_his_event` (
`id` bigint unsigned not null AUTO_INCREMENT,
`is_recovered` tinyint(1) not null,
`cluster` varchar(128) not null,
`group_id` bigint unsigned not null comment 'busi group id of rule',
`group_name` varchar(255) not null default '' comment 'busi group name',
`hash` varchar(64) not null comment 'rule_id + vector_pk',
`rule_id` bigint unsigned not null,
`rule_name` varchar(255) not null,
`rule_note` varchar(2048) not null default 'alert rule note',
`rule_prod` varchar(255) not null default '',
`rule_algo` varchar(255) not null default '',
`severity` tinyint(1) not null comment '0:Emergency 1:Warning 2:Notice',
`prom_for_duration` int not null comment 'prometheus for, unit:s',
`prom_ql` varchar(8192) not null comment 'promql',
`prom_eval_interval` int not null comment 'evaluate interval',
`callbacks` varchar(255) not null default '' comment 'split by space: http://a.com/api/x http://a.com/api/y',
`runbook_url` varchar(255),
`notify_recovered` tinyint(1) not null comment 'whether notify when recovery',
`notify_channels` varchar(255) not null default '' comment 'split by space: sms voice email dingtalk wecom',
`notify_groups` varchar(255) not null default '' comment 'split by space: 233 43',
`notify_cur_number` int not null default 0 comment '',
`target_ident` varchar(191) not null default '' comment 'target ident, also in tags',
`target_note` varchar(191) not null default '' comment 'target note',
`trigger_time` bigint not null,
`trigger_value` varchar(255) not null,
`recover_time` bigint not null default 0,
`last_eval_time` bigint not null default 0 comment 'for time filter',
`tags` varchar(1024) not null default '' comment 'merge data_tags rule_tags, split by ,,',
PRIMARY KEY (`id`),
KEY (`hash`),
KEY (`rule_id`),
KEY (`trigger_time`, `group_id`)
) ENGINE=InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `task_tpl`
(
`id` int unsigned NOT NULL AUTO_INCREMENT,
`group_id` int unsigned not null comment 'busi group id',
`title` varchar(255) not null default '',
`account` varchar(64) not null,
`batch` int unsigned not null default 0,
`tolerance` int unsigned not null default 0,
`timeout` int unsigned not null default 0,
`pause` varchar(255) not null default '',
`script` text not null,
`args` varchar(512) not null default '',
`tags` varchar(255) not null default '' comment 'split by space',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
`update_at` bigint not null default 0,
`update_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
KEY (`group_id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `task_tpl_host`
(
`ii` int unsigned NOT NULL AUTO_INCREMENT,
`id` int unsigned not null comment 'task tpl id',
`host` varchar(128) not null comment 'ip or hostname',
PRIMARY KEY (`ii`),
KEY (`id`, `host`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `task_record`
(
`id` bigint unsigned not null comment 'ibex task id',
`group_id` bigint not null comment 'busi group id',
`ibex_address` varchar(128) not null,
`ibex_auth_user` varchar(128) not null default '',
`ibex_auth_pass` varchar(128) not null default '',
`title` varchar(255) not null default '',
`account` varchar(64) not null,
`batch` int unsigned not null default 0,
`tolerance` int unsigned not null default 0,
`timeout` int unsigned not null default 0,
`pause` varchar(255) not null default '',
`script` text not null,
`args` varchar(512) not null default '',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
KEY (`create_at`, `group_id`),
KEY (`create_by`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;

1362
docker/initsql/b-ibex.sql Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
GRANT ALL ON *.* TO 'root'@'127.0.0.1' IDENTIFIED BY '1234';
GRANT ALL ON *.* TO 'root'@'localhost' IDENTIFIED BY '1234';
GRANT ALL ON *.* TO 'root'@'%' IDENTIFIED BY '1234';

View File

@@ -0,0 +1,551 @@
-- n9e version 5.8 for postgres
CREATE TABLE users (
id bigserial,
username varchar(64) not null ,
nickname varchar(64) not null ,
password varchar(128) not null default '',
phone varchar(16) not null default '',
email varchar(64) not null default '',
portrait varchar(255) not null default '' ,
roles varchar(255) not null ,
contacts varchar(1024) ,
maintainer smallint not null default 0,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE users ADD CONSTRAINT users_pk PRIMARY KEY (id);
ALTER TABLE users ADD CONSTRAINT users_un UNIQUE (username);
COMMENT ON COLUMN users.username IS 'login name, cannot rename';
COMMENT ON COLUMN users.nickname IS 'display name, chinese name';
COMMENT ON COLUMN users.portrait IS 'portrait image url';
COMMENT ON COLUMN users.roles IS 'Admin | Standard | Guest, split by space';
COMMENT ON COLUMN users.contacts IS 'json e.g. {wecom:xx, dingtalk_robot_token:yy}';
insert into users(id, username, nickname, password, roles, create_at, create_by, update_at, update_by) values(1, 'root', '超管', 'root.2020', 'Admin', date_part('epoch',current_timestamp)::int, 'system', date_part('epoch',current_timestamp)::int, 'system');
CREATE TABLE user_group (
id bigserial,
name varchar(128) not null default '',
note varchar(255) not null default '',
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE user_group ADD CONSTRAINT user_group_pk PRIMARY KEY (id);
CREATE INDEX user_group_create_by_idx ON user_group (create_by);
CREATE INDEX user_group_update_at_idx ON user_group (update_at);
insert into user_group(id, name, create_at, create_by, update_at, update_by) values(1, 'demo-root-group', date_part('epoch',current_timestamp)::int, 'root', date_part('epoch',current_timestamp)::int, 'root');
CREATE TABLE user_group_member (
group_id bigint not null,
user_id bigint not null
) ;
CREATE INDEX user_group_member_group_id_idx ON user_group_member (group_id);
CREATE INDEX user_group_member_user_id_idx ON user_group_member (user_id);
insert into user_group_member(group_id, user_id) values(1, 1);
CREATE TABLE configs (
id bigserial,
ckey varchar(191) not null,
cval varchar(1024) not null default ''
) ;
ALTER TABLE configs ADD CONSTRAINT configs_pk PRIMARY KEY (id);
ALTER TABLE configs ADD CONSTRAINT configs_un UNIQUE (ckey);
CREATE TABLE role (
id bigserial,
name varchar(191) not null default '',
note varchar(255) not null default ''
) ;
ALTER TABLE role ADD CONSTRAINT role_pk PRIMARY KEY (id);
ALTER TABLE role ADD CONSTRAINT role_un UNIQUE ("name");
insert into role(name, note) values('Admin', 'Administrator role');
insert into role(name, note) values('Standard', 'Ordinary user role');
insert into role(name, note) values('Guest', 'Readonly user role');
CREATE TABLE role_operation(
role_name varchar(128) not null,
operation varchar(191) not null
) ;
CREATE INDEX role_operation_role_name_idx ON role_operation (role_name);
CREATE INDEX role_operation_operation_idx ON role_operation (operation);
-- Admin is special, who has no concrete operation but can do anything.
insert into role_operation(role_name, operation) values('Guest', '/metric/explorer');
insert into role_operation(role_name, operation) values('Guest', '/object/explorer');
insert into role_operation(role_name, operation) values('Guest', '/help/version');
insert into role_operation(role_name, operation) values('Guest', '/help/contact');
insert into role_operation(role_name, operation) values('Standard', '/metric/explorer');
insert into role_operation(role_name, operation) values('Standard', '/object/explorer');
insert into role_operation(role_name, operation) values('Standard', '/help/version');
insert into role_operation(role_name, operation) values('Standard', '/help/contact');
insert into role_operation(role_name, operation) values('Standard', '/users');
insert into role_operation(role_name, operation) values('Standard', '/user-groups');
insert into role_operation(role_name, operation) values('Standard', '/user-groups/add');
insert into role_operation(role_name, operation) values('Standard', '/user-groups/put');
insert into role_operation(role_name, operation) values('Standard', '/user-groups/del');
insert into role_operation(role_name, operation) values('Standard', '/busi-groups');
insert into role_operation(role_name, operation) values('Standard', '/busi-groups/add');
insert into role_operation(role_name, operation) values('Standard', '/busi-groups/put');
insert into role_operation(role_name, operation) values('Standard', '/busi-groups/del');
insert into role_operation(role_name, operation) values('Standard', '/targets');
insert into role_operation(role_name, operation) values('Standard', '/targets/add');
insert into role_operation(role_name, operation) values('Standard', '/targets/put');
insert into role_operation(role_name, operation) values('Standard', '/targets/del');
insert into role_operation(role_name, operation) values('Standard', '/dashboards');
insert into role_operation(role_name, operation) values('Standard', '/dashboards/add');
insert into role_operation(role_name, operation) values('Standard', '/dashboards/put');
insert into role_operation(role_name, operation) values('Standard', '/dashboards/del');
insert into role_operation(role_name, operation) values('Standard', '/alert-rules');
insert into role_operation(role_name, operation) values('Standard', '/alert-rules/add');
insert into role_operation(role_name, operation) values('Standard', '/alert-rules/put');
insert into role_operation(role_name, operation) values('Standard', '/alert-rules/del');
insert into role_operation(role_name, operation) values('Standard', '/alert-mutes');
insert into role_operation(role_name, operation) values('Standard', '/alert-mutes/add');
insert into role_operation(role_name, operation) values('Standard', '/alert-mutes/del');
insert into role_operation(role_name, operation) values('Standard', '/alert-subscribes');
insert into role_operation(role_name, operation) values('Standard', '/alert-subscribes/add');
insert into role_operation(role_name, operation) values('Standard', '/alert-subscribes/put');
insert into role_operation(role_name, operation) values('Standard', '/alert-subscribes/del');
insert into role_operation(role_name, operation) values('Standard', '/alert-cur-events');
insert into role_operation(role_name, operation) values('Standard', '/alert-cur-events/del');
insert into role_operation(role_name, operation) values('Standard', '/alert-his-events');
insert into role_operation(role_name, operation) values('Standard', '/job-tpls');
insert into role_operation(role_name, operation) values('Standard', '/job-tpls/add');
insert into role_operation(role_name, operation) values('Standard', '/job-tpls/put');
insert into role_operation(role_name, operation) values('Standard', '/job-tpls/del');
insert into role_operation(role_name, operation) values('Standard', '/job-tasks');
insert into role_operation(role_name, operation) values('Standard', '/job-tasks/add');
insert into role_operation(role_name, operation) values('Standard', '/job-tasks/put');
-- for alert_rule | collect_rule | mute | dashboard grouping
CREATE TABLE busi_group (
id bigserial,
name varchar(191) not null,
label_enable smallint not null default 0,
label_value varchar(191) not null default '' ,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE busi_group ADD CONSTRAINT busi_group_pk PRIMARY KEY (id);
ALTER TABLE busi_group ADD CONSTRAINT busi_group_un UNIQUE ("name");
COMMENT ON COLUMN busi_group.label_value IS 'if label_enable: label_value can not be blank';
insert into busi_group(id, name, create_at, create_by, update_at, update_by) values(1, 'Default Busi Group', date_part('epoch',current_timestamp)::int, 'root', date_part('epoch',current_timestamp)::int, 'root');
CREATE TABLE busi_group_member (
id bigserial,
busi_group_id bigint not null ,
user_group_id bigint not null ,
perm_flag char(2) not null
) ;
ALTER TABLE busi_group_member ADD CONSTRAINT busi_group_member_pk PRIMARY KEY (id);
CREATE INDEX busi_group_member_busi_group_id_idx ON busi_group_member (busi_group_id);
CREATE INDEX busi_group_member_user_group_id_idx ON busi_group_member (user_group_id);
COMMENT ON COLUMN busi_group_member.busi_group_id IS 'busi group id';
COMMENT ON COLUMN busi_group_member.user_group_id IS 'user group id';
COMMENT ON COLUMN busi_group_member.perm_flag IS 'ro | rw';
insert into busi_group_member(busi_group_id, user_group_id, perm_flag) values(1, 1, 'rw');
-- for dashboard new version
CREATE TABLE board (
id bigserial not null ,
group_id bigint not null default 0 ,
name varchar(191) not null,
tags varchar(255) not null ,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE board ADD CONSTRAINT board_pk PRIMARY KEY (id);
ALTER TABLE board ADD CONSTRAINT board_un UNIQUE (group_id,"name");
COMMENT ON COLUMN board.group_id IS 'busi group id';
COMMENT ON COLUMN board.tags IS 'split by space';
-- for dashboard new version
CREATE TABLE board_payload (
id bigint not null ,
payload text not null
);
ALTER TABLE board_payload ADD CONSTRAINT board_payload_un UNIQUE (id);
COMMENT ON COLUMN board_payload.id IS 'dashboard id';
-- deprecated
CREATE TABLE dashboard (
id bigserial,
group_id bigint not null default 0 ,
name varchar(191) not null,
tags varchar(255) not null ,
configs varchar(8192) ,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE dashboard ADD CONSTRAINT dashboard_pk PRIMARY KEY (id);
-- deprecated
-- auto create the first subclass 'Default chart group' of dashboard
CREATE TABLE chart_group (
id bigserial,
dashboard_id bigint not null,
name varchar(255) not null,
weight int not null default 0
) ;
ALTER TABLE chart_group ADD CONSTRAINT chart_group_pk PRIMARY KEY (id);
-- deprecated
CREATE TABLE chart (
id bigserial,
group_id bigint not null ,
configs text,
weight int not null default 0
) ;
ALTER TABLE chart ADD CONSTRAINT chart_pk PRIMARY KEY (id);
CREATE TABLE chart_share (
id bigserial,
cluster varchar(128) not null,
configs text,
create_at bigint not null default 0,
create_by varchar(64) not null default ''
) ;
ALTER TABLE chart_share ADD CONSTRAINT chart_share_pk PRIMARY KEY (id);
CREATE INDEX chart_share_create_at_idx ON chart_share (create_at);
CREATE TABLE alert_rule (
id bigserial NOT NULL,
group_id int8 NOT NULL DEFAULT 0,
"cluster" varchar(128) NOT NULL,
"name" varchar(255) NOT NULL,
note varchar(1024) NOT NULL,
severity int2 NOT NULL,
disabled int2 NOT NULL,
prom_for_duration int4 NOT NULL,
prom_ql text NOT NULL,
prom_eval_interval int4 NOT NULL,
enable_stime bpchar(5) NOT NULL DEFAULT '00:00'::bpchar,
enable_etime bpchar(5) NOT NULL DEFAULT '23:59'::bpchar,
enable_days_of_week varchar(32) NOT NULL DEFAULT ''::character varying,
enable_in_bg int2 NOT NULL DEFAULT 0,
notify_recovered int2 NOT NULL,
notify_channels varchar(255) NOT NULL DEFAULT ''::character varying,
notify_groups varchar(255) NOT NULL DEFAULT ''::character varying,
notify_repeat_step int4 NOT NULL DEFAULT 0,
notify_max_number int4 not null default 0,
recover_duration int4 NOT NULL DEFAULT 0,
callbacks varchar(255) NOT NULL DEFAULT ''::character varying,
runbook_url varchar(255) NULL,
append_tags varchar(255) NOT NULL DEFAULT ''::character varying,
create_at int8 NOT NULL DEFAULT 0,
create_by varchar(64) NOT NULL DEFAULT ''::character varying,
update_at int8 NOT NULL DEFAULT 0,
update_by varchar(64) NOT NULL DEFAULT ''::character varying,
prod varchar(255) NOT NULL DEFAULT ''::character varying,
algorithm varchar(255) NOT NULL DEFAULT ''::character varying,
algo_params varchar(255) NULL,
delay int4 NOT NULL DEFAULT 0,
CONSTRAINT alert_rule_pk PRIMARY KEY (id)
);
CREATE INDEX alert_rule_group_id_idx ON alert_rule USING btree (group_id);
CREATE INDEX alert_rule_update_at_idx ON alert_rule USING btree (update_at);
COMMENT ON COLUMN alert_rule.group_id IS 'busi group id';
COMMENT ON COLUMN alert_rule.severity IS '0:Emergency 1:Warning 2:Notice';
COMMENT ON COLUMN alert_rule.disabled IS '0:enabled 1:disabled';
COMMENT ON COLUMN alert_rule.prom_for_duration IS 'prometheus for, unit:s';
COMMENT ON COLUMN alert_rule.prom_ql IS 'promql';
COMMENT ON COLUMN alert_rule.prom_eval_interval IS 'evaluate interval';
COMMENT ON COLUMN alert_rule.enable_stime IS '00:00';
COMMENT ON COLUMN alert_rule.enable_etime IS '23:59';
COMMENT ON COLUMN alert_rule.enable_days_of_week IS 'split by space: 0 1 2 3 4 5 6';
COMMENT ON COLUMN alert_rule.enable_in_bg IS '1: only this bg 0: global';
COMMENT ON COLUMN alert_rule.notify_recovered IS 'whether notify when recovery';
COMMENT ON COLUMN alert_rule.notify_channels IS 'split by space: sms voice email dingtalk wecom';
COMMENT ON COLUMN alert_rule.notify_groups IS 'split by space: 233 43';
COMMENT ON COLUMN alert_rule.notify_repeat_step IS 'unit: min';
COMMENT ON COLUMN alert_rule.recover_duration IS 'unit: s';
COMMENT ON COLUMN alert_rule.callbacks IS 'split by space: http://a.com/api/x http://a.com/api/y';
COMMENT ON COLUMN alert_rule.append_tags IS 'split by space: service=n9e mod=api';
CREATE TABLE alert_mute (
id bigserial,
group_id bigint not null default 0 ,
prod varchar(255) NOT NULL DEFAULT '' ,
cluster varchar(128) not null,
tags varchar(4096) not null default '' ,
cause varchar(255) not null default '',
btime bigint not null default 0 ,
etime bigint not null default 0 ,
create_at bigint not null default 0,
create_by varchar(64) not null default ''
) ;
ALTER TABLE alert_mute ADD CONSTRAINT alert_mute_pk PRIMARY KEY (id);
CREATE INDEX alert_mute_create_at_idx ON alert_mute (create_at);
CREATE INDEX alert_mute_group_id_idx ON alert_mute (group_id);
COMMENT ON COLUMN alert_mute.group_id IS 'busi group id';
COMMENT ON COLUMN alert_mute.tags IS 'json,map,tagkey->regexp|value';
COMMENT ON COLUMN alert_mute.btime IS 'begin time';
COMMENT ON COLUMN alert_mute.etime IS 'end time';
CREATE TABLE alert_subscribe (
id bigserial,
group_id bigint not null default 0 ,
cluster varchar(128) not null,
rule_id bigint not null default 0,
tags jsonb not null ,
redefine_severity smallint default 0 ,
new_severity smallint not null ,
redefine_channels smallint default 0 ,
new_channels varchar(255) not null default '' ,
user_group_ids varchar(250) not null ,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE alert_subscribe ADD CONSTRAINT alert_subscribe_pk PRIMARY KEY (id);
CREATE INDEX alert_subscribe_group_id_idx ON alert_subscribe (group_id);
CREATE INDEX alert_subscribe_update_at_idx ON alert_subscribe (update_at);
COMMENT ON COLUMN alert_subscribe.group_id IS 'busi group id';
COMMENT ON COLUMN alert_subscribe.tags IS 'json,map,tagkey->regexp|value';
COMMENT ON COLUMN alert_subscribe.redefine_severity IS 'is redefine severity?';
COMMENT ON COLUMN alert_subscribe.new_severity IS '0:Emergency 1:Warning 2:Notice';
COMMENT ON COLUMN alert_subscribe.redefine_channels IS 'is redefine channels?';
COMMENT ON COLUMN alert_subscribe.new_channels IS 'split by space: sms voice email dingtalk wecom';
COMMENT ON COLUMN alert_subscribe.user_group_ids IS 'split by space 1 34 5, notify cc to user_group_ids';
CREATE TABLE target (
id bigserial,
group_id bigint not null default 0 ,
cluster varchar(128) not null ,
ident varchar(191) not null ,
note varchar(255) not null default '' ,
tags varchar(512) not null default '' ,
update_at bigint not null default 0
) ;
ALTER TABLE target ADD CONSTRAINT target_pk PRIMARY KEY (id);
ALTER TABLE target ADD CONSTRAINT target_un UNIQUE (ident);
CREATE INDEX target_group_id_idx ON target (group_id);
COMMENT ON COLUMN target.group_id IS 'busi group id';
COMMENT ON COLUMN target."cluster" IS 'append to alert event as field';
COMMENT ON COLUMN target.ident IS 'target id';
COMMENT ON COLUMN target.note IS 'append to alert event as field';
COMMENT ON COLUMN target.tags IS 'append to series data as tags, split by space, append external space at suffix';
CREATE TABLE metric_view (
id bigserial,
name varchar(191) not null default '',
cate smallint not null ,
configs varchar(8192) not null default '',
create_at bigint not null default 0,
create_by bigint not null default 0 ,
update_at bigint not null default 0
) ;
ALTER TABLE metric_view ADD CONSTRAINT metric_view_pk PRIMARY KEY (id);
CREATE INDEX metric_view_create_by_idx ON metric_view (create_by);
COMMENT ON COLUMN metric_view.cate IS '0: preset 1: custom';
COMMENT ON COLUMN metric_view.create_by IS 'user id';
insert into metric_view(name, cate, configs) values('Host View', 0, '{"filters":[{"oper":"=","label":"__name__","value":"cpu_usage_idle"}],"dynamicLabels":[],"dimensionLabels":[{"label":"ident","value":""}]}');
CREATE TABLE alert_aggr_view (
id bigserial,
name varchar(191) not null default '',
rule varchar(2048) not null default '',
cate smallint not null ,
create_at bigint not null default 0,
create_by bigint not null default 0 ,
update_at bigint not null default 0
) ;
ALTER TABLE alert_aggr_view ADD CONSTRAINT alert_aggr_view_pk PRIMARY KEY (id);
CREATE INDEX alert_aggr_view_create_by_idx ON alert_aggr_view (create_by);
COMMENT ON COLUMN alert_aggr_view.cate IS '0: preset 1: custom';
COMMENT ON COLUMN alert_aggr_view.create_by IS 'user id';
insert into alert_aggr_view(name, rule, cate) values('By BusiGroup, Severity', 'field:group_name::field:severity', 0);
insert into alert_aggr_view(name, rule, cate) values('By RuleName', 'field:rule_name', 0);
CREATE TABLE alert_cur_event (
id bigserial NOT NULL,
"cluster" varchar(128) NOT NULL,
group_id int8 NOT NULL,
group_name varchar(255) NOT NULL DEFAULT ''::character varying,
hash varchar(64) NOT NULL,
rule_id int8 NOT NULL,
rule_name varchar(255) NOT NULL,
rule_note varchar(2048) NOT NULL DEFAULT 'alert rule note'::character varying,
severity int2 NOT NULL,
prom_for_duration int4 NOT NULL,
prom_ql varchar(8192) NOT NULL,
prom_eval_interval int4 NOT NULL,
callbacks varchar(255) NOT NULL DEFAULT ''::character varying,
runbook_url varchar(255) NULL,
notify_recovered int2 NOT NULL,
notify_channels varchar(255) NOT NULL DEFAULT ''::character varying,
notify_groups varchar(255) NOT NULL DEFAULT ''::character varying,
notify_repeat_next int8 NOT NULL DEFAULT 0,
notify_cur_number int4 not null default 0,
target_ident varchar(191) NOT NULL DEFAULT ''::character varying,
target_note varchar(191) NOT NULL DEFAULT ''::character varying,
trigger_time int8 NOT NULL,
trigger_value varchar(255) NOT NULL,
tags varchar(1024) NOT NULL DEFAULT ''::character varying,
rule_prod varchar(255) NOT NULL DEFAULT ''::character varying,
rule_algo varchar(255) NOT NULL DEFAULT ''::character varying,
CONSTRAINT alert_cur_event_pk PRIMARY KEY (id)
);
CREATE INDEX alert_cur_event_hash_idx ON alert_cur_event USING btree (hash);
CREATE INDEX alert_cur_event_notify_repeat_next_idx ON alert_cur_event USING btree (notify_repeat_next);
CREATE INDEX alert_cur_event_rule_id_idx ON alert_cur_event USING btree (rule_id);
CREATE INDEX alert_cur_event_trigger_time_idx ON alert_cur_event USING btree (trigger_time, group_id);
COMMENT ON COLUMN alert_cur_event.id IS 'use alert_his_event.id';
COMMENT ON COLUMN alert_cur_event.group_id IS 'busi group id of rule';
COMMENT ON COLUMN alert_cur_event.group_name IS 'busi group name';
COMMENT ON COLUMN alert_cur_event.hash IS 'rule_id + vector_pk';
COMMENT ON COLUMN alert_cur_event.rule_note IS 'alert rule note';
COMMENT ON COLUMN alert_cur_event.severity IS '1:Emergency 2:Warning 3:Notice';
COMMENT ON COLUMN alert_cur_event.prom_for_duration IS 'prometheus for, unit:s';
COMMENT ON COLUMN alert_cur_event.prom_ql IS 'promql';
COMMENT ON COLUMN alert_cur_event.prom_eval_interval IS 'evaluate interval';
COMMENT ON COLUMN alert_cur_event.callbacks IS 'split by space: http://a.com/api/x http://a.com/api/y';
COMMENT ON COLUMN alert_cur_event.notify_recovered IS 'whether notify when recovery';
COMMENT ON COLUMN alert_cur_event.notify_channels IS 'split by space: sms voice email dingtalk wecom';
COMMENT ON COLUMN alert_cur_event.notify_groups IS 'split by space: 233 43';
COMMENT ON COLUMN alert_cur_event.notify_repeat_next IS 'next timestamp to notify, get repeat settings from rule';
COMMENT ON COLUMN alert_cur_event.target_ident IS 'target ident, also in tags';
COMMENT ON COLUMN alert_cur_event.target_note IS 'target note';
COMMENT ON COLUMN alert_cur_event.tags IS 'merge data_tags rule_tags, split by ,,';
CREATE TABLE alert_his_event (
id bigserial NOT NULL,
is_recovered int2 NOT NULL,
"cluster" varchar(128) NOT NULL,
group_id int8 NOT NULL,
group_name varchar(255) NOT NULL DEFAULT ''::character varying,
hash varchar(64) NOT NULL,
rule_id int8 NOT NULL,
rule_name varchar(255) NOT NULL,
rule_note varchar(2048) NOT NULL DEFAULT 'alert rule note'::character varying,
severity int2 NOT NULL,
prom_for_duration int4 NOT NULL,
prom_ql varchar(8192) NOT NULL,
prom_eval_interval int4 NOT NULL,
callbacks varchar(255) NOT NULL DEFAULT ''::character varying,
runbook_url varchar(255) NULL,
notify_recovered int2 NOT NULL,
notify_channels varchar(255) NOT NULL DEFAULT ''::character varying,
notify_groups varchar(255) NOT NULL DEFAULT ''::character varying,
notify_cur_number int4 not null default 0,
target_ident varchar(191) NOT NULL DEFAULT ''::character varying,
target_note varchar(191) NOT NULL DEFAULT ''::character varying,
trigger_time int8 NOT NULL,
trigger_value varchar(255) NOT NULL,
recover_time int8 NOT NULL DEFAULT 0,
last_eval_time int8 NOT NULL DEFAULT 0,
tags varchar(1024) NOT NULL DEFAULT ''::character varying,
rule_prod varchar(255) NOT NULL DEFAULT ''::character varying,
rule_algo varchar(255) NOT NULL DEFAULT ''::character varying,
CONSTRAINT alert_his_event_pk PRIMARY KEY (id)
);
CREATE INDEX alert_his_event_hash_idx ON alert_his_event USING btree (hash);
CREATE INDEX alert_his_event_rule_id_idx ON alert_his_event USING btree (rule_id);
CREATE INDEX alert_his_event_trigger_time_idx ON alert_his_event USING btree (trigger_time, group_id);
COMMENT ON COLUMN alert_his_event.group_id IS 'busi group id of rule';
COMMENT ON COLUMN alert_his_event.group_name IS 'busi group name';
COMMENT ON COLUMN alert_his_event.hash IS 'rule_id + vector_pk';
COMMENT ON COLUMN alert_his_event.rule_note IS 'alert rule note';
COMMENT ON COLUMN alert_his_event.severity IS '0:Emergency 1:Warning 2:Notice';
COMMENT ON COLUMN alert_his_event.prom_for_duration IS 'prometheus for, unit:s';
COMMENT ON COLUMN alert_his_event.prom_ql IS 'promql';
COMMENT ON COLUMN alert_his_event.prom_eval_interval IS 'evaluate interval';
COMMENT ON COLUMN alert_his_event.callbacks IS 'split by space: http://a.com/api/x http://a.com/api/y';
COMMENT ON COLUMN alert_his_event.notify_recovered IS 'whether notify when recovery';
COMMENT ON COLUMN alert_his_event.notify_channels IS 'split by space: sms voice email dingtalk wecom';
COMMENT ON COLUMN alert_his_event.notify_groups IS 'split by space: 233 43';
COMMENT ON COLUMN alert_his_event.target_ident IS 'target ident, also in tags';
COMMENT ON COLUMN alert_his_event.target_note IS 'target note';
COMMENT ON COLUMN alert_his_event.last_eval_time IS 'for time filter';
COMMENT ON COLUMN alert_his_event.tags IS 'merge data_tags rule_tags, split by ,,';
CREATE TABLE task_tpl
(
id serial,
group_id int not null ,
title varchar(255) not null default '',
account varchar(64) not null,
batch int not null default 0,
tolerance int not null default 0,
timeout int not null default 0,
pause varchar(255) not null default '',
script text not null,
args varchar(512) not null default '',
tags varchar(255) not null default '' ,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE task_tpl ADD CONSTRAINT task_tpl_pk PRIMARY KEY (id);
CREATE INDEX task_tpl_group_id_idx ON task_tpl (group_id);
COMMENT ON COLUMN task_tpl.group_id IS 'busi group id';
COMMENT ON COLUMN task_tpl.tags IS 'split by space';
CREATE TABLE task_tpl_host
(
ii serial,
id int not null ,
host varchar(128) not null
) ;
ALTER TABLE task_tpl_host ADD CONSTRAINT task_tpl_host_pk PRIMARY KEY (id);
CREATE INDEX task_tpl_host_id_idx ON task_tpl_host (id,host);
COMMENT ON COLUMN task_tpl_host.id IS 'task tpl id';
COMMENT ON COLUMN task_tpl_host.host IS 'ip or hostname';
CREATE TABLE task_record
(
id bigserial,
group_id bigint not null ,
ibex_address varchar(128) not null,
ibex_auth_user varchar(128) not null default '',
ibex_auth_pass varchar(128) not null default '',
title varchar(255) not null default '',
account varchar(64) not null,
batch int not null default 0,
tolerance int not null default 0,
timeout int not null default 0,
pause varchar(255) not null default '',
script text not null,
args varchar(512) not null default '',
create_at bigint not null default 0,
create_by varchar(64) not null default ''
) ;
ALTER TABLE task_record ADD CONSTRAINT task_record_pk PRIMARY KEY (id);
CREATE INDEX task_record_create_at_idx ON task_record (create_at,group_id);
CREATE INDEX task_record_create_by_idx ON task_record (create_by);
COMMENT ON COLUMN task_record.id IS 'ibex task id';
COMMENT ON COLUMN task_record.group_id IS 'busi group id';

5
docker/mysqletc/my.cnf Normal file
View File

@@ -0,0 +1,5 @@
[mysqld]
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
datadir = /var/lib/mysql
bind-address = 0.0.0.0

View File

@@ -0,0 +1,392 @@
[
{
"name": "Elastic Cluster Red status",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": " elasticsearch_cluster_health_status{color=\"red\"} == 1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchClusterRed"
]
},
{
"name": "Elastic Cluster Yellow status",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "elasticsearch_cluster_health_status{color=\"yellow\"} == 1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchClusterYellow"
]
},
{
"name": "Elasticsearch disk out of space of the instance",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "elasticsearch_filesystem_data_available_bytes / elasticsearch_filesystem_data_size_bytes * 100 < 10",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchDiskOutOfSpace"
]
},
{
"name": "Elasticsearch disk space low of the instance",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "elasticsearch_filesystem_data_available_bytes / elasticsearch_filesystem_data_size_bytes * 100 < 20",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchDiskSpaceLow"
]
},
{
"name": "Elasticsearch Heap Usage Too High of the instance",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "(elasticsearch_jvm_memory_used_bytes{area=\"heap\"} / elasticsearch_jvm_memory_max_bytes{area=\"heap\"}) * 100 > 90",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchHeapUsageTooHigh"
]
},
{
"name": "Elasticsearch Heap Usage warning of the instance",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "(elasticsearch_jvm_memory_used_bytes{area=\"heap\"} / elasticsearch_jvm_memory_max_bytes{area=\"heap\"}) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchHeapUsageWarning"
]
},
{
"name": "Elasticsearch initializing shards of the instance",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 900,
"prom_ql": "elasticsearch_cluster_health_initializing_shards > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchInitializingShards"
]
},
{
"name": "Elasticsearch no new documents of the instance",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 300,
"prom_ql": "rate(elasticsearch_indices_docs{es_data_node=\"true\"}[5m]) == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchNoNewDocuments"
]
},
{
"name": "Elasticsearch pending tasks of the instance",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 900,
"prom_ql": "elasticsearch_cluster_health_number_of_pending_tasks > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchPendingTasks"
]
},
{
"name": "Elasticsearch relocation shards of the instance",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 900,
"prom_ql": "elasticsearch_cluster_health_relocating_shards > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchRelocationShards"
]
},
{
"name": "Elasticsearch unassigned shards of the instance",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "elasticsearch_cluster_health_unassigned_shards > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchUnassignedShards"
]
},
{
"name": "Elasticsearch Unhealthy Data Nodes",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "elasticsearch_cluster_health_number_of_data_nodes < number_of_data_nodes",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchHealthyDataNodes"
]
},
{
"name": "Elasticsearch Unhealthy Nodes",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": " elasticsearch_cluster_health_number_of_nodes < number_of_nodes",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ElasticsearchHealthyNodes"
]
}
]

View File

@@ -0,0 +1,30 @@
[
{
"name": "HTTP地址探测失败",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "http_response_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@@ -0,0 +1,58 @@
[
{
"name": "数据有丢失风险-同步副本数小于3",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "sum(kafka_topic_partition_in_sync_replica) by (topic) < 3",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "消费能力不足-积压消息数超过50条",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "sum(kafka_topic_partition_current_offset{instance=\"$instance\"}) by (topic) - sum(kafka_consumergroup_current_offset{instance=\"$instance\"}) by (topic) ",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@@ -0,0 +1,243 @@
[
{
"name": "监控对象失联",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "max_over_time(target_up[130s]) == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "机器负载-CPU较高请关注",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "cpu_usage_idle{cpu=\"cpu-total\"} < 25",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "机器负载-内存较高,请关注",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "mem_available_percent < 25",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "硬盘-IO有点繁忙",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "rate(diskio_io_time[1m])/10 > 99",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "硬盘-预计再有4小时写满",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "predict_linear(disk_free[1h], 4*3600) < 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网卡-入向有丢包",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "increase(net_drop_in[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网卡-出向有丢包",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "increase(net_drop_out[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网络连接-TME_WAIT数量超过2万",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "netstat_tcp_time_wait > 20000",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@@ -0,0 +1,344 @@
[
{
"name": "inode资源不足-使用率超过90",
"note": "",
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "(100 - ((node_filesystem_files_free * 100) / node_filesystem_files))>90",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "内存资源不足-利用率大于75%",
"note": "需要扩容或者升级配置了",
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "(node_memory_MemTotal_bytes - node_memory_MemFree_bytes - (node_memory_Cached_bytes + node_memory_Buffers_bytes))/node_memory_MemTotal_bytes*100 > 75",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [
"dingtalk"
],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "内存资源不足-利用率大于95%",
"note": "需要扩容或者升级配置了",
"severity": 1,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "(node_memory_MemTotal_bytes - node_memory_MemFree_bytes - (node_memory_Cached_bytes + node_memory_Buffers_bytes))/node_memory_MemTotal_bytes*100 > 95",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [
"dingtalk"
],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "文件句柄不足-使用率超过90%",
"note": "可以将文件句柄limit调大或者扩容",
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "(node_filefd_allocated{instance=\"$node\"}/node_filefd_maximum{instance=\"$node\"}*100) > 90",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "某磁盘无法正常读写",
"note": "",
"severity": 1,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "(node_filesystem_device_error{instance=\"$node\",mountpoint!~\"/var/lib/.*\",mountpoint!~\"/run.*\"}) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "磁盘需要清理了-利用率达到92%",
"note": "",
"severity": 1,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "(100 - ((node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes) ) > 92 ",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [
"dingtalk"
],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "系统conntrack需要调整-使用率超过80%",
"note": "",
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "node_nf_conntrack_entries / node_nf_conntrack_entries_limit*100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "系统出现oom",
"note": "",
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "increase(node_vmstat_oom_kill[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网卡入方向丢包",
"note": "",
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "rate(node_network_receive_drop_total{device=~\"e.*\"}[1m]) > 3",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网卡出方向丢包",
"note": "",
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "rate(node_network_transmit_drop_total{device=~\"e.*\"}[1m]) > 3",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "计算资源不足-机器每个核平均负载大于10",
"note": "需要扩容或者升级配置了",
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "avg (node_load1) by (instance)/count(count(node_cpu_seconds_total) by (cpu,instance)) by (instance) >10",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "运行进程数过多-超过3000",
"note": "建议扩容",
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "node_procs_running > 3000",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@@ -0,0 +1,393 @@
[
{
"name": "有地址PING不通请注意",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "ping_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "有监控对象失联",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "target_up != 1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "有端口探测失败,请注意",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "net_response_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "机器负载-CPU较高请关注",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "cpu_usage_idle{cpu=\"cpu-total\"} < 25",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "机器负载-内存较高,请关注",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "mem_available_percent < 25",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "硬盘-IO非常繁忙",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "rate(diskio_io_time[1m])/10 > 99",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "硬盘-预计再有4小时写满",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "predict_linear(disk_free[1h], 4*3600) < 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网卡-入向有丢包",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "increase(net_drop_in[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网卡-出向有丢包",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "increase(net_drop_out[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网络连接-TME_WAIT数量超过2万",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "netstat_tcp_time_wait > 20000",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "进程监控-有进程数为0某进程可能挂了",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "procstat_lookup_running == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "进程监控-进程句柄限制过小",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "procstat_rlimit_num_fds_soft < 2048",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "进程监控-采集失败",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "procstat_lookup_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@@ -0,0 +1,242 @@
[
{
"name": "Mongo出现Assert错误",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 1800,
"prom_ql": "rate(mongodb_asserts_total{type=~\"regular|message\"}[5m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MongoAssertsDetected"
]
},
{
"name": "Mongo出现游标超时",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 1800,
"prom_ql": "rate(mongodb_mongod_metrics_cursor_timed_out_total[5m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MongoRecurrentCursorTimeout"
]
},
{
"name": "Mongo出现页错误中断",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 1800,
"prom_ql": "rate(mongodb_extra_info_page_faults_total[5m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MongoRecurrentMemoryPageFaults"
]
},
{
"name": "Mongo刚刚有重启请注意",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mongodb_instance_uptime_seconds < 60",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MongoRestarted"
]
},
{
"name": "Mongo副本集主从延迟超过30s",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "mongodb_mongod_replset_member_replication_lag > 30",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MongoSlaveReplicationLag(>30s)"
]
},
{
"name": "Mongo实例挂了",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "MongoServerDown",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MongoServerDown"
]
},
{
"name": "Mongo操作平均耗时超过250秒",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 600,
"prom_ql": "rate(mongodb_mongod_op_latencies_latency_total[5m]) / rate(mongodb_mongod_op_latencies_ops_total[5m]) > 250000",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MongoOperationHighLatency"
]
},
{
"name": "Mongo连接数已超过80%",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mongodb_connections{state=\"current\"}) / avg by (instance) (mongodb_connections{state=\"available\"}) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MongoTooManyConnections(>80%)"
]
}
]

View File

@@ -0,0 +1,302 @@
[
{
"name": "MysqlInnodbLogWaits",
"note": "MySQL innodb log writes stalling",
"severity": 2,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "rate(mysql_global_status_innodb_log_waits[15m]) > 10",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlInnodbLogWaits"
]
},
{
"name": "MysqlSlaveIoThreadNotRunning",
"note": "MySQL Slave IO thread not running",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_io_running == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveIoThreadNotRunning"
]
},
{
"name": "MysqlSlaveReplicationLag",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) (mysql_slave_status_seconds_behind_master - mysql_slave_status_sql_delay) > 30",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveReplicationLag"
]
},
{
"name": "MysqlSlaveSqlThreadNotRunning",
"note": "MySQL Slave SQL thread not running",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_sql_running == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveSqlThreadNotRunning"
]
},
{
"name": "Mysql刚刚有重启请注意",
"note": "MySQL has just been restarted, less than one minute ago",
"severity": 3,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_global_status_uptime < 60",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlRestarted"
]
},
{
"name": "Mysql实例挂了",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_up == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlDown"
]
},
{
"name": "Mysql打开了很多文件句柄请注意",
"note": "More than 80% of MySQL files open",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_open_files) / avg by (instance)(mysql_global_variables_open_files_limit) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlHighOpenFiles"
]
},
{
"name": "Mysql最近一分钟有慢查询出现",
"note": "MySQL server mysql has some new slow query",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "increase(mysql_global_status_slow_queries[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlowQueries"
]
},
{
"name": "Mysql有超过60%的连接是running状态",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_threads_running) / avg by (instance) (mysql_global_variables_max_connections) * 100 > 60",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlHighThreadsRunning"
]
},
{
"name": "Mysql连接数已超过80%",
"note": "More than 80% of MySQL connections are in use",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_threads_connected) / avg by (instance) (mysql_global_variables_max_connections) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlTooManyConnections"
]
}
]

View File

@@ -0,0 +1,302 @@
[
{
"name": "MysqlInnodbLogWaits",
"note": "MySQL innodb log writes stalling",
"severity": 2,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "rate(mysql_global_status_innodb_log_waits[15m]) > 10",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlInnodbLogWaits"
]
},
{
"name": "MysqlSlaveIoThreadNotRunning",
"note": "MySQL Slave IO thread not running",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_io_running == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveIoThreadNotRunning"
]
},
{
"name": "MysqlSlaveReplicationLag",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) (mysql_slave_status_seconds_behind_master - mysql_slave_status_sql_delay) > 30",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveReplicationLag"
]
},
{
"name": "MysqlSlaveSqlThreadNotRunning",
"note": "MySQL Slave SQL thread not running",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_sql_running == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveSqlThreadNotRunning"
]
},
{
"name": "Mysql刚刚有重启请注意",
"note": "MySQL has just been restarted, less than one minute ago",
"severity": 3,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_global_status_uptime < 60",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlRestarted"
]
},
{
"name": "Mysql实例挂了",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_up == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlDown"
]
},
{
"name": "Mysql打开了很多文件句柄请注意",
"note": "More than 80% of MySQL files open",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_innodb_num_open_files) / avg by (instance)(mysql_global_variables_open_files_limit) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlHighOpenFiles"
]
},
{
"name": "Mysql最近一分钟有慢查询出现",
"note": "MySQL server mysql has some new slow query",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "increase(mysql_global_status_slow_queries[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlowQueries"
]
},
{
"name": "Mysql有超过60%的连接是running状态",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_threads_running) / avg by (instance) (mysql_global_variables_max_connections) * 100 > 60",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlHighThreadsRunning"
]
},
{
"name": "Mysql连接数已超过80%",
"note": "More than 80% of MySQL connections are in use",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_threads_connected) / avg by (instance) (mysql_global_variables_max_connections) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlTooManyConnections"
]
}
]

View File

@@ -0,0 +1,30 @@
[
{
"name": "网络地址探活失败",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "net_response_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@@ -0,0 +1,30 @@
[
{
"name": "NTP时间偏移太大",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "ntp_offset_ms > 1000 or ntp_offset_ms < -1000",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@@ -0,0 +1,30 @@
[
{
"name": "PING地址探测失败",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "ping_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@@ -0,0 +1,92 @@
[
{
"name": "Process X high number of open files",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "avg by (instance) (namedprocess_namegroup_worst_fd_ratio{groupname=\"X\"}) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ProcessHighOpenFiles"
]
},
{
"name": "Process X is down",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "sum by (instance) (namedprocess_namegroup_num_procs{groupname=\"X\"}) == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ProcessNotRunning"
]
},
{
"name": "Process X is restarted",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "namedprocess_namegroup_oldest_start_time_seconds{groupname=\"X\"} > time() - 60 ",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=ProcessRestarted"
]
}
]

View File

@@ -0,0 +1,62 @@
[
{
"name": "进程监控-有进程数为0某进程可能挂了",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "procstat_lookup_count == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "进程监控-进程句柄限制过小",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "procstat_rlimit_num_fds_soft < 2048",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@@ -0,0 +1,182 @@
[
{
"name": "Redis Ping 延迟高大于100毫秒",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "redis_ping_use_seconds > 0.1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=HighPingLatency"
]
},
{
"name": "Redis内存使用率较高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "redis_maxmemory > 0 and (redis_used_memory / redis_maxmemory) > 0.85",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisHighMemoryUsage"
]
},
{
"name": "Redis出现拒绝连接",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "(rate(redis_rejected_connections[5m])) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisRejectedConnHigh"
]
},
{
"name": "Redis刚刚有重启请注意",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "redis_uptime_in_seconds < 600",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisLowUptime"
]
},
{
"name": "Redis较低的命中率",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "rate(redis_keyspace_hits[5m])\n/\n(rate(redis_keyspace_misses[5m]) + rate(redis_keyspace_hits[5m]))\n< 0.9",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisLowHitRatio"
]
},
{
"name": "Redis驱逐率较高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "(sum(rate(redis_evicted_keys[5m])) / sum(redis_keyspace_keys)) > 0.1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisHighKeysEvictionRatio"
]
}
]

View File

@@ -0,0 +1,212 @@
[
{
"name": "Redis内存使用率较高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "redis_memory_max_bytes > 0 and (redis_memory_used_bytes / redis_memory_max_bytes) > 0.85",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisHighMemoryUsage"
]
},
{
"name": "Redis出现拒绝连接",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "(rate(redis_rejected_connections_total[5m])) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisRejectedConnHigh"
]
},
{
"name": "Redis刚刚有重启请注意",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "redis_uptime_in_seconds < 600",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisLowUptime"
]
},
{
"name": "Redis客户端连接数过高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "(redis_connected_clients / redis_config_maxclients) > 0.85",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisHighClientsUsage"
]
},
{
"name": "Redis延迟高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "sum(rate(redis_commands_duration_seconds_total[5m])) / sum(rate(redis_commands_processed_total[5m])) > 0.25",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisHighResponseTime"
]
},
{
"name": "Redis较低的命中率",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "rate(redis_keyspace_hits_total[5m])\n/\n(rate(redis_keyspace_misses_total[5m]) + rate(redis_keyspace_hits_total[5m]))\n< 0.9",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisLowHitRatio"
]
},
{
"name": "Redis驱逐率较高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "(sum(rate(redis_evicted_keys_total[5m])) / sum(redis_db_keys)) > 0.1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisHighKeysEvictionRatio"
]
}
]

View File

@@ -0,0 +1,182 @@
[
{
"name": "CPU利用率高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "100 * sum by (instance) (rate(windows_cpu_time_total{mode != 'idle'}[5m])) / count by (instance) (windows_cpu_core_frequency_mhz) > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=HighCPUUsage"
]
},
{
"name": "机器在15分钟内发生过重启",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "time() - windows_system_system_up_time < 900",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=UpTimeLessThan15Min"
]
},
{
"name": "物理内存使用率过高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "100 * (windows_cs_physical_memory_bytes - windows_os_physical_memory_free_bytes) / windows_cs_physical_memory_bytes > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=HighPhysicalMemoryUsage"
]
},
{
"name": "硬盘快要写满了,注意",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "100 * (windows_logical_disk_size_bytes - windows_logical_disk_free_bytes) / windows_logical_disk_size_bytes > 90",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=LogicalDiskFull"
]
},
{
"name": "系统有In方向丢包情况出现",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "100 * rate(windows_net_packets_received_errors[5m]) / (rate(windows_net_packets_received_errors[5m]) + rate(windows_net_packets_received_total[5m])>0) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=HighInboundErrorRate"
]
},
{
"name": "系统有Out方向丢包情况出现",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "100 * rate(windows_net_packets_outbound_errors[5m]) / (rate(windows_net_packets_outbound_errors[5m]) + rate(windows_net_packets_sent_total[5m])>0) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=HighOutboundErrorRate"
]
}
]

View File

@@ -0,0 +1,114 @@
[
{
"name": "Zookeeper leader 个数大于1",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "sum(zk_server_leader) > 1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "Zookeeper 实例运行异常",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "zk_ruok == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "Zookeeper 没有 leader 了",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "sum(zk_server_leader) == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "Zookeeper 挂掉了",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "zk_up == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@@ -0,0 +1,431 @@
[
{
"name": "ElasticSearch - 模板",
"tags": "Prometheus ElasticSearch ES",
"configs": "{\"var\":[{\"name\":\"cluster\",\"definition\":\"label_values(elasticsearch_indices_docs,cluster)\",\"options\":[\"elasticsearch-cluster\"]},{\"name\":\"instance\",\"definition\":\"label_values(elasticsearch_indices_docs{cluster=\\\"$cluster\\\", name!=\\\"\\\"},instance)\",\"options\":[\"10.206.0.7:9200\"]},{\"name\":\"name\",\"definition\":\"label_values(elasticsearch_indices_docs{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\", name!=\\\"\\\"},name)\",\"options\":[\"node-2\",\"node-1\",\"node-3\"],\"multi\":true,\"allOption\":true}]}",
"chart_groups": [
{
"name": "KPI",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_cluster_health_number_of_nodes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\"}\"}],\"name\":\"Data Nodes\",\"description\":\"集群数据节点数量\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":6,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_cluster_health_number_of_nodes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\"}\"}],\"name\":\"Nodes\",\"description\":\"集群节点数量\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":3,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum (elasticsearch_process_cpu_percent{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"} ) / count (elasticsearch_process_cpu_percent{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"} )\"}],\"name\":\"CPU usage Avg\",\"description\":\"平均CPU使用率\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"special\":80,\"from\":80},\"result\":{\"color\":\"#f90606\"}},{\"type\":\"range\",\"match\":{\"special\":70,\"from\":70},\"result\":{\"color\":\"#f5ac0f\"}},{\"type\":\"range\",\"match\":{\"to\":70},\"result\":{\"color\":\"#21c00c\"}}],\"standardOptions\":{\"util\":\"percent\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":9,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum (elasticsearch_jvm_memory_used_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}) / sum (elasticsearch_jvm_memory_max_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}) * 100\"}],\"name\":\"JVM memory used Avg\",\"description\":\"平均JVM内存使用率\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"special\":80,\"from\":80},\"result\":{\"color\":\"#f12c09\"}},{\"type\":\"range\",\"match\":{\"from\":70,\"to\":80},\"result\":{\"color\":\"#fbca18\"}},{\"type\":\"range\",\"match\":{\"to\":70},\"result\":{\"color\":\"#21c00c\"}}],\"standardOptions\":{\"util\":\"percent\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":12,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum (elasticsearch_process_open_files_count{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"})\"}],\"name\":\"Open file descriptors\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":15,\"y\":0,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(elasticsearch_breakers_tripped{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"})\"}],\"name\":\"Tripped for breakers\",\"description\":\"集群断路器阻断总数\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"special\",\"result\":{\"text\":\"\",\"color\":\"#21c00c\"},\"match\":{\"special\":0}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":18,\"y\":0,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_cluster_health_number_of_pending_tasks{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\"}\"}],\"name\":\"Pending tasks\",\"description\":\"等待执行的集群变更任务数量\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":5},\"result\":{\"color\":\"#f24207\"}},{\"type\":\"range\",\"match\":{\"from\":1,\"to\":5},\"result\":{\"color\":\"#f9a006\"}},{\"type\":\"range\",\"match\":{\"to\":1},\"result\":{\"color\":\"#21c00c\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":21,\"y\":0,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_cluster_health_status{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",color=\\\"red\\\"}==1 or (elasticsearch_cluster_health_status{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",color=\\\"green\\\"}==1)+4 or (elasticsearch_cluster_health_status{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",color=\\\"yellow\\\"}==1)+2\"}],\"name\":\"Cluster health\",\"description\":\"绿色:健康,黄色:存在副本分片异常,红色:存在主分片异常\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"special\",\"result\":{\"text\":\"N/A\"},\"match\":{}},{\"type\":\"special\",\"match\":{\"special\":5},\"result\":{\"text\":\"Green\",\"color\":\"#21c00c\"}},{\"type\":\"special\",\"match\":{\"special\":3},\"result\":{\"text\":\"Yellow\",\"color\":\"#f9e406\"}},{\"type\":\"special\",\"match\":{\"special\":1},\"result\":{\"text\":\"Red\",\"color\":\"#f43606\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":0,\"y\":0,\"i\":\"7\"}}",
"weight": 0
}
]
},
{
"name": "Shards",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_cluster_health_active_shards{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\"}\"}],\"name\":\"Active shards\",\"description\":\"活跃分片数\\nAggregate total of all shards across all indices, which includes replica shards\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_cluster_health_active_primary_shards{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\"}\"}],\"name\":\"Active primary shards\",\"description\":\"活跃主分片数\\nThe number of primary shards in your cluster. This is an aggregate total across all indices.\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":4,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_cluster_health_initializing_shards{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\"}\"}],\"name\":\"Initializing shards\",\"description\":\"创建中分片数量\\nCount of shards that are being freshly created\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":8,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_cluster_health_relocating_shards{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\"}\"}],\"name\":\"Relocating shards\",\"description\":\"迁移中分片数量\\nThe number of shards that are currently moving from one node to another node.\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":12,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_cluster_health_unassigned_shards{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\"}\"}],\"name\":\"Unassigned shards\",\"description\":\"未分配的分片数量\\nThe number of shards that exist in the cluster state, but cannot be found in the cluster itself\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":20,\"y\":0,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_cluster_health_delayed_unassigned_shards{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\"}\"}],\"name\":\"Delayed shards\",\"description\":\"暂缓分配的分片数量当节点丢失会发生选主和数据拷贝。为了较少网络抖动等原因导致的重分配情况配置delay参数该值为等待delay到期将被重分配的分片数量\\nShards delayed to reduce reallocation overhead\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":16,\"y\":0,\"i\":\"5\"}}",
"weight": 0
}
]
},
{
"name": "JVM Garbage Collection",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_jvm_gc_collection_seconds_count{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}} - {{gc}}\"}],\"name\":\"GC count\",\"description\":\"GC运行次数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_jvm_gc_collection_seconds_sum{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}} - {{gc}}\"}],\"name\":\"GC time\",\"description\":\"GC运行耗时\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"seconds\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Translog",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_indices_translog_operations{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Total translog operations\",\"description\":\"translog大小byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"none\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_indices_translog_size_in_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Total translog size in bytes\",\"description\":\"translog大小byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Breakers",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_breakers_tripped{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}}: {{breaker}}\"}],\"name\":\"Tripped for breakers\",\"description\":\"节点断路器阻断总数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_breakers_estimated_size_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}}: {{breaker}}\"},{\"expr\":\"elasticsearch_breakers_limit_size_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"refId\":\"B\",\"legend\":\"{{name}}: limit for {{breaker}}\"}],\"name\":\"Estimated size in bytes of breaker\",\"description\":\"预估内存大小和限制内存大小\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Cpu and Memory",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_os_load1{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"load1: {{name}}\"},{\"expr\":\"elasticsearch_os_load5{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"refId\":\"B\",\"legend\":\"load5: {{name}}\"},{\"expr\":\"elasticsearch_os_load15{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"refId\":\"C\",\"legend\":\"load15: {{name}}\"}],\"name\":\"Load average\",\"description\":\"1m/5m/15m系统负载\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_process_cpu_percent{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"CPU usage\",\"description\":\"进程CPU占比\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percent\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_jvm_memory_used_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}} used: {{area}}\"},{\"expr\":\"elasticsearch_jvm_memory_max_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"refId\":\"B\",\"legend\":\"{{name}} max: {{area}}\"},{\"expr\":\"elasticsearch_jvm_memory_pool_peak_used_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"refId\":\"C\",\"legend\":\"{{name}} peak used pool: {{pool}}\"}],\"name\":\"JVM memory usage\",\"description\":\"进程内存占用/限制/峰值byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_jvm_memory_committed_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}} committed: {{area}}\"},{\"expr\":\"elasticsearch_jvm_memory_max_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"refId\":\"B\",\"legend\":\"{{name}} max: {{area}}\"}],\"name\":\"JVM memory committed\",\"description\":\"JVM申请/限制内存byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Disk and Network",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"1-(elasticsearch_filesystem_data_available_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}/elasticsearch_filesystem_data_size_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"})\",\"legend\":\"{{name}}: {{path}}\"}],\"name\":\"Disk usage\",\"description\":\"磁盘使用率\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_transport_tx_size_bytes_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}: sent\"},{\"expr\":\"-irate(elasticsearch_transport_rx_size_bytes_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"B\",\"legend\":\"{{name}}: received\"}],\"name\":\"Network usage\",\"description\":\"网络流量byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesSI\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Documents",
"weight": 7,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_docs{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Documents count on node\",\"description\":\"节点文档总数,不包含已删除文档和子文档以及刚索引的文档\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_indices_indexing_index_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Documents indexed rate\",\"description\":\"平均每秒索引文档数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_indices_docs_deleted{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Documents deleted rate\",\"description\":\"平均每秒删除文档数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(elasticsearch_indices_merges_docs_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Documents merged rate\",\"description\":\"平均每秒合并文档数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":2,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_indices_merges_total_size_bytes_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Documents merged bytes\",\"description\":\"平均每秒合并文档大小\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesSI\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":2,\"i\":\"4\"}}",
"weight": 0
}
]
},
{
"name": "Times",
"weight": 8,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_indices_search_query_time_seconds{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Query time\",\"description\":\"查询操作耗时(秒)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"seconds\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_indices_indexing_index_time_seconds_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Indexing time\",\"description\":\"索引操作耗时(秒)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"seconds\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_indices_merges_total_time_seconds_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Merging time\",\"description\":\"合并操作耗时(秒)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"seconds\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_indices_store_throttle_time_seconds_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Throttle time for index store\",\"description\":\"索引存储限制耗时(秒)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"seconds\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Total Operations states",
"weight": 9,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_indices_indexing_index_time_seconds_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}: indexing\"},{\"expr\":\"irate(elasticsearch_indices_search_query_time_seconds{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"B\",\"legend\":\"{{name}}: query\"},{\"expr\":\"irate(elasticsearch_indices_search_fetch_time_seconds{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"C\",\"legend\":\"{{name}}: fetch\"},{\"expr\":\"irate(elasticsearch_indices_merges_total_time_seconds_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"D\",\"legend\":\"{{name}}: merges\"},{\"expr\":\"irate(elasticsearch_indices_refresh_time_seconds_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"E\",\"legend\":\"{{name}}: refresh\"},{\"expr\":\"irate(elasticsearch_indices_flush_time_seconds{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"F\",\"legend\":\"{{name}}: flush\"},{\"expr\":\"irate(elasticsearch_indices_get_exists_time_seconds{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"G\",\"legend\":\"{{name}}: get_exists\"},{\"expr\":\"irate(elasticsearch_indices_get_time_seconds{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"H\",\"legend\":\"{{name}}: get_time\"},{\"expr\":\"irate(elasticsearch_indices_get_missing_time_seconds{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"I\",\"legend\":\"{{name}}: get_missing\"},{\"expr\":\"irate(elasticsearch_indices_indexing_delete_time_seconds_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"K\",\"legend\":\"{{name}}: indexing_delete\"},{\"expr\":\"irate(elasticsearch_indices_get_time_seconds{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"L\",\"legend\":\"{{name}}: get\"}],\"name\":\"Total Operations time\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"seconds\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(elasticsearch_indices_indexing_index_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}: indexing\"},{\"expr\":\"rate(elasticsearch_indices_search_query_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"B\",\"legend\":\"{{name}}: query\"},{\"expr\":\"rate(elasticsearch_indices_search_fetch_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"C\",\"legend\":\"{{name}}: fetch\"},{\"expr\":\"rate(elasticsearch_indices_merges_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"D\",\"legend\":\"{{name}}: merges\"},{\"expr\":\"rate(elasticsearch_indices_refresh_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"E\",\"legend\":\"{{name}}: refresh\"},{\"expr\":\"rate(elasticsearch_indices_flush_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"F\",\"legend\":\"{{name}}: flush\"},{\"expr\":\"rate(elasticsearch_indices_get_exists_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"G\",\"legend\":\"{{name}}: get_exists\"},{\"expr\":\"rate(elasticsearch_indices_get_missing_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"H\",\"legend\":\"{{name}}: get_missing\"},{\"expr\":\"rate(elasticsearch_indices_get_tota{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"I\",\"legend\":\"{{name}}: get\"},{\"expr\":\"rate(elasticsearch_indices_indexing_delete_total{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"refId\":\"K\",\"legend\":\"{{name}}: indexing_delete\"}],\"name\":\"Total Operations rate\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Thread Pool",
"weight": 10,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_thread_pool_rejected_count{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}: {{ type }}\"}],\"name\":\"Thread Pool operations rejected\",\"description\":\"线程池reject次数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_thread_pool_active_count{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}}: {{ type }}\"}],\"name\":\"Thread Pool threads active\",\"description\":\"活跃线程数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_thread_pool_queue_count{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}}: {{ type }}\"}],\"name\":\"Thread Pool threads queued\",\"description\":\"排队等待线程任务数量\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"irate(elasticsearch_thread_pool_completed_count{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}: {{ type }}\"}],\"name\":\"Thread Pool operations completed\",\"description\":\"线程池complete次数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Caches",
"weight": 11,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_fielddata_memory_size_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Field data memory size\",\"description\":\"fielddata cache内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(elasticsearch_indices_fielddata_evictions{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Field data evictions\",\"description\":\"fielddata cache平均每秒内存剔除次数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"none\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_query_cache_memory_size_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Query cache size\",\"description\":\"query cache内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(elasticsearch_indices_query_cache_evictions{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Query cache evictions\",\"description\":\"query cache平均每秒内存剔除次数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"none\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":2,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(elasticsearch_indices_filter_cache_evictions{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}[5m])\",\"legend\":\"{{name}}\"}],\"name\":\"Evictions from filter cache\",\"description\":\"老版本的filter cache内存剔除次数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"none\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":2,\"i\":\"4\"}}",
"weight": 0
}
]
},
{
"name": "Segments",
"weight": 12,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segments_count{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Count of index segments\",\"description\":\"segment个数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segments_memory_bytes{instance=\\\"$instance\\\",cluster=\\\"$cluster\\\",name=~\\\"$name\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Current memory size of segments in bytes\",\"description\":\"segment内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Indices: Count of documents and Total size",
"weight": 13,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_docs_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Count of documents with only primary shards\",\"description\":\"主分片文档数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"stack\":\"off\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_store_size_bytes_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Total size of stored index data in bytes with only primary shards on all nodes\",\"description\":\"主分片索引容量byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_store_size_bytes_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Total size of stored index data in bytes with all shards on all nodes\",\"description\":\"所有分片索引容量byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":4,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Indices: Index writer",
"weight": 14,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_index_writer_memory_bytes_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Index writer with only primary shards on all nodes in bytes\",\"description\":\"主分片索引写入数据量byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_index_writer_memory_bytes_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Index writer with all shards on all nodes in bytes\",\"description\":\"所有分片索引写入数据量byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Indices: Segments",
"weight": 15,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_count_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Segments with only primary shards on all nodes\",\"description\":\"主分片segment数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_count_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Segments with all shards on all nodes\",\"description\":\"所有分片segment总数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_memory_bytes_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of segments with only primary shards on all nodes in bytes\",\"description\":\"主分片segment容量\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":4,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_memory_bytes_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of segments with all shards on all nodes in bytes\",\"description\":\"所有分片segment容量\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":6,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Indices: Doc values",
"weight": 16,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_doc_values_memory_bytes_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Doc values with only primary shards on all nodes in bytes\",\"description\":\"主分片doc value内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_doc_values_memory_bytes_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Doc values with all shards on all nodes in bytes\",\"description\":\"所有分片doc value内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Indices: Fields",
"weight": 17,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_fields_memory_bytes_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of fields with only primary shards on all nodes in bytes\",\"description\":\"分片field内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_fields_memory_bytes_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of fields with all shards on all nodes in bytes\",\"description\":\"所有分片field内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Indices: Fixed bit",
"weight": 18,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_fixed_bit_set_memory_bytes_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of fixed bit with only primary shards on all nodes in bytes\",\"description\":\"主分片fixed bit set内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_fixed_bit_set_memory_bytes_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of fixed bit with all shards on all nodes in bytes\",\"description\":\"所有分片fixed bit set内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Indices: Norms",
"weight": 19,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_norms_memory_bytes_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of norms with only primary shards on all nodes in bytes\",\"description\":\"主分片normalization factor内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_norms_memory_bytes_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of norms with all shards on all nodes in bytes\",\"description\":\"所有分片normalization factor内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Indices: Points",
"weight": 20,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_points_memory_bytes_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of points with only primary shards on all nodes in bytes\",\"description\":\"主分片point内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_points_memory_bytes_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of points with all shards on all nodes in bytes\",\"description\":\"所有分片point内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Indices: Terms",
"weight": 21,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_terms_memory_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of terms with only primary shards on all nodes in bytes\",\"description\":\"主分片term内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_terms_memory_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Number of terms with all shards on all nodes in bytes\",\"description\":\"所有分片term内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Indices: Terms",
"weight": 22,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_version_map_memory_bytes_primary{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of version map with only primary shards on all nodes in bytes\",\"description\":\"所有分片version map内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"elasticsearch_indices_segment_version_map_memory_bytes_total{instance=~\\\"$instance\\\"}\",\"legend\":\"{{index}}\"}],\"name\":\"Size of version map with all shards on all nodes in bytes\",\"description\":\"所有分片version map内存占用byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,19 @@
[
{
"name": "HTTP探测",
"tags": "",
"configs": "",
"chart_groups": [
{
"name": "Default chart group",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"max(http_response_result_code) by (target)\",\"legend\":\"UP?\"},{\"expr\":\"max(http_response_response_code) by (target)\",\"refId\":\"B\",\"legend\":\"status code\"},{\"expr\":\"max(http_response_response_time) by (target)\",\"refId\":\"C\",\"legend\":\"latency(s)\"},{\"expr\":\"max(http_response_cert_expire_timestamp) by (target) - time()\",\"refId\":\"D\",\"legend\":\"cert expire\"}],\"name\":\"URL Details\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelValuesToRows\",\"aggrDimension\":\"target\"},\"options\":{\"valueMappings\":[],\"standardOptions\":{}},\"overrides\":[{\"properties\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"UP\",\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"special\":1,\"from\":1},\"result\":{\"text\":\"DOWN\",\"color\":\"#e90f0f\"}}],\"standardOptions\":{}},\"matcher\":{\"value\":\"A\"}},{\"type\":\"special\",\"matcher\":{\"value\":\"D\"},\"properties\":{\"standardOptions\":{\"util\":\"humantimeSeconds\"}}}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":4,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,125 @@
[
{
"name": "JMX - 模板",
"tags": "Prometheus JMX",
"configs": "{\"var\":[{\"name\":\"job\",\"definition\":\"jmx_exporter\"},{\"name\":\"instance\",\"definition\":\"label_values(jmx_scrape_error,instance)\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"up{job=\\\"$job\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Status\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":1},\"result\":{\"text\":\"UP\",\"color\":\"#1eac02\"}},{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"DOWN\",\"color\":\"#f00a0a\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"time() - process_start_time_seconds{job=\\\"$job\\\",instance=\\\"$instance\\\"}\"}],\"name\":\"Uptime\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"os_available_processors{job=\\\"$job\\\",instance=\\\"$instance\\\"}\"}],\"name\":\"Available CPUs\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"os_open_file_descriptor_count{job=\\\"$job\\\",instance=\\\"$instance\\\"}\"}],\"name\":\"Open file descriptors\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "JVM Memory",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_memory_bytes_used{area=\\\"heap\\\",job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Used\"},{\"expr\":\"jvm_memory_bytes_max{area=\\\"heap\\\",job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Max\"}],\"name\":\"JVM Memory(heap)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_memory_bytes_used{area=\\\"nonheap\\\",job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Used\"},{\"expr\":\"jvm_memory_bytes_max{area=\\\"nonheap\\\",job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Max\"}],\"name\":\"JVM Memory(nonheap)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Memory Pool",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_memory_pool_bytes_max{pool=\\\"CodeHeap 'non-nmethods'\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Max\"},{\"expr\":\"jvm_memory_pool_bytes_used{pool=\\\"CodeHeap 'non-nmethods'\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Used\"},{\"expr\":\"jvm_memory_pool_bytes_committed{pool=\\\"CodeHeap 'non-nmethods'\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"Committed\"}],\"name\":\"CodeHeap 'non-nmethods'\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_memory_pool_bytes_max{pool=\\\"CodeHeap 'profiled nmethods'\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Max\"},{\"expr\":\"jvm_memory_pool_bytes_used{pool=\\\"CodeHeap 'profiled nmethods'\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Used\"},{\"expr\":\"jvm_memory_pool_bytes_committed{pool=\\\"CodeHeap 'profiled nmethods'\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"Committed\"}],\"name\":\"CodeHeap 'profiled nmethods'\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_memory_pool_bytes_max{pool=\\\"CodeHeap 'non-profiled nmethods'\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Max\"},{\"expr\":\"jvm_memory_pool_bytes_used{pool=\\\"CodeHeap 'non-profiled nmethods'\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Used\"},{\"expr\":\"jvm_memory_pool_bytes_committed{pool=\\\"CodeHeap 'non-profiled nmethods'\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"Committed\"}],\"name\":\"CodeHeap 'non-profiled nmethods'\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_memory_pool_bytes_max{pool=\\\"G1 Eden Space\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Max\"},{\"expr\":\"jvm_memory_pool_bytes_used{pool=\\\"G1 Eden Space\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Used\"},{\"expr\":\"jvm_memory_pool_bytes_committed{pool=\\\"G1 Eden Space\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"Committed\"}],\"name\":\"G1 Eden Space\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_memory_pool_bytes_max{pool=\\\"Compressed Class Space\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Max\"},{\"expr\":\"jvm_memory_pool_bytes_used{pool=\\\"Compressed Class Space\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Used\"},{\"expr\":\"jvm_memory_pool_bytes_committed{pool=\\\"Compressed Class Space\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"Committed\"}],\"name\":\"Compressed Class Space\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_memory_pool_bytes_max{pool=\\\"G1 Survivor Space\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Max\"},{\"expr\":\"jvm_memory_pool_bytes_used{pool=\\\"G1 Survivor Space\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Used\"},{\"expr\":\"jvm_memory_pool_bytes_committed{pool=\\\"G1 Survivor Space\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"Committed\"}],\"name\":\"G1 Survivor Space\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":2,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_memory_pool_bytes_max{pool=\\\"G1 Old Gen\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Max\"},{\"expr\":\"jvm_memory_pool_bytes_used{pool=\\\"G1 Old Gen\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Used\"},{\"expr\":\"jvm_memory_pool_bytes_committed{pool=\\\"G1 Old Gen\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"Committed\"}],\"name\":\"G1 Old Gen\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":2,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_memory_pool_bytes_max{pool=\\\"Metaspace\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Max\"},{\"expr\":\"jvm_memory_pool_bytes_used{pool=\\\"Metaspace\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Used\"},{\"expr\":\"jvm_memory_pool_bytes_committed{pool=\\\"Metaspace\\\", job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"Committed\"}],\"name\":\"Metaspace\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":2,\"i\":\"7\"}}",
"weight": 0
}
]
},
{
"name": "GC",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"increase(jvm_gc_collection_seconds_sum{job=\\\"$job\\\",instance=~\\\"$instance\\\"}[1m])\"}],\"name\":\"过去一分钟GC耗时\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"none\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"increase(jvm_gc_collection_seconds_count{job=\\\"$job\\\",instance=\\\"$instance\\\"}[1m])\",\"legend\":\"\"}],\"name\":\"过去一分钟GC次数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"none\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"increase(jvm_gc_collection_seconds_sum{job=\\\"$job\\\",instance=\\\"$instance\\\"}[1m])/increase(jvm_gc_collection_seconds_count{job=\\\"$job\\\",instance=\\\"$instance\\\"}[1m])\",\"legend\":\"\"}],\"name\":\"过去一分钟每次GC平均耗时\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"none\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"bars\",\"stack\":\"off\",\"fillOpacity\":1},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Threads and Class loading",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_threads_current{job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"current\"},{\"expr\":\"jvm_threads_daemon{job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"daemon\"},{\"expr\":\"jvm_threads_deadlocked{job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"deadlocked\"}],\"name\":\"Threads\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"jvm_classes_loaded{job=\\\"$job\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Class loading\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Physical memory",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"os_total_physical_memory_bytes{job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"legend\":\"Total physical memory\"},{\"expr\":\"os_committed_virtual_memory_bytes{job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"Committed virtual memory\"},{\"expr\":\"os_free_physical_memory_bytes{job=\\\"$job\\\",instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"Free physical memory\"}],\"name\":\"Physical memory\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,63 @@
[
{
"name": "Kafka - 模板",
"tags": "Kafka Prometheus ",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(kafka_brokers, instance)\"},{\"name\":\"job\",\"definition\":\"label_values(kafka_brokers, job)\"}]}",
"chart_groups": [
{
"name": "overview",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"count(count by (topic) (kafka_topic_partitions))\"}],\"name\":\"topics\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":50}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":8,\"x\":8,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"kafka_brokers\"}],\"name\":\"brokers\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":50}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":8,\"x\":0,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(kafka_topic_partitions)\"}],\"name\":\"partitions\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":50}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "throughput",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(kafka_topic_partition_current_offset{instance=\\\"$instance\\\"}[1m])) by (topic)\"}],\"name\":\"Message in per second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(kafka_consumer_lag_millis{instance=\\\"$instance\\\"}) by (consumergroup, topic) \",\"legend\":\"{{consumergroup}} (topic: {{topic}})\"}],\"name\":\"Latency by Consumer Group\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"humantimeMilliseconds\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(kafka_consumergroup_current_offset{instance=\\\"$instance\\\"}[1m])) by (topic)\"}],\"name\":\"Message consume per second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(kafka_topic_partition_current_offset{instance=\\\"$instance\\\"}) by (topic) - sum(kafka_consumergroup_current_offset{instance=\\\"$instance\\\"}) by (topic) \",\"legend\":\"{{consumergroup}} (topic: {{topic}})\"}],\"name\":\"Lag by Consumer Group\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "patition/replicate",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"kafka_topic_partitions{instance=\\\"$instance\\\"}\",\"legend\":\"{{topic}}\"}],\"name\":\"Partitions per Topic\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"seriesToRows\"},\"options\":{\"standardOptions\":{}},\"overrides\":[{}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"kafka_topic_partition_under_replicated_partition\",\"legend\":\"{{topic}}-{{partition}}\"}],\"name\":\"Under Replicated\",\"description\":\"副本不同步预案\\n1. Restart the Zookeeper leader.\\n2. Restart the broker\\\\brokers that are not replicating some of the partitions.\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"seriesToRows\"},\"options\":{\"standardOptions\":{}},\"overrides\":[{}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,223 @@
[
{
"name": "HOST - 模板",
"tags": "Prometheus Host",
"configs": "{\"var\":[{\"name\":\"node\",\"definition\":\"label_values(node_cpu_seconds_total, instance)\",\"selected\":\"$node\",\"options\":[\"tt-fc-es01.nj:12345\",\"tt-fc-es02.nj:12345\",\"tt-fc-dev01.nj:12345\",\"10.206.0.13:9100\"]}],\"links\":[{\"title\":\"n9e\",\"url\":\"https://n9e.gitee.io/\",\"targetBlank\":true},{\"title\":\"author\",\"url\":\"http://flashcat.cloud/\",\"targetBlank\":true}]}",
"chart_groups": [
{
"name": "整体概况",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(10,100-(avg by (mode, instance)(rate(node_cpu_seconds_total{mode=\\\"idle\\\"}[1m])))*100)\",\"legend\":\"{{instance}}\"}],\"name\":\"cpu使用率 top10\",\"links\":[],\"description\":\"\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":9,\"x\":3,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(10,(node_memory_MemTotal_bytes - node_memory_MemFree_bytes - (node_memory_Cached_bytes + node_memory_Buffers_bytes))/node_memory_MemTotal_bytes*100)\",\"legend\":\"{{instance}}\"}],\"name\":\"内存使用率 top10\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(10,(node_filesystem_avail_bytes{device!~'rootfs', device!~\\\"tmpfs\\\",mountpoint!~\\\"/var/lib.*\\\"} * 100) / node_filesystem_size_bytes{device!~'rootfs', device!~\\\"tmpfs\\\",mountpoint!~\\\"/var/lib.*\\\"})\",\"legend\":\"{{instance}}-{{mountpoint}}\"}],\"name\":\"磁盘分区使用率 top10\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(10,rate(node_disk_io_time_seconds_total[5m]) * 100)\",\"legend\":\"{{instance}}-{{device}}\"}],\"name\":\"设备io util top10\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":1,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"count(node_boot_time_seconds)\"}],\"name\":\"监控机器数\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":40}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":0,\"y\":0,\"i\":\"4\"}}",
"weight": 0
}
]
},
{
"name": "单机概况",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"(node_memory_MemTotal_bytes{instance=\\\"$node\\\"} - node_memory_MemFree_bytes{instance=\\\"$node\\\"} - (node_memory_Cached_bytes{instance=\\\"$node\\\"} + node_memory_Buffers_bytes{instance=\\\"$node\\\"}))/node_memory_MemTotal_bytes{instance=\\\"$node\\\"}*100\"}],\"name\":\"内存使用率\",\"description\":\"如果内存使用率超过50%,则需要扩容或者升级配置了\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":25}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"result\":{\"color\":\"#369903\"},\"match\":{\"from\":0,\"to\":50}},{\"type\":\"range\",\"match\":{\"from\":50,\"to\":100},\"result\":{\"color\":\"#e3170d\"}}],\"standardOptions\":{\"util\":\"percent\",\"decimals\":1}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"(((count(count(node_cpu_seconds_total{instance=\\\"$node\\\"}) by (cpu))) - avg(sum by (mode)(rate(node_cpu_seconds_total{mode='idle',instance=\\\"$node\\\"}[1m])))) * 100) / count(count(node_cpu_seconds_total{instance=\\\"$node\\\"}) by (cpu))\"}],\"name\":\"CPU使用率\",\"description\":\"如果cpu使用率超过50%可以通过top命令查看机器上是否有异常进程如果没有异常进程则说明服务需要扩容或者机器需要升级配置了\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":30}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"result\":{\"color\":\"#369903\"},\"match\":{\"from\":0,\"to\":50}},{\"type\":\"range\",\"match\":{\"special\":50,\"from\":50,\"to\":100},\"result\":{\"color\":\"#b22222\"}}],\"standardOptions\":{\"util\":\"percent\",\"decimals\":1}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"max(100 - ((node_filesystem_avail_bytes{instance=\\\"$node\\\",} * 100) / node_filesystem_size_bytes{instance=\\\"$node\\\"}))\",\"legend\":\"{{mountpoint}}\"}],\"name\":\"磁盘分区使用率最大值\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percent\",\"decimals\":1},\"thresholds\":{\"steps\":[{\"value\":90,\"color\":\"#f90101\"}]}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_time_seconds{instance=\\\"$node\\\"} - node_boot_time_seconds{instance=\\\"$node\\\"}\"}],\"name\":\"启动时长\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"title\":null,\"value\":20}},\"options\":{\"valueMappings\":[],\"standardOptions\":{\"util\":\"humantimeSeconds\",\"decimals\":1}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":21,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"(node_memory_SwapTotal_bytes{instance=\\\"$node\\\"} - node_memory_SwapFree_bytes{instance=\\\"$node\\\"})\"}],\"name\":\"SWAP内存使用\",\"description\":\"swap使用过高会影响系统io性能如果内存够用但swap使用很高可以调小swappiness的值\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":30}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"result\":{\"color\":\"#369903\"},\"match\":{\"from\":0,\"to\":50}},{\"type\":\"range\",\"match\":{\"special\":50,\"from\":50,\"to\":80},\"result\":{\"color\":\"#fb9b2d\"}},{\"type\":\"range\",\"match\":{\"from\":80,\"to\":100000},\"result\":{\"color\":\"#d10000\"}}],\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":12,\"y\":0,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(node_vmstat_oom_kill{instance=\\\"$node\\\"}[1m])\",\"legend\":\"OOM\"}],\"name\":\"OOM次数\",\"description\":\"大于0说明有进程内存不够用了需要考虑扩容或升级配置了\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{\"steps\":[{\"value\":1,\"color\":\"#f90101\"}]}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":1,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"max(rate(node_disk_io_time_seconds_total{instance=\\\"$node\\\"}[5m]) * 100)\",\"legend\":\"{{device}}\"}],\"name\":\"磁盘设备io util 最大值\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percent\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":1,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_filefd_allocated{instance=\\\"$node\\\"}/node_filefd_maximum{instance=\\\"$node\\\"}*100\"}],\"name\":\"FD使用率\",\"description\":\"如果超过80%,建议把文件描述符的最大个数调大,或者扩容\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":25}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"result\":{\"color\":\"#369903\"},\"match\":{\"from\":0,\"to\":50}},{\"type\":\"range\",\"match\":{\"special\":50,\"from\":50,\"to\":80},\"result\":{\"color\":\"#fb9b2d\"}},{\"type\":\"range\",\"match\":{\"from\":80,\"to\":100},\"result\":{\"color\":\"#d10000\"}}],\"standardOptions\":{\"util\":\"percent\",\"decimals\":1}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":15,\"y\":0,\"i\":\"7\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"max(100 - ((node_filesystem_files_free{instance=\\\"$node\\\",mountpoint!~\\\"/var/lib/.*\\\",mountpoint!~\\\"/run/user.*\\\"} * 100) / node_filesystem_files{instance=\\\"$node\\\",mountpoint!~\\\"/var/lib/.*\\\",mountpoint!~\\\"/run/user.*\\\"}))\",\"legend\":\"{{mountpoint}}\"}],\"name\":\"inode分区使用率最大值\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percent\",\"decimals\":1},\"thresholds\":{\"steps\":[{\"value\":50,\"color\":\"#f90101\"}]}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":1,\"i\":\"8\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(node_filesystem_device_error{instance=\\\"$node\\\",mountpoint!~\\\"/var/lib/.*\\\",mountpoint!~\\\"/run.*\\\"})\",\"legend\":\"{{mountpoint}}\"}],\"name\":\"写文件错误数总和\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":30}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":0,\"to\":0},\"result\":{\"color\":\"#369903\"}},{\"type\":\"range\",\"match\":{\"from\":1,\"to\":10000},\"result\":{\"color\":\"#f0310f\"}}],\"standardOptions\":{\"decimals\":1}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":3,\"x\":18,\"y\":0,\"i\":\"9\"}}",
"weight": 0
}
]
},
{
"name": "系统指标",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"node_procs_running{instance=\\\"$node\\\"}\",\"legend\":\"{{mountpoint}}\"}],\"name\":\"进程数\",\"description\":\"进程数超过2000可以考虑扩容了\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{\"steps\":[{\"value\":2000,\"color\":\"#ff0000\"}]}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_timex_offset_seconds{instance=\\\"$node\\\"}\",\"legend\":\"ntp偏移\"}],\"name\":\"NTP偏移\",\"description\":\"\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{\"steps\":[]}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(node_intr_total{instance=\\\"$node\\\"}[1m])\",\"legend\":\"Interrupts\"},{\"expr\":\"irate(node_context_switches_total{instance=\\\"$node\\\"}[1m])\",\"legend\":\"context switches\"}],\"name\":\"上下文切换/中断\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_entropy_available_bits{instance=\\\"$node\\\"}\",\"legend\":\"entropy\"}],\"name\":\"熵池大小\",\"description\":\"熵池太小 ,程序使用随机函数会阻塞,可以安装 rng-tools 工具增加熵池大小,可参考\\n<a href=\\\"https://codeantenna.com/a/Ab6aMd3NSA\\\" target=\\\"_blank\\\">https://codeantenna.com/a/Ab6aMd3NSA</a> \",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{\"steps\":[{\"value\":100,\"color\":\"#f70202\"}]}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "CPU详情",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\" (avg by (mode)(rate(node_cpu_seconds_total{instance=\\\"$node\\\",mode!=\\\"idle\\\"}[1m])))*100\",\"legend\":\"{{mode}}\"}],\"name\":\"CPU使用率详情\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\" (avg by (mode)(rate(node_cpu_seconds_total{instance=\\\"$node\\\",mode=\\\"idle\\\"}[1m])))*100\",\"legend\":\"cpu_idle\"}],\"name\":\"CPU空闲率\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{\"steps\":[{\"value\":10,\"color\":\"#f90101\"}]}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_load1{instance=\\\"$node\\\"}\",\"legend\":\"load1\"},{\"expr\":\"node_load5{instance=\\\"$node\\\"}\",\"legend\":\"load5\"},{\"expr\":\"node_load15{instance=\\\"$node\\\"}\",\"legend\":\"load15\"}],\"name\":\"CPU负载\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{\"steps\":[]}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "内存详情",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"node_memory_HugePages_Total{instance=\\\"$node\\\"}\",\"legend\":\"HugePages_Total\"},{\"expr\":\"node_memory_Hugepagesize_bytes{instance=\\\"$node\\\"}\",\"legend\":\"HugePages_Size\"},{\"expr\":\"node_memory_HugePages_Surp{instance=\\\"$node\\\"}\",\"legend\":\"HugePages_Surp \"},{\"expr\":\"node_memory_HugePages_Free{instance=\\\"$node\\\"}\",\"legend\":\"HugePages_Free\"},{\"expr\":\"node_memory_HugePages_Rsvd{instance=\\\"$node\\\"}\",\"legend\":\"HugePages_Rsvd\"},{\"expr\":\"node_memory_AnonHugePages_bytes{instance=\\\"$node\\\"}\",\"legend\":\"AnonHugePages\"},{\"expr\":\"node_memory_Inactive_file_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Inactive_file\"},{\"expr\":\"node_memory_Inactive_anon_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Inactive_anon\"},{\"expr\":\"node_memory_Active_file_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Active_file\"},{\"expr\":\"node_memory_Active_anon_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Active_anon\"},{\"expr\":\"node_memory_Unevictable_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Unevictable\"},{\"expr\":\"node_memory_AnonPages_bytes{instance=\\\"$node\\\"}\",\"legend\":\"AnonPages\"},{\"expr\":\"node_memory_Shmem_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Shmem\"},{\"expr\":\"node_memory_Mapped_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Mapped\"},{\"expr\":\"node_memory_Cached_bytes{instance=\\\"$node\\\"} \",\"legend\":\"Cache\"},{\"expr\":\"node_memory_SwapCached_bytes{instance=\\\"$node\\\"}\",\"legend\":\"SwapCache\"},{\"expr\":\"node_memory_Mlocked_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Mlocked\"},{\"expr\":\"node_memory_Buffers_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Buffers\"}],\"name\":\"用户态内存使用\",\"description\":\"内存指标可参考链接 [/PROC/MEMINFO之谜](http://linuxperf.com/?p=142) \",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.35,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_memory_Slab_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Slab \"},{\"expr\":\"node_memory_SReclaimable_bytes{instance=\\\"$node\\\"}\",\"legend\":\"SReclaimable \"},{\"expr\":\"node_memory_SUnreclaim_bytes{instance=\\\"$node\\\"}\",\"legend\":\"SUnreclaim \"},{\"expr\":\"node_memory_VmallocUsed_bytes{instance=\\\"$node\\\"}\",\"legend\":\"VmallocUsed\"},{\"expr\":\"node_memory_VmallocChunk_bytes{instance=\\\"$node\\\"}\",\"legend\":\"VmallocChunk\"},{\"expr\":\"node_memory_KernelStack_bytes{instance=\\\"$node\\\"}\",\"legend\":\"KernelStack\"},{\"expr\":\"node_memory_Bounce_bytes{instance=\\\"$node\\\"}\",\"legend\":\"Bounce \"}],\"name\":\"内核态内存使用\",\"description\":\"内存指标可参考链接 [/PROC/MEMINFO之谜](http://linuxperf.com/?p=142) \",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_memory_DirectMap1G_bytes{instance=\\\"$node\\\"}\",\"legend\":\"DirectMap1G\"},{\"expr\":\"node_memory_DirectMap2M_bytes{instance=\\\"$node\\\"}\",\"legend\":\"DirectMap2M\"},{\"expr\":\"node_memory_DirectMap4k_bytes{instance=\\\"$node\\\"}\",\"legend\":\"DirectMap4K\"}],\"name\":\"TLB效率\",\"description\":\"/proc/meminfo中的DirectMap所统计的不是关于内存的使用而是一个反映TLB效率的指标。TLB(Translation Lookaside Buffer)是位于CPU上的缓存用于将内存的虚拟地址翻译成物理地址由于TLB的大小有限不能缓存的地址就需要访问内存里的page table来进行翻译速度慢很多。为了尽可能地将地址放进TLB缓存新的CPU硬件支持比4k更大的页面从而达到减少地址数量的目的 比如2MB4MB甚至1GB的内存页视不同的硬件而定。”DirectMap4k”表示映射为4kB的内存数量 “DirectMap2M”表示映射为2MB的内存数量以此类推。所以DirectMap其实是一个反映TLB效率的指标\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_memory_NFS_Unstable_bytes{instance=\\\"$node\\\"}\",\"legend\":\"NFS Unstable\"},{\"expr\":\"node_memory_Writeback_bytes{instance=\\\"$node\\\"}\",\"legend\":\"memory_Writeback\"},{\"expr\":\"node_memory_Dirty_bytes{instance=\\\"$node\\\"}\",\"legend\":\"memory_Dirty\"}],\"name\":\"dirty page\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "磁盘详情",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"node_filesystem_avail_bytes{instance=\\\"$node\\\",device!~'rootfs', device!~\\\"tmpfs\\\",mountpoint!~\\\"/var/lib.*\\\"}\",\"legend\":\"{{mountpoint}} - Available\"},{\"expr\":\"node_filesystem_free_bytes{instance=\\\"$node\\\",device!~'rootfs',device!~\\\"tmpfs\\\",mountpoint!~\\\"/var/lib.*\\\"}\",\"legend\":\"{{mountpoint}} - Free\"},{\"expr\":\"node_filesystem_size_bytes{instance=\\\"$node\\\",device!~'rootfs',device!~\\\"tmpfs\\\",mountpoint!~\\\"/var/lib.*\\\"}\",\"legend\":\"{{mountpoint}} - Total\"}],\"name\":\"磁盘空间\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":null},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_filesystem_files{instance=\\\"$node\\\",device!~'rootfs',device!~\\\"tmpfs\\\",mountpoint!~\\\"/var/lib.*\\\"}\",\"legend\":\"{{mountpoint}} - total\"},{\"expr\":\"node_filesystem_files_free{instance=\\\"$node\\\",device!~'rootfs',device!~\\\"tmpfs\\\",mountpoint!~\\\"/var/lib.*\\\"}\",\"legend\":\"{{mountpoint}} - free\"}],\"name\":\"inode\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_filefd_maximum{instance=\\\"$node\\\"}\",\"legend\":\"Max open files\"},{\"expr\":\"node_filefd_allocated{instance=\\\"$node\\\"}\",\"legend\":\"Open files\"}],\"name\":\"fd使用\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_filesystem_readonly{instance=\\\"$node\\\",device!~'rootfs',device!~\\\"tmpfs\\\",mountpoint!~\\\"/var/lib.*\\\"}\",\"legend\":\"{{mountpoint}} - ReadOnly\"},{\"expr\":\"node_filesystem_device_error{instance=\\\"$node\\\",device!~'rootfs',device!~\\\"tmpfs\\\",mountpoint!~\\\"/var/lib.*\\\"}\",\"legend\":\"{{mountpoint}} - Device error\"}],\"name\":\"读写错误\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(node_disk_reads_completed_total{instance=\\\"$node\\\"}[1m])\",\"legend\":\"{{device}}-reads\"},{\"expr\":\"rate(node_disk_writes_completed_total{instance=\\\"$node\\\"}[1m])\",\"legend\":\"{{device}} - Writes\"},{\"expr\":\"rate(node_disk_reads_merged_total{instance=\\\"$node\\\"}[1m])\",\"legend\":\"{{device}} - Read merged\"},{\"expr\":\"rate(node_disk_writes_merged_total{instance=\\\"$node\\\"}[1m])\",\"legend\":\"{{device}} - Write merged\"}],\"name\":\"IO/Merged次数\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(node_disk_read_bytes_total{instance=\\\"$node\\\"}[1m])\",\"legend\":\"{{device}}-Read bytes\"},{\"expr\":\"rate(node_disk_written_bytes_total{instance=\\\"$node\\\"}[1m])\",\"legend\":\"{{device}} - Written bytes\"}],\"name\":\"读写数据大小\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":0},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":2,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(node_disk_io_time_seconds_total{instance=\\\"$node\\\"}[1m])\",\"legend\":\"{{device}}\"}],\"name\":\"io util\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":2,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"(rate(node_disk_read_time_seconds_total{instance=\\\"$node\\\"}[1m]) / rate(node_disk_reads_completed_total{instance=\\\"$node\\\"}[1m])+rate(node_disk_write_time_seconds_total{instance=\\\"$node\\\"}[1m]) / rate(node_disk_writes_completed_total{instance=\\\"$node\\\"}[1m]))*1000\",\"legend\":\"{{device}}\"}],\"name\":\"io await\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.64,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":2,\"i\":\"7\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"(rate(node_disk_read_bytes_total{instance=\\\"$node\\\"}[1m]) + rate(node_disk_written_bytes_total{instance=\\\"$node\\\"}[1m]))\\n/\\n(rate(node_disk_reads_completed_total{instance=\\\"$node\\\"}[1m]) + rate(node_disk_writes_completed_total{instance=\\\"$node\\\"}[1m]))\",\"legend\":\"avgrq-sz\"}],\"name\":\"avgrq-sz\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":4,\"i\":\"8\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(node_disk_io_time_weighted_seconds_total{instance=\\\"$node\\\"}[1m])\\n\",\"legend\":\"{{device}}\"}],\"name\":\"avgqu-sz\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":4,\"i\":\"9\"}}",
"weight": 0
}
]
},
{
"name": "网络详情",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"rate(node_network_receive_bytes_total{instance=\\\"$node\\\",device=~\\\"e.*\\\"}[1m])*8\",\"legend\":\"{{device}} - in\"},{\"expr\":\"rate(node_network_transmit_bytes_total{instance=\\\"$node\\\",device=~\\\"e.*\\\"}[1m])*8\",\"legend\":\"{{device}} - out\"}],\"name\":\"出入流量大小\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(node_network_receive_packets_total{instance=\\\"$node\\\",device=~\\\"e.*\\\"}[1m])\",\"legend\":\"{{device}} - in\"},{\"expr\":\"rate(node_network_transmit_packets_total{instance=\\\"$node\\\",device=~\\\"e.*\\\"}[1m])\",\"legend\":\"{{device}} - out\"}],\"name\":\"packets\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(node_network_receive_errs_total{instance=\\\"$node\\\",device=~\\\"e.*\\\"}[1m])\",\"legend\":\"{{device}} - in\"},{\"expr\":\"rate(node_network_transmit_errs_total{instance=\\\"$node\\\",device=~\\\"e.*\\\"}[1m])\",\"legend\":\"{{device}} - out\"}],\"name\":\"error\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(node_network_receive_drop_total{instance=\\\"$node\\\",device=~\\\"e.*\\\"}[1m])\",\"legend\":\"{{device}} - in\"},{\"expr\":\"rate(node_network_transmit_drop_total{instance=\\\"$node\\\",device=~\\\"e.*\\\"}[1m])\",\"legend\":\"{{device}} - out\"}],\"name\":\"drop\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_nf_conntrack_entries{instance=\\\"$node\\\"}\",\"legend\":\"NF conntrack entries\"},{\"expr\":\"node_nf_conntrack_entries_limit{instance=\\\"$node\\\"}\",\"legend\":\"NF conntrack limit\"}],\"name\":\"nf_conntrack\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_sockstat_TCP_alloc{instance=\\\"$node\\\"}\",\"legend\":\"TCP_alloc\"},{\"expr\":\"node_sockstat_TCP_inuse{instance=\\\"$node\\\"}\",\"legend\":\"TCP_inuse\"},{\"expr\":\"node_sockstat_TCP_orphan{instance=\\\"$node\\\"}\",\"legend\":\"TCP_orphan\"},{\"expr\":\"node_sockstat_TCP_tw{instance=\\\"$node\\\"}\",\"legend\":\"TCP_tw\"},{\"expr\":\"node_netstat_Tcp_CurrEstab{instance=\\\"$node\\\"}\",\"legend\":\"TCP_CurrEstab\"}],\"name\":\"tcp\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.27,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":2,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"node_sockstat_sockets_used{instance=\\\"$node\\\"}\",\"legend\":\"Sockets_used\"}],\"name\":\"socket\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":2,\"i\":\"6\"}}",
"weight": 0
}
]
}
]
}
]

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,109 @@
[
{
"name": "MongoDB Overview - 模板",
"tags": "Prometheus MongoDB",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(mongodb_up,instance)\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"mongodb_up{instance=\\\"$instance\\\"}\",\"time\":{\"num\":1,\"unit\":\"hour\",\"description\":\"小时\"}}],\"name\":\"Up\",\"description\":\"实例数\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":1},\"result\":{\"color\":\"#53b503\"}},{\"type\":\"range\",\"match\":{\"special\":null,\"from\":0,\"to\":1},\"result\":{\"color\":\"#e70d0d\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"mongodb_instance_uptime_seconds{instance='$instance'}\",\"time\":{\"num\":1,\"unit\":\"hour\",\"description\":\"小时\"}}],\"name\":\"Uptime\",\"description\":\"启用时长\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"title\":null}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":1800},\"result\":{\"color\":\"#ec7718\"}},{\"type\":\"range\",\"match\":{\"from\":1800},\"result\":{\"color\":\"#53b503\"}}],\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"mongodb_memory{instance='$instance'} * 1024 * 1024\",\"legend\":\"{{type}}\"}],\"name\":\"Memory\",\"description\":\"内存占用MiB\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(mongodb_extra_info_page_faults_total{instance=\\\"$instance\\\"}[5m])\",\"legend\":\"{{type}}\"}],\"name\":\"Page Faults\",\"description\":\"页缺失中断次数 Page faults indicate that requests are processed from disk either because an index is missing or there is not enough memory for the data set. Consider increasing memory or sharding out.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"none\",\"decimals\":null},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(mongodb_ss_network_bytesOut{instance='$instance'}[5m])\",\"legend\":\"bytesOut\"},{\"expr\":\"rate(mongodb_ss_network_bytesIn{instance='$instance'}[5m])\",\"refId\":\"B\",\"legend\":\"bytesIn\"}],\"name\":\"Network I/O\",\"description\":\"网络流量byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesSI\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"mongodb_connections{instance=\\\"$instance\\\", state=\\\"current\\\"}\",\"legend\":\"Connections\"}],\"name\":\"Connections\",\"description\":\"连接数 Keep in mind the hard limit on the maximum number of connections set by your distribution.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":2,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(mongodb_asserts_total{instance=\\\"$instance\\\"}[5m])\",\"legend\":\"{{type}}\"}],\"name\":\"Assert Events\",\"description\":\"断言错误次数 Asserts are not important by themselves, but you can correlate spikes with other graphs.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":2,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"mongodb_mongod_global_lock_current_queue{instance=\\\"$instance\\\"}\",\"legend\":\"{{type}}\"}],\"name\":\"Lock Queue\",\"description\":\"等待获取锁操作数量 Any number of queued operations for long periods of time is an indication of possible issues. Find the cause and fix it before requests get stuck in the queue.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":2,\"i\":\"7\"}}",
"weight": 0
}
]
},
{
"name": "Operation Info",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(mongodb_op_counters_total{instance=\\\"$instance\\\", type!=\\\"command\\\"}[5m])\",\"legend\":\"{{type}}\"},{\"expr\":\"rate(mongodb_mongod_op_counters_repl_total{instance=\\\"$instance\\\", type!~\\\"(command|query|getmore)\\\"}[5m]) or \\nrate(mongodb_mongos_op_counters_repl_total{instance=\\\"$instance\\\", type!~\\\"(command|query|getmore)\\\"}[5m])\",\"refId\":\"B\",\"legend\":\"repl_{{type}}\"},{\"expr\":\"rate(mongodb_mongod_metrics_ttl_deleted_documents_total{instance=\\\"$instance\\\"}[5m]) or \\nrate(mongodb_mongos_metrics_ttl_deleted_documents_total{instance=\\\"$instance\\\"}[5m])\",\"refId\":\"C\",\"legend\":\"ttl_delete\"}],\"name\":\"Command Operations\",\"description\":\"接收请求数 Shows how many times a command is executed per second on average during the selected interval.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(mongodb_mongod_metrics_document_total{instance=\\\"$instance\\\"}[5m])\",\"legend\":\"{{state}}\"}],\"name\":\"Document Operations\",\"description\":\"文档操作数 When used in combination with 'Command Operations', this graph can help identify write amplification. For example, when one insert or update command actually inserts or updates hundreds, thousands, or even millions of documents.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(mongodb_mongod_op_latencies_latency_total{instance='$instance'}[5m]) / rate(mongodb_mongod_op_latencies_ops_total{instance='$instance'}[5m]) / 1000\",\"legend\":\"{{type}}\"}],\"name\":\"Response Time\",\"description\":\"操作详情耗时(毫秒)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"milliseconds\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(increase(mongodb_mongod_metrics_query_executor_total{instance=\\\"$instance\\\", state=\\\"scanned_objects\\\"}[5m])) / sum(increase(mongodb_mongod_metrics_document_total{instance=\\\"$instance\\\", state=\\\"returned\\\"}[5m]))\",\"legend\":\"Document\"},{\"expr\":\"sum(increase(mongodb_mongod_metrics_query_executor_total{instance=\\\"$instance\\\", state=\\\"scanned\\\"}[5m])) / sum(increase(mongodb_mongod_metrics_document_total{instance=\\\"$instance\\\", state=\\\"returned\\\"}[5m]))\",\"refId\":\"B\",\"legend\":\"Index\"}],\"name\":\"Query Efficiency\",\"description\":\"查询效率\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":2,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"mongodb_mongod_metrics_cursor_open{instance=\\\"$instance\\\"}\",\"legend\":\"{{state}}\"}],\"name\":\"Cursors\",\"description\":\"游标数量 Helps identify why connections are increasing. Shows active cursors compared to cursors being automatically killed after 10 minutes due to an application not closing the connection.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":2,\"i\":\"4\"}}",
"weight": 0
}
]
},
{
"name": "Cache Info",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"mongodb_mongod_wiredtiger_cache_bytes{instance='$instance'}\",\"legend\":\"{{type}}\"}],\"name\":\"Cache Size\",\"description\":\"缓存大小byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(mongodb_mongod_wiredtiger_cache_bytes_total{instance='$instance'}[5m])\",\"legend\":\"{{type}}\"}],\"name\":\"Cache I/O\",\"description\":\"写入或读取的缓存数据大小byte\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesSI\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"100 * sum(mongodb_mongod_wiredtiger_cache_pages{instance='$instance',type=\\\"dirty\\\"}) / sum(mongodb_mongod_wiredtiger_cache_pages{instance='$instance',type=\\\"total\\\"})\",\"legend\":\"dirty rate\"}],\"name\":\"Cache Dirty Pages Rate\",\"description\":\"缓存脏页占比\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percent\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(mongodb_mongod_wiredtiger_cache_evicted_total{instance='$instance'}[5m])\",\"legend\":\"evicted pages\"}],\"name\":\"Cache Evicted Pages\",\"description\":\"缓存剔除页数量\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "ReplSet Info",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"time() - mongodb_mongod_replset_member_election_date\"}],\"name\":\"Replset Election\",\"description\":\"副本集选主时间\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":1800},\"result\":{\"color\":\"#f24526\"}},{\"type\":\"range\",\"match\":{\"from\":1800},\"result\":{\"color\":\"#53b503\"}}],\"standardOptions\":{\"util\":\"seconds\",\"decimals\":1}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"mongodb_mongod_replset_member_replication_lag{instance=\\\"$instance\\\"}\",\"legend\":\"lag\"}],\"name\":\"Replset Lag Seconds\",\"description\":\"副本集成员主从同步延迟\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"seconds\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,119 @@
[
{
"name": "MySQL Overview - 模板",
"tags": "Prometheus MySQL",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(mysql_global_status_uptime, instance)\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"min(mysql_global_status_uptime{instance=~\\\"$instance\\\"})\"}],\"name\":\"MySQL Uptime\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":1800},\"result\":{\"color\":\"#ec7718\"}},{\"type\":\"range\",\"match\":{\"from\":1800},\"result\":{\"color\":\"#369603\"}}],\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_queries{instance=~\\\"$instance\\\"}[5m])\"}],\"name\":\"Current QPS\",\"description\":\"mysql_global_status_queries\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":100},\"result\":{\"color\":\"#05a31f\"}},{\"type\":\"range\",\"match\":{\"from\":100},\"result\":{\"color\":\"#ea3939\"}}],\"standardOptions\":{\"decimals\":2}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"avg(mysql_global_variables_innodb_buffer_pool_size{instance=~\\\"$instance\\\"})\"}],\"name\":\"InnoDB Buffer Pool\",\"description\":\"**InnoDB Buffer Pool Size**\\n\\nInnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. Knowing how the InnoDB buffer pool works, and taking advantage of it to keep frequently accessed data in memory, is one of the most important aspects of MySQL tuning. The goal is to keep the working set in memory. In most cases, this should be between 60%-90% of available memory on a dedicated database host, but depends on many factors.\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"bytesIEC\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(increase(mysql_global_status_table_locks_waited{instance=~\\\"$instance\\\"}[5m]))\"}],\"name\":\"Table Locks Waited(5min)\",\"description\":\"**Table Locks**\\n\\nMySQL takes a number of different locks for varying reasons. In this graph we see how many Table level locks MySQL has requested from the storage engine. In the case of InnoDB, many times the locks could actually be row locks as it only takes table level locks in a few specific cases.\\n\\nIt is most useful to compare Locks Immediate and Locks Waited. If Locks waited is rising, it means you have lock contention. Otherwise, Locks Immediate rising and falling is normal activity.\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":1},\"result\":{\"color\":\"#e70d0d\"}},{\"type\":\"range\",\"match\":{\"to\":1},\"result\":{\"color\":\"#53b503\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Connections",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(mysql_global_status_threads_connected{instance=~\\\"$instance\\\"})\",\"legend\":\"Connections\"},{\"expr\":\"sum(mysql_global_status_max_used_connections{instance=~\\\"$instance\\\"})\",\"legend\":\"Max Used Connections\"},{\"expr\":\"sum(mysql_global_variables_max_connections{instance=~\\\"$instance\\\"})\",\"legend\":\"Max Connections\"},{\"expr\":\"sum(rate(mysql_global_status_aborted_connects{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Aborted Connections\"}],\"name\":\"MySQL Connections\",\"description\":\"**Max Connections** \\n\\nMax Connections is the maximum permitted number of simultaneous client connections. By default, this is 151. Increasing this value increases the number of file descriptors that mysqld requires. If the required number of descriptors are not available, the server reduces the value of Max Connections.\\n\\nmysqld actually permits Max Connections + 1 clients to connect. The extra connection is reserved for use by accounts that have the SUPER privilege, such as root.\\n\\nMax Used Connections is the maximum number of connections that have been in use simultaneously since the server started.\\n\\nConnections is the number of connection attempts (successful or not) to the MySQL server.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(mysql_global_status_threads_connected{instance=~\\\"$instance\\\"})\",\"legend\":\"Threads Connected\"},{\"expr\":\"sum(mysql_global_status_threads_running{instance=~\\\"$instance\\\"})\",\"legend\":\"Threads Running\"}],\"name\":\"MySQL Client Thread Activity\",\"description\":\"Threads Connected is the number of open connections, while Threads Running is the number of threads not sleeping.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Query Performance",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_created_tmp_tables{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Tables\"},{\"expr\":\"sum(rate(mysql_global_status_created_tmp_disk_tables{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Disk Tables\"},{\"expr\":\"sum(rate(mysql_global_status_created_tmp_files{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Files\"}],\"name\":\"MySQL Temporary Objects\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.64,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_select_full_join{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Full Join\"},{\"expr\":\"sum(rate(mysql_global_status_select_full_range_join{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Full Range Join\"},{\"expr\":\"sum(rate(mysql_global_status_select_range{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Range\"},{\"expr\":\"sum(rate(mysql_global_status_select_range_check{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Range Check\"},{\"expr\":\"sum(rate(mysql_global_status_select_scan{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Scan\"}],\"name\":\"MySQL Select Types\",\"description\":\"**MySQL Select Types**\\n\\nAs with most relational databases, selecting based on indexes is more efficient than scanning an entire table's data. Here we see the counters for selects not done with indexes.\\n\\n* ***Select Scan*** is how many queries caused full table scans, in which all the data in the table had to be read and either discarded or returned.\\n* ***Select Range*** is how many queries used a range scan, which means MySQL scanned all rows in a given range.\\n* ***Select Full Join*** is the number of joins that are not joined on an index, this is usually a huge performance hit.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"stack\":\"off\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.41},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_sort_rows{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Rows\"},{\"expr\":\"sum(rate(mysql_global_status_sort_range{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Range\"},{\"expr\":\"sum(rate(mysql_global_status_sort_merge_passes{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Merge Passes\"},{\"expr\":\"sum(rate(mysql_global_status_sort_scan{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Scan\"}],\"name\":\"MySQL Sorts\",\"description\":\"**MySQL Sorts**\\n\\nDue to a query's structure, order, or other requirements, MySQL sorts the rows before returning them. For example, if a table is ordered 1 to 10 but you want the results reversed, MySQL then has to sort the rows to return 10 to 1.\\n\\nThis graph also shows when sorts had to scan a whole table or a given range of a table in order to return the results and which could not have been sorted via an index.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_slow_queries{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Slow Queries\"}],\"name\":\"MySQL Slow Queries\",\"description\":\"**MySQL Slow Queries**\\n\\nSlow queries are defined as queries being slower than the long_query_time setting. For example, if you have long_query_time set to 3, all queries that take longer than 3 seconds to complete will show on this graph.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"bars\",\"stack\":\"off\",\"fillOpacity\":0.81},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Network",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_bytes_received{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Inbound\"},{\"expr\":\"sum(rate(mysql_global_status_bytes_sent{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Outbound\"}],\"name\":\"MySQL Network Traffic\",\"description\":\"**MySQL Network Traffic**\\n\\nHere we can see how much network traffic is generated by MySQL. Outbound is network traffic sent from MySQL and Inbound is network traffic MySQL has received.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesSI\",\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Commands, Handlers",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"topk(10, rate(mysql_global_status_commands_total{instance=~\\\"$instance\\\"}[5m])>0)\",\"legend\":\"Com_{{command}}\"}],\"name\":\"Top Command Counters\",\"description\":\"**Top Command Counters**\\n\\nThe Com_{{xxx}} statement counter variables indicate the number of times each xxx statement has been executed. There is one status variable for each type of statement. For example, Com_delete and Com_update count [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements, respectively. Com_delete_multi and Com_update_multi are similar but apply to [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements that use multiple-table syntax.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.2,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_handlers_total{instance=~\\\"$instance\\\", handler!~\\\"commit|rollback|savepoint.*|prepare\\\"}[5m])\",\"legend\":\"{{handler}}\"}],\"name\":\"MySQL Handlers\",\"description\":\"**MySQL Handlers**\\n\\nHandler statistics are internal statistics on how MySQL is selecting, updating, inserting, and modifying rows, tables, and indexes.\\n\\nThis is in fact the layer between the Storage Engine and MySQL.\\n\\n* `read_rnd_next` is incremented when the server performs a full table scan and this is a counter you don't really want to see with a high value.\\n* `read_key` is incremented when a read is done with an index.\\n* `read_next` is incremented when the storage engine is asked to 'read the next index entry'. A high value means a lot of index scans are being done.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":3},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_handlers_total{instance=~\\\"$instance\\\", handler=~\\\"commit|rollback|savepoint.*|prepare\\\"}[5m])\",\"legend\":\"{{handler}}\"}],\"name\":\"MySQL Transaction Handlers\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Open Files",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"mysql_global_variables_open_files_limit{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Files Limit\"},{\"expr\":\"mysql_global_status_open_files{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Files\"}],\"name\":\"MySQL Open Files\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Table Openings",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_table_open_cache_hits{instance=~\\\"$instance\\\"}[5m])\\n/\\n(\\nrate(mysql_global_status_table_open_cache_hits{instance=~\\\"$instance\\\"}[5m])\\n+\\nrate(mysql_global_status_table_open_cache_misses{instance=~\\\"$instance\\\"}[5m])\\n)\",\"legend\":\"Table Open Cache Hit Ratio\"}],\"name\":\"Table Open Cache Hit Ratio Mysql 5.6.6+\",\"description\":\"**MySQL Table Open Cache Status**\\n\\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\\n\\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"mysql_global_status_open_tables{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Tables\"},{\"expr\":\"mysql_global_variables_table_open_cache{instance=~\\\"$instance\\\"}\",\"legend\":\"Table Open Cache\"}],\"name\":\"MySQL Open Tables\",\"description\":\"**MySQL Open Tables**\\n\\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\\n\\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,119 @@
[
{
"name": "MySQL Overview - 模板",
"tags": "Prometheus MySQL",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(mysql_global_status_uptime, instance)\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"min(mysql_global_status_uptime{instance=~\\\"$instance\\\"})\"}],\"name\":\"MySQL Uptime\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":1800},\"result\":{\"color\":\"#ec7718\"}},{\"type\":\"range\",\"match\":{\"from\":1800},\"result\":{\"color\":\"#369603\"}}],\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_queries{instance=~\\\"$instance\\\"}[5m])\"}],\"name\":\"Current QPS\",\"description\":\"mysql_global_status_queries\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":100},\"result\":{\"color\":\"#05a31f\"}},{\"type\":\"range\",\"match\":{\"from\":100},\"result\":{\"color\":\"#ea3939\"}}],\"standardOptions\":{\"decimals\":2}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"avg(mysql_global_variables_innodb_buffer_pool_size{instance=~\\\"$instance\\\"})\"}],\"name\":\"InnoDB Buffer Pool\",\"description\":\"**InnoDB Buffer Pool Size**\\n\\nInnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. Knowing how the InnoDB buffer pool works, and taking advantage of it to keep frequently accessed data in memory, is one of the most important aspects of MySQL tuning. The goal is to keep the working set in memory. In most cases, this should be between 60%-90% of available memory on a dedicated database host, but depends on many factors.\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"bytesIEC\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(increase(mysql_global_status_table_locks_waited{instance=~\\\"$instance\\\"}[5m]))\"}],\"name\":\"Table Locks Waited(5min)\",\"description\":\"**Table Locks**\\n\\nMySQL takes a number of different locks for varying reasons. In this graph we see how many Table level locks MySQL has requested from the storage engine. In the case of InnoDB, many times the locks could actually be row locks as it only takes table level locks in a few specific cases.\\n\\nIt is most useful to compare Locks Immediate and Locks Waited. If Locks waited is rising, it means you have lock contention. Otherwise, Locks Immediate rising and falling is normal activity.\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":1},\"result\":{\"color\":\"#e70d0d\"}},{\"type\":\"range\",\"match\":{\"to\":1},\"result\":{\"color\":\"#53b503\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Connections",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(mysql_global_status_threads_connected{instance=~\\\"$instance\\\"})\",\"legend\":\"Connections\"},{\"expr\":\"sum(mysql_global_status_max_used_connections{instance=~\\\"$instance\\\"})\",\"legend\":\"Max Used Connections\"},{\"expr\":\"sum(mysql_global_variables_max_connections{instance=~\\\"$instance\\\"})\",\"legend\":\"Max Connections\"},{\"expr\":\"sum(rate(mysql_global_status_aborted_connects{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Aborted Connections\"}],\"name\":\"MySQL Connections\",\"description\":\"**Max Connections** \\n\\nMax Connections is the maximum permitted number of simultaneous client connections. By default, this is 151. Increasing this value increases the number of file descriptors that mysqld requires. If the required number of descriptors are not available, the server reduces the value of Max Connections.\\n\\nmysqld actually permits Max Connections + 1 clients to connect. The extra connection is reserved for use by accounts that have the SUPER privilege, such as root.\\n\\nMax Used Connections is the maximum number of connections that have been in use simultaneously since the server started.\\n\\nConnections is the number of connection attempts (successful or not) to the MySQL server.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(mysql_global_status_threads_connected{instance=~\\\"$instance\\\"})\",\"legend\":\"Threads Connected\"},{\"expr\":\"sum(mysql_global_status_threads_running{instance=~\\\"$instance\\\"})\",\"legend\":\"Threads Running\"}],\"name\":\"MySQL Client Thread Activity\",\"description\":\"Threads Connected is the number of open connections, while Threads Running is the number of threads not sleeping.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Query Performance",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_created_tmp_tables{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Tables\"},{\"expr\":\"sum(rate(mysql_global_status_created_tmp_disk_tables{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Disk Tables\"},{\"expr\":\"sum(rate(mysql_global_status_created_tmp_files{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Files\"}],\"name\":\"MySQL Temporary Objects\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.64,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_select_full_join{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Full Join\"},{\"expr\":\"sum(rate(mysql_global_status_select_full_range_join{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Full Range Join\"},{\"expr\":\"sum(rate(mysql_global_status_select_range{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Range\"},{\"expr\":\"sum(rate(mysql_global_status_select_range_check{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Range Check\"},{\"expr\":\"sum(rate(mysql_global_status_select_scan{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Scan\"}],\"name\":\"MySQL Select Types\",\"description\":\"**MySQL Select Types**\\n\\nAs with most relational databases, selecting based on indexes is more efficient than scanning an entire table's data. Here we see the counters for selects not done with indexes.\\n\\n* ***Select Scan*** is how many queries caused full table scans, in which all the data in the table had to be read and either discarded or returned.\\n* ***Select Range*** is how many queries used a range scan, which means MySQL scanned all rows in a given range.\\n* ***Select Full Join*** is the number of joins that are not joined on an index, this is usually a huge performance hit.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"stack\":\"off\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.41},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_sort_rows{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Rows\"},{\"expr\":\"sum(rate(mysql_global_status_sort_range{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Range\"},{\"expr\":\"sum(rate(mysql_global_status_sort_merge_passes{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Merge Passes\"},{\"expr\":\"sum(rate(mysql_global_status_sort_scan{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Scan\"}],\"name\":\"MySQL Sorts\",\"description\":\"**MySQL Sorts**\\n\\nDue to a query's structure, order, or other requirements, MySQL sorts the rows before returning them. For example, if a table is ordered 1 to 10 but you want the results reversed, MySQL then has to sort the rows to return 10 to 1.\\n\\nThis graph also shows when sorts had to scan a whole table or a given range of a table in order to return the results and which could not have been sorted via an index.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_slow_queries{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Slow Queries\"}],\"name\":\"MySQL Slow Queries\",\"description\":\"**MySQL Slow Queries**\\n\\nSlow queries are defined as queries being slower than the long_query_time setting. For example, if you have long_query_time set to 3, all queries that take longer than 3 seconds to complete will show on this graph.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"bars\",\"stack\":\"off\",\"fillOpacity\":0.81},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Network",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_bytes_received{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Inbound\"},{\"expr\":\"sum(rate(mysql_global_status_bytes_sent{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Outbound\"}],\"name\":\"MySQL Network Traffic\",\"description\":\"**MySQL Network Traffic**\\n\\nHere we can see how much network traffic is generated by MySQL. Outbound is network traffic sent from MySQL and Inbound is network traffic MySQL has received.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesSI\",\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Commands, Handlers",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"topk(10, rate(mysql_global_status_commands_total{instance=~\\\"$instance\\\"}[5m])>0)\",\"legend\":\"Com_{{command}}\"}],\"name\":\"Top Command Counters\",\"description\":\"**Top Command Counters**\\n\\nThe Com_{{xxx}} statement counter variables indicate the number of times each xxx statement has been executed. There is one status variable for each type of statement. For example, Com_delete and Com_update count [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements, respectively. Com_delete_multi and Com_update_multi are similar but apply to [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements that use multiple-table syntax.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.2,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_handlers_total{instance=~\\\"$instance\\\", handler!~\\\"commit|rollback|savepoint.*|prepare\\\"}[5m])\",\"legend\":\"{{handler}}\"}],\"name\":\"MySQL Handlers\",\"description\":\"**MySQL Handlers**\\n\\nHandler statistics are internal statistics on how MySQL is selecting, updating, inserting, and modifying rows, tables, and indexes.\\n\\nThis is in fact the layer between the Storage Engine and MySQL.\\n\\n* `read_rnd_next` is incremented when the server performs a full table scan and this is a counter you don't really want to see with a high value.\\n* `read_key` is incremented when a read is done with an index.\\n* `read_next` is incremented when the storage engine is asked to 'read the next index entry'. A high value means a lot of index scans are being done.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":3},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_handlers_total{instance=~\\\"$instance\\\", handler=~\\\"commit|rollback|savepoint.*|prepare\\\"}[5m])\",\"legend\":\"{{handler}}\"}],\"name\":\"MySQL Transaction Handlers\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Open Files",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"mysql_global_variables_open_files_limit{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Files Limit\"},{\"expr\":\"mysql_global_status_innodb_num_open_files{instance=~\\\"$instance\\\"}\",\"legend\":\"InnoDB Open Files\"}],\"name\":\"MySQL Open Files\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Table Openings",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_table_open_cache_hits{instance=~\\\"$instance\\\"}[5m])\\n/\\n(\\nrate(mysql_global_status_table_open_cache_hits{instance=~\\\"$instance\\\"}[5m])\\n+\\nrate(mysql_global_status_table_open_cache_misses{instance=~\\\"$instance\\\"}[5m])\\n)\",\"legend\":\"Table Open Cache Hit Ratio\"}],\"name\":\"Table Open Cache Hit Ratio\",\"description\":\"**MySQL Table Open Cache Status**\\n\\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\\n\\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"mysql_global_status_open_tables{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Tables\"},{\"expr\":\"mysql_global_variables_table_open_cache{instance=~\\\"$instance\\\"}\",\"legend\":\"Table Open Cache\"}],\"name\":\"MySQL Open Tables\",\"description\":\"**MySQL Open Tables**\\n\\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\\n\\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,19 @@
[
{
"name": "TCP探测",
"tags": "",
"configs": "",
"chart_groups": [
{
"name": "Default chart group",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"max(net_response_result_code) by (target)\",\"legend\":\"UP?\"},{\"expr\":\"max(net_response_response_time) by (target)\",\"refId\":\"C\",\"legend\":\"latency(s)\"}],\"name\":\"Targets\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelValuesToRows\",\"aggrDimension\":\"target\"},\"options\":{\"valueMappings\":[],\"standardOptions\":{}},\"overrides\":[{\"properties\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"UP\",\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"special\":1,\"from\":1},\"result\":{\"text\":\"DOWN\",\"color\":\"#e90f0f\"}}],\"standardOptions\":{}},\"matcher\":{\"value\":\"A\"}}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":4,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,127 @@
[
{
"name": "Oracle - 模板",
"tags": "Telegraf",
"configs": "{\"var\":[{\"name\":\"ident\",\"definition\":\"label_values(oracle_up,ident)\",\"options\":[\"tt-fc-log00.nj\"]},{\"name\":\"instance\",\"definition\":\"label_values(oracle_up{ident=\\\"$ident\\\"},instance)\"}]}",
"chart_groups": [
{
"name": "Activities",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(oracle_activity_execute_count_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}[2m])\"}],\"name\":\"execute count / second\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"decimals\":1}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(oracle_activity_user_commits_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}[2m])\"}],\"name\":\"user commits / second\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(oracle_activity_user_rollbacks_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}[2m])\"}],\"name\":\"user rollbacks / second\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_up{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"status\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":1},\"result\":{\"text\":\"UP\",\"color\":\"#5ea70f\"}},{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"DOWN\",\"color\":\"#f60f0f\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Waits",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_wait_time_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"legend\":\"{{wait_class}}\"}],\"name\":\"Time waited\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Tablespace",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_tablespace_bytes{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}/oracle_tablespace_max_bytes{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"legend\":\"{{tablespace}}\"}],\"name\":\"Used Percent\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_tablespace_free{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"legend\":\"{{tablespace}}\"}],\"name\":\"Free space\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "IO and TPS",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_io_requests_per_second_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"IO Requests / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_user_transaction_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"TPS\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_io_megabytes_per_second_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}*1024*1024\"}],\"name\":\"IO Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Connections",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sessions_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\",status=\\\"ACTIVE\\\"}\",\"legend\":\"\"}],\"name\":\"Sessions\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Hit Ratio",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_buffer_cache_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Buffer Cache Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_redo_allocation_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Redo Allocation Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_row_cache_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Row Cache Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_library_cache_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Library Cache Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Physical Read Write",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_bytes_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_Bytes_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical Read Write Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_total_bytes_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_Total_Bytes_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical Read Write Total Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_io_requests_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_IO_Requests_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical RW IO Requests / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_total_io_requests_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_Total_IO_Requests_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical RW Total IO Requests / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,19 @@
[
{
"name": "PING探测",
"tags": "",
"configs": "",
"chart_groups": [
{
"name": "Default chart group",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"max(ping_result_code) by (target)\",\"legend\":\"UP?\"},{\"expr\":\"max(ping_percent_packet_loss) by (target)\",\"refId\":\"B\",\"legend\":\"Packet Loss %\"},{\"expr\":\"max(httpresponse_response_time) by (target)\",\"refId\":\"C\",\"legend\":\"latency(s)\"}],\"name\":\"Ping\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelValuesToRows\",\"aggrDimension\":\"target\"},\"options\":{\"valueMappings\":[],\"standardOptions\":{}},\"overrides\":[{\"properties\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"UP\",\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"special\":1,\"from\":1},\"result\":{\"text\":\"DOWN\",\"color\":\"#e90f0f\"}}],\"standardOptions\":{}},\"matcher\":{\"value\":\"A\"}}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":4,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,153 @@
[
{
"name": "Linux Process - 模板",
"tags": "Prometheus Process",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(namedprocess_namegroup_num_procs, instance)\",\"multi\":false,\"options\":[\"tt-fc-es02.nj:12346\"]},{\"definition\":\"label_values(namedprocess_namegroup_cpu_seconds_total{instance=~\\\"$instance\\\"},groupname)\",\"name\":\"processes\",\"multi\":true,\"options\":[\"(sd-pam)\",\"NetworkManager\",\"YDLive\",\"YDPython\",\"YDService\",\"agent\",\"agetty\",\"atd\",\"auditd\",\"barad_agent\",\"bash\",\"chronyd\",\"crond\",\"dbus-daemon\",\"fc-agent\",\"fc-alert\",\"fc-checker\",\"gpg-agent\",\"less\",\"lsmd\",\"mongod\",\"mysql\",\"mysqld\",\"nginx\",\"ngo\",\"node\",\"openvpn\",\"podman pause\",\"polkitd\",\"process-agent\",\"redis-server\",\"rsyslogd\",\"sedispatch\",\"sgagent\",\"sh\",\"ssh-agent\",\"sshd\",\"sssd\",\"sssd_be\",\"sssd_nss\",\"su\",\"sudo\",\"systemd\",\"systemd-journal\",\"systemd-logind\",\"systemd-udevd\",\"tat_agent\",\"trace-agent\",\"tuned\",\"unbound-anchor\",\"vminsert-prod\",\"vmselect-prod\",\"vmstorage-prod\"],\"allOption\":true}]}",
"chart_groups": [
{
"name": "Cpu Usage",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,(rate(namedprocess_namegroup_cpu_seconds_total{mode=\\\"user\\\",groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) + ignoring(mode) rate(namedprocess_namegroup_cpu_seconds_total{mode=\\\"system\\\",groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m])) or (irate(namedprocess_namegroup_cpu_seconds_total{mode=\\\"user\\\",groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) + ignoring(mode) irate(namedprocess_namegroup_cpu_seconds_total{mode=\\\"system\\\",groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by Total CPU cores used\",\"description\":\"进程占用CPU时间用户态+内核态倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5, rate(namedprocess_namegroup_cpu_seconds_total{mode=\\\"system\\\",groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) or ( irate(namedprocess_namegroup_cpu_seconds_total{mode=\\\"system\\\",groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by System CPU cores used\",\"description\":\"进程占用CPU时间内核态倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Memory Usage",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,( (avg_over_time(namedprocess_namegroup_memory_bytes{groupname=~\\\"$processes\\\", memtype=\\\"swapped\\\",instance=~\\\"$instance\\\"}[5m])+ ignoring (memtype) avg_over_time(namedprocess_namegroup_memory_bytes{groupname=~\\\"$processes\\\", memtype=\\\"resident\\\",instance=~\\\"$instance\\\"}[5m])) or (avg_over_time(namedprocess_namegroup_memory_bytes{groupname=~\\\"$processes\\\", memtype=\\\"swapped\\\",instance=~\\\"$instance\\\"}[5m])+ ignoring (memtype) avg_over_time(namedprocess_namegroup_memory_bytes{groupname=~\\\"$processes\\\", memtype=\\\"resident\\\",instance=~\\\"$instance\\\"}[5m])) ))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by Used memory\",\"description\":\"进程常驻内存与交换空间平均占用容量之和倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5, (avg_over_time(namedprocess_namegroup_memory_bytes{groupname=~\\\"$processes\\\", memtype=\\\"resident\\\",instance=~\\\"$instance\\\"}[5m]) or avg_over_time(namedprocess_namegroup_memory_bytes{groupname=~\\\"$processes\\\", memtype=\\\"resident\\\",instance=~\\\"$instance\\\"}[5m]) ))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by Resident Memory\",\"description\":\"进程常驻内存平均占用容量之和倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,( avg_over_time(namedprocess_namegroup_memory_bytes{groupname=~\\\"$processes\\\", memtype=\\\"swapped\\\",instance=~\\\"$instance\\\"}[5m]) or avg_over_time(namedprocess_namegroup_memory_bytes{groupname=~\\\"$processes\\\", memtype=\\\"swapped\\\",instance=~\\\"$instance\\\"}[5m]))) \",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by Swapped Memory\",\"description\":\"进程交换内存平均占用容量倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,( avg_over_time(namedprocess_namegroup_memory_bytes{groupname=~\\\"$processes\\\", memtype=\\\"virtual\\\",instance=~\\\"$instance\\\"}[5m]) or avg_over_time(namedprocess_namegroup_memory_bytes{groupname=~\\\"$processes\\\", memtype=\\\"virtual\\\",instance=~\\\"$instance\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by Virtual Memory\",\"description\":\"进程虚拟内存平均占用容量倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Disk IO Usage",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,(rate(namedprocess_namegroup_write_bytes_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) or irate(namedprocess_namegroup_write_bytes_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by Bytes Written\",\"description\":\"进程写数据量倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesSI\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,(rate(namedprocess_namegroup_read_bytes_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) or irate(namedprocess_namegroup_read_bytes_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by Bytes Read\",\"description\":\"进程读数据量倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesSI\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Process and Thread Counts",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,(max_over_time(namedprocess_namegroup_num_procs{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) or max_over_time(namedprocess_namegroup_num_procs{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by number of processes instances\",\"description\":\"同名进程数倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,(max_over_time(namedprocess_namegroup_num_threads{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) or max_over_time(namedprocess_namegroup_num_threads{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by number of threads\",\"description\":\"进程内线程数倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Context Switches",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,( rate(namedprocess_namegroup_context_switches_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\",ctxswitchtype=\\\"voluntary\\\"}[5m]) or irate(namedprocess_namegroup_context_switches_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\",ctxswitchtype=\\\"voluntary\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top Processes by Voluntary Context Switches\",\"description\":\"进程自愿中断如I/O完成次数倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,( rate(namedprocess_namegroup_context_switches_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\",ctxswitchtype=\\\"nonvoluntary\\\"}[5m]) or irate(namedprocess_namegroup_context_switches_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\",ctxswitchtype=\\\"nonvoluntary\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top Processes by Non-Voluntary Context Switches\",\"description\":\"进程被迫中断如CPU时间耗尽次数倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "File Descriptors",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,(max_over_time(namedprocess_namegroup_open_filedesc{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) or max_over_time(namedprocess_namegroup_open_filedesc{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by Open File Descriptors\",\"description\":\"打开文件数倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,( max_over_time(namedprocess_namegroup_worst_fd_ratio{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) or max_over_time(namedprocess_namegroup_worst_fd_ratio{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) ))*100\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by File Descriptor Usage Percent\",\"description\":\"已打开文件数与允许打开文件数占比倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percent\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Page Faults",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,( rate(namedprocess_namegroup_major_page_faults_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) or irate(namedprocess_namegroup_major_page_faults_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by Major Page Faults\",\"description\":\"主要页缺失次数倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,( rate(namedprocess_namegroup_minor_page_faults_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m]) or irate(namedprocess_namegroup_minor_page_faults_total{groupname=~\\\"$processes\\\",instance=~\\\"$instance\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top processes by Minor Page Faults\",\"description\":\"次要页缺失次数倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Statuses",
"weight": 7,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,( max_over_time(namedprocess_namegroup_states{instance=~\\\"$instance\\\", groupname=~\\\"$processes\\\", state=\\\"Running\\\"}[5m]) or max_over_time(namedprocess_namegroup_states{instance=~\\\"$instance\\\", groupname=~\\\"$processes\\\", state=\\\"Running\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top running processes\",\"description\":\"运行态同名进程数量倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,( max_over_time(namedprocess_namegroup_states{instance=~\\\"$instance\\\", groupname=~\\\"$processes\\\", state=\\\"Waiting\\\"}[5m]) or max_over_time(namedprocess_namegroup_states{instance=~\\\"$instance\\\", groupname=~\\\"$processes\\\", state=\\\"Waiting\\\"}[5m])))\",\"legend\":\"{{groupname}}\"}],\"name\":\"Top of processes waiting on IO\",\"description\":\"等待IO状态同名进程数量倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Kernel Waits(WCHAN)",
"weight": 8,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,sum(avg_over_time(namedprocess_namegroup_threads_wchan{instance=~\\\"$instance\\\", groupname=~\\\"$processes\\\"}[5m])) by (wchan) )\",\"legend\":\"{{wchan}}\"}],\"name\":\"Kernel waits for All\",\"description\":\"内核函数等待线程数量倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"topk(5,sum(avg_over_time(namedprocess_namegroup_threads_wchan{instance=~\\\"$instance\\\", groupname=~\\\"$processes\\\"}[5m])) by (wchan,groupname) )\",\"legend\":\"{{wchan}}\"}],\"name\":\"Kernel wait Details for All\",\"description\":\"内核函数等待线程数量按进程统计倒排前5\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Uptime",
"weight": 9,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"time()-(namedprocess_namegroup_oldest_start_time_seconds{instance=~\\\"$instance\\\",groupname=~\\\"$processes\\\"}>0)\",\"legend\":\"{{groupname}}\"}],\"name\":\"Processes by uptime\",\"description\":\"进程启动时长\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"seriesToRows\"},\"options\":{\"standardOptions\":{\"util\":\"seconds\"}},\"overrides\":[{\"properties\":{\"standardOptions\":{\"util\":\"seconds\"},\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":1800},\"result\":{\"color\":\"#f91010\"}},{\"type\":\"range\",\"match\":{\"from\":1800},\"result\":{\"color\":\"#21f312\"}}]}}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,221 @@
[
{
"name": "RabbitMQ 3.8+",
"tags": "",
"configs": "{\"var\":[{\"definition\":\"label_values(rabbitmq_identity_info, rabbitmq_cluster)\",\"name\":\"rabbitmq_cluster\"}]}",
"chart_groups": [
{
"name": "Overview",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_ready * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Ready messages\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10000},\"result\":{\"color\":\"#4a90e2\"}},{\"type\":\"range\",\"match\":{\"from\":100000},\"result\":{\"color\":\"#f50a0a\"}},{\"type\":\"range\",\"match\":{\"to\":9999},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":7,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_published_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Incoming messages / s\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":50},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":5,\"x\":7,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_channels * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) - sum(rabbitmq_channel_consumers * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Publishers\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_connections * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Connections\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":16,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queues * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Queues\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":20,\"y\":0,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_unacked * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Unacknowledged messages\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":99},\"result\":{\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"from\":100},\"result\":{\"color\":\"#4a90e2\"}},{\"type\":\"range\",\"match\":{\"from\":500},\"result\":{\"color\":\"#d0021b\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":7,\"x\":0,\"y\":1,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_redelivered_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_messages_delivered_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_messages_delivered_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_get_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_get_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Outgoing messages / s\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":50},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":5,\"x\":7,\"y\":1,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_channel_consumers * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Consumers\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":12,\"y\":1,\"i\":\"7\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_channels * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Channels\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":16,\"y\":1,\"i\":\"8\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_build_info * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Nodes\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":null,\"from\":3},\"result\":{\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"from\":8},\"result\":{\"color\":\"#e70909\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":20,\"y\":1,\"i\":\"9\"}}",
"weight": 0
}
]
},
{
"name": "Nodes",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_build_info * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"nodes\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelsOfSeriesToRows\",\"columns\":[\"rabbitmq_cluster\",\"rabbitmq_node\",\"rabbitmq_version\",\"erlang_version\"]},\"options\":{\"standardOptions\":{}},\"overrides\":[{}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":1,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"(rabbitmq_resident_memory_limit_bytes * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) -\\n(rabbitmq_process_resident_memory_bytes * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Memory available before publishers blocked\",\"description\":\"If the value is zero or less, the memory alarm will be triggered and all publishing connections across all cluster nodes will be blocked.\\n\\nThis value can temporarily go negative because the memory alarm is triggered with a slight delay.\\n\\nThe kernel's view of the amount of memory used by the node can differ from what the node itself can observe. This means that this value can be negative for a sustained period of time.\\n\\nBy default nodes use resident set size (RSS) to compute how much memory they use. This strategy can be changed (see the guides below).\\n\\n* [Alarms](https://www.rabbitmq.com/alarms.html)\\n* [Memory Alarms](https://www.rabbitmq.com/memory.html)\\n* [Reasoning About Memory Use](https://www.rabbitmq.com/memory-use.html)\\n* [Blocked Connection Notifications](https://www.rabbitmq.com/connection-blocked.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":1,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_disk_space_available_bytes * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Disk space available before publishers blocked\",\"description\":\"This metric is reported for the partition where the RabbitMQ data directory is stored.\\n\\nIf the value is zero or less, the disk alarm will be triggered and all publishing connections across all cluster nodes will be blocked.\\n\\nThis value can temporarily go negative because the free disk space alarm is triggered with a slight delay.\\n\\n* [Alarms](https://www.rabbitmq.com/alarms.html)\\n* [Disk Space Alarms](https://www.rabbitmq.com/disk-alarms.html)\\n* [Disk Space](https://www.rabbitmq.com/production-checklist.html#resource-limits-disk-space)\\n* [Persistence Configuration](https://www.rabbitmq.com/persistence-conf.html)\\n* [Blocked Connection Notifications](https://www.rabbitmq.com/connection-blocked.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"(rabbitmq_process_max_fds * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) -\\n(rabbitmq_process_open_fds * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"File descriptors available\",\"description\":\"When this value reaches zero, new connections will not be accepted and disk write operations may fail.\\n\\nClient libraries, peer nodes and CLI tools will not be able to connect when the node runs out of available file descriptors.\\n\\n* [Open File Handles Limit](https://www.rabbitmq.com/production-checklist.html#resource-limits-file-handle-limit)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":8,\"x\":16,\"y\":1,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"(rabbitmq_process_max_tcp_sockets * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) -\\n(rabbitmq_process_open_tcp_sockets * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"TCP sockets available\",\"description\":\"When this value reaches zero, new connections will not be accepted.\\n\\nClient libraries, peer nodes and CLI tools will not be able to connect when the node runs out of available file descriptors.\\n\\n* [Networking and RabbitMQ](https://www.rabbitmq.com/networking.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":8,\"x\":16,\"y\":2,\"i\":\"4\"}}",
"weight": 0
}
]
},
{
"name": "QUEUED MESSAGES",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_ready * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages ready to be delivered to consumers\",\"description\":\"Total number of ready messages ready to be delivered to consumers.\\n\\nAim to keep this value as low as possible. RabbitMQ behaves best when messages are flowing through it. It's OK for publishers to occasionally outpace consumers, but the expectation is that consumers will eventually process all ready messages.\\n\\nIf this metric keeps increasing, your system will eventually run out of memory and/or disk space. Consider using TTL or Queue Length Limit to prevent unbounded message growth.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\\n* [Queue Length Limit](https://www.rabbitmq.com/maxlength.html)\\n* [Time-To-Live and Expiration](https://www.rabbitmq.com/ttl.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_unacked * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages pending consumer acknowledgement\",\"description\":\"The total number of messages that are either in-flight to consumers, currently being processed by consumers or simply waiting for the consumer acknowledgements to be processed by the queue. Until the queue processes the message acknowledgement, the message will remain unacknowledged.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\\n* [Confirms and Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [Consumer Prefetch](https://www.rabbitmq.com/consumer-prefetch.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "INCOMING MESSAGES",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_published_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages published / s\",\"description\":\"The incoming message rate before any routing rules are applied.\\n\\nIf this value is lower than the number of messages published to queues, it may indicate that some messages are delivered to more than one queue.\\n\\nIf this value is higher than the number of messages published to queues, messages cannot be routed and will either be dropped or returned to publishers.\\n\\n* [Publishers](https://www.rabbitmq.com/publishers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_confirmed_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages confirmed to publishers / s\",\"description\":\"The rate of messages confirmed by the broker to publishers. Publishers must opt-in to receive message confirmations.\\n\\nIf this metric is consistently at zero it may suggest that publisher confirms are not used by clients. The safety of published messages is likely to be at risk.\\n\\n* [Publisher Confirms](https://www.rabbitmq.com/confirms.html#publisher-confirms)\\n* [Publisher Confirms and Data Safety](https://www.rabbitmq.com/publishers.html#data-safety)\\n* [When Will Published Messages Be Confirmed by the Broker?](https://www.rabbitmq.com/confirms.html#when-publishes-are-confirmed)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queue_messages_published_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages routed to queues / s\",\"description\":\"The rate of messages received from publishers and successfully routed to the master queue replicas.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\\n* [Publishers](https://www.rabbitmq.com/publishers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_unconfirmed[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages unconfirmed to publishers / s\",\"description\":\"The rate of messages received from publishers that have publisher confirms enabled and the broker has not confirmed yet.\\n\\n* [Publishers](https://www.rabbitmq.com/publishers.html)\\n* [Confirms and Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [When Will Published Messages Be Confirmed by the Broker?](https://www.rabbitmq.com/confirms.html#when-publishes-are-confirmed)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":1,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_unroutable_dropped_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Unroutable messages dropped / s\",\"description\":\"The rate of messages that cannot be routed and are dropped. \\n\\nAny value above zero means message loss and likely suggests a routing problem on the publisher end.\\n\\n* [Unroutable Message Handling](https://www.rabbitmq.com/publishers.html#unroutable)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_unroutable_returned_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Unroutable messages returned to publishers / s\",\"description\":\"The rate of messages that cannot be routed and are returned back to publishers.\\n\\nSustained values above zero may indicate a routing problem on the publisher end.\\n\\n* [Unroutable Message Handling](https://www.rabbitmq.com/publishers.html#unroutable)\\n* [When Will Published Messages Be Confirmed by the Broker?](https://www.rabbitmq.com/confirms.html#when-publishes-are-confirmed)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":2,\"i\":\"5\"}}",
"weight": 0
}
]
},
{
"name": "OUTGOING MESSAGES",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(\\n (rate(rabbitmq_channel_messages_delivered_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\n (rate(rabbitmq_channel_messages_delivered_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\\n) by(rabbitmq_node)\"}],\"name\":\"Messages delivered / s\",\"description\":\"The rate of messages delivered to consumers. It includes messages that have been redelivered.\\n\\nThis metric does not include messages that have been fetched by consumers using `basic.get` (consumed by polling).\\n\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_redelivered_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages redelivered / s\",\"description\":\"The rate of messages that have been redelivered to consumers. It includes messages that have been requeued automatically and redelivered due to channel exceptions or connection closures.\\n\\nHaving some redeliveries is expected, but if this metric is consistently non-zero, it is worth investigating why.\\n\\n* [Negative Acknowledgement and Requeuing of Deliveries](https://www.rabbitmq.com/confirms.html#consumer-nacks-requeue)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_delivered_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages delivered with manual ack / s\",\"description\":\"The rate of message deliveries to consumers that use manual acknowledgement mode.\\n\\nWhen this mode is used, RabbitMQ waits for consumers to acknowledge messages before more messages can be delivered.\\n\\nThis is the safest way of consuming messages.\\n\\n* [Consumer Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [Consumer Prefetch](https://www.rabbitmq.com/consumer-prefetch.html)\\n* [Consumer Acknowledgement Modes, Prefetch and Throughput](https://www.rabbitmq.com/confirms.html#channel-qos-prefetch-throughput)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_delivered_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages delivered auto ack / s\",\"description\":\"The rate of message deliveries to consumers that use automatic acknowledgement mode.\\n\\nWhen this mode is used, RabbitMQ does not wait for consumers to acknowledge message deliveries.\\n\\nThis mode is fire-and-forget and does not offer any delivery safety guarantees. It tends to provide higher throughput and it may lead to consumer overload and higher consumer memory usage.\\n\\n* [Consumer Acknowledgement Modes, Prefetch and Throughput](https://www.rabbitmq.com/confirms.html#channel-qos-prefetch-throughput)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":1,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_acked_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages acknowledged / s\",\"description\":\"The rate of message acknowledgements coming from consumers that use manual acknowledgement mode.\\n\\n* [Consumer Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [Consumer Prefetch](https://www.rabbitmq.com/consumer-prefetch.html)\\n* [Consumer Acknowledgement Modes, Prefetch and Throughput](https://www.rabbitmq.com/confirms.html#channel-qos-prefetch-throughput)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_get_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Polling operations with auto ack / s\",\"description\":\"The rate of messages delivered to polling consumers that use automatic acknowledgement mode.\\n\\nThe use of polling consumers is highly inefficient and therefore strongly discouraged.\\n\\n* [Fetching individual messages](https://www.rabbitmq.com/consumers.html#fetching)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":2,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_get_empty_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Polling operations that yield no result / s\",\"description\":\"The rate of polling consumer operations that yield no result.\\n\\nAny value above zero means that RabbitMQ resources are wasted by polling consumers.\\n\\nCompare this metric to the other polling consumer metrics to see the inefficiency rate.\\n\\nThe use of polling consumers is highly inefficient and therefore strongly discouraged.\\n\\n* [Fetching individual messages](https://www.rabbitmq.com/consumers.html#fetching)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":3,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_get_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Polling operations with manual ack / s\",\"description\":\"The rate of messages delivered to polling consumers that use manual acknowledgement mode.\\n\\nThe use of polling consumers is highly inefficient and therefore strongly discouraged.\\n\\n* [Fetching individual messages](https://www.rabbitmq.com/consumers.html#fetching)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":3,\"i\":\"7\"}}",
"weight": 0
}
]
},
{
"name": "QUEUES",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_queues * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Total queues\",\"description\":\"Total number of queue masters per node. \\n\\nThis metric makes it easy to see sub-optimal queue distribution in a cluster.\\n\\n* [Queue Masters, Data Locality](https://www.rabbitmq.com/ha.html#master-migration-data-locality)\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queues_declared_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Queues declared / s\",\"description\":\"The rate of queue declarations performed by clients.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of queue churn or high rates of connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":4,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queues_created_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Queues created / s\",\"description\":\"The rate of new queues created (as opposed to redeclarations).\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of queue churn or high rates of connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":4,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queues_deleted_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Queues deleted / s\",\"description\":\"The rate of queues deleted.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of queue churn or high rates of connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":4,\"x\":20,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "CHANNELS",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_channels * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Total channels\",\"description\":\"Total number of channels on all currently opened connections.\\n\\nIf this metric grows monotonically it is highly likely a channel leak in one of the applications. Confirm channel leaks by using the _Channels opened_ and _Channels closed_ metrics.\\n\\n* [Channel Leak](https://www.rabbitmq.com/channels.html#channel-leaks)\\n* [Channels](https://www.rabbitmq.com/channels.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channels_opened_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Channels opened / s\",\"description\":\"The rate of new channels opened by applications across all connections. Channels are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of channel churn or mass connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [High Channel Churn](https://www.rabbitmq.com/channels.html#high-channel-churn)\\n* [Channels](https://www.rabbitmq.com/channels.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channels_closed_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Channels closed / s\",\"description\":\"The rate of channels closed by applications across all connections. Channels are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of channel churn or mass connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [High Channel Churn](https://www.rabbitmq.com/channels.html#high-channel-churn)\\n* [Channels](https://www.rabbitmq.com/channels.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "CONNECTIONS",
"weight": 7,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_connections_closed_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Connections closed / s\",\"description\":\"The rate of connections closed. Connections are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of connection churn or mass connection recovery.\\n\\n* [Connections](https://www.rabbitmq.com/connections.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_connections * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Total connections\",\"description\":\"Total number of client connections.\\n\\nIf this metric grows monotonically it is highly likely a connection leak in one of the applications. Confirm connection leaks by using the _Connections opened_ and _Connections closed_ metrics.\\n\\n* [Connection Leak](https://www.rabbitmq.com/connections.html#monitoring)\\n* [Connections](https://www.rabbitmq.com/connections.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_connections_opened_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Connections opened / s\",\"description\":\"The rate of new connections opened by clients. Connections are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of connection churn or mass connection recovery.\\n\\n* [Connection Leak](https://www.rabbitmq.com/connections.html#monitoring)\\n* [Connections](https://www.rabbitmq.com/connections.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,77 @@
[
{
"name": "Redis Overview - 模板",
"tags": "Redis Prometheus",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(redis_uptime_in_seconds,instance)\",\"selected\":\"10.206.0.16:6379\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"min(redis_uptime_in_seconds{instance=~\\\"$instance\\\"})\"}],\"name\":\"Redis Uptime\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(redis_connected_clients{instance=~\\\"$instance\\\"})\"}],\"name\":\"Connected Clients\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"redis_used_memory{instance=~\\\"$instance\\\"}\"}],\"name\":\"Memory Used\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":128000000},\"result\":{\"color\":\"#079e05\"}},{\"type\":\"range\",\"match\":{\"from\":128000000},\"result\":{\"color\":\"#f10909\"}}],\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":0}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"redis_maxmemory{instance=~\\\"$instance\\\"}\"}],\"name\":\"Max Memory Limit\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"bytesIEC\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Commands",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"rate(redis_total_commands_processed{instance=~\\\"$instance\\\"}[5m])\"}],\"name\":\"Commands Executed / sec\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"irate(redis_keyspace_hits{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"hits\"},{\"expr\":\"irate(redis_keyspace_misses{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"misses\"}],\"name\":\"Hits / Misses per Sec\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"noraml\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"topk(5, irate(redis_cmdstat_calls{instance=~\\\"$instance\\\"} [1m]))\",\"legend\":\"{{command}}\"}],\"name\":\"Top Commands\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Keys",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum (redis_keyspace_keys{instance=~\\\"$instance\\\"}) by (db)\",\"legend\":\"{{db}}\"}],\"name\":\"Total Items per DB\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(redis_expired_keys{instance=~\\\"$instance\\\"}[5m])) by (instance)\",\"legend\":\"expired\"},{\"expr\":\"sum(rate(redis_evicted_keys{instance=~\\\"$instance\\\"}[5m])) by (instance)\",\"legend\":\"evicted\"}],\"name\":\"Expired / Evicted\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(redis_keyspace_keys{instance=~\\\"$instance\\\"}) - sum(redis_keyspace_expires{instance=~\\\"$instance\\\"}) \",\"legend\":\"not expiring\"},{\"expr\":\"sum(redis_keyspace_expires{instance=~\\\"$instance\\\"}) \",\"legend\":\"expiring\"}],\"name\":\"Expiring vs Not-Expiring Keys\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"noraml\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Network",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(redis_total_net_input_bytes{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"input\"},{\"expr\":\"sum(rate(redis_total_net_output_bytes{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"output\"}],\"name\":\"Network I/O\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,77 @@
[
{
"name": "Redis Overview - 模板",
"tags": "Redis Prometheus",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(redis_uptime_in_seconds,instance)\",\"selected\":\"10.206.0.16:6379\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"min(redis_uptime_in_seconds{instance=~\\\"$instance\\\"})\"}],\"name\":\"Redis Uptime\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(redis_connected_clients{instance=~\\\"$instance\\\"})\"}],\"name\":\"Connected Clients\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"redis_memory_used_bytes{instance=~\\\"$instance\\\"}\"}],\"name\":\"Memory Used\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":128000000},\"result\":{\"color\":\"#079e05\"}},{\"type\":\"range\",\"match\":{\"from\":128000000},\"result\":{\"color\":\"#f10909\"}}],\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":0}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"redis_memory_max_bytes{instance=~\\\"$instance\\\"}\"}],\"name\":\"Max Memory Limit\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"bytesIEC\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Commands",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"rate(redis_commands_processed_total{instance=~\\\"$instance\\\"}[5m])\"}],\"name\":\"Commands Executed / sec\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"irate(redis_keyspace_hits_total{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"hits\"},{\"expr\":\"irate(redis_keyspace_misses_total{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"misses\"}],\"name\":\"Hits / Misses per Sec\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"noraml\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"topk(5, irate(redis_commands_total{instance=~\\\"$instance\\\"} [1m]))\",\"legend\":\"{{cmd}}\"}],\"name\":\"Top Commands\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Keys",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum (redis_db_keys{instance=~\\\"$instance\\\"}) by (db)\",\"legend\":\"{{db}}\"}],\"name\":\"Total Items per DB\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(redis_expired_keys_total{instance=~\\\"$instance\\\"}[5m])) by (instance)\",\"legend\":\"expired\"},{\"expr\":\"sum(rate(redis_evicted_keys_total{instance=~\\\"$instance\\\"}[5m])) by (instance)\",\"legend\":\"evicted\"}],\"name\":\"Expired / Evicted\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(redis_db_keys{instance=~\\\"$instance\\\"}) - sum(redis_db_keys_expiring{instance=~\\\"$instance\\\"}) \",\"legend\":\"not expiring\"},{\"expr\":\"sum(redis_db_keys_expiring{instance=~\\\"$instance\\\"}) \",\"legend\":\"expiring\"}],\"name\":\"Expiring vs Not-Expiring Keys\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"noraml\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Network",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(redis_net_input_bytes_total{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"input\"},{\"expr\":\"sum(rate(redis_net_output_bytes_total{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"output\"}],\"name\":\"Network I/O\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,63 @@
[
{
"name": "Tomcat - 模板",
"tags": "Categraf",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(tomcat_up, instance)\"}],\"links\":[{\"title\":\"n9e\",\"url\":\"https://n9e.gitee.io/\",\"targetBlank\":true}]}",
"chart_groups": [
{
"name": "connector",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(tomcat_connector_bytes_sent{instance=\\\"$instance\\\"}[1m])\",\"legend\":\"sent\"},{\"expr\":\"rate(tomcat_connector_bytes_received{instance=\\\"$instance\\\"}[1m])\",\"refId\":\"B\",\"legend\":\"received\"}],\"name\":\"Traffic Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(tomcat_connector_request_count{instance=\\\"$instance\\\"}[1m])\",\"legend\":\"tomcat_connector_request_count\"},{\"expr\":\"rate(tomcat_connector_error_count{instance=\\\"$instance\\\"}[1m])\",\"refId\":\"B\",\"legend\":\"tomcat_connector_error_count\"}],\"name\":\"Request count / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_connector_max_threads{instance=\\\"$instance\\\"}\",\"legend\":\"max_threads\"},{\"expr\":\"tomcat_connector_current_thread_count{instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"current_thread_count\"},{\"expr\":\"tomcat_connector_current_threads_busy{instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"current_threads_busy\"}],\"name\":\"Tread\",\"description\":\"max_threads: The maximum number of allowed worker threads.\\ncurrent_thread_count: The number of threads managed by the thread pool\\ncurrent_threads_busy: The number of threads that are in use\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(tomcat_connector_processing_time{instance=\\\"$instance\\\"}[1m])\",\"legend\":\"{{name}}-processing_time\"}],\"name\":\"Processing time\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "mem used",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memory_max{instance=\\\"$instance\\\"}\",\"legend\":\"max\"},{\"expr\":\"tomcat_jvm_memory_total{instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"used\"},{\"expr\":\"tomcat_jvm_memory_free{instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"free\"}],\"name\":\"Mem Used\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "memorypool",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_used{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Used\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_max{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Max\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_committed{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Committed\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_init{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Init\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,99 @@
[
{
"name": "Windows - 模板",
"tags": "Windows Prometheus",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(windows_system_system_up_time, instance)\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"time() - windows_system_system_up_time{instance=~\\\"$instance\\\"}\",\"legend\":\"\"}],\"name\":\"Uptime\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"windows_cs_logical_processors{instance=~\\\"$instance\\\"}\",\"legend\":\"\"}],\"name\":\"CPU Core Total\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"windows_cs_physical_memory_bytes{instance=~\\\"$instance\\\"}\"}],\"name\":\"Memory Total\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":0}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"windows_os_processes{instance=~\\\"$instance\\\"}\"}],\"name\":\"Process Total\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":100},\"result\":{\"color\":\"#109d06\"}},{\"type\":\"range\",\"match\":{\"from\":100},\"result\":{\"color\":\"#d11010\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "CPU Memory Disk",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"100 * sum by (instance) (rate(windows_cpu_time_total{mode != 'idle'}[5m])) / count by (instance) (windows_cpu_core_frequency_mhz) \",\"legend\":\"CPU Util\"}],\"name\":\"Cpu Util\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"100 - (windows_os_physical_memory_free_bytes{instance=~\\\"$instance\\\"} / windows_cs_physical_memory_bytes{instance=~\\\"$instance\\\"})*100\"}],\"name\":\"Memory Util\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":2},\"thresholds\":{\"steps\":[{\"value\":70,\"color\":\"#e71313\"}]}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"100 - (windows_logical_disk_free_bytes{instance=~\\\"$instance\\\"} / windows_logical_disk_size_bytes{instance=~\\\"$instance\\\"})*100\",\"legend\":\"{{volume}}\"}],\"name\":\"Disk Util\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"windows_logical_disk_free_bytes{instance=~\\\"$instance\\\"}\",\"legend\":\"{{volume}} Free\"},{\"expr\":\"windows_logical_disk_size_bytes{instance=~\\\"$instance\\\"}\",\"legend\":\"{{volume}} Total\"}],\"name\":\"Disk Free\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":0},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Disk IO",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"irate(windows_logical_disk_read_bytes_total{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"{{volume}} Read\"},{\"expr\":\"irate(windows_logical_disk_write_bytes_total{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"{{volume}} Write\"}],\"name\":\"Read/Write Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"irate(windows_logical_disk_reads_total{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"{{volume}} Read\"},{\"expr\":\"irate(windows_logical_disk_writes_total{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"{{volume}} Write\"}],\"name\":\"Read/Write / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Network",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"irate(windows_net_bytes_sent_total{instance=~\\\"$instance\\\",nic!~'isatap.*|VPN.*'}[5m])*8\",\"legend\":\"{{nic}} Sent\"},{\"expr\":\"irate(windows_net_bytes_received_total{instance=~\\\"$instance\\\",nic!~'isatap.*|VPN.*'}[5m])*8\",\"legend\":\"{{nic}} Received\"}],\"name\":\"Sent/Received bits / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bitsIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"(irate(windows_net_bytes_total{instance=~\\\"$instance\\\",nic!~'isatap.*|VPN.*'}[5m]) * 8 / windows_net_current_bandwidth{instance=~\\\"$instance\\\",nic!~'isatap.*|VPN.*'}) * 100\"}],\"name\":\"Network Util\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"irate(windows_net_packets_outbound_discarded{instance=~\\\"$instance\\\", nic!~'isatap.*|VPN.*'}[5m]) + irate(windows_net_packets_outbound_errors{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"outbound\"},{\"expr\":\"irate(windows_net_packets_received_discarded{job=~\\\"$job\\\",instance=~\\\"$instance\\\", nic!~'isatap.*|VPN.*'}[5m]) + irate(windows_net_packets_received_errors{job=~\\\"$job\\\",instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"received\"}],\"name\":\"Packets / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "System",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"windows_system_threads{instance=~\\\"$instance\\\"}\"}],\"name\":\"System Threads\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"irate(windows_system_exception_dispatches_total{instance=~\\\"$instance\\\"}[5m])\"}],\"name\":\"System exception dispatches\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

View File

@@ -0,0 +1,55 @@
[
{
"name": "Zookeeper - 模板",
"tags": "",
"configs": "{\"var\":[{\"name\":\"job\",\"definition\":\"label_values(zk_up,job)\"},{\"definition\":\"label_values(zk_up,instance)\",\"name\":\"instance\"}]}",
"chart_groups": [
{
"name": "overview",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"zk_up{job=\\\"$job\\\", instance=\\\"$instance\\\"}\",\"legend\":\"up\"}],\"name\":\"up\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":40}},\"options\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":1},\"result\":{\"color\":\"#3d950e\",\"text\":\"UP\"}},{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"color\":\"#f01414\",\"text\":\"DOWN\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"zk_znode_count{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"legend\":\"{{instance}}\"}],\"name\":\"zk_znode_count\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":50}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"zk_watch_count{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"legend\":\"{{instance}}\"}],\"name\":\"zk_watch_count\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":50}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"zk_ephemerals_count{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"legend\":\"zk_ephemerals_count\"}],\"name\":\"zk_ephemerals_count\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":50}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(zk_packets_sent{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"{{instance}}-sent\"},{\"expr\":\"rate(zk_packets_received{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}[5m])\",\"refId\":\"B\",\"legend\":\"{{instance}}-received\"}],\"name\":\"Pakages\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":1,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"zk_num_alive_connections{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"legend\":\"{{instance}}\"}],\"name\":\"alive_connections\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":3,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"zk_open_file_descriptor_count{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"legend\":\"{{instance}}-open\"},{\"expr\":\"zk_max_file_descriptor_count{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"{{instance}}-max\"}],\"name\":\"file_descriptor\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":3,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"zk_avg_latency{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"legend\":\"{{instance}}-avg\"},{\"expr\":\"zk_min_latency{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"{{instance}}-min\"},{\"expr\":\"zk_max_latency{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"{{instance}}-max\"}],\"name\":\"latency(ms)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":3,\"i\":\"7\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"zk_outstanding_requests{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"legend\":\"{{instance}}\"}],\"name\":\"outstanding_requests\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":3,\"i\":\"8\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"zk_approximate_data_size{job=~\\\"$job\\\", instance=~\\\"$instance\\\"}\",\"legend\":\"{{instance}}\"}],\"name\":\"approximate_data_size\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":1,\"i\":\"9\"}}",
"weight": 0
}
]
}
]
}
]

494
docker/n9eetc/metrics.yaml Normal file
View File

@@ -0,0 +1,494 @@
cpu_usage_idle: CPU空闲率单位%
cpu_usage_active: CPU使用率单位%
cpu_usage_system: CPU内核态时间占比单位%
cpu_usage_user: CPU用户态时间占比单位%
cpu_usage_nice: 低优先级用户态CPU时间占比也就是进程nice值被调整为1-19之间的CPU时间。这里注意nice可取值范围是-20到19数值越大优先级反而越低单位%
cpu_usage_iowait: CPU等待I/O的时间占比单位%
cpu_usage_irq: CPU处理硬中断的时间占比单位%
cpu_usage_softirq: CPU处理软中断的时间占比单位%
cpu_usage_steal: 在虚拟机环境下有该指标表示CPU被其他虚拟机争用的时间占比超过20就表示争抢严重单位%
cpu_usage_guest: 通过虚拟化运行其他操作系统的时间也就是运行虚拟机的CPU时间占比单位%
cpu_usage_guest_nice: 以低优先级运行虚拟机的时间占比(单位:%
disk_free: 硬盘分区剩余量单位byte
disk_used: 硬盘分区使用量单位byte
disk_used_percent: 硬盘分区使用率(单位:%
disk_total: 硬盘分区总量单位byte
disk_inodes_free: 硬盘分区inode剩余量
disk_inodes_used: 硬盘分区inode使用量
disk_inodes_total: 硬盘分区inode总量
diskio_io_time: 从设备视角来看I/O请求总时间队列中有I/O请求就计数单位毫秒counter类型需要用函数求rate才有使用价值
diskio_iops_in_progress: 已经分配给设备驱动且尚未完成的IO请求不包含在队列中但尚未分配给设备驱动的IO请求gauge类型
diskio_merged_reads: 相邻读请求merge读的次数counter类型
diskio_merged_writes: 相邻写请求merge写的次数counter类型
diskio_read_bytes: 读取的byte数量counter类型需要用函数求rate才有使用价值
diskio_read_time: 读请求总时间单位毫秒counter类型需要用函数求rate才有使用价值
diskio_reads: 读请求次数counter类型需要用函数求rate才有使用价值
diskio_weighted_io_time: 从I/O请求视角来看I/O等待总时间如果同时有多个I/O请求时间会叠加单位毫秒
diskio_write_bytes: 写入的byte数量counter类型需要用函数求rate才有使用价值
diskio_write_time: 写请求总时间单位毫秒counter类型需要用函数求rate才有使用价值
diskio_writes: 写请求次数counter类型需要用函数求rate才有使用价值
kernel_boot_time: 内核启动时间
kernel_context_switches: 内核上下文切换次数
kernel_entropy_avail: linux系统内部的熵池
kernel_interrupts: 内核中断次数
kernel_processes_forked: fork的进程数
mem_active: 活跃使用的内存总数(包括cache和buffer内存)
mem_available: 应用程序可用内存数
mem_available_percent: 内存剩余百分比(0~100)
mem_buffered: 用来给文件做缓冲大小
mem_cached: 被高速缓冲存储器cache memory用的内存的大小等于 diskcache minus SwapCache
mem_commit_limit: 根据超额分配比率('vm.overcommit_ratio'这是当前在系统上分配可用的内存总量这个限制只是在模式2('vm.overcommit_memory')时启用
mem_committed_as: 目前在系统上分配的内存量。是所有进程申请的内存的总和
mem_dirty: 等待被写回到磁盘的内存大小
mem_free: 空闲内存数
mem_high_free: 未被使用的高位内存大小
mem_high_total: 高位内存总大小Highmem是指所有内存高于860MB的物理内存,Highmem区域供用户程序使用或用于页面缓存。该区域不是直接映射到内核空间。内核必须使用不同的手法使用该段内存
mem_huge_page_size: 每个大页的大小
mem_huge_pages_free: 池中尚未分配的 HugePages 数量
mem_huge_pages_total: 预留HugePages的总个数
mem_inactive: 空闲的内存数(包括free和avalible的内存)
mem_low_free: 未被使用的低位大小
mem_low_total: 低位内存总大小,低位可以达到高位内存一样的作用,而且它还能够被内核用来记录一些自己的数据结构
mem_mapped: 设备和文件等映射的大小
mem_page_tables: 管理内存分页页面的索引表的大小
mem_shared: 多个进程共享的内存总额
mem_slab: 内核数据结构缓存的大小,可以减少申请和释放内存带来的消耗
mem_sreclaimable: 可收回Slab的大小
mem_sunreclaim: 不可收回Slab的大小SUnreclaim+SReclaimableSlab
mem_swap_cached: 被高速缓冲存储器cache memory用的交换空间的大小已经被交换出来的内存但仍然被存放在swapfile中。用来在需要的时候很快的被替换而不需要再次打开I/O端口
mem_swap_free: 未被使用交换空间的大小
mem_swap_total: 交换空间的总大小
mem_total: 内存总数
mem_used: 已用内存数
mem_used_percent: 已用内存数百分比(0~100)
mem_vmalloc_chunk: 最大的连续未被使用的vmalloc区域
mem_vmalloc_totalL: 可以vmalloc虚拟内存大小
mem_vmalloc_used: vmalloc已使用的虚拟内存大小
mem_write_back: 正在被写回到磁盘的内存大小
mem_write_back_tmp: FUSE用于临时写回缓冲区的内存
net_bytes_recv: 网卡收包总数(bytes)
net_bytes_sent: 网卡发包总数(bytes)
net_drop_in: 网卡收丢包数量
net_drop_out: 网卡发丢包数量
net_err_in: 网卡收包错误数量
net_err_out: 网卡发包错误数量
net_packets_recv: 网卡收包数量
net_packets_sent: 网卡发包数量
netstat_tcp_established: ESTABLISHED状态的网络链接数
netstat_tcp_fin_wait1: FIN_WAIT1状态的网络链接数
netstat_tcp_fin_wait2: FIN_WAIT2状态的网络链接数
netstat_tcp_last_ack: LAST_ACK状态的网络链接数
netstat_tcp_listen: LISTEN状态的网络链接数
netstat_tcp_syn_recv: SYN_RECV状态的网络链接数
netstat_tcp_syn_sent: SYN_SENT状态的网络链接数
netstat_tcp_time_wait: TIME_WAIT状态的网络链接数
netstat_udp_socket: UDP状态的网络链接数
processes_blocked: 不可中断的睡眠状态下的进程数('U','D','L')
processes_dead: 回收中的进程数('X')
processes_idle: 挂起的空闲进程数('I')
processes_paging: 分页进程数('P')
processes_running: 运行中的进程数('R')
processes_sleeping: 可中断进程数('S')
processes_stopped: 暂停状态进程数('T')
processes_total: 总进程数
processes_total_threads: 总线程数
processes_unknown: 未知状态进程数
processes_zombies: 僵尸态进程数('Z')
swap_used_percent: Swap空间换出数据量
system_load1: 1分钟平均load值
system_load5: 5分钟平均load值
system_load15: 15分钟平均load值
system_n_users: 用户数
system_n_cpus: CPU核数
system_uptime: 系统启动时间
nginx_accepts: 自nginx启动起,与客户端建立过得连接总数
nginx_active: 当前nginx正在处理的活动连接数,等于Reading/Writing/Waiting总和
nginx_handled: 自nginx启动起,处理过的客户端连接总数
nginx_reading: 正在读取HTTP请求头部的连接总数
nginx_requests: 自nginx启动起,处理过的客户端请求总数,由于存在HTTP Krrp-Alive请求,该值会大于handled值
nginx_upstream_check_fall: upstream_check模块检测到后端失败的次数
nginx_upstream_check_rise: upstream_check模块对后端的检测次数
nginx_upstream_check_status_code: 后端upstream的状态,up为1,down为0
nginx_waiting: 开启 keep-alive 的情况下,这个值等于 active (reading+writing), 意思就是 Nginx 已经处理完正在等候下一次请求指令的驻留连接
nginx_writing: 正在向客户端发送响应的连接总数
http_response_content_length: HTTP消息实体的传输长度
http_response_http_response_code: http响应状态码
http_response_response_time: http响应用时
http_response_result_code: url探测结果0为正常否则url无法访问
# [mysqld_exporter]
mysql_global_status_uptime: The number of seconds that the server has been up.(Gauge)
mysql_global_status_uptime_since_flush_status: The number of seconds since the most recent FLUSH STATUS statement.(Gauge)
mysql_global_status_queries: The number of statements executed by the server. This variable includes statements executed within stored programs, unlike the Questions variable. It does not count COM_PING or COM_STATISTICS commands.(Counter)
mysql_global_status_threads_connected: The number of currently open connections.(Counter)
mysql_global_status_connections: The number of connection attempts (successful or not) to the MySQL server.(Gauge)
mysql_global_status_max_used_connections: The maximum number of connections that have been in use simultaneously since the server started.(Gauge)
mysql_global_status_threads_running: The number of threads that are not sleeping.(Gauge)
mysql_global_status_questions: The number of statements executed by the server. This includes only statements sent to the server by clients and not statements executed within stored programs, unlike the Queries variable. This variable does not count COM_PING, COM_STATISTICS, COM_STMT_PREPARE, COM_STMT_CLOSE, or COM_STMT_RESET commands.(Counter)
mysql_global_status_threads_cached: The number of threads in the thread cache.(Counter)
mysql_global_status_threads_created: The number of threads created to handle connections. If Threads_created is big, you may want to increase the thread_cache_size value. The cache miss rate can be calculated as Threads_created/Connections.(Counter)
mysql_global_status_created_tmp_tables: The number of internal temporary tables created by the server while executing statements.(Counter)
mysql_global_status_created_tmp_disk_tables: The number of internal on-disk temporary tables created by the server while executing statements. You can compare the number of internal on-disk temporary tables created to the total number of internal temporary tables created by comparing Created_tmp_disk_tables and Created_tmp_tables values.(Counter)
mysql_global_status_created_tmp_files: How many temporary files mysqld has created.(Counter)
mysql_global_status_select_full_join: The number of joins that perform table scans because they do not use indexes. If this value is not 0, you should carefully check the indexes of your tables.(Counter)
mysql_global_status_select_full_range_join: The number of joins that used a range search on a reference table.(Counter)
mysql_global_status_select_range: The number of joins that used ranges on the first table. This is normally not a critical issue even if the value is quite large.(Counter)
mysql_global_status_select_range_check: The number of joins without keys that check for key usage after each row. If this is not 0, you should carefully check the indexes of your tables.(Counter)
mysql_global_status_select_scan: The number of joins that did a full scan of the first table.(Counter)
mysql_global_status_sort_rows: The number of sorted rows.(Counter)
mysql_global_status_sort_range: The number of sorts that were done using ranges.(Counter)
mysql_global_status_sort_merge_passes: The number of merge passes that the sort algorithm has had to do. If this value is large, you should consider increasing the value of the sort_buffer_size system variable.(Counter)
mysql_global_status_sort_scan: The number of sorts that were done by scanning the table.(Counter)
mysql_global_status_slow_queries: The number of queries that have taken more than long_query_time seconds. This counter increments regardless of whether the slow query log is enabled.(Counter)
mysql_global_status_aborted_connects: The number of failed attempts to connect to the MySQL server.(Counter)
mysql_global_status_aborted_clients: The number of connections that were aborted because the client died without closing the connection properly.(Counter)
mysql_global_status_table_locks_immediate: The number of times that a request for a table lock could be granted immediately. Locks Immediate rising and falling is normal activity.(Counter)
mysql_global_status_table_locks_waited: The number of times that a request for a table lock could not be granted immediately and a wait was needed. If this is high and you have performance problems, you should first optimize your queries, and then either split your table or tables or use replication.(Counter)
mysql_global_status_bytes_received: The number of bytes received from all clients.(Counter)
mysql_global_status_bytes_sent: The number of bytes sent to all clients.(Counter)
mysql_global_status_innodb_page_size: InnoDB page size (default 16KB). Many values are counted in pages; the page size enables them to be easily converted to bytes.(Gauge)
mysql_global_status_buffer_pool_pages: The number of pages in the InnoDB buffer pool.(Gauge)
mysql_global_status_commands_total: The number of times each xxx statement has been executed.(Counter)
mysql_global_status_handlers_total: Handler statistics are internal statistics on how MySQL is selecting, updating, inserting, and modifying rows, tables, and indexes. This is in fact the layer between the Storage Engine and MySQL.(Counter)
mysql_global_status_opened_files: The number of files that have been opened with my_open() (a mysys library function). Parts of the server that open files without using this function do not increment the count.(Counter)
mysql_global_status_open_tables: The number of tables that are open.(Gauge)
mysql_global_status_opened_tables: The number of tables that have been opened. If Opened_tables is big, your table_open_cache value is probably too small.(Counter)
mysql_global_status_table_open_cache_hits: The number of hits for open tables cache lookups.(Counter)
mysql_global_status_table_open_cache_misses: The number of misses for open tables cache lookups.(Counter)
mysql_global_status_table_open_cache_overflows: The number of overflows for the open tables cache.(Counter)
mysql_global_status_innodb_num_open_files: The number of files InnoDB currently holds open.(Gauge)
mysql_global_status_connection_errors_total: These variables provide information about errors that occur during the client connection process.(Counter)
mysql_global_status_innodb_buffer_pool_read_requests: The number of logical read requests.(Counter)
mysql_global_status_innodb_buffer_pool_reads: The number of logical reads that InnoDB could not satisfy from the buffer pool, and had to read directly from disk.(Counter)
mysql_global_variables_thread_cache_size: How many threads the server should cache for reuse.(Gauge)
mysql_global_variables_max_connections: The maximum permitted number of simultaneous client connections.(Gauge)
mysql_global_variables_innodb_buffer_pool_size: The size in bytes of the buffer pool, the memory area where InnoDB caches table and index data. The default value is 134217728 bytes (128MB).(Gauge)
mysql_global_variables_innodb_log_buffer_size: The size in bytes of the buffer that InnoDB uses to write to the log files on disk.(Gauge)
mysql_global_variables_key_buffer_size: Index blocks for MyISAM tables are buffered and are shared by all threads.(Gauge)
mysql_global_variables_query_cache_size: The amount of memory allocated for caching query results.(Gauge)
mysql_global_variables_table_open_cache: The number of open tables for all threads.(Gauge)
mysql_global_variables_open_files_limit: The number of file descriptors available to mysqld from the operating system.(Gauge)
# [redis_exporter]
redis_active_defrag_running: When activedefrag is enabled, this indicates whether defragmentation is currently active, and the CPU percentage it intends to utilize.
redis_allocator_active_bytes: Total bytes in the allocator active pages, this includes external-fragmentation.
redis_allocator_allocated_bytes: Total bytes allocated form the allocator, including internal-fragmentation. Normally the same as used_memory.
redis_allocator_frag_bytes: Delta between allocator_active and allocator_allocated. See note about mem_fragmentation_bytes.
redis_allocator_frag_ratio: Ratio between allocator_active and allocator_allocated. This is the true (external) fragmentation metric (not mem_fragmentation_ratio).
redis_allocator_resident_bytes: Total bytes resident (RSS) in the allocator, this includes pages that can be released to the OS (by MEMORY PURGE, or just waiting).
redis_allocator_rss_bytes: Delta between allocator_resident and allocator_active.
redis_allocator_rss_ratio: Ratio between allocator_resident and allocator_active. This usually indicates pages that the allocator can and probably will soon release back to the OS.
redis_aof_current_rewrite_duration_sec: Duration of the on-going AOF rewrite operation if any.
redis_aof_enabled: Flag indicating AOF logging is activated.
redis_aof_last_bgrewrite_status: Status of the last AOF rewrite operation.
redis_aof_last_cow_size_bytes: The size in bytes of copy-on-write memory during the last AOF rewrite operation.
redis_aof_last_rewrite_duration_sec: Duration of the last AOF rewrite operation in seconds.
redis_aof_last_write_status: Status of the last write operation to the AOF.
redis_aof_rewrite_in_progress: Flag indicating a AOF rewrite operation is on-going.
redis_aof_rewrite_scheduled: Flag indicating an AOF rewrite operation will be scheduled once the on-going RDB save is complete.
redis_blocked_clients: Number of clients pending on a blocking call (BLPOP, BRPOP, BRPOPLPUSH, BLMOVE, BZPOPMIN, BZPOPMAX).
redis_client_recent_max_input_buffer_bytes: Biggest input buffer among current client connections.
redis_client_recent_max_output_buffer_bytes: Biggest output buffer among current client connections.
redis_cluster_enabled: Indicate Redis cluster is enabled.
redis_commands_duration_seconds_total: The total CPU time consumed by these commands.(Counter)
redis_commands_processed_total: Total number of commands processed by the server.(Counter)
redis_commands_total: The number of calls that reached command execution (not rejected).(Counter)
redis_config_maxclients: The value of the maxclients configuration directive. This is the upper limit for the sum of connected_clients, connected_slaves and cluster_connections.
redis_config_maxmemory: The value of the maxmemory configuration directive.
redis_connected_clients: Number of client connections (excluding connections from replicas).
redis_connected_slaves: Number of connected replicas.
redis_connections_received_total: Total number of connections accepted by the server.(Counter)
redis_cpu_sys_children_seconds_total: System CPU consumed by the background processes.(Counter)
redis_cpu_sys_seconds_total: System CPU consumed by the Redis server, which is the sum of system CPU consumed by all threads of the server process (main thread and background threads).(Counter)
redis_cpu_user_children_seconds_total: User CPU consumed by the background processes.(Counter)
redis_cpu_user_seconds_total: User CPU consumed by the Redis server, which is the sum of user CPU consumed by all threads of the server process (main thread and background threads).(Counter)
redis_db_keys: Total number of keys by DB.
redis_db_keys_expiring: Total number of expiring keys by DB
redis_defrag_hits: Number of value reallocations performed by active the defragmentation process.
redis_defrag_misses: Number of aborted value reallocations started by the active defragmentation process.
redis_defrag_key_hits: Number of keys that were actively defragmented.
redis_defrag_key_misses: Number of keys that were skipped by the active defragmentation process.
redis_evicted_keys_total: Number of evicted keys due to maxmemory limit.(Counter)
redis_expired_keys_total: Total number of key expiration events.(Counter)
redis_expired_stale_percentage: The percentage of keys probably expired.
redis_expired_time_cap_reached_total: The count of times that active expiry cycles have stopped early.
redis_exporter_last_scrape_connect_time_seconds: The duration(in seconds) to connect when scrape.
redis_exporter_last_scrape_duration_seconds: The last scrape duration.
redis_exporter_last_scrape_error: The last scrape error status.
redis_exporter_scrape_duration_seconds_count: Durations of scrapes by the exporter
redis_exporter_scrape_duration_seconds_sum: Durations of scrapes by the exporter
redis_exporter_scrapes_total: Current total redis scrapes.(Counter)
redis_instance_info: Information about the Redis instance.
redis_keyspace_hits_total: Hits total.(Counter)
redis_keyspace_misses_total: Misses total.(Counter)
redis_last_key_groups_scrape_duration_milliseconds: Duration of the last key group metrics scrape in milliseconds.
redis_last_slow_execution_duration_seconds: The amount of time needed for last slow execution, in seconds.
redis_latest_fork_seconds: The amount of time needed for last fork, in seconds.
redis_lazyfree_pending_objects: The number of objects waiting to be freed (as a result of calling UNLINK, or FLUSHDB and FLUSHALL with the ASYNC option).
redis_master_repl_offset: The server's current replication offset.
redis_mem_clients_normal: Memory used by normal clients.(Gauge)
redis_mem_clients_slaves: Memory used by replica clients - Starting Redis 7.0, replica buffers share memory with the replication backlog, so this field can show 0 when replicas don't trigger an increase of memory usage.
redis_mem_fragmentation_bytes: Delta between used_memory_rss and used_memory. Note that when the total fragmentation bytes is low (few megabytes), a high ratio (e.g. 1.5 and above) is not an indication of an issue.
redis_mem_fragmentation_ratio: Ratio between used_memory_rss and used_memory. Note that this doesn't only includes fragmentation, but also other process overheads (see the allocator_* metrics), and also overheads like code, shared libraries, stack, etc.
redis_mem_not_counted_for_eviction_bytes: (Gauge)
redis_memory_max_bytes: Max memory limit in bytes.
redis_memory_used_bytes: Total number of bytes allocated by Redis using its allocator (either standard libc, jemalloc, or an alternative allocator such as tcmalloc)
redis_memory_used_dataset_bytes: The size in bytes of the dataset (used_memory_overhead subtracted from used_memory)
redis_memory_used_lua_bytes: Number of bytes used by the Lua engine.
redis_memory_used_overhead_bytes: The sum in bytes of all overheads that the server allocated for managing its internal data structures.
redis_memory_used_peak_bytes: Peak memory consumed by Redis (in bytes)
redis_memory_used_rss_bytes: Number of bytes that Redis allocated as seen by the operating system (a.k.a resident set size). This is the number reported by tools such as top(1) and ps(1)
redis_memory_used_scripts_bytes: Number of bytes used by cached Lua scripts
redis_memory_used_startup_bytes: Initial amount of memory consumed by Redis at startup in bytes
redis_migrate_cached_sockets_total: The number of sockets open for MIGRATE purposes
redis_net_input_bytes_total: Total input bytes(Counter)
redis_net_output_bytes_total: Total output bytes(Counter)
redis_process_id: Process ID
redis_pubsub_channels: Global number of pub/sub channels with client subscriptions
redis_pubsub_patterns: Global number of pub/sub pattern with client subscriptions
redis_rdb_bgsave_in_progress: Flag indicating a RDB save is on-going
redis_rdb_changes_since_last_save: Number of changes since the last dump
redis_rdb_current_bgsave_duration_sec: Duration of the on-going RDB save operation if any
redis_rdb_last_bgsave_duration_sec: Duration of the last RDB save operation in seconds
redis_rdb_last_bgsave_status: Status of the last RDB save operation
redis_rdb_last_cow_size_bytes: The size in bytes of copy-on-write memory during the last RDB save operation
redis_rdb_last_save_timestamp_seconds: Epoch-based timestamp of last successful RDB save
redis_rejected_connections_total: Number of connections rejected because of maxclients limit(Counter)
redis_repl_backlog_first_byte_offset: The master offset of the replication backlog buffer
redis_repl_backlog_history_bytes: Size in bytes of the data in the replication backlog buffer
redis_repl_backlog_is_active: Flag indicating replication backlog is active
redis_replica_partial_resync_accepted: The number of accepted partial resync requests(Gauge)
redis_replica_partial_resync_denied: The number of denied partial resync requests(Gauge)
redis_replica_resyncs_full: The number of full resyncs with replicas
redis_replication_backlog_bytes: Memory used by replication backlog
redis_second_repl_offset: The offset up to which replication IDs are accepted.
redis_slave_expires_tracked_keys: The number of keys tracked for expiry purposes (applicable only to writable replicas)(Gauge)
redis_slowlog_last_id: Last id of slowlog
redis_slowlog_length: Total slowlog
redis_start_time_seconds: Start time of the Redis instance since unix epoch in seconds.
redis_target_scrape_request_errors_total: Errors in requests to the exporter
redis_up: Flag indicating redis instance is up
redis_uptime_in_seconds: Number of seconds since Redis server start
# [windows_exporter]
windows_cpu_clock_interrupts_total: Total number of received and serviced clock tick interrupts(counter)
windows_cpu_core_frequency_mhz: Core frequency in megahertz(gauge)
windows_cpu_cstate_seconds_total: Time spent in low-power idle state(counter)
windows_cpu_dpcs_total: Total number of received and serviced deferred procedure calls (DPCs)(counter)
windows_cpu_idle_break_events_total: Total number of time processor was woken from idle(counter)
windows_cpu_interrupts_total: Total number of received and serviced hardware interrupts(counter)
windows_cpu_parking_status: Parking Status represents whether a processor is parked or not(gauge)
windows_cpu_processor_performance: Processor Performance is the average performance of the processor while it is executing instructions, as a percentage of the nominal performance of the processor. On some processors, Processor Performance may exceed 100%(gauge)
windows_cpu_time_total: Time that processor spent in different modes (idle, user, system, ...)(counter)
windows_cs_hostname: Labeled system hostname information as provided by ComputerSystem.DNSHostName and ComputerSystem.Domain(gauge)
windows_cs_logical_processors: ComputerSystem.NumberOfLogicalProcessors(gauge)
windows_cs_physical_memory_bytes: ComputerSystem.TotalPhysicalMemory(gauge)
windows_exporter_build_info: A metric with a constant '1' value labeled by version, revision, branch, and goversion from which windows_exporter was built.(gauge)
windows_exporter_collector_duration_seconds: Duration of a collection.(gauge)
windows_exporter_collector_success: Whether the collector was successful.(gauge)
windows_exporter_collector_timeout: Whether the collector timed out.(gauge)
windows_exporter_perflib_snapshot_duration_seconds: Duration of perflib snapshot capture(gauge)
windows_logical_disk_free_bytes: Free space in bytes (LogicalDisk.PercentFreeSpace)(gauge)
windows_logical_disk_idle_seconds_total: Seconds that the disk was idle (LogicalDisk.PercentIdleTime)(counter)
windows_logical_disk_read_bytes_total: The number of bytes transferred from the disk during read operations (LogicalDisk.DiskReadBytesPerSec)(counter)
windows_logical_disk_read_latency_seconds_total: Shows the average time, in seconds, of a read operation from the disk (LogicalDisk.AvgDiskSecPerRead)(counter)
windows_logical_disk_read_seconds_total: Seconds that the disk was busy servicing read requests (LogicalDisk.PercentDiskReadTime)(counter)
windows_logical_disk_read_write_latency_seconds_total: Shows the time, in seconds, of the average disk transfer (LogicalDisk.AvgDiskSecPerTransfer)(counter)
windows_logical_disk_reads_total: The number of read operations on the disk (LogicalDisk.DiskReadsPerSec)(counter)
windows_logical_disk_requests_queued: The number of requests queued to the disk (LogicalDisk.CurrentDiskQueueLength)(gauge)
windows_logical_disk_size_bytes: Total space in bytes (LogicalDisk.PercentFreeSpace_Base)(gauge)
windows_logical_disk_split_ios_total: The number of I/Os to the disk were split into multiple I/Os (LogicalDisk.SplitIOPerSec)(counter)
windows_logical_disk_write_bytes_total: The number of bytes transferred to the disk during write operations (LogicalDisk.DiskWriteBytesPerSec)(counter)
windows_logical_disk_write_latency_seconds_total: Shows the average time, in seconds, of a write operation to the disk (LogicalDisk.AvgDiskSecPerWrite)(counter)
windows_logical_disk_write_seconds_total: Seconds that the disk was busy servicing write requests (LogicalDisk.PercentDiskWriteTime)(counter)
windows_logical_disk_writes_total: The number of write operations on the disk (LogicalDisk.DiskWritesPerSec)(counter)
windows_net_bytes_received_total: (Network.BytesReceivedPerSec)(counter)
windows_net_bytes_sent_total: (Network.BytesSentPerSec)(counter)
windows_net_bytes_total: (Network.BytesTotalPerSec)(counter)
windows_net_current_bandwidth: (Network.CurrentBandwidth)(gauge)
windows_net_packets_outbound_discarded_total: (Network.PacketsOutboundDiscarded)(counter)
windows_net_packets_outbound_errors_total: (Network.PacketsOutboundErrors)(counter)
windows_net_packets_received_discarded_total: (Network.PacketsReceivedDiscarded)(counter)
windows_net_packets_received_errors_total: (Network.PacketsReceivedErrors)(counter)
windows_net_packets_received_total: (Network.PacketsReceivedPerSec)(counter)
windows_net_packets_received_unknown_total: (Network.PacketsReceivedUnknown)(counter)
windows_net_packets_sent_total: (Network.PacketsSentPerSec)(counter)
windows_net_packets_total: (Network.PacketsPerSec)(counter)
windows_os_info: OperatingSystem.Caption, OperatingSystem.Version(gauge)
windows_os_paging_free_bytes: OperatingSystem.FreeSpaceInPagingFiles(gauge)
windows_os_paging_limit_bytes: OperatingSystem.SizeStoredInPagingFiles(gauge)
windows_os_physical_memory_free_bytes: OperatingSystem.FreePhysicalMemory(gauge)
windows_os_process_memory_limix_bytes: OperatingSystem.MaxProcessMemorySize(gauge)
windows_os_processes: OperatingSystem.NumberOfProcesses(gauge)
windows_os_processes_limit: OperatingSystem.MaxNumberOfProcesses(gauge)
windows_os_time: OperatingSystem.LocalDateTime(gauge)
windows_os_timezone: OperatingSystem.LocalDateTime(gauge)
windows_os_users: OperatingSystem.NumberOfUsers(gauge)
windows_os_virtual_memory_bytes: OperatingSystem.TotalVirtualMemorySize(gauge)
windows_os_virtual_memory_free_bytes: OperatingSystem.FreeVirtualMemory(gauge)
windows_os_visible_memory_bytes: OperatingSystem.TotalVisibleMemorySize(gauge)
windows_service_info: A metric with a constant '1' value labeled with service information(gauge)
windows_service_start_mode: The start mode of the service (StartMode)(gauge)
windows_service_state: The state of the service (State)(gauge)
windows_service_status: The status of the service (Status)(gauge)
windows_system_context_switches_total: Total number of context switches (WMI source is PerfOS_System.ContextSwitchesPersec)(counter)
windows_system_exception_dispatches_total: Total number of exceptions dispatched (WMI source is PerfOS_System.ExceptionDispatchesPersec)(counter)
windows_system_processor_queue_length: Length of processor queue (WMI source is PerfOS_System.ProcessorQueueLength)(gauge)
windows_system_system_calls_total: Total number of system calls (WMI source is PerfOS_System.SystemCallsPersec)(counter)
windows_system_system_up_time: System boot time (WMI source is PerfOS_System.SystemUpTime)(gauge)
windows_system_threads: Current number of threads (WMI source is PerfOS_System.Threads)(gauge)
# [node_exporter]
# SYSTEM
# CPU context switch 次数
node_context_switches_total: context_switches
# Interrupts 次数
node_intr_total: Interrupts
# 运行的进程数
node_procs_running: Processes in runnable state
# 熵池大小
node_entropy_available_bits: Entropy available to random number generators
node_time_seconds: System time in seconds since epoch (1970)
node_boot_time_seconds: Node boot time, in unixtime
# CPU
node_cpu_seconds_total: Seconds the CPUs spent in each mode
node_load1: cpu load 1m
node_load5: cpu load 5m
node_load15: cpu load 15m
# MEM
# 内核态
# 用户追踪已从交换区获取但尚未修改的页面的内存
node_memory_SwapCached_bytes: Memory that keeps track of pages that have been fetched from swap but not yet been modified
# 内核用于缓存数据结构供自己使用的内存
node_memory_Slab_bytes: Memory used by the kernel to cache data structures for its own use
# slab中可回收的部分
node_memory_SReclaimable_bytes: SReclaimable - Part of Slab, that might be reclaimed, such as caches
# slab中不可回收的部分
node_memory_SUnreclaim_bytes: Part of Slab, that cannot be reclaimed on memory pressure
# Vmalloc内存区的大小
node_memory_VmallocTotal_bytes: Total size of vmalloc memory area
# vmalloc已分配的内存虚拟地址空间上的连续的内存
node_memory_VmallocUsed_bytes: Amount of vmalloc area which is used
# vmalloc区可用的连续最大快的大小通过此指标可以知道vmalloc可分配连续内存的最大值
node_memory_VmallocChunk_bytes: Largest contigious block of vmalloc area which is free
# 内存的硬件故障删除掉的内存页的总大小
node_memory_HardwareCorrupted_bytes: Amount of RAM that the kernel identified as corrupted / not working
# 用于在虚拟和物理内存地址之间映射的内存
node_memory_PageTables_bytes: Memory used to map between virtual and physical memory addresses (gauge)
# 内核栈内存,常驻内存,不可回收
node_memory_KernelStack_bytes: Kernel memory stack. This is not reclaimable
# 用来访问高端内存复制高端内存的临时buffer称为“bounce buffering”会降低I/O 性能
node_memory_Bounce_bytes: Memory used for block device bounce buffers
#用户态
# 单个巨页大小
node_memory_Hugepagesize_bytes: Huge Page size
# 系统分配的常驻巨页数
node_memory_HugePages_Total: Total size of the pool of huge pages
# 系统空闲的巨页数
node_memory_HugePages_Free: Huge pages in the pool that are not yet allocated
# 进程已申请但未使用的巨页数
node_memory_HugePages_Rsvd: Huge pages for which a commitment to allocate from the pool has been made, but no allocation
# 超过系统设定的常驻HugePages数量的个数
node_memory_HugePages_Surp: Huge pages in the pool above the value in /proc/sys/vm/nr_hugepages
# 透明巨页 Transparent HugePages (THP)
node_memory_AnonHugePages_bytes: Memory in anonymous huge pages
# inactivelist中的File-backed内存
node_memory_Inactive_file_bytes: File-backed memory on inactive LRU list
# inactivelist中的Anonymous内存
node_memory_Inactive_anon_bytes: Anonymous and swap cache on inactive LRU list, including tmpfs (shmem)
# activelist中的File-backed内存
node_memory_Active_file_bytes: File-backed memory on active LRU list
# activelist中的Anonymous内存
node_memory_Active_anon_bytes: Anonymous and swap cache on active least-recently-used (LRU) list, including tmpfs
# 禁止换出的页,对应 Unevictable 链表
node_memory_Unevictable_bytes: Amount of unevictable memory that can't be swapped out for a variety of reasons
# 共享内存
node_memory_Shmem_bytes: Used shared memory (shared between several processes, thus including RAM disks)
# 匿名页内存大小
node_memory_AnonPages_bytes: Memory in user pages not backed by files
# 被关联的内存页大小
node_memory_Mapped_bytes: Used memory in mapped pages files which have been mmaped, such as libraries
# file-backed内存页缓存大小
node_memory_Cached_bytes: Parked file data (file content) cache
# 系统中有多少匿名页曾经被swap-out、现在又被swap-in并且swap-in之后页面中的内容一直没发生变化
node_memory_SwapCached_bytes: Memory that keeps track of pages that have been fetched from swap but not yet been modified
# 被mlock()系统调用锁定的内存大小
node_memory_Mlocked_bytes: Size of pages locked to memory using the mlock() system call
# 块设备(block device)所占用的缓存页
node_memory_Buffers_bytes: Block device (e.g. harddisk) cache
node_memory_SwapTotal_bytes: Memory information field SwapTotal_bytes
node_memory_SwapFree_bytes: Memory information field SwapFree_bytes
# DISK
node_filesystem_files_free: Filesystem space available to non-root users in byte
node_filesystem_free_bytes: Filesystem free space in bytes
node_filesystem_size_bytes: Filesystem size in bytes
node_filesystem_files_free: Filesystem total free file nodes
node_filesystem_files: Filesystem total free file nodes
node_filefd_maximum: Max open files
node_filefd_allocated: Open files
node_filesystem_readonly: Filesystem read-only status
node_filesystem_device_error: Whether an error occurred while getting statistics for the given device
node_disk_reads_completed_total: The total number of reads completed successfully
node_disk_writes_completed_total: The total number of writes completed successfully
node_disk_reads_merged_total: The number of reads merged
node_disk_writes_merged_total: The number of writes merged
node_disk_read_bytes_total: The total number of bytes read successfully
node_disk_written_bytes_total: The total number of bytes written successfully
node_disk_io_time_seconds_total: Total seconds spent doing I/Os
node_disk_read_time_seconds_total: The total number of seconds spent by all reads
node_disk_write_time_seconds_total: The total number of seconds spent by all writes
node_disk_io_time_weighted_seconds_total: The weighted of seconds spent doing I/Os
# NET
node_network_receive_bytes_total: Network device statistic receive_bytes (counter)
node_network_transmit_bytes_total: Network device statistic transmit_bytes (counter)
node_network_receive_packets_total: Network device statistic receive_bytes
node_network_transmit_packets_total: Network device statistic transmit_bytes
node_network_receive_errs_total: Network device statistic receive_errs
node_network_transmit_errs_total: Network device statistic transmit_errs
node_network_receive_drop_total: Network device statistic receive_drop
node_network_transmit_drop_total: Network device statistic transmit_drop
node_nf_conntrack_entries: Number of currently allocated flow entries for connection tracking
node_sockstat_TCP_alloc: Number of TCP sockets in state alloc
node_sockstat_TCP_inuse: Number of TCP sockets in state inuse
node_sockstat_TCP_orphan: Number of TCP sockets in state orphan
node_sockstat_TCP_tw: Number of TCP sockets in state tw
node_netstat_Tcp_CurrEstab: Statistic TcpCurrEstab
node_sockstat_sockets_used: Number of IPv4 sockets in use
# [kafka_exporter]
kafka_brokers: count of kafka_brokers (gauge)
kafka_topic_partitions: Number of partitions for this Topic (gauge)
kafka_topic_partition_current_offset: Current Offset of a Broker at Topic/Partition (gauge)
kafka_consumergroup_current_offset: Current Offset of a ConsumerGroup at Topic/Partition (gauge)
kafka_consumer_lag_millis: Current approximation of consumer lag for a ConsumerGroup at Topic/Partition (gauge)
kafka_topic_partition_under_replicated_partition: 1 if Topic/Partition is under Replicated
# [zookeeper_exporter]
zk_znode_count: The total count of znodes stored
zk_ephemerals_count: The number of Ephemerals nodes
zk_watch_count: The number of watchers setup over Zookeeper nodes.
zk_approximate_data_size: Size of data in bytes that a zookeeper server has in its data tree
zk_outstanding_requests: Number of currently executing requests
zk_packets_sent: Count of the number of zookeeper packets sent from a server
zk_packets_received: Count of the number of zookeeper packets received by a server
zk_num_alive_connections: Number of active clients connected to a zookeeper server
zk_open_file_descriptor_count: Number of file descriptors that a zookeeper server has open
zk_max_file_descriptor_count: Maximum number of file descriptors that a zookeeper server can open
zk_avg_latency: Average time in milliseconds for requests to be processed
zk_min_latency: Maximum time in milliseconds for a request to be processed
zk_max_latency: Minimum time in milliseconds for a request to be processed

View File

@@ -0,0 +1,216 @@
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
import sys
import json
import urllib2
import smtplib
from email.mime.text import MIMEText
reload(sys)
sys.setdefaultencoding('utf8')
notify_channel_funcs = {
"email":"email",
"sms":"sms",
"voice":"voice",
"dingtalk":"dingtalk",
"wecom":"wecom",
"feishu":"feishu"
}
mail_host = "smtp.163.com"
mail_port = 994
mail_user = "ulricqin"
mail_pass = "password"
mail_from = "ulricqin@163.com"
class Sender(object):
@classmethod
def send_email(cls, payload):
if mail_user == "ulricqin" and mail_pass == "password":
print("invalid smtp configuration")
return
users = payload.get('event').get("notify_users_obj")
emails = {}
for u in users:
if u.get("email"):
emails[u.get("email")] = 1
if not emails:
return
recipients = emails.keys()
mail_body = payload.get('tpls').get("mailbody.tpl", "mailbody.tpl not found")
message = MIMEText(mail_body, 'html', 'utf-8')
message['From'] = mail_from
message['To'] = ", ".join(recipients)
message["Subject"] = payload.get('tpls').get("subject.tpl", "subject.tpl not found")
try:
smtp = smtplib.SMTP_SSL(mail_host, mail_port)
smtp.login(mail_user, mail_pass)
smtp.sendmail(mail_from, recipients, message.as_string())
smtp.close()
except smtplib.SMTPException, error:
print(error)
@classmethod
def send_wecom(cls, payload):
users = payload.get('event').get("notify_users_obj")
tokens = {}
for u in users:
contacts = u.get("contacts")
if contacts.get("wecom_robot_token", ""):
tokens[contacts.get("wecom_robot_token", "")] = 1
opener = urllib2.build_opener(urllib2.HTTPHandler())
method = "POST"
for t in tokens:
url = "https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key={}".format(t)
body = {
"msgtype": "markdown",
"markdown": {
"content": payload.get('tpls').get("wecom.tpl", "wecom.tpl not found")
}
}
request = urllib2.Request(url, data=json.dumps(body))
request.add_header("Content-Type",'application/json;charset=utf-8')
request.get_method = lambda: method
try:
connection = opener.open(request)
print(connection.read())
except urllib2.HTTPError, error:
print(error)
@classmethod
def send_dingtalk(cls, payload):
event = payload.get('event')
users = event.get("notify_users_obj")
rule_name = event.get("rule_name")
event_state = "Triggered"
if event.get("is_recovered"):
event_state = "Recovered"
tokens = {}
phones = {}
for u in users:
if u.get("phone"):
phones[u.get("phone")] = 1
contacts = u.get("contacts")
if contacts.get("dingtalk_robot_token", ""):
tokens[contacts.get("dingtalk_robot_token", "")] = 1
opener = urllib2.build_opener(urllib2.HTTPHandler())
method = "POST"
for t in tokens:
url = "https://oapi.dingtalk.com/robot/send?access_token={}".format(t)
body = {
"msgtype": "markdown",
"markdown": {
"title": "{} - {}".format(event_state, rule_name),
"text": payload.get('tpls').get("dingtalk.tpl", "dingtalk.tpl not found") + ' '.join(["@"+i for i in phones.keys()])
},
"at": {
"atMobiles": phones.keys(),
"isAtAll": False
}
}
request = urllib2.Request(url, data=json.dumps(body))
request.add_header("Content-Type",'application/json;charset=utf-8')
request.get_method = lambda: method
try:
connection = opener.open(request)
print(connection.read())
except urllib2.HTTPError, error:
print(error)
@classmethod
def send_feishu(cls, payload):
users = payload.get('event').get("notify_users_obj")
tokens = {}
phones = {}
for u in users:
if u.get("phone"):
phones[u.get("phone")] = 1
contacts = u.get("contacts")
if contacts.get("feishu_robot_token", ""):
tokens[contacts.get("feishu_robot_token", "")] = 1
opener = urllib2.build_opener(urllib2.HTTPHandler())
method = "POST"
for t in tokens:
url = "https://open.feishu.cn/open-apis/bot/v2/hook/{}".format(t)
body = {
"msg_type": "text",
"content": {
"text": payload.get('tpls').get("feishu.tpl", "feishu.tpl not found")
},
"at": {
"atMobiles": phones.keys(),
"isAtAll": False
}
}
request = urllib2.Request(url, data=json.dumps(body))
request.add_header("Content-Type",'application/json;charset=utf-8')
request.get_method = lambda: method
try:
connection = opener.open(request)
print(connection.read())
except urllib2.HTTPError, error:
print(error)
@classmethod
def send_sms(cls, payload):
users = payload.get('event').get("notify_users_obj")
phones = {}
for u in users:
if u.get("phone"):
phones[u.get("phone")] = 1
if phones:
print("send_sms not implemented, phones: {}".format(phones.keys()))
@classmethod
def send_voice(cls, payload):
users = payload.get('event').get("notify_users_obj")
phones = {}
for u in users:
if u.get("phone"):
phones[u.get("phone")] = 1
if phones:
print("send_voice not implemented, phones: {}".format(phones.keys()))
def main():
payload = json.load(sys.stdin)
with open(".payload", 'w') as f:
f.write(json.dumps(payload, indent=4))
for ch in payload.get('event').get('notify_channels'):
send_func_name = "send_{}".format(notify_channel_funcs.get(ch.strip()))
if not hasattr(Sender, send_func_name):
print("function: {} not found", send_func_name)
continue
send_func = getattr(Sender, send_func_name)
send_func(payload)
def hello():
print("hello nightingale")
if __name__ == "__main__":
if len(sys.argv) == 1:
main()
elif sys.argv[1] == "hello":
hello()
else:
print("I am confused")

68
docker/n9eetc/script/notify.py Executable file
View File

@@ -0,0 +1,68 @@
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
import sys
import json
class Sender(object):
@classmethod
def send_email(cls, payload):
# already done in go code
pass
@classmethod
def send_wecom(cls, payload):
# already done in go code
pass
@classmethod
def send_dingtalk(cls, payload):
# already done in go code
pass
@classmethod
def send_feishu(cls, payload):
# already done in go code
pass
@classmethod
def send_sms(cls, payload):
users = payload.get('event').get("notify_users_obj")
phones = {}
for u in users:
if u.get("phone"):
phones[u.get("phone")] = 1
if phones:
print("send_sms not implemented, phones: {}".format(phones.keys()))
@classmethod
def send_voice(cls, payload):
users = payload.get('event').get("notify_users_obj")
phones = {}
for u in users:
if u.get("phone"):
phones[u.get("phone")] = 1
if phones:
print("send_voice not implemented, phones: {}".format(phones.keys()))
def main():
payload = json.load(sys.stdin)
with open(".payload", 'w') as f:
f.write(json.dumps(payload, indent=4))
for ch in payload.get('event').get('notify_channels'):
send_func_name = "send_{}".format(ch.strip())
if not hasattr(Sender, send_func_name):
print("function: {} not found", send_func_name)
continue
send_func = getattr(Sender, send_func_name)
send_func(payload)
def hello():
print("hello nightingale")
if __name__ == "__main__":
if len(sys.argv) == 1:
main()
elif sys.argv[1] == "hello":
hello()
else:
print("I am confused")

View File

@@ -0,0 +1,9 @@
.phony: all
all: plugin
.phony: plugin
plugin:
export GOPROXY=http://goproxy.cn,direct
go build -buildmode=plugin -o notify.so notify.go

View File

@@ -0,0 +1,35 @@
通过go plugin模式处理告警通知
---
相比于调用py脚本方式该方式一般无需考虑依赖问题
### (1) 编写动态链接库逻辑
```go
package main
type inter interface {
Descript() string
Notify([]byte)
}
// 0、Descript 可用于该插件在 server 中的描述
// 1、在 Notify 方法中实现要处理的自定义逻辑
```
实现以上接口的 `struct` 实例即为合法 `plugin`
### (2) 构建链接库
参考 `notify.go` 实现方式,执行 `make` 后可以看到生成一个 `notify.so` 链接文件,放到 n9e 对应项目位置即可
### (3) 更新 n9e 配置
```text
[Alerting.CallPlugin]
Enable = false
PluginPath = "./etc/script/notify.so"
# 注意此处caller必须在notify.so中作为变量暴露,首字母必须大写才能暴露
Caller = "N9eCaller"
```

Some files were not shown because too many files have changed in this diff Show More