Compare commits

...

347 Commits

Author SHA1 Message Date
ning
8f8f24ccfe refactor: change sql-template api 2023-11-30 16:57:42 +08:00
ning
0f2257b8bb Merge branch 'main' of github.com:ccfos/nightingale 2023-11-30 12:06:59 +08:00
ning
8bd99f13c1 docs: change i18n 2023-11-30 12:06:41 +08:00
shardingHe
f8deb89592 feat: add gorm logger (#1768)
* add gorm logger

* add gorm logger implement, slow log save to warning-level file

* optimize gorm logger

* optimize code

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-11-30 12:02:23 +08:00
Yening Qin
701407581b feat: support multi resources get (#1780)
* get alert rules by gids

* add perm check

* code refactor

* add tasks api

* code refactor

* add site-info api

* update targets api

* add /site-settings
2023-11-28 15:37:40 +08:00
ning
ba2ee05bc0 docs: rename clone board name 2023-11-23 20:38:12 +08:00
ning
c6e649129e refactor: role ops api 2023-11-22 14:49:10 +08:00
Yening Qin
329249ea99 refactor: ops i18n (#1778)
* ops i18n

* update gomod

* code refactor
2023-11-22 13:57:30 +08:00
Yening Qin
65d8a30396 refactor: datasource api (#1777)
* ds refactor
2023-11-20 23:28:15 +08:00
ning
e29a45c4a3 refactor: update role_operation 2023-11-17 17:13:44 +08:00
Yening Qin
438078cdc5 target add agent version (#1776)
* add agent-version
2023-11-17 16:57:56 +08:00
ning
ae07ba7523 docs: update default cas config 2023-11-15 12:14:06 +08:00
ning
f201b12dd8 Merge branch 'main' of github.com:ccfos/nightingale 2023-11-15 11:58:36 +08:00
ning
ee5322f406 change alert_rule callbacks length 2023-11-15 11:58:21 +08:00
Yening Qin
60a2e0c963 feat: writer support mult addrs (#1775)
* support mult write
2023-11-15 10:56:13 +08:00
Ulric Qin
2b55ed9b46 code refactor 2023-11-15 10:33:38 +08:00
ning
68eb7cb57e docs: update migrate sql 2023-11-13 17:54:33 +08:00
ning
6387b601b1 docs: add auto migrate sql 2023-11-13 17:52:10 +08:00
ning
af58fa8802 code refactor 2023-11-13 17:40:31 +08:00
ning
80daea5744 Merge branch 'main' of github.com:ccfos/nightingale 2023-11-13 17:33:38 +08:00
ning
bf9a471484 docs: add auto migrate sql 2023-11-13 17:31:56 +08:00
shardingHe
195ed9761c docs: built-in dashboards remove defaultValue for datasource (#1773)
Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-11-13 14:31:13 +08:00
Ulric Qin
fdc0123681 update logo 2023-11-11 22:21:15 +08:00
ning
6fd75ae552 Merge branch 'main' of github.com:ccfos/nightingale 2023-11-10 14:42:31 +08:00
ning
10c462a477 fix: insert ibex-settings perm point 2023-11-10 14:42:18 +08:00
Ulric Qin
694c43292a add api: /api/n9e/user/busi-groups 2023-11-09 20:07:30 +08:00
ning
cfa78dc9e2 feat: alert_subscribe support note and disabled 2023-11-09 00:15:36 +08:00
ning
cc80f5b685 fix: add ibex-settings perm point 2023-11-08 19:08:58 +08:00
ning
58f4a11669 refactor: datasource update 2023-11-08 18:17:22 +08:00
Yening Qin
4f57624a67 refactor: timeteries handle hook (#1772)
* refactor:  timeteries handle hook
2023-11-08 15:11:39 +08:00
Yening Qin
9558520dcd feat: host filter support * (#1769) 2023-11-07 11:19:18 +08:00
dependabot[bot]
8ded3623a4 build(deps): bump golang.org/x/image from 0.5.0 to 0.10.0 (#1767)
Bumps [golang.org/x/image](https://github.com/golang/image) from 0.5.0 to 0.10.0.
- [Commits](https://github.com/golang/image/compare/v0.5.0...v0.10.0)

---
updated-dependencies:
- dependency-name: golang.org/x/image
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-11-07 11:12:54 +08:00
ning
12fcca2faf docs: change integrations 2023-11-03 15:32:27 +08:00
xtan
9dc20fc674 fix: fix windows dashboard (#1762) 2023-10-27 19:32:23 +08:00
Lars Lehtonen
16430550d1 models: fix clobbered error (#1764) 2023-10-27 19:32:07 +08:00
ning
b34b66785d change notify api perm 2023-10-27 19:22:13 +08:00
ning
2cf38b6027 docs: update built-in linux dashboard 2023-10-27 19:08:46 +08:00
ning
e1b4edaa68 refactor: user-variable-configs api 2023-10-27 18:58:22 +08:00
ning
97f3f70d57 refactor perm api 2023-10-27 18:13:04 +08:00
ning
cee0ce6620 code refactor 2023-10-27 17:40:11 +08:00
ning
89b659695f add ops 2023-10-27 17:35:00 +08:00
Yening Qin
d52848ab1b feat: http support print body log (#1757)
* add log
2023-10-24 11:37:34 +08:00
shardingHe
314a8d71ef Fix:use text template to replace data (#1756)
* add text/template func, avoid escape issues caused by special characters

* rename the function of template

* change use of template. ReplaceTemplateUseHtml replaced to ReplaceTemplateUseText

* change use of template. ReplaceTemplateUseHtml replaced to ReplaceTemplateUseText

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-10-23 17:22:34 +08:00
shardingHe
bfa85cd8f1 fix: add func for tplx, avoid escape issues caused by special characters (#1755)
* add text/template func, avoid escape issues caused by special characters

* rename the function of template

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-10-23 16:35:08 +08:00
shardingHe
2254cb1f87 fix: move dashboards to the right place (#1753)
* move dashboards to the right place

* move dashboards to the right place

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-10-21 11:49:48 +08:00
ning
17cbfb8453 change alert_rule extra_config type 2023-10-20 11:26:30 +08:00
ning
6b89b7b4a5 change configs cval type 2023-10-20 10:20:08 +08:00
shardingHe
4fe5828d8d feat: macro variable in Configs (#1725)
* configs form user crud

* configs form user crud

* code refactor

* code refactor

* configs for user variable update & add InitRSAConfig for alert.go

* configs for user variable update

* remove InitRSAConfig for alert.go

* add config of Center.Encryption

* migrate for config and set default value

* add annotation for Center.Encryption

* code refactor

* code refactor

* code refactor again

* code refactor

* remove userVariableCheck bool return

* remove userVariableCheck bool return

* optimize InitRSAConfig

* optimize InitRSAConfig

* optimize InitRSAConfig

* macro variable

* optimize config

* remove test function user-variable-ras

* code refactor config_cache

* code refactor again

* code refactor again

* ReplaceMacroVariables return string value & optimize code

* add user variable check on ckey, it is not recommended to use the period "."

* change configs ckey+external=uniqueness

* configs add add external value

* configs remove isInternal check(ckey is no longer unique field)

* update userVariableCheck

* migrate drop unique filed limit

* refactor rsa generate logic. read key pairs from config.toml(HTTP.RSA) if file is not exist generate rsa key paris and saving to database(configs).

* Improve the RSA key generation logic. In the first step, attempt to read key pairs from the database or the 'config.toml' file. If they don't exist, generate a new set of key pairs and store them in the database.

* optimize code

* ckey use C-style naming convention and optimize smtp validations

* change email logger. print host & port

* update init rsa config logically

* update migrate about DropUniqueFiled

* update migrate about DropUniqueFiled

* make migrating at the first step

* move config of rsa. generate in db.

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
Co-authored-by: Yening Qin <710leo@gmail.com>
2023-10-19 15:02:59 +08:00
Yening Qin
98422d696e refactor: heartbeat api (#1747) 2023-10-17 17:19:37 +08:00
shardingHe
e3103faeae refactor: Integrations sync from categraf (#1739)
* sync fist step

* integrations sync 14

* integrations sync 20

* integrations sync

* integrations sync 1 & add aliyun alerts

* update v2 dashboard to v3, and move dashboard to cloudwatch, cAdvisor, nginx_upstream_check. make sure all dashboards's name is unique.

* update readme

* delete nsq readme file

* rename switch_legacy,delete sockstat

* Rename the directory to match the lowercase plugin name

* add AMD_ROCm_SMI

* add ipvs

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-10-17 11:05:52 +08:00
Ulric Qin
0b23ddffb2 ignore rsa 2023-10-16 16:23:50 +08:00
Ulric Qin
37fa12e214 remove rust wait tool 2023-10-16 13:29:14 +08:00
Ulric Qin
328f8ac125 Merge branch 'main' of github.com:ccfos/nightingale 2023-10-16 12:25:21 +08:00
Ulric Qin
744749d22b remove rust wait tool 2023-10-16 12:25:09 +08:00
Yening Qin
a56fd039b4 fix eval query err (#1743) 2023-10-16 12:18:27 +08:00
Ulric Qin
e16867b72a code refactor 2023-10-16 12:12:09 +08:00
Ulric Qin
ff6756447b refactor docker compose configurations 2023-10-15 17:05:35 +08:00
Ulric Qin
546980a906 add tcpx.WaitHosts 2023-10-15 14:26:54 +08:00
ning
f93e2ad4b6 fix: push sample 2023-10-13 11:54:31 +08:00
ning
68732d6b31 refactor: rsa generate file 2023-10-12 20:33:42 +08:00
shardingHe
c3b8146e7f fix: rsa decrypt & generate method (#1740)
* Maintain consistency between encryption and decryption methods

* Maintain consistency between encryption and decryption methods

* Maintain consistency between encryption and decryption methods

* optimize code

* optimize code

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-10-12 19:34:55 +08:00
ning
271b7ca8a5 refactor get rule type 2023-10-12 19:33:05 +08:00
ning
2fa87cc428 fix: tdengine alert inhibit 2023-10-12 16:44:39 +08:00
ning
ffa7c4ee79 docs: delete ops.yaml 2023-10-12 15:50:33 +08:00
ning
86c4374238 refactor: unified heartbeat api 2023-10-12 15:27:02 +08:00
dependabot[bot]
4ee8f1b9ad build(deps): bump golang.org/x/net from 0.10.0 to 0.17.0 (#1738)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.10.0 to 0.17.0.
- [Commits](https://github.com/golang/net/compare/v0.10.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-12 14:47:43 +08:00
ning
47dcd2b054 refactor: update go.mod 2023-10-12 14:44:45 +08:00
yuweizzz
d099e6b85c refactor: convert task script line endings (#1717) 2023-10-09 10:30:39 +08:00
shardingHe
320401e8f3 fix: sync event ID for edge model (#1710)
* sync event id for edge model

* update PostByUrlsWithResp

* update test file

* update test file

* update mock struct

* update mock struct

* update mock struct

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-10-09 10:29:04 +08:00
Ulric Qin
6d75244f8f Merge branch 'main' of github.com:ccfos/nightingale 2023-10-09 07:01:27 +08:00
Ulric Qin
05ac5d51b5 forward metrics one by one 2023-10-09 07:01:13 +08:00
shardingHe
2b7c9a9673 feat: merge operation permission points from built-in (#1729)
* merge operation permission points from build-in

* optimizing spelling

* optimizing ops

* remove duplicate role operation

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-10-07 11:49:52 +08:00
Yening Qin
f76bfdf6b3 Datasource add is_default column (#1728)
* ds add is_default

* code refactor
2023-09-28 15:05:14 +08:00
Lars Lehtonen
7ebb70b896 prom: fix dropped errors (#1727) 2023-09-27 20:27:18 +08:00
Ulric Qin
960ed6bf70 Merge branch 'main' of github.com:ccfos/nightingale 2023-09-27 10:00:30 +08:00
Ulric Qin
b4fd4f1087 code refactor 2023-09-27 09:59:49 +08:00
shardingHe
109d6db1fc fix: Optimize user config (#1724)
* optimize initRSAFile

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-09-26 10:18:38 +08:00
shardingHe
07e2d9ed10 feat: configs from user crud (#1714)
* configs form user crud

* configs for user variable update & add InitRSAConfig for alert.go

* configs for user variable update

* remove InitRSAConfig for alert.go

* add config of Center.Encryption

* migrate for config and set default value

* add annotation for Center.Encryption

* remove userVariableCheck bool return

* optimize InitRSAConfig

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
Co-authored-by: Yening Qin <710leo@gmail.com>
2023-09-25 20:44:07 +08:00
ning
c5cd6c0337 docs: update integrations 2023-09-25 13:27:35 +08:00
Yening Qin
fe1d566326 feat: add tdengine datasource (#1722)
* refactor: add alert stats

* add tdengine

* add sql tpl api
2023-09-25 11:40:20 +08:00
Lars Lehtonen
cedc918a09 pkg/secu: fix dropped error (#1698) 2023-09-25 11:03:00 +08:00
shardingHe
1e6c0865dd feat: merge i18n configuration (#1701)
* merge expand config and build-in, prioritize the settings within the expand config options in case of conflicts

* code refactor

* code refactor

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-09-22 17:49:22 +08:00
shardingHe
7649986b55 fix: oauth2 add debug log (#1706)
* oauth2 add CallbackOutput debug log

* code refactor

* code refactor

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-09-22 17:48:14 +08:00
yanli
86a82b409a Update user_group.go (#1718)
Fix :修复同步用户组接口BUG
2023-09-21 17:15:11 +08:00
710leo
f6ad9bdf82 add oidc log 2023-09-19 17:27:50 +08:00
Yening Qin
a647526084 config api (#1716) 2023-09-19 16:29:12 +08:00
qifenggang
44ed90e181 set target_ident when cache is not sync (#1708) 2023-09-18 17:51:59 +08:00
shardingHe
3e7273701d feat: add host ip for target (#1711)
* add host ip for target

* update host ip from heartbeat

* update host ip from heartbeat

* batch update host ip from heartbeat

* batch update host ip from heartbeat

* refactor code

* update target when either gid or host_ip has a new value

* rollback

* add debug log

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-09-14 14:45:16 +08:00
Ulric Qin
d77ed30940 add some sso logs 2023-09-11 18:30:30 +08:00
yuweizzz
5ae80e67a3 Update http_response_by_categraf.json (#1704) 2023-09-08 08:34:46 +08:00
idcdog
184389be33 1、增加victoriametrics单级版本视图(视图项定义参考官方grafana版本);2、针对集群版本视图调整datasource命名以便适配最新版本n9e (#1702) 2023-09-07 08:42:02 +08:00
ning
c1f022001f refactor: loki datasource 2023-09-01 23:14:08 +08:00
Tripitakav
616d56d515 feat: support loki datasources (#1700) 2023-09-01 23:07:27 +08:00
ning
10a0b5099e add debug log 2023-08-30 17:38:01 +08:00
chixianliangGithub
0815605298 Update reader.go (#1689)
remove  invalid code
2023-08-30 10:24:26 +08:00
ning
2df3216b32 refactor: event stats api 2023-08-29 15:33:24 +08:00
shardingHe
74491c666d Compute statistics for alert_cur_events (#1697)
* add statistics of alert_cur_events

* code refactor

* code refactor

* code refactor

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-08-29 15:15:32 +08:00
Ulric Qin
29a2eb6f2f code refactor 2023-08-28 11:13:29 +08:00
Ulric Qin
baf56746ce Merge branch 'main' of github.com:ccfos/nightingale 2023-08-28 10:33:48 +08:00
Ulric Qin
5867c5af8f update dashboard table demo 2023-08-28 10:33:34 +08:00
shardingHe
4a358f5cff fix: Automatically migrate recent table structure changes. (#1696)
* auto migrate alerting_engines[engine_cluster],task_record[event_id],chart_share[datasource_id],recording_rule[datasource_ids]

* auto migrate code refactor

* auto migrate drop chart_share

* auto migrate drop dashboard_id

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-08-25 18:46:22 +08:00
Ulric Qin
13f2b008fd code refactor 2023-08-25 17:52:09 +08:00
Ulric Qin
84400cd657 code refactor 2023-08-25 10:20:41 +08:00
Ulric Qin
f2a3a6933e Merge branch 'main' of github.com:ccfos/nightingale 2023-08-25 10:15:04 +08:00
Ulric Qin
0a4d1cad4c add linux_by_categraf_zh.json 2023-08-25 10:14:52 +08:00
李明
08f472f9ee fix: PostgreSQL bool int convert error (#1695)
* fix: postgres bool int

* fix: code optimize

* fix: updateby

* optimize

* Update es_index_pattern.go

---------

Co-authored-by: Yening Qin <710leo@gmail.com>
2023-08-24 20:53:14 +08:00
ning
7f73945c8d Merge branch 'main' of github.com:ccfos/nightingale 2023-08-24 16:09:43 +08:00
ning
56a7860b5a docs: update integrations 2023-08-24 16:07:55 +08:00
shardingHe
25dab86b8e feat: add mute preview and an option to delete the active alerts (#1692)
* add mute preview and an option to delete these active alerts(del_alert_cur:true)


---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
Co-authored-by: Yening Qin <710leo@gmail.com>
2023-08-24 15:56:42 +08:00
Ulric Qin
35b90ca162 upgrade mysql version in docker-compose 2023-08-23 16:43:09 +08:00
Ulric Qin
5babee6de3 Merge branch 'main' of github.com:ccfos/nightingale 2023-08-22 20:43:38 +08:00
Ulric Qin
7567d440a9 update metrics.yaml 2023-08-22 20:43:17 +08:00
shardingHe
2ecd799dab fix: attempt to send an email (#1691)
* After configuring the SMTP, attempt to send the email.

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
Co-authored-by: Yening Qin <710leo@gmail.com>
2023-08-22 17:43:12 +08:00
shardingHe
5b3561f983 fix: alert pre-save validation (#1690)
* add interface of validation rule
---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
Co-authored-by: Yening Qin <710leo@gmail.com>
2023-08-22 17:42:05 +08:00
shardingHe
cce3711c02 fix: Edge config check (#1686)
* check the configuration for the CenterApi in the edge model

* set default timeout 5000 ms

* Update edge.go

* Update post.go

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
Co-authored-by: Yening Qin <710leo@gmail.com>
2023-08-21 16:46:55 +08:00
ning
9cdbda0828 refactor: alert rule mute strategy 2023-08-19 20:17:24 +08:00
ning
9c4775fd38 refactor: alert subscribe gets api 2023-08-18 11:20:13 +08:00
ning
212e0aa4c3 refactor: alert subscribe verify 2023-08-18 10:20:33 +08:00
xtan
05300ec0e9 fix: compatible with pg (#1683)
* fix: fix pg migrate

* fix: compatible with pg

* fix: compatible with pg
2023-08-18 10:00:00 +08:00
shardingHe
67fb49e54e After configuring the SMTP, attempt to send the email. (#1684)
* After configuring the SMTP, attempt to send the email.

* refactor code

* refactor code

* Update router_notify_config.go

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
Co-authored-by: Yening Qin <710leo@gmail.com>
2023-08-17 20:30:53 +08:00
shardingHe
7164b696b1 Change validate rule ignore non-default channels (#1680)
* change validate rule ignore non-default channels

* refactor code

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
Co-authored-by: Yening Qin <710leo@gmail.com>
2023-08-17 18:18:20 +08:00
Yening Qin
8728167733 feat: sso support skip tls verify (#1685)
* refactor: sso support skip tls verify

* fix: update oidc

* fix: cas init enable
2023-08-17 18:03:26 +08:00
ning
6e80a63b68 refactor: cur event del 2023-08-17 14:20:03 +08:00
kongfei605
9e43a22ec3 use redis.Cmdable instead of Redis (#1681) 2023-08-16 18:37:57 +08:00
Ulric Qin
49d8ed4a6f Merge branch 'main' of github.com:ccfos/nightingale 2023-08-16 17:44:03 +08:00
Ulric Qin
c7b537e6c7 expose tags_map 2023-08-16 17:43:48 +08:00
shardingHe
f1cdd2fa46 refactor: modify the alert_subscribe to make the datasource optional (#1679)
* subscribe change 'pord','datasource_ids' to optional item

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-08-16 14:23:16 +08:00
ning
3d5ad02274 feat: notification proxy supports http 2023-08-14 18:00:51 +08:00
Ulric Qin
1cb9f4becf code refactor 2023-08-14 15:01:36 +08:00
Ulric Qin
0d0dafbe49 code refactor 2023-08-14 15:00:05 +08:00
Ulric Qin
048d1df2d1 code refactor 2023-08-14 14:59:28 +08:00
ning
4fb4154e30 feat: add FormatDecimal 2023-08-10 23:15:09 +08:00
ning
0be69bbccd feat: add FormatDecimal 2023-08-10 22:58:36 +08:00
shardingHe
7015a40256 refactor: alert subscribe verify check (#1666)
* add BusiGroupFilter for alert_subscribe ,copy from TagFiler

* refactor BusiGroupFilter

* refactor BusiGroupFilter

* refactor BusiGroupFilter

* AlertSubscribe verify check

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-08-09 13:26:00 +08:00
Ulric Qin
03cca642e9 modify email words 2023-08-08 16:30:51 +08:00
ulricqin
579fd3780b Update community-governance.md 2023-08-08 10:55:14 +08:00
Ulric Qin
a85d91c10e Merge branch 'main' of github.com:ccfos/nightingale 2023-08-08 07:55:40 +08:00
Ulric Qin
af31c496a1 datasource checker for loki 2023-08-08 07:55:27 +08:00
shardingHe
f9efbaa954 refactor: use config arguments (#1665)
Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-08-07 19:03:15 +08:00
Ulric Qin
d541ec7f20 Merge branch 'main' of github.com:ccfos/nightingale 2023-08-07 09:16:05 +08:00
Ulric Qin
1d847e2c6f refactor datasource check of loki 2023-08-07 09:15:52 +08:00
xtan
2fedf4f075 docs: pg init sql (#1663) 2023-08-07 08:33:11 +08:00
Tripitakav
e9a02c4c80 refactor: sync rule to scheduler (#1657) 2023-08-07 08:29:50 +08:00
ning
8beaccdded refactor: GetTagFilters 2023-08-05 12:39:10 +08:00
shardingHe
af6003da6d feat: Add BusiGroupFilter for alert_subscribe (#1660)
* add BusiGroupFilter for alert_subscribe ,copy from TagFiler

* refactor BusiGroupFilter

* refactor BusiGroupFilter

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-08-04 18:30:00 +08:00
ning
76ac2cd013 refactor version api 2023-08-03 18:06:09 +08:00
ning
859876e3f8 change version api 2023-08-03 16:34:42 +08:00
ning
7d49e7fb34 feat: add github version api 2023-08-03 15:46:18 +08:00
ning
6c42ae9077 fix: query-range skip tls verify 2023-08-03 14:42:45 +08:00
Yening Qin
15dcc60407 refactor: proxy api (#1656)
* refactor: proxy api
2023-08-02 17:20:06 +08:00
Ulric Qin
5b811b7003 Merge branch 'main' of github.com:ccfos/nightingale 2023-08-02 16:22:09 +08:00
Ulric Qin
55d670fe3c code refactor 2023-08-02 16:21:57 +08:00
ning
ac3a5e52c7 docs: update ldap config 2023-08-02 14:16:20 +08:00
李明
2abe00e251 fix: post err process (#1653) 2023-08-02 13:33:44 +08:00
热心网友吴溢豪
1bd3c29e39 fix: open cstats init (#1654)
Co-authored-by: wuyh_1 <wuyh_1@chinatelecom.cn>
2023-08-02 13:28:36 +08:00
Ulric Qin
1a8087bda7 update zookeeper markdown 2023-08-02 09:36:24 +08:00
Ulric Qin
72b4c2b1ec update markdown of vmware 2023-08-02 09:20:09 +08:00
Ulric Qin
38e6820d7b update markdown of VictoriaMetrics 2023-08-02 09:10:14 +08:00
Ulric Qin
765b3a57fe update markdown of tomcat 2023-08-02 09:03:59 +08:00
Ulric Qin
1c4a32f8fa code refactor 2023-08-02 09:01:39 +08:00
Ulric Qin
3f258fcebf update markdown of springboot 2023-08-02 08:57:39 +08:00
Ulric Qin
140f2cbfa8 update markdown if snmp 2023-08-02 08:44:45 +08:00
Ulric Qin
6aacd77492 update markdown of redis 2023-08-02 08:34:51 +08:00
Ulric Qin
ef3f46f8b7 update markdown of integration RabbitMQ 2023-08-02 08:26:06 +08:00
Ulric Qin
0cdd25d2cf update markdown of integration Processes 2023-08-02 08:20:51 +08:00
Ulric Qin
5d02ce0636 update markdown of integration procstat 2023-08-02 08:17:41 +08:00
Ulric Qin
0cd1228ba7 update postgres markdown 2023-08-02 07:38:46 +08:00
Ulric Qin
0595401d14 update oracle markdown 2023-08-02 07:30:03 +08:00
Yening Qin
d724f8cc8e fix get tpl (#1655) 2023-08-02 00:48:02 +08:00
Ulric Qin
a3f5d458d7 add nginx markdown 2023-08-01 18:21:17 +08:00
Ulric Qin
76bfb130b0 code refactor 2023-08-01 18:05:50 +08:00
Ulric Qin
184bb78e3b add markdown of integration n9e 2023-08-01 17:49:49 +08:00
Ulric Qin
6a41af2cb2 update markdown of integration mysql 2023-08-01 17:42:30 +08:00
Ulric Qin
faa149cc87 code refactor 2023-08-01 17:26:10 +08:00
Ulric Qin
24592fe480 code refactor 2023-08-01 17:18:12 +08:00
Ulric Qin
4be53082e0 code refactor 2023-08-01 17:04:36 +08:00
Ulric Qin
ae8c9c668c code refactor 2023-08-01 16:52:21 +08:00
Ulric Qin
b0c15af04f code refactor 2023-08-01 16:39:46 +08:00
Ulric Qin
c05b710aff update markdown of kafka integration 2023-08-01 16:21:45 +08:00
Ulric Qin
4299c48aef update markdown of IPMI integration 2023-08-01 15:58:27 +08:00
Ulric Qin
ae0523dec0 code refactor 2023-08-01 15:50:26 +08:00
Ulric Qin
e18a6bda7b update markdown of integration http_response 2023-08-01 15:47:45 +08:00
Ulric Qin
e64be95f1c code refactor 2023-08-01 15:25:14 +08:00
Ulric Qin
a1aa0150f8 update markdown of integration elasticsearch 2023-08-01 15:14:27 +08:00
Ulric Qin
32f9cb5996 update markdown of ceph integration 2023-08-01 14:59:10 +08:00
Ulric Qin
3b7e692b01 update markdown of aliyun integration 2023-08-01 14:54:52 +08:00
Yening Qin
6491eba1da check datasource (#1651) 2023-07-31 15:32:03 +08:00
ning
bb7ea7e809 code refactor 2023-07-27 16:36:04 +08:00
ning
169930e3b8 docs: add markdown 2023-07-27 16:33:38 +08:00
ning
8e14047f36 docs: add markdown 2023-07-27 16:25:07 +08:00
ning
fd29a96f7b docs: remove jaeger 2023-07-27 15:24:58 +08:00
ning
820c12f230 docs: update markdown 2023-07-27 15:13:54 +08:00
ning
ff3550e7b3 docs: update markdown 2023-07-27 15:09:32 +08:00
ning
b65e43351d Merge branch 'main' of github.com:ccfos/nightingale 2023-07-27 15:02:58 +08:00
ning
3fb74b632b docs: update markdown 2023-07-27 15:02:35 +08:00
xtan
253e54344d docs: fix docker-compose for pg-vm (#1649) 2023-07-27 10:50:38 +08:00
ning
f1ee7d24a6 fix: sub rule filter 2023-07-26 18:14:45 +08:00
ning
475673b3e7 fix: admin role get targets 2023-07-26 16:49:38 +08:00
Yening Qin
dd49afef01 support markdown api and downtime select (#1645) 2023-07-25 17:06:25 +08:00
kongfei605
d0c842fe87 Merge pull request #1644 from ccfos/docker_update
install requests lib for python3
2023-07-25 11:19:14 +08:00
kongfei
b873bd161e install requests lib for python3 2023-07-25 11:18:28 +08:00
yimiaoxiehou
60b76b9ccc add static ttf file route (#1641)
Co-authored-by: chenzebin <chenzebin@ut.cn>
2023-07-20 19:40:03 +08:00
ning
ef39ee2f66 Merge branch 'main' of github.com:ccfos/nightingale 2023-07-20 18:01:21 +08:00
ning
6c83c2ef9b fix: panic when query data get cli is nil 2023-07-20 18:01:09 +08:00
李明
9495ec67ab feat: support index pattern datasource_id param (#1640)
* feat:support index pattern datasource id param
2023-07-20 14:09:21 +08:00
青牛踏雪
bb5680f6c4 fix windows dashboards label_values (#1639) 2023-07-20 11:49:20 +08:00
李明
acbe49f518 index pattern basic op (#1635)
* index pattern basic op
2023-07-20 11:21:51 +08:00
青牛踏雪
9dd55938c2 add Gitlab dashboard and alert rules based on categraf acquisition (#1636) 2023-07-20 11:17:21 +08:00
shardingHe
5433e6e27e AlertAggrView update verify (#1637)
Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-07-19 20:33:56 +08:00
ning
2dd6eb5f0f fix: get targets 2023-07-19 15:56:25 +08:00
Yening Qin
1731713dbb fix: get all target by guest user (#1634)
* fix: targets api get all

* code refactor
2023-07-19 15:08:10 +08:00
Ulric Qin
327ddb7bad code refactor 2023-07-17 19:47:58 +08:00
Ulric Qin
9e4adc1fa2 code refactor 2023-07-17 17:47:14 +08:00
Ulric Qin
bce7fdb470 code refactor 2023-07-17 17:13:56 +08:00
Ulric Qin
b79422962c code refactor 2023-07-17 17:13:39 +08:00
Ulric Qin
e5989ae5c2 rename integration Mongo to MongoDB 2023-07-17 17:11:27 +08:00
Ulric Qin
64feafa3a6 code refactor 2023-07-17 12:18:15 +08:00
Ulric Qin
52e4fa4d0d rename obs to dumper 2023-07-17 07:04:01 +08:00
Ulric Qin
6462c02861 rename obs to dumper 2023-07-17 07:01:19 +08:00
Ulric Qin
c657182659 refactor forward series 2023-07-16 11:35:56 +08:00
Ulric Qin
04d93eff34 refactor observe functions 2023-07-16 10:29:55 +08:00
Ulric Qin
40d60aeb4a add observe 2023-07-16 10:06:41 +08:00
Ulric Qin
ac875fa1b9 fix logger format output 2023-07-16 06:38:57 +08:00
shardingHe
b7c3e8a4f5 add interface of validation rule (#1606)
* add interface of validation rule

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
Co-authored-by: Yening Qin <710leo@gmail.com>
2023-07-14 14:16:35 +08:00
ning
2524e15947 Merge branch 'main' of github.com:ccfos/nightingale 2023-07-14 11:45:07 +08:00
ning
995c579403 docs: update built-in alert rule 2023-07-14 11:44:55 +08:00
Ulric Qin
848b7ac1ae Merge branch 'main' of github.com:ccfos/nightingale 2023-07-14 11:37:50 +08:00
Ulric Qin
9476b5ba7c code refactor 2023-07-14 11:37:39 +08:00
ning
7b58696bdc Merge branch 'main' of github.com:ccfos/nightingale 2023-07-14 11:16:19 +08:00
ning
6159178d99 set alert_rule.promql empty 2023-07-14 11:16:06 +08:00
青牛踏雪
99e5e0c117 add MinIO dashboard and alert rules based on categraf acquisition (#1625)
* add MinIO  dashboard and alert rules based on categraf acquisition

* add MinIO dashboard and alert rules based on categraf acquisition

* add MinIO dashboard and alert rules based on categraf acquisition

* add MinIO dashboard and alert rules based on categraf acquisition
2023-07-14 10:37:35 +08:00
Yening Qin
be1a3c1d8b sub and mute rule by severity (#1621)
* sub severity

* mute by severity
2023-07-13 11:16:32 +08:00
Yening Qin
f6378b055c docs: optimize the name of the integrations directory 2023-07-12 18:09:05 +08:00
Yening Qin
2574bb19cd rename integration ceph 2023-07-12 17:59:59 +08:00
Yening Qin
aa9d43cc69 rename integration 2023-07-12 17:58:20 +08:00
李明
d7f18ebec1 add mute hook (#1617) 2023-07-12 16:39:43 +08:00
Ulric Qin
b40f6976bb code refactor 2023-07-12 15:31:09 +08:00
Ulric Qin
cd1db57b7c code refactor 2023-07-12 15:05:58 +08:00
青牛踏雪
5a6ca42c75 add ceph dashboard and alert rules based on categraf acquisition (#1619) 2023-07-11 15:08:00 +08:00
Ulric Qin
80874a743c refactor logic: do not extract ident when ignore_ident exists 2023-07-11 14:56:53 +08:00
ulricqin
6cc612564f fix alert mute compute (#1618) 2023-07-11 10:12:09 +08:00
Yening Qin
909bbb5e66 refactor alert eval (#1616) 2023-07-10 18:49:15 +08:00
青牛踏雪
ff3ea7de58 update postgresql dashboard and alert rules based on categraf acquisition (#1613) 2023-07-07 15:09:27 +08:00
kongfei605
dd316e6ce1 alerts rule and dashboards for pg (#1612) 2023-07-07 14:04:55 +08:00
kongfei605
ba893e77cd update title of tidb alerts (#1611) 2023-07-07 14:04:15 +08:00
青牛踏雪
21904f1e39 add kafka dashboard and alert rules based on categraf acquisition (#1607)
* add kafka dashboard and alert rules based on categraf acquisition

* add kafka dashboard and alert rules based on categraf acquisition
2023-07-06 19:54:37 +08:00
kongfei605
b5d5ecbab2 Merge pull request #1605 from longzhuquan/main
添加TiDB大盘,告警规则
2023-07-06 16:11:19 +08:00
Talon
ee612908ac feat(Login): add rsa to password (#1604) 2023-07-06 16:09:04 +08:00
Yong Wang (IT)
2ee04dffac 添加TiDB大盘,告警规则 2023-07-06 15:54:39 +08:00
青牛踏雪
be25adf990 add dashboard and alert rules based on categraf acquisition (#1603) 2023-07-06 15:50:53 +08:00
dependabot[bot]
ab72b6e1ba build(deps): bump google.golang.org/grpc from 1.51.0 to 1.53.0 (#1602) 2023-07-06 15:50:04 +08:00
laiwei
a4718e7a45 use star-history 2023-07-04 11:26:01 +08:00
青牛踏雪
f948d50d8b add springboot actuator 2.0 dashboard (#1601) 2023-07-03 19:38:40 +08:00
Ulric Qin
cb797d5913 Merge branch 'main' of github.com:ccfos/nightingale 2023-07-03 19:13:34 +08:00
Ulric Qin
8941c192de code refactor 2023-07-03 19:13:23 +08:00
alick-liming
5b726c1e61 optimize i18n format (#1600) 2023-07-01 15:45:59 +08:00
xtan
03871a0bf0 feat: provide alert info to ibex via stdin (#1599)
* feat: provide alert info to ibex via stdin

* refactor: rename tags to stdin

* refactor: format json to ibex
2023-06-30 19:03:56 +08:00
青牛踏雪
e002e9cb8f add VictoriaMetrics New Alerts Rule & add VictoriaMetrics Images. (#1598) 2023-06-29 15:47:48 +08:00
qifenggang
d414831c79 heartbeat update target table update_at field (#1595)
Co-authored-by: qifenggang <qifenggang@sina.com>
2023-06-29 15:47:11 +08:00
alick-liming
89807ada94 i18n const -> var (#1594) 2023-06-28 22:24:53 +08:00
青牛踏雪
351a31b079 fix ipmi readme.md (#1592)
* fix ipmi readme.md

* fix ipmi readme.md
2023-06-28 14:25:12 +08:00
青牛踏雪
af0127c905 add the ipmi dashboards & alerts rules (#1588) 2023-06-27 21:07:10 +08:00
青牛踏雪
95612e7140 add the kube-state-metrics, prometheus, kube-controller-plane alarm & record rules (#1586) 2023-06-26 16:45:45 +08:00
ning
a338b5233c code refactor 2023-06-25 20:30:58 +08:00
ning
ad26225f63 refactor: recording rule model 2023-06-25 19:14:19 +08:00
ning
16db570f18 refactor datasource 2023-06-25 17:04:02 +08:00
ning
97c68360a1 Merge branch 'main' of ssh://github.com/ccfos/nightingale 2023-06-25 10:23:16 +08:00
ning
00192b9d0f code refactor 2023-06-25 10:23:04 +08:00
Ulric Qin
e745253d08 refactor integrations and add configuration: UseFileAssets 2023-06-22 18:27:25 +08:00
ning
76905c55d5 refactor: loki datasource check 2023-06-21 21:36:21 +08:00
kongfei605
d4bce5456b snmp & smart dashboards (#1581)
* snmp & smart dashboards

* update

* update snmp
2023-06-21 20:42:50 +08:00
Ulric Qin
58136d30e6 code refactor 2023-06-21 14:59:38 +08:00
Ulric Qin
563fb0330a code refactor 2023-06-21 14:35:36 +08:00
Ulric Qin
c2ab3b4240 Merge branch 'main' of github.com:ccfos/nightingale 2023-06-21 14:05:22 +08:00
Ulric Qin
f5dde6e4d6 fix wrong descriptions 2023-06-21 14:05:08 +08:00
青牛踏雪
a9779703dd add AliYun monitor dashboard & readme.md (#1579) 2023-06-20 15:37:08 +08:00
青牛踏雪
9f4a9e77ae add vmware RabbitMQ monitor dashboard & alerts & readme.md (#1578) 2023-06-20 13:45:11 +08:00
Ulric Qin
df37071c3d refactor pushgw 2023-06-20 10:34:14 +08:00
ning
fa164ac5d2 Merge branch 'main' of ssh://github.com/ccfos/nightingale 2023-06-16 18:17:21 +08:00
ning
f5de4c3f22 refactor db2fe 2023-06-16 18:17:14 +08:00
ning
dd9099af0a refactor db2fe 2023-06-16 18:13:28 +08:00
dependabot[bot]
5bdb63a818 build(deps): bump golang.org/x/image (#1575)
Bumps [golang.org/x/image](https://github.com/golang/image) from 0.0.0-20190501045829-6d32002ffd75 to 0.5.0.
- [Commits](https://github.com/golang/image/commits/v0.5.0)

---
updated-dependencies:
- dependency-name: golang.org/x/image
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-16 17:58:27 +08:00
ning
8a4c709e87 refactor models 2023-06-16 17:44:26 +08:00
xtan
75f6e07c40 feat: add verification code for login (#1566)
* feat: add verification code for login

* feat: 支持图形验证码开关
2023-06-16 14:47:11 +08:00
Yening Qin
de9b11a049 recording rule add query configs (#1574)
* add query config

* migrate table
2023-06-16 14:40:54 +08:00
ning
067b3f91a7 refactor: change default notify tpl 2023-06-15 19:21:18 +08:00
ning
5d215a89b6 refactor: optimize alert mute when time is 23:59:xx 2023-06-15 18:02:25 +08:00
ning
63679c15dd Merge branch 'main' of ssh://github.com/ccfos/nightingale 2023-06-15 17:01:29 +08:00
ning
38229a43dc refactor: notify tpl 2023-06-15 17:01:17 +08:00
青牛踏雪
1d1ae238d4 add elasticsearch_by_categraf monitor dashboard & alerts & markdown (#1573) 2023-06-15 16:30:06 +08:00
ning
c2d300c0f1 Merge branch 'main' of ssh://github.com/ccfos/nightingale 2023-06-15 16:17:38 +08:00
ning
bcb89017a0 refactor: remove default notify template file 2023-06-15 16:17:26 +08:00
Ulric Qin
e04a3eed5f Merge branch 'main' of github.com:ccfos/nightingale 2023-06-15 15:26:13 +08:00
Ulric Qin
e77cf40938 add n9e v6 dashboard 2023-06-15 15:25:58 +08:00
ning
cb66b19d70 refactor: change configs.cval length 2023-06-15 14:35:21 +08:00
ning
9edf05c19a Merge branch 'main' of ssh://github.com/ccfos/nightingale 2023-06-15 11:22:39 +08:00
ning
6a6b4a2283 update ops 2023-06-15 11:22:26 +08:00
青牛踏雪
0473bb3925 add springboot actuator monitor dashboard & alerts & markdown (#1571) 2023-06-15 08:04:00 +08:00
ning
4afc3a60a4 Merge branch 'main' of ssh://github.com/ccfos/nightingale 2023-06-14 13:56:13 +08:00
shardingHe
e9c9a3ac58 feat: notify tpl support add and delete (#1567)
* notifyTpl add and delete

* notifyTpl add and delete

* optimization notifyTpl

* optimization notifyTpl

* optimization notifyTpl

* optimization notifyTpl

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-06-14 13:51:53 +08:00
ning
98260e239e Merge branch 'main' of ssh://github.com/ccfos/nightingale 2023-06-14 13:26:18 +08:00
ning
f751b2034d fix: recovery event tags being lost after promql modification 2023-06-14 13:26:01 +08:00
青牛踏雪
9ce22a33f0 add vmware vsphere monitor dashboard & alerts & readme.md (#1565) 2023-06-13 07:18:04 +08:00
laiwei
3da64ca0fe refine readme 2023-06-12 20:49:25 +08:00
ning
9a883dc02c refactor: feishu_card sender 2023-06-12 12:21:49 +08:00
Ulric Qin
5ab6fe7e56 code refactor 2023-06-12 10:22:51 +08:00
shardingHe
c730eaa860 Move feishucard (#1563)
* Fix an exception situation where the prod and cate fields cannot be updated.

* add feishucard.tpl

* move feishucard to v6

---------

Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-06-11 21:04:02 +08:00
ning
5ba2d6bc8e fix: concurrent map writes 2023-06-09 17:49:34 +08:00
ning
64feee79ff Merge branch 'main' of ssh://github.com/ccfos/nightingale 2023-06-09 10:07:54 +08:00
ning
c490ab09ad fix: cli upgrade 2023-06-09 10:07:42 +08:00
shardingHe
61762e894c Fix: an exception situation where the prod and cate fields cannot be updated. (#1561)
Co-authored-by: shardingHe <wangzihe@flashcat.cloud>
2023-06-08 15:05:29 +08:00
ning
ac4ff33dff refactor: remove phone space 2023-06-07 10:22:44 +08:00
ning
72abeea51f add user login log 2023-06-06 13:41:22 +08:00
ning
6ec2b42669 code refactor 2023-06-05 15:30:10 +08:00
ning
a93e967d30 refactor: update target_up 2023-06-05 14:59:56 +08:00
ning
b5984b7871 Merge branch 'main' of ssh://github.com/ccfos/nightingale 2023-06-05 14:42:40 +08:00
ning
70ccbbc929 update target_up 2023-06-05 14:42:28 +08:00
dependabot[bot]
79d4fc508c build(deps): bump github.com/gin-gonic/gin from 1.9.0 to 1.9.1 (#1559)
Bumps [github.com/gin-gonic/gin](https://github.com/gin-gonic/gin) from 1.9.0 to 1.9.1.
- [Release notes](https://github.com/gin-gonic/gin/releases)
- [Changelog](https://github.com/gin-gonic/gin/blob/master/CHANGELOG.md)
- [Commits](https://github.com/gin-gonic/gin/compare/v1.9.0...v1.9.1)

---
updated-dependencies:
- dependency-name: github.com/gin-gonic/gin
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-03 20:38:05 +08:00
ning
794f0f874f change HostDatasourceId 2023-06-02 13:12:53 +08:00
Ulric Qin
aff53e8be3 Merge branch 'main' of github.com:ccfos/nightingale 2023-06-02 12:11:47 +08:00
Ulric Qin
2de6847323 refactor fe.sh 2023-06-02 12:11:35 +08:00
ning
eed037a3a1 change heartbeat api 2023-06-02 11:57:15 +08:00
ning
4099c467bb code refactor 2023-06-02 11:42:40 +08:00
ning
6b51adbc9a code refactor 2023-06-02 11:29:30 +08:00
ning
307be1dda2 fix: datasource bind to alert engine where update 2023-06-02 11:22:05 +08:00
ning
7da6145ec6 fix: promClients hit 2023-06-02 10:19:42 +08:00
Ulric Qin
0e4298a592 use standard http client instead of beego client 2023-06-02 09:58:10 +08:00
Ulric Qin
037fab74eb code refactor 2023-06-02 09:20:24 +08:00
Ulric Qin
fb849928c9 code refactor 2023-06-02 08:33:41 +08:00
Ulric Qin
7833aae0a1 code refactor 2023-06-02 08:13:08 +08:00
Ulric Qin
6edd71b1f0 code refactor 2023-06-02 08:06:40 +08:00
ulricqin
2f2f310a40 add hearbeat api for pushgw (#1560) 2023-06-02 08:06:23 +08:00
Ulric Qin
14bfdaa2ee code refactor 2023-06-02 07:39:04 +08:00
Ulric Qin
ffd0a69e43 fix: leaking connections 2023-06-02 07:31:21 +08:00
Ulric Qin
5b79d0ef46 code refactor 2023-06-01 21:20:21 +08:00
Ulric Qin
8f2a885a7d code refactor 2023-06-01 21:16:51 +08:00
Ulric Qin
31f6300c16 code refactor 2023-06-01 21:09:04 +08:00
Ulric Qin
54710c22f0 code refactor 2023-06-01 21:03:18 +08:00
Ulric Qin
352aa2b6b1 code refactor 2023-06-01 20:57:36 +08:00
Ulric Qin
624e5b5e62 debug 2023-06-01 20:45:51 +08:00
ning
65e3b5c8f1 fix: goreleaser 2023-06-01 20:28:28 +08:00
ning
750732f203 docs: update makefile 2023-06-01 20:16:23 +08:00
ning
9957711643 add n9e-edge 2023-06-01 20:01:29 +08:00
Ulric Qin
8f4fb0d28b code refactor for fe.sh 2023-06-01 19:28:54 +08:00
Ulric Qin
5d63f23cfc code refactor 2023-06-01 18:27:00 +08:00
Ulric Qin
c0fb8d22db code refactor 2023-06-01 18:02:01 +08:00
ulricqin
1732b297b1 refactor basic auth configurations: merge HTTP.Pushgw and HTTP.Heartbeat to HTTP.APIForAgent; merge HTTP.Alert and HTTP.Service to HTTP.APIForService (#1558) 2023-06-01 16:23:19 +08:00
660 changed files with 117360 additions and 16935 deletions

11
.gitignore vendored
View File

@@ -30,7 +30,11 @@ _test
/dist
/etc/*.local.yml
/etc/*.local.conf
/etc/rsa/*
/etc/plugins/*.local.yml
/etc/script/rules.yaml
/etc/script/alert-rules.json
/etc/script/record-rules.json
/data*
/tarball
/run
@@ -40,9 +44,12 @@ _test
/n9e
/docker/pub
/docker/n9e
/docker/mysqldata
/docker/experience_pg_vm/pgdata
/docker/compose-bridge/mysqldata
/docker/compose-host-network/mysqldata
/docker/compose-postgres/pgdata
/etc.local*
/front/statik/statik.go
/docker/compose-bridge/etc-nightingale/rsa/
.alerts
.idea

View File

@@ -15,7 +15,8 @@ builds:
- id: build
hooks:
pre:
- ./fe.sh
- cmd: sh -x ./fe.sh
output: true
main: ./cmd/center/
binary: n9e
env:
@@ -41,22 +42,9 @@ builds:
ldflags:
- -s -w
- -X github.com/ccfos/nightingale/v6/pkg/version.Version={{ .Tag }}-{{.Commit}}
- id: build-alert
main: ./cmd/alert/
binary: n9e-alert
env:
- CGO_ENABLED=0
goos:
- linux
goarch:
- amd64
- arm64
ldflags:
- -s -w
- -X github.com/ccfos/nightingale/v6/pkg/version.Version={{ .Tag }}-{{.Commit}}
- id: build-pushgw
main: ./cmd/pushgw/
binary: n9e-pushgw
- id: build-edge
main: ./cmd/edge/
binary: n9e-edge
env:
- CGO_ENABLED=0
goos:
@@ -73,8 +61,7 @@ archives:
builds:
- build
- build-cli
- build-alert
- build-pushgw
- build-edge
format: tar.gz
format_overrides:
- goos: windows
@@ -84,7 +71,6 @@ archives:
files:
- docker/*
- etc/*
- pub/*
- integrations/*
- cli/*
- n9e.sql
@@ -104,7 +90,6 @@ dockers:
- build
dockerfile: docker/Dockerfile.goreleaser
extra_files:
- pub
- etc
- integrations
use: buildx
@@ -118,7 +103,6 @@ dockers:
- build
dockerfile: docker/Dockerfile.goreleaser.arm64
extra_files:
- pub
- etc
- integrations
use: buildx

View File

@@ -1,4 +1,4 @@
.PHONY: prebuild start build
.PHONY: prebuild build
ROOT:=$(shell pwd -P)
GIT_COMMIT:=$(shell git --work-tree ${ROOT} rev-parse 'HEAD^{commit}')
@@ -6,16 +6,19 @@ _GIT_VERSION:=$(shell git --work-tree ${ROOT} describe --tags --abbrev=14 "${GIT
TAG=$(shell echo "${_GIT_VERSION}" | awk -F"-" '{print $$1}')
RELEASE_VERSION:="$(TAG)-$(GIT_COMMIT)"
all: prebuild build
prebuild:
echo "begin download and embed the front-end file..."
sh fe.sh
echo "front-end file download and embedding completed."
all: build
build:
go build -ldflags "-w -s -X github.com/ccfos/nightingale/v6/pkg/version.Version=$(RELEASE_VERSION)" -o n9e ./cmd/center/main.go
build-edge:
go build -ldflags "-w -s -X github.com/ccfos/nightingale/v6/pkg/version.Version=$(RELEASE_VERSION)" -o n9e-edge ./cmd/edge/
build-alert:
go build -ldflags "-w -s -X github.com/ccfos/nightingale/v6/pkg/version.Version=$(RELEASE_VERSION)" -o n9e-alert ./cmd/alert/main.go
@@ -28,10 +31,10 @@ build-cli:
run:
nohup ./n9e > n9e.log 2>&1 &
run_alert:
run-alert:
nohup ./n9e-alert > n9e-alert.log 2>&1 &
run_pushgw:
run-pushgw:
nohup ./n9e-pushgw > n9e-pushgw.log 2>&1 &
release:

View File

@@ -1,6 +1,6 @@
<p align="center">
<a href="https://github.com/ccfos/nightingale">
<img src="doc/img/nightingale_logo_h.png" alt="nightingale - cloud native monitoring" width="240" /></a>
<img src="doc/img/Nightingale_L_V.png" alt="nightingale - cloud native monitoring" width="240" /></a>
</p>
<p align="center">
@@ -20,36 +20,77 @@
<img alt="License" src="https://img.shields.io/badge/license-Apache--2.0-blue"/>
</p>
<p align="center">
告警管理专家,一体化开源观测平台!
An open-source cloud-native monitoring system that is <b>all-in-one</b> <br/>
<b>Out-of-the-box</b>, it integrates data collection, visualization, and monitoring alert <br/>
We recommend upgrading your <b>Prometheus + AlertManager + Grafana</b> combination to Nightingale!
</p>
[English](./README_en.md) | [中文](./README.md)
[English](./README.md) | [中文](./README_zh.md)
## 资料
- 文档:[https://flashcat.cloud/docs/](https://flashcat.cloud/docs/)
- 论坛提问:[https://answer.flashcat.cloud/](https://answer.flashcat.cloud/)
- 报Bug[https://github.com/ccfos/nightingale/issues](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&projects=&template=bug_report.yml)
- 商业版本:[企业版](https://mp.weixin.qq.com/s/FOwnnGPkRao2ZDV574EHrw) | [专业版](https://mp.weixin.qq.com/s/uM2a8QUDJEYwdBpjkbQDxA) 感兴趣请 [联系我们交流试用](https://flashcat.cloud/contact/)
## Highlighted Features
## 功能和特点
- **Out-of-the-box**
- Supports multiple deployment methods such as **Docker, Helm Chart, and cloud services**, integrates data collection, monitoring, and alerting into one system, and comes with various monitoring dashboards, quick views, and alert rule templates. **It greatly reduces the construction cost, learning cost, and usage cost of cloud-native monitoring systems**.
- **Professional Alerting**
- Provides visual alert configuration and management, supports various alert rules, offers the ability to configure silence and subscription rules, supports multiple alert delivery channels, and has features such as alert self-healing and event management.
- **Cloud-Native**
- Quickly builds an enterprise-level cloud-native monitoring system through a turnkey approach, supports multiple collectors such as [Categraf](https://github.com/flashcatcloud/categraf), Telegraf, and Grafana-agent, supports multiple data sources such as Prometheus, VictoriaMetrics, M3DB, ElasticSearch, and Jaeger, and is compatible with importing Grafana dashboards. **It seamlessly integrates with the cloud-native ecosystem**.
- **High Performance and High Availability**
- Due to the multi-data-source management engine of Nightingale and its excellent architecture design, and utilizing a high-performance time-series database, it can handle data collection, storage, and alert analysis scenarios with billions of time-series data, saving a lot of costs.
- Nightingale components can be horizontally scaled with no single point of failure. It has been deployed in thousands of enterprises and tested in harsh production practices. Many leading Internet companies have used Nightingale for cluster machines with hundreds of nodes, processing billions of time-series data.
- **Flexible Extension and Centralized Management**
- Nightingale can be deployed on a 1-core 1G cloud host, deployed in a cluster of hundreds of machines, or run in Kubernetes. Time-series databases, alert engines, and other components can also be decentralized to various data centers and regions, balancing edge deployment with centralized management. **It solves the problem of data fragmentation and lack of unified views**.
- **统一接入各种时序库**:支持对接 Prometheus、VictoriaMetrics、Thanos、Mimir、M3DB 等多种时序库,实现统一告警管理
- **专业告警能力**:内置支持多种告警规则,可以扩展支持所有通知媒介,支持告警屏蔽、告警抑制、告警自愈、告警事件管理
- **无缝搭配 [FlashDuty](https://flashcat.cloud/product/flashcat-duty/)**实现告警聚合收敛、认领、升级、排班、IM集成确保告警处理不遗漏减少打扰更好协同
- **支持所有常见采集器**:支持 categraf、telegraf、grafana-agent、datadog-agent、给类 exporter 作为采集器,没有什么数据是不能监控的
- **统一的观测平台**:从 v6 版本开始,支持接入 ElasticSearch、Jaeger 数据源,逐步实现日志、链路、指标的一体化观测
## 产品示意图
#### If you are using Prometheus and have one or more of the following requirement scenarios, it is recommended that you upgrade to Nightingale:
- Multiple systems such as Prometheus, Alertmanager, Grafana, etc. are fragmented and lack a unified view and cannot be used out of the box;
- The way to manage Prometheus and Alertmanager by modifying configuration files has a big learning curve and is difficult to collaborate;
- Too much data to scale-up your Prometheus cluster;
- Multiple Prometheus clusters running in production environments, which faced high management and usage costs;
#### If you are using Zabbix and have the following scenarios, it is recommended that you upgrade to Nightingale:
- Monitoring too much data and wanting a better scalable solution;
- A high learning curve and a desire for better efficiency of collaborative use in a multi-person, multi-team model;
- Microservice and cloud-native architectures with variable monitoring data lifecycles and high monitoring data dimension bases, which are not easily adaptable to the Zabbix data model;
#### If you are using [open-falcon](https://github.com/open-falcon/falcon-plus), we recommend you to upgrade to Nightingale
- For more information about open-falcon and Nightingale, please refer to read [Ten features and trends of cloud-native monitoring](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
## Getting Started
[https://n9e.github.io/](https://n9e.github.io/)
## Screenshots
https://user-images.githubusercontent.com/792850/216888712-2565fcea-9df5-47bd-a49e-d60af9bd76e8.mp4
## Architecture
## 加入交流群
<img src="doc/img/arch-product.png" width="600">
欢迎加入 QQ 交流群群号479290895也可以扫下方二维码加入微信交流群
Nightingale monitoring can receive monitoring data reported by various collectors (such as [Categraf](https://github.com/flashcatcloud/categraf) , telegraf, grafana-agent, Prometheus, etc.) and write them to various popular time-series databases (such as Prometheus, M3DB, VictoriaMetrics, Thanos, TDEngine, etc.). It provides configuration capabilities for alert rules, silence rules, and subscription rules, as well as the ability to view monitoring data. It also provides automatic alarm self-healing mechanisms (such as automatically calling back to a webhook address or executing a script after an alarm is triggered), and the ability to store and manage historical alarm events and view them in groups.
<img src="doc/img/wecom.png" width="240">
If the performance of a standalone time-series database (such as Prometheus) has bottlenecks or poor disaster recovery, we recommend using [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics). The VictoriaMetrics architecture is relatively simple, has excellent performance, and is easy to deploy and maintain. The architecture diagram is as shown above. For more detailed documentation on VictoriaMetrics, please refer to its [official website](https://victoriametrics.com/).
**We welcome you to participate in the Nightingale open-source project and community in various ways, including but not limited to**
- Adding and improving documentation => [n9e.github.io](https://n9e.github.io/)
- Sharing your best practices and experience in using Nightingale monitoring => [Article sharing]((https://n9e.github.io/docs/prologue/share/))
- Submitting product suggestions => [github issue](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md)
- Submitting code to make Nightingale monitoring faster, more stable, and easier to use => [github pull request](https://github.com/didi/nightingale/pulls)
**Respecting, recognizing, and recording the work of every contributor** is the first guiding principle of the Nightingale open-source community. We advocate effective questioning, which not only respects the developer's time but also contributes to the accumulation of knowledge in the entire community
- Before asking a question, please first refer to the [FAQ](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq)
- We use [GitHub Discussions](https://github.com/ccfos/nightingale/discussions) as the communication forum. You can search and ask questions here.
- We also recommend that you join ours [Slack channel](https://n9e-talk.slack.com/) to exchange experiences with other Nightingale users.
## Who is using Nightingale
You can register your usage and share your experience by posting on **[Who is Using Nightingale](https://github.com/ccfos/nightingale/issues/897)**.
## Stargazers over time
[![Stargazers over time](https://starchart.cc/ccfos/nightingale.svg)](https://starchart.cc/ccfos/nightingale)
@@ -60,9 +101,4 @@ https://user-images.githubusercontent.com/792850/216888712-2565fcea-9df5-47bd-a4
</a>
## License
[Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE)
## 社区管理
[夜莺开源项目和社区治理架构(草案)](./doc/community-governance.md)
[Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE)

View File

@@ -1,104 +0,0 @@
<p align="center">
<a href="https://github.com/ccfos/nightingale">
<img src="doc/img/nightingale_logo_h.png" alt="nightingale - cloud native monitoring" width="240" /></a>
</p>
<p align="center">
<img alt="GitHub latest release" src="https://img.shields.io/github/v/release/ccfos/nightingale"/>
<a href="https://n9e.github.io">
<img alt="Docs" src="https://img.shields.io/badge/docs-get%20started-brightgreen"/></a>
<a href="https://hub.docker.com/u/flashcatcloud">
<img alt="Docker pulls" src="https://img.shields.io/docker/pulls/flashcatcloud/nightingale"/></a>
<img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/ccfos/nightingale">
<img alt="GitHub Repo issues" src="https://img.shields.io/github/issues/ccfos/nightingale">
<img alt="GitHub Repo issues closed" src="https://img.shields.io/github/issues-closed/ccfos/nightingale">
<img alt="GitHub forks" src="https://img.shields.io/github/forks/ccfos/nightingale">
<a href="https://github.com/ccfos/nightingale/graphs/contributors">
<img alt="GitHub contributors" src="https://img.shields.io/github/contributors-anon/ccfos/nightingale"/></a>
<a href="https://n9e-talk.slack.com/">
<img alt="GitHub contributors" src="https://img.shields.io/badge/join%20slack-%23n9e-brightgreen.svg"/></a>
<img alt="License" src="https://img.shields.io/badge/license-Apache--2.0-blue"/>
</p>
<p align="center">
An open-source cloud-native monitoring system that is <b>all-in-one</b> <br/>
<b>Out-of-the-box</b>, it integrates data collection, visualization, and monitoring alert <br/>
We recommend upgrading your <b>Prometheus + AlertManager + Grafana</b> combination to Nightingale!
</p>
[English](./README.md) | [中文](./README_ZH.md)
## Highlighted Features
- **Out-of-the-box**
- Supports multiple deployment methods such as **Docker, Helm Chart, and cloud services**, integrates data collection, monitoring, and alerting into one system, and comes with various monitoring dashboards, quick views, and alert rule templates. **It greatly reduces the construction cost, learning cost, and usage cost of cloud-native monitoring systems**.
- **Professional Alerting**
- Provides visual alert configuration and management, supports various alert rules, offers the ability to configure silence and subscription rules, supports multiple alert delivery channels, and has features such as alert self-healing and event management.
- **Cloud-Native**
- Quickly builds an enterprise-level cloud-native monitoring system through a turnkey approach, supports multiple collectors such as [Categraf](https://github.com/flashcatcloud/categraf), Telegraf, and Grafana-agent, supports multiple data sources such as Prometheus, VictoriaMetrics, M3DB, ElasticSearch, and Jaeger, and is compatible with importing Grafana dashboards. **It seamlessly integrates with the cloud-native ecosystem**.
- **High Performance and High Availability**
- Due to the multi-data-source management engine of Nightingale and its excellent architecture design, and utilizing a high-performance time-series database, it can handle data collection, storage, and alert analysis scenarios with billions of time-series data, saving a lot of costs.
- Nightingale components can be horizontally scaled with no single point of failure. It has been deployed in thousands of enterprises and tested in harsh production practices. Many leading Internet companies have used Nightingale for cluster machines with hundreds of nodes, processing billions of time-series data.
- **Flexible Extension and Centralized Management**
- Nightingale can be deployed on a 1-core 1G cloud host, deployed in a cluster of hundreds of machines, or run in Kubernetes. Time-series databases, alert engines, and other components can also be decentralized to various data centers and regions, balancing edge deployment with centralized management. **It solves the problem of data fragmentation and lack of unified views**.
#### If you are using Prometheus and have one or more of the following requirement scenarios, it is recommended that you upgrade to Nightingale:
- Multiple systems such as Prometheus, Alertmanager, Grafana, etc. are fragmented and lack a unified view and cannot be used out of the box;
- The way to manage Prometheus and Alertmanager by modifying configuration files has a big learning curve and is difficult to collaborate;
- Too much data to scale-up your Prometheus cluster;
- Multiple Prometheus clusters running in production environments, which faced high management and usage costs;
#### If you are using Zabbix and have the following scenarios, it is recommended that you upgrade to Nightingale:
- Monitoring too much data and wanting a better scalable solution;
- A high learning curve and a desire for better efficiency of collaborative use in a multi-person, multi-team model;
- Microservice and cloud-native architectures with variable monitoring data lifecycles and high monitoring data dimension bases, which are not easily adaptable to the Zabbix data model;
#### If you are using [open-falcon](https://github.com/open-falcon/falcon-plus), we recommend you to upgrade to Nightingale
- For more information about open-falcon and Nightingale, please refer to read [Ten features and trends of cloud-native monitoring](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
## Getting Started
[English Doc](https://n9e.github.io/) | [中文文档](http://n9e.flashcat.cloud/)
## Screenshots
https://user-images.githubusercontent.com/792850/216888712-2565fcea-9df5-47bd-a49e-d60af9bd76e8.mp4
## Architecture
<img src="doc/img/arch-product.png" width="600">
Nightingale monitoring can receive monitoring data reported by various collectors (such as [Categraf](https://github.com/flashcatcloud/categraf) , telegraf, grafana-agent, Prometheus, etc.) and write them to various popular time-series databases (such as Prometheus, M3DB, VictoriaMetrics, Thanos, TDEngine, etc.). It provides configuration capabilities for alert rules, silence rules, and subscription rules, as well as the ability to view monitoring data. It also provides automatic alarm self-healing mechanisms (such as automatically calling back to a webhook address or executing a script after an alarm is triggered), and the ability to store and manage historical alarm events and view them in groups.
If the performance of a standalone time-series database (such as Prometheus) has bottlenecks or poor disaster recovery, we recommend using [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics). The VictoriaMetrics architecture is relatively simple, has excellent performance, and is easy to deploy and maintain. The architecture diagram is as shown above. For more detailed documentation on VictoriaMetrics, please refer to its [official website](https://victoriametrics.com/).
**We welcome you to participate in the Nightingale open-source project and community in various ways, including but not limited to**
- Adding and improving documentation => [n9e.github.io](https://n9e.github.io/)
- Sharing your best practices and experience in using Nightingale monitoring => [Article sharing]((https://n9e.github.io/docs/prologue/share/))
- Submitting product suggestions => [github issue](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md)
- Submitting code to make Nightingale monitoring faster, more stable, and easier to use => [github pull request](https://github.com/didi/nightingale/pulls)
**Respecting, recognizing, and recording the work of every contributor** is the first guiding principle of the Nightingale open-source community. We advocate effective questioning, which not only respects the developer's time but also contributes to the accumulation of knowledge in the entire community
- Before asking a question, please first refer to the [FAQ](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq)
- We use [GitHub Discussions](https://github.com/ccfos/nightingale/discussions) as the communication forum. You can search and ask questions here.
- We also recommend that you join ours [Slack channel](https://n9e-talk.slack.com/) to exchange experiences with other Nightingale users.
## Who is using Nightingale
You can register your usage and share your experience by posting on **[Who is Using Nightingale](https://github.com/ccfos/nightingale/issues/897)**.
## Stargazers over time
[![Stargazers over time](https://starchart.cc/ccfos/nightingale.svg)](https://starchart.cc/ccfos/nightingale)
## Contributors
<a href="https://github.com/ccfos/nightingale/graphs/contributors">
<img src="https://contrib.rocks/image?repo=ccfos/nightingale" />
</a>
## License
[Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE)

74
README_zh.md Normal file
View File

@@ -0,0 +1,74 @@
<p align="center">
<a href="https://github.com/ccfos/nightingale">
<img src="doc/img/Nightingale_L_V.png" alt="nightingale - cloud native monitoring" width="240" /></a>
</p>
<p align="center">
<a href="https://flashcat.cloud/docs/">
<img alt="Docs" src="https://img.shields.io/badge/docs-get%20started-brightgreen"/></a>
<a href="https://hub.docker.com/u/flashcatcloud">
<img alt="Docker pulls" src="https://img.shields.io/docker/pulls/flashcatcloud/nightingale"/></a>
<a href="https://github.com/ccfos/nightingale/graphs/contributors">
<img alt="GitHub contributors" src="https://img.shields.io/github/contributors-anon/ccfos/nightingale"/></a>
<img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/ccfos/nightingale">
<br/><img alt="GitHub Repo issues" src="https://img.shields.io/github/issues/ccfos/nightingale">
<img alt="GitHub Repo issues closed" src="https://img.shields.io/github/issues-closed/ccfos/nightingale">
<img alt="GitHub forks" src="https://img.shields.io/github/forks/ccfos/nightingale">
<img alt="GitHub latest release" src="https://img.shields.io/github/v/release/ccfos/nightingale"/>
<img alt="License" src="https://img.shields.io/badge/license-Apache--2.0-blue"/>
<a href="https://n9e-talk.slack.com/">
<img alt="GitHub contributors" src="https://img.shields.io/badge/join%20slack-%23n9e-brightgreen.svg"/></a>
</p>
<p align="center">
告警管理专家,一体化的开源可观测平台
</p>
[English](./README.md) | [中文](./README_zh.md)
夜莺Nightingale是中国计算机学会托管的开源云原生可观测工具最早由滴滴于 2020 年孵化并开源,并于 2022 年正式捐赠予中国计算机学会。夜莺采用 All-in-One 的设计理念,集数据采集、可视化、监控告警、数据分析于一体,与云原生生态紧密集成,融入了顶级互联网公司可观测性最佳实践,沉淀了众多社区专家经验,开箱即用。
## 资料
- 文档:[flashcat.cloud/docs](https://flashcat.cloud/docs/)
- 提问:[answer.flashcat.cloud](https://answer.flashcat.cloud/)
- 报Bug[github.com/ccfos/nightingale/issues](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&projects=&template=bug_report.yml)
## 功能和特点
- 统一接入各种时序库:支持对接 Prometheus、VictoriaMetrics、Thanos、Mimir、M3DB 等多种时序库,实现统一告警管理
- 专业告警能力:内置支持多种告警规则,可以扩展支持所有通知媒介,支持告警屏蔽、告警抑制、告警自愈、告警事件管理
- 高性能可视化引擎支持多种图表样式内置众多Dashboard模版也可导入Grafana模版开箱即用开源协议商业友好
- 无缝搭配 [Flashduty](https://flashcat.cloud/product/flashcat-duty/)实现告警聚合收敛、认领、升级、排班、IM集成确保告警处理不遗漏减少打扰更好协同
- 支持所有常见采集器:支持 [Categraf](https://flashcat.cloud/product/categraf)、telegraf、grafana-agent、datadog-agent、各种 exporter 作为采集器,没有什么数据是不能监控的
- 一体化观测平台:从 v6 版本开始,支持接入 ElasticSearch、Jaeger 数据源,实现日志、链路、指标多维度的统一可观测
## 产品演示
![演示](doc/img/n9e-screenshot-gif-v6.gif)
## 部署架构
![架构](doc/img/n9e-arch-latest.png)
## 加入交流群
欢迎加入 QQ 交流群群号479290895QQ 群适合群友互助,夜莺研发人员通常不在群里。如果要报 bug 请到[这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&projects=&template=bug_report.yml),提问到[这里](https://answer.flashcat.cloud/)。
## Stargazers over time
[![Stargazers over time](https://api.star-history.com/svg?repos=ccfos/nightingale&type=Date)](https://star-history.com/#ccfos/nightingale&Date)
## Contributors
<a href="https://github.com/ccfos/nightingale/graphs/contributors">
<img src="https://contrib.rocks/image?repo=ccfos/nightingale" />
</a>
## 社区治理
[夜莺开源项目和社区治理架构(草案)](./doc/community-governance.md)
## License
[Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE)

View File

@@ -2,11 +2,10 @@ package aconf
import (
"path"
"github.com/toolkits/pkg/runner"
)
type Alert struct {
Disable bool
EngineDelay int64
Heartbeat HeartbeatConfig
Alerting Alerting
@@ -54,9 +53,9 @@ type Ibex struct {
Timeout int64
}
func (a *Alert) PreCheck() {
func (a *Alert) PreCheck(configDir string) {
if a.Alerting.TemplatesDir == "" {
a.Alerting.TemplatesDir = path.Join(runner.Cwd, "etc", "template")
a.Alerting.TemplatesDir = path.Join(configDir, "template")
}
if a.Alerting.NotifyConcurrency == 0 {
@@ -70,4 +69,8 @@ func (a *Alert) PreCheck() {
if a.Heartbeat.EngineName == "" {
a.Heartbeat.EngineName = "default"
}
if a.EngineDelay == 0 {
a.EngineDelay = 30
}
}

View File

@@ -15,6 +15,7 @@ import (
"github.com/ccfos/nightingale/v6/alert/router"
"github.com/ccfos/nightingale/v6/alert/sender"
"github.com/ccfos/nightingale/v6/conf"
"github.com/ccfos/nightingale/v6/dumper"
"github.com/ccfos/nightingale/v6/memsto"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
@@ -23,6 +24,7 @@ import (
"github.com/ccfos/nightingale/v6/prom"
"github.com/ccfos/nightingale/v6/pushgw/pconf"
"github.com/ccfos/nightingale/v6/pushgw/writer"
"github.com/ccfos/nightingale/v6/tdengine"
)
func Initialize(configDir string, cryptoKey string) (func(), error) {
@@ -41,22 +43,27 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
syncStats := memsto.NewSyncStats()
alertStats := astats.NewSyncStats()
configCache := memsto.NewConfigCache(ctx, syncStats, nil, "")
targetCache := memsto.NewTargetCache(ctx, syncStats, nil)
busiGroupCache := memsto.NewBusiGroupCache(ctx, syncStats)
alertMuteCache := memsto.NewAlertMuteCache(ctx, syncStats)
alertRuleCache := memsto.NewAlertRuleCache(ctx, syncStats)
notifyConfigCache := memsto.NewNotifyConfigCache(ctx)
notifyConfigCache := memsto.NewNotifyConfigCache(ctx, configCache)
dsCache := memsto.NewDatasourceCache(ctx, syncStats)
userCache := memsto.NewUserCache(ctx, syncStats)
userGroupCache := memsto.NewUserGroupCache(ctx, syncStats)
promClients := prom.NewPromClient(ctx, config.Alert.Heartbeat)
promClients := prom.NewPromClient(ctx)
tdengineClients := tdengine.NewTdengineClient(ctx, config.Alert.Heartbeat)
externalProcessors := process.NewExternalProcessors()
Start(config.Alert, config.Pushgw, syncStats, alertStats, externalProcessors, targetCache, busiGroupCache, alertMuteCache, alertRuleCache, notifyConfigCache, dsCache, ctx, promClients)
Start(config.Alert, config.Pushgw, syncStats, alertStats, externalProcessors, targetCache, busiGroupCache, alertMuteCache, alertRuleCache, notifyConfigCache, dsCache, ctx, promClients, tdengineClients, userCache, userGroupCache)
r := httpx.GinEngine(config.Global.RunMode, config.HTTP)
rt := router.New(config.HTTP, config.Alert, alertMuteCache, targetCache, busiGroupCache, alertStats, ctx, externalProcessors)
rt.Config(r)
dumper.ConfigRouter(r)
httpClean := httpx.Init(config.HTTP, r)
@@ -67,9 +74,8 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
}
func Start(alertc aconf.Alert, pushgwc pconf.Pushgw, syncStats *memsto.Stats, alertStats *astats.Stats, externalProcessors *process.ExternalProcessorsType, targetCache *memsto.TargetCacheType, busiGroupCache *memsto.BusiGroupCacheType,
alertMuteCache *memsto.AlertMuteCacheType, alertRuleCache *memsto.AlertRuleCacheType, notifyConfigCache *memsto.NotifyConfigCacheType, datasourceCache *memsto.DatasourceCacheType, ctx *ctx.Context, promClients *prom.PromClientMap) {
userCache := memsto.NewUserCache(ctx, syncStats)
userGroupCache := memsto.NewUserGroupCache(ctx, syncStats)
alertMuteCache *memsto.AlertMuteCacheType, alertRuleCache *memsto.AlertRuleCacheType, notifyConfigCache *memsto.NotifyConfigCacheType, datasourceCache *memsto.DatasourceCacheType, ctx *ctx.Context,
promClients *prom.PromClientMap, tdendgineClients *tdengine.TdengineClientMap, userCache *memsto.UserCacheType, userGroupCache *memsto.UserGroupCacheType) {
alertSubscribeCache := memsto.NewAlertSubscribeCache(ctx, syncStats)
recordingRuleCache := memsto.NewRecordingRuleCache(ctx, syncStats)
@@ -80,14 +86,14 @@ func Start(alertc aconf.Alert, pushgwc pconf.Pushgw, syncStats *memsto.Stats, al
writers := writer.NewWriters(pushgwc)
record.NewScheduler(alertc, recordingRuleCache, promClients, writers, alertStats)
eval.NewScheduler(alertc, externalProcessors, alertRuleCache, targetCache, busiGroupCache, alertMuteCache, datasourceCache, promClients, naming, ctx, alertStats)
eval.NewScheduler(alertc, externalProcessors, alertRuleCache, targetCache, busiGroupCache, alertMuteCache, datasourceCache, promClients, tdendgineClients, naming, ctx, alertStats)
dp := dispatch.NewDispatch(alertRuleCache, userCache, userGroupCache, alertSubscribeCache, targetCache, notifyConfigCache, alertc.Alerting, ctx)
dp := dispatch.NewDispatch(alertRuleCache, userCache, userGroupCache, alertSubscribeCache, targetCache, notifyConfigCache, alertc.Alerting, ctx, alertStats)
consumer := dispatch.NewConsumer(alertc.Alerting, ctx, dp)
go dp.ReloadTpls()
go consumer.LoopConsume()
go queue.ReportQueueSize(alertStats)
go sender.StartEmailSender(notifyConfigCache.GetSMTP()) // todo
go sender.InitEmailSender(notifyConfigCache.GetSMTP())
}

View File

@@ -10,22 +10,52 @@ const (
)
type Stats struct {
CounterSampleTotal *prometheus.CounterVec
CounterAlertsTotal *prometheus.CounterVec
GaugeAlertQueueSize prometheus.Gauge
GaugeSampleQueueSize *prometheus.GaugeVec
RequestDuration *prometheus.HistogramVec
ForwardDuration *prometheus.HistogramVec
AlertNotifyTotal *prometheus.CounterVec
AlertNotifyErrorTotal *prometheus.CounterVec
CounterAlertsTotal *prometheus.CounterVec
GaugeAlertQueueSize prometheus.Gauge
CounterRuleEval *prometheus.CounterVec
CounterQueryDataErrorTotal *prometheus.CounterVec
CounterRecordEval *prometheus.CounterVec
CounterRecordEvalErrorTotal *prometheus.CounterVec
CounterMuteTotal *prometheus.CounterVec
}
func NewSyncStats() *Stats {
// 从各个接收接口接收到的监控数据总量
CounterSampleTotal := prometheus.NewCounterVec(prometheus.CounterOpts{
CounterRuleEval := prometheus.NewCounterVec(prometheus.CounterOpts{
Namespace: namespace,
Subsystem: subsystem,
Name: "samples_received_total",
Help: "Total number samples received.",
}, []string{"cluster", "channel"})
Name: "rule_eval_total",
Help: "Number of rule eval.",
}, []string{})
CounterRecordEval := prometheus.NewCounterVec(prometheus.CounterOpts{
Namespace: namespace,
Subsystem: subsystem,
Name: "record_eval_total",
Help: "Number of record eval.",
}, []string{})
CounterRecordEvalErrorTotal := prometheus.NewCounterVec(prometheus.CounterOpts{
Namespace: namespace,
Subsystem: subsystem,
Name: "record_eval_error_total",
Help: "Number of record eval error.",
}, []string{})
AlertNotifyTotal := prometheus.NewCounterVec(prometheus.CounterOpts{
Namespace: namespace,
Subsystem: subsystem,
Name: "alert_notify_total",
Help: "Number of send msg.",
}, []string{"channel"})
AlertNotifyErrorTotal := prometheus.NewCounterVec(prometheus.CounterOpts{
Namespace: namespace,
Subsystem: subsystem,
Name: "alert_notify_error_total",
Help: "Number of send msg.",
}, []string{"channel"})
// 产生的告警总量
CounterAlertsTotal := prometheus.NewCounterVec(prometheus.CounterOpts{
@@ -33,7 +63,7 @@ func NewSyncStats() *Stats {
Subsystem: subsystem,
Name: "alerts_total",
Help: "Total number alert events.",
}, []string{"cluster"})
}, []string{"cluster", "type", "busi_group"})
// 内存中的告警事件队列的长度
GaugeAlertQueueSize := prometheus.NewGauge(prometheus.GaugeOpts{
@@ -43,51 +73,41 @@ func NewSyncStats() *Stats {
Help: "The size of alert queue.",
})
// 数据转发队列,各个队列的长度
GaugeSampleQueueSize := prometheus.NewGaugeVec(prometheus.GaugeOpts{
CounterQueryDataErrorTotal := prometheus.NewCounterVec(prometheus.CounterOpts{
Namespace: namespace,
Subsystem: subsystem,
Name: "sample_queue_size",
Help: "The size of sample queue.",
}, []string{"cluster", "channel_number"})
Name: "query_data_error_total",
Help: "Number of query data error.",
}, []string{"datasource"})
// 一些重要的请求,比如接收数据的请求,应该统计一下延迟情况
RequestDuration := prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: namespace,
Subsystem: subsystem,
Buckets: []float64{.01, .1, 1},
Name: "http_request_duration_seconds",
Help: "HTTP request latencies in seconds.",
}, []string{"code", "path", "method"},
)
// 发往后端TSDB延迟如何
ForwardDuration := prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: namespace,
Subsystem: subsystem,
Buckets: []float64{.1, 1, 10},
Name: "forward_duration_seconds",
Help: "Forward samples to TSDB. latencies in seconds.",
}, []string{"cluster", "channel_number"},
)
CounterMuteTotal := prometheus.NewCounterVec(prometheus.CounterOpts{
Namespace: namespace,
Subsystem: subsystem,
Name: "mute_total",
Help: "Number of mute.",
}, []string{"group"})
prometheus.MustRegister(
CounterSampleTotal,
CounterAlertsTotal,
GaugeAlertQueueSize,
GaugeSampleQueueSize,
RequestDuration,
ForwardDuration,
AlertNotifyTotal,
AlertNotifyErrorTotal,
CounterRuleEval,
CounterQueryDataErrorTotal,
CounterRecordEval,
CounterRecordEvalErrorTotal,
CounterMuteTotal,
)
return &Stats{
CounterSampleTotal: CounterSampleTotal,
CounterAlertsTotal: CounterAlertsTotal,
GaugeAlertQueueSize: GaugeAlertQueueSize,
GaugeSampleQueueSize: GaugeSampleQueueSize,
RequestDuration: RequestDuration,
ForwardDuration: ForwardDuration,
CounterAlertsTotal: CounterAlertsTotal,
GaugeAlertQueueSize: GaugeAlertQueueSize,
AlertNotifyTotal: AlertNotifyTotal,
AlertNotifyErrorTotal: AlertNotifyErrorTotal,
CounterRuleEval: CounterRuleEval,
CounterQueryDataErrorTotal: CounterQueryDataErrorTotal,
CounterRecordEval: CounterRecordEval,
CounterRecordEvalErrorTotal: CounterRecordEvalErrorTotal,
CounterMuteTotal: CounterMuteTotal,
}
}

View File

@@ -15,6 +15,7 @@ type AnomalyPoint struct {
Value float64 `json:"value"`
Severity int `json:"severity"`
Triggered bool `json:"triggered"`
Query string `json:"query"`
}
func NewAnomalyPoint(key string, labels map[string]string, ts int64, value float64, severity int) AnomalyPoint {

View File

@@ -22,6 +22,14 @@ func MatchTags(eventTagsMap map[string]string, itags []models.TagFilter) bool {
}
return true
}
func MatchGroupsName(groupName string, groupFilter []models.TagFilter) bool {
for _, filter := range groupFilter {
if !matchTag(groupName, filter) {
return false
}
}
return true
}
func matchTag(value string, filter models.TagFilter) bool {
switch filter.Func {

View File

@@ -61,6 +61,13 @@ func (e *Consumer) consume(events []interface{}, sema *semaphore.Semaphore) {
func (e *Consumer) consumeOne(event *models.AlertCurEvent) {
LogEvent(event, "consume")
eventType := "alert"
if event.IsRecovered {
eventType = "recovery"
}
e.dispatch.astats.CounterAlertsTotal.WithLabelValues(event.Cluster, eventType, event.GroupName).Inc()
if err := event.ParseRule("rule_name"); err != nil {
event.RuleName = fmt.Sprintf("failed to parse rule name: %v", err)
}
@@ -83,11 +90,16 @@ func (e *Consumer) consumeOne(event *models.AlertCurEvent) {
}
func (e *Consumer) persist(event *models.AlertCurEvent) {
if event.Status != 0 {
return
}
if !e.ctx.IsCenter {
event.DB2FE()
err := poster.PostByUrls(e.ctx, "/v1/n9e/event-persist", event)
var err error
event.Id, err = poster.PostByUrlsWithResp[int64](e.ctx, "/v1/n9e/event-persist", event)
if err != nil {
logger.Errorf("event%+v persist err:%v", event, err)
logger.Errorf("event:%+v persist err:%v", event, err)
}
return
}

View File

@@ -9,6 +9,7 @@ import (
"time"
"github.com/ccfos/nightingale/v6/alert/aconf"
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/alert/common"
"github.com/ccfos/nightingale/v6/alert/sender"
"github.com/ccfos/nightingale/v6/memsto"
@@ -28,11 +29,13 @@ type Dispatch struct {
alerting aconf.Alerting
senders map[string]sender.Sender
tpls map[string]*template.Template
ExtraSenders map[string]sender.Sender
Senders map[string]sender.Sender
tpls map[string]*template.Template
ExtraSenders map[string]sender.Sender
BeforeSenderHook func(*models.AlertCurEvent) bool
ctx *ctx.Context
ctx *ctx.Context
astats *astats.Stats
RwLock sync.RWMutex
}
@@ -40,7 +43,7 @@ type Dispatch struct {
// 创建一个 Notify 实例
func NewDispatch(alertRuleCache *memsto.AlertRuleCacheType, userCache *memsto.UserCacheType, userGroupCache *memsto.UserGroupCacheType,
alertSubscribeCache *memsto.AlertSubscribeCacheType, targetCache *memsto.TargetCacheType, notifyConfigCache *memsto.NotifyConfigCacheType,
alerting aconf.Alerting, ctx *ctx.Context) *Dispatch {
alerting aconf.Alerting, ctx *ctx.Context, astats *astats.Stats) *Dispatch {
notify := &Dispatch{
alertRuleCache: alertRuleCache,
userCache: userCache,
@@ -51,11 +54,13 @@ func NewDispatch(alertRuleCache *memsto.AlertRuleCacheType, userCache *memsto.Us
alerting: alerting,
senders: make(map[string]sender.Sender),
tpls: make(map[string]*template.Template),
ExtraSenders: make(map[string]sender.Sender),
Senders: make(map[string]sender.Sender),
tpls: make(map[string]*template.Template),
ExtraSenders: make(map[string]sender.Sender),
BeforeSenderHook: func(*models.AlertCurEvent) bool { return true },
ctx: ctx,
ctx: ctx,
astats: astats,
}
return notify
}
@@ -63,7 +68,7 @@ func NewDispatch(alertRuleCache *memsto.AlertRuleCacheType, userCache *memsto.Us
func (e *Dispatch) ReloadTpls() error {
err := e.relaodTpls()
if err != nil {
logger.Error("failed to reload tpls: %v", err)
logger.Errorf("failed to reload tpls: %v", err)
}
duration := time.Duration(9000) * time.Millisecond
@@ -83,23 +88,24 @@ func (e *Dispatch) relaodTpls() error {
smtp := e.notifyConfigCache.GetSMTP()
senders := map[string]sender.Sender{
models.Email: sender.NewSender(models.Email, tmpTpls, smtp),
models.Dingtalk: sender.NewSender(models.Dingtalk, tmpTpls, smtp),
models.Wecom: sender.NewSender(models.Wecom, tmpTpls, smtp),
models.Feishu: sender.NewSender(models.Feishu, tmpTpls, smtp),
models.Mm: sender.NewSender(models.Mm, tmpTpls, smtp),
models.Telegram: sender.NewSender(models.Telegram, tmpTpls, smtp),
models.Email: sender.NewSender(models.Email, tmpTpls, smtp),
models.Dingtalk: sender.NewSender(models.Dingtalk, tmpTpls),
models.Wecom: sender.NewSender(models.Wecom, tmpTpls),
models.Feishu: sender.NewSender(models.Feishu, tmpTpls),
models.Mm: sender.NewSender(models.Mm, tmpTpls),
models.Telegram: sender.NewSender(models.Telegram, tmpTpls),
models.FeishuCard: sender.NewSender(models.FeishuCard, tmpTpls),
}
e.RwLock.RLock()
for channel, sender := range e.ExtraSenders {
senders[channel] = sender
for channelName, extraSender := range e.ExtraSenders {
senders[channelName] = extraSender
}
e.RwLock.RUnlock()
e.RwLock.Lock()
e.tpls = tmpTpls
e.senders = senders
e.Senders = senders
e.RwLock.Unlock()
return nil
}
@@ -140,7 +146,7 @@ func (e *Dispatch) HandleEventNotify(event *models.AlertCurEvent, isSubscribe bo
}
// 处理事件发送,这里用一个goroutine处理一个event的所有发送事件
go e.Send(rule, event, notifyTarget, isSubscribe)
go e.Send(rule, event, notifyTarget)
// 如果是不是订阅规则出现的event, 则需要处理订阅规则的event
if !isSubscribe {
@@ -167,45 +173,73 @@ func (e *Dispatch) handleSubs(event *models.AlertCurEvent) {
// handleSub 处理订阅规则的event,注意这里event要使用值传递,因为后面会修改event的状态
func (e *Dispatch) handleSub(sub *models.AlertSubscribe, event models.AlertCurEvent) {
if sub.IsDisabled() || !sub.MatchCluster(event.DatasourceId) {
if sub.IsDisabled() {
return
}
if !sub.MatchCluster(event.DatasourceId) {
return
}
if !sub.MatchProd(event.RuleProd) {
return
}
if !common.MatchTags(event.TagsMap, sub.ITags) {
return
}
// event BusiGroups filter
if !common.MatchGroupsName(event.GroupName, sub.IBusiGroups) {
return
}
if sub.ForDuration > (event.TriggerTime - event.FirstTriggerTime) {
return
}
if len(sub.SeveritiesJson) != 0 {
match := false
for _, s := range sub.SeveritiesJson {
if s == event.Severity || s == 0 {
match = true
break
}
}
if !match {
return
}
}
sub.ModifyEvent(&event)
LogEvent(&event, "subscribe")
event.SubRuleId = sub.Id
e.HandleEventNotify(&event, true)
}
func (e *Dispatch) Send(rule *models.AlertRule, event *models.AlertCurEvent, notifyTarget *NotifyTarget, isSubscribe bool) {
for channel, uids := range notifyTarget.ToChannelUserMap() {
ctx := sender.BuildMessageContext(rule, event, uids, e.userCache)
e.RwLock.RLock()
s := e.senders[channel]
e.RwLock.RUnlock()
if s == nil {
logger.Debugf("no sender for channel: %s", channel)
continue
func (e *Dispatch) Send(rule *models.AlertRule, event *models.AlertCurEvent, notifyTarget *NotifyTarget) {
needSend := e.BeforeSenderHook(event)
if needSend {
for channel, uids := range notifyTarget.ToChannelUserMap() {
msgCtx := sender.BuildMessageContext(rule, []*models.AlertCurEvent{event}, uids, e.userCache, e.astats)
e.RwLock.RLock()
s := e.Senders[channel]
e.RwLock.RUnlock()
if s == nil {
logger.Debugf("no sender for channel: %s", channel)
continue
}
s.Send(msgCtx)
}
logger.Debugf("send event: %s, channel: %s", event.Hash, channel)
for i := 0; i < len(ctx.Users); i++ {
logger.Debug("send event to user: ", ctx.Users[i])
}
s.Send(ctx)
}
// handle event callbacks
sender.SendCallbacks(e.ctx, notifyTarget.ToCallbackList(), event, e.targetCache, e.userCache, e.notifyConfigCache.GetIbex())
sender.SendCallbacks(e.ctx, notifyTarget.ToCallbackList(), event, e.targetCache, e.userCache, e.notifyConfigCache.GetIbex(), e.astats)
// handle global webhooks
sender.SendWebhooks(notifyTarget.ToWebhookList(), event)
sender.SendWebhooks(notifyTarget.ToWebhookList(), event, e.astats)
// handle plugin call
go sender.MayPluginNotify(e.genNoticeBytes(event), e.notifyConfigCache.GetNotifyScript())
go sender.MayPluginNotify(e.genNoticeBytes(event), e.notifyConfigCache.GetNotifyScript(), e.astats)
}
type Notice struct {

View File

@@ -12,6 +12,8 @@ import (
"github.com/ccfos/nightingale/v6/memsto"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/prom"
"github.com/ccfos/nightingale/v6/tdengine"
"github.com/toolkits/pkg/logger"
)
@@ -29,7 +31,8 @@ type Scheduler struct {
alertMuteCache *memsto.AlertMuteCacheType
datasourceCache *memsto.DatasourceCacheType
promClients *prom.PromClientMap
promClients *prom.PromClientMap
tdengineClients *tdengine.TdengineClientMap
naming *naming.Naming
@@ -38,8 +41,8 @@ type Scheduler struct {
}
func NewScheduler(aconf aconf.Alert, externalProcessors *process.ExternalProcessorsType, arc *memsto.AlertRuleCacheType, targetCache *memsto.TargetCacheType,
busiGroupCache *memsto.BusiGroupCacheType, alertMuteCache *memsto.AlertMuteCacheType, datasourceCache *memsto.DatasourceCacheType, promClients *prom.PromClientMap, naming *naming.Naming,
ctx *ctx.Context, stats *astats.Stats) *Scheduler {
busiGroupCache *memsto.BusiGroupCacheType, alertMuteCache *memsto.AlertMuteCacheType, datasourceCache *memsto.DatasourceCacheType,
promClients *prom.PromClientMap, tdengineClients *tdengine.TdengineClientMap, naming *naming.Naming, ctx *ctx.Context, stats *astats.Stats) *Scheduler {
scheduler := &Scheduler{
aconf: aconf,
alertRules: make(map[string]*AlertRuleWorker),
@@ -52,8 +55,9 @@ func NewScheduler(aconf aconf.Alert, externalProcessors *process.ExternalProcess
alertMuteCache: alertMuteCache,
datasourceCache: datasourceCache,
promClients: promClients,
naming: naming,
promClients: promClients,
tdengineClients: tdengineClients,
naming: naming,
ctx: ctx,
stats: stats,
@@ -85,8 +89,11 @@ func (s *Scheduler) syncAlertRules() {
if rule == nil {
continue
}
if rule.IsPrometheusRule() {
ruleType := rule.GetRuleType()
if rule.IsPrometheusRule() || rule.IsLokiRule() || rule.IsTdengineRule() {
datasourceIds := s.promClients.Hit(rule.DatasourceIdsJson)
datasourceIds = append(datasourceIds, s.tdengineClients.Hit(rule.DatasourceIdsJson)...)
for _, dsId := range datasourceIds {
if !naming.DatasourceHashRing.IsHit(dsId, fmt.Sprintf("%d", rule.Id), s.aconf.Heartbeat.Endpoint) {
continue
@@ -97,13 +104,18 @@ func (s *Scheduler) syncAlertRules() {
continue
}
if ds.PluginType != ruleType {
logger.Debugf("datasource %d category is %s not %s", dsId, ds.PluginType, ruleType)
continue
}
if ds.Status != "enabled" {
logger.Debugf("datasource %d status is %s", dsId, ds.Status)
continue
}
processor := process.NewProcessor(rule, dsId, s.alertRuleCache, s.targetCache, s.busiGroupCache, s.alertMuteCache, s.datasourceCache, s.promClients, s.ctx, s.stats)
processor := process.NewProcessor(rule, dsId, s.alertRuleCache, s.targetCache, s.busiGroupCache, s.alertMuteCache, s.datasourceCache, s.ctx, s.stats)
alertRule := NewAlertRuleWorker(rule, dsId, processor, s.promClients, s.ctx)
alertRule := NewAlertRuleWorker(rule, dsId, processor, s.promClients, s.tdengineClients, s.ctx)
alertRuleWorkers[alertRule.Hash()] = alertRule
}
} else if rule.IsHostRule() && s.ctx.IsCenter {
@@ -111,8 +123,8 @@ func (s *Scheduler) syncAlertRules() {
if !naming.DatasourceHashRing.IsHit(naming.HostDatasource, fmt.Sprintf("%d", rule.Id), s.aconf.Heartbeat.Endpoint) {
continue
}
processor := process.NewProcessor(rule, 0, s.alertRuleCache, s.targetCache, s.busiGroupCache, s.alertMuteCache, s.datasourceCache, s.promClients, s.ctx, s.stats)
alertRule := NewAlertRuleWorker(rule, 0, processor, s.promClients, s.ctx)
processor := process.NewProcessor(rule, 0, s.alertRuleCache, s.targetCache, s.busiGroupCache, s.alertMuteCache, s.datasourceCache, s.ctx, s.stats)
alertRule := NewAlertRuleWorker(rule, 0, processor, s.promClients, s.tdengineClients, s.ctx)
alertRuleWorkers[alertRule.Hash()] = alertRule
} else {
// 如果 rule 不是通过 prometheus engine 来告警的,则创建为 externalRule
@@ -128,7 +140,7 @@ func (s *Scheduler) syncAlertRules() {
logger.Debugf("datasource %d status is %s", dsId, ds.Status)
continue
}
processor := process.NewProcessor(rule, dsId, s.alertRuleCache, s.targetCache, s.busiGroupCache, s.alertMuteCache, s.datasourceCache, s.promClients, s.ctx, s.stats)
processor := process.NewProcessor(rule, dsId, s.alertRuleCache, s.targetCache, s.busiGroupCache, s.alertMuteCache, s.datasourceCache, s.ctx, s.stats)
externalRuleWorkers[processor.Key()] = processor
}
}

View File

@@ -11,8 +11,11 @@ import (
"github.com/ccfos/nightingale/v6/alert/process"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/hash"
"github.com/ccfos/nightingale/v6/pkg/parser"
promsdk "github.com/ccfos/nightingale/v6/pkg/prom"
"github.com/ccfos/nightingale/v6/prom"
"github.com/ccfos/nightingale/v6/tdengine"
"github.com/toolkits/pkg/logger"
"github.com/toolkits/pkg/str"
@@ -28,19 +31,21 @@ type AlertRuleWorker struct {
processor *process.Processor
promClients *prom.PromClientMap
ctx *ctx.Context
promClients *prom.PromClientMap
tdengineClients *tdengine.TdengineClientMap
ctx *ctx.Context
}
func NewAlertRuleWorker(rule *models.AlertRule, datasourceId int64, processor *process.Processor, promClients *prom.PromClientMap, ctx *ctx.Context) *AlertRuleWorker {
func NewAlertRuleWorker(rule *models.AlertRule, datasourceId int64, processor *process.Processor, promClients *prom.PromClientMap, tdengineClients *tdengine.TdengineClientMap, ctx *ctx.Context) *AlertRuleWorker {
arw := &AlertRuleWorker{
datasourceId: datasourceId,
quit: make(chan struct{}),
rule: rule,
processor: processor,
promClients: promClients,
ctx: ctx,
promClients: promClients,
tdengineClients: tdengineClients,
ctx: ctx,
}
return arw
@@ -69,14 +74,16 @@ func (arw *AlertRuleWorker) Start() {
if interval <= 0 {
interval = 10
}
ticker := time.NewTicker(time.Duration(interval) * time.Second)
go func() {
defer ticker.Stop()
for {
select {
case <-arw.quit:
return
default:
case <-ticker.C:
arw.Eval()
time.Sleep(time.Duration(interval) * time.Second)
}
}
}()
@@ -85,17 +92,23 @@ func (arw *AlertRuleWorker) Start() {
func (arw *AlertRuleWorker) Eval() {
cachedRule := arw.rule
if cachedRule == nil {
//logger.Errorf("rule_eval:%s rule not found", arw.Key())
// logger.Errorf("rule_eval:%s rule not found", arw.Key())
return
}
arw.processor.Stats.CounterRuleEval.WithLabelValues().Inc()
typ := cachedRule.GetRuleType()
var lst []common.AnomalyPoint
var anomalyPoints []common.AnomalyPoint
var recoverPoints []common.AnomalyPoint
switch typ {
case models.PROMETHEUS:
lst = arw.GetPromAnomalyPoint(cachedRule.RuleConfig)
anomalyPoints = arw.GetPromAnomalyPoint(cachedRule.RuleConfig)
case models.HOST:
lst = arw.GetHostAnomalyPoint(cachedRule.RuleConfig)
anomalyPoints = arw.GetHostAnomalyPoint(cachedRule.RuleConfig)
case models.TDENGINE:
anomalyPoints, recoverPoints = arw.GetTdengineAnomalyPoint(cachedRule, arw.processor.DatasourceId())
case models.LOKI:
anomalyPoints = arw.GetPromAnomalyPoint(cachedRule.RuleConfig)
default:
return
}
@@ -105,7 +118,11 @@ func (arw *AlertRuleWorker) Eval() {
return
}
arw.processor.Handle(lst, "inner", arw.inhibit)
arw.processor.Handle(anomalyPoints, "inner", arw.inhibit)
for _, point := range recoverPoints {
str := fmt.Sprintf("%v", point.Value)
arw.processor.RecoverSingle(process.Hash(cachedRule.Id, arw.processor.DatasourceId(), point), point.Timestamp, &str)
}
}
func (arw *AlertRuleWorker) Stop() {
@@ -151,11 +168,13 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) []common.Anom
value, warnings, err := readerClient.Query(context.Background(), promql, time.Now())
if err != nil {
logger.Errorf("rule_eval:%s promql:%s, error:%v", arw.Key(), promql, err)
arw.processor.Stats.CounterQueryDataErrorTotal.WithLabelValues(fmt.Sprintf("%d", arw.datasourceId)).Inc()
continue
}
if len(warnings) > 0 {
logger.Errorf("rule_eval:%s promql:%s, warnings:%v", arw.Key(), promql, warnings)
arw.processor.Stats.CounterQueryDataErrorTotal.WithLabelValues(fmt.Sprintf("%d", arw.datasourceId)).Inc()
continue
}
@@ -163,12 +182,118 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) []common.Anom
points := common.ConvertAnomalyPoints(value)
for i := 0; i < len(points); i++ {
points[i].Severity = query.Severity
points[i].Query = promql
}
lst = append(lst, points...)
}
return lst
}
func (arw *AlertRuleWorker) GetTdengineAnomalyPoint(rule *models.AlertRule, dsId int64) ([]common.AnomalyPoint, []common.AnomalyPoint) {
// 获取查询和规则判断条件
points := []common.AnomalyPoint{}
recoverPoints := []common.AnomalyPoint{}
ruleConfig := strings.TrimSpace(rule.RuleConfig)
if ruleConfig == "" {
logger.Warningf("rule_eval:%d promql is blank", rule.Id)
return points, recoverPoints
}
var ruleQuery models.RuleQuery
err := json.Unmarshal([]byte(ruleConfig), &ruleQuery)
if err != nil {
logger.Warningf("rule_eval:%d promql parse error:%s", rule.Id, err.Error())
return points, recoverPoints
}
arw.inhibit = ruleQuery.Inhibit
if len(ruleQuery.Queries) > 0 {
seriesStore := make(map[uint64]*models.DataResp)
seriesTagIndex := make(map[uint64][]uint64)
for _, query := range ruleQuery.Queries {
cli := arw.tdengineClients.GetCli(dsId)
if cli == nil {
logger.Warningf("rule_eval:%d tdengine client is nil", rule.Id)
continue
}
series, err := cli.Query(query)
if err != nil {
logger.Warningf("rule_eval rid:%d query data error: %v", rule.Id, err)
continue
}
// 此条日志很重要,是告警判断的现场值
logger.Debugf("rule_eval rid:%d req:%+v resp:%+v", rule.Id, query, series)
for i := 0; i < len(series); i++ {
serieHash := hash.GetHash(series[i].Metric, series[i].Ref)
tagHash := hash.GetTagHash(series[i].Metric)
seriesStore[serieHash] = series[i]
// 将曲线按照相同的 tag 分组
if _, exists := seriesTagIndex[tagHash]; !exists {
seriesTagIndex[tagHash] = make([]uint64, 0)
}
seriesTagIndex[tagHash] = append(seriesTagIndex[tagHash], serieHash)
}
}
// 判断
for _, trigger := range ruleQuery.Triggers {
for _, seriesHash := range seriesTagIndex {
m := make(map[string]float64)
var ts int64
var sample *models.DataResp
var value float64
for _, serieHash := range seriesHash {
series, exists := seriesStore[serieHash]
if !exists {
logger.Warningf("rule_eval rid:%d series:%+v not found", rule.Id, series)
continue
}
t, v, exists := series.Last()
if !exists {
logger.Warningf("rule_eval rid:%d series:%+v value not found", rule.Id, series)
continue
}
if !strings.Contains(trigger.Exp, "$"+series.Ref) {
// 表达式中不包含该变量
continue
}
m["$"+series.Ref] = v
m["$"+series.Ref+"."+series.MetricName()] = v
ts = int64(t)
sample = series
value = v
}
isTriggered := parser.Calc(trigger.Exp, m)
// 此条日志很重要,是告警判断的现场值
logger.Debugf("rule_eval rid:%d trigger:%+v exp:%s res:%v m:%v", rule.Id, trigger, trigger.Exp, isTriggered, m)
point := common.AnomalyPoint{
Key: sample.MetricName(),
Labels: sample.Metric,
Timestamp: int64(ts),
Value: value,
Severity: trigger.Severity,
Triggered: isTriggered,
}
if isTriggered {
points = append(points, point)
} else {
recoverPoints = append(recoverPoints, point)
}
}
}
}
return points, recoverPoints
}
func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) []common.AnomalyPoint {
var lst []common.AnomalyPoint
var severity int
@@ -198,6 +323,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) []common.Anom
targets, err := models.MissTargetGetsByFilter(arw.ctx, query, t)
if err != nil {
logger.Errorf("rule_eval:%s query:%v, error:%v", arw.Key(), query, err)
arw.processor.Stats.CounterQueryDataErrorTotal.WithLabelValues(fmt.Sprintf("%d", arw.datasourceId)).Inc()
continue
}
for _, target := range targets {
@@ -219,6 +345,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) []common.Anom
targets, err := models.TargetGetsByFilter(arw.ctx, query, 0, 0)
if err != nil {
logger.Errorf("rule_eval:%s query:%v, error:%v", arw.Key(), query, err)
arw.processor.Stats.CounterQueryDataErrorTotal.WithLabelValues(fmt.Sprintf("%d", arw.datasourceId)).Inc()
continue
}
var targetMap = make(map[string]*models.Target)
@@ -250,12 +377,14 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) []common.Anom
count, err := models.MissTargetCountByFilter(arw.ctx, query, t)
if err != nil {
logger.Errorf("rule_eval:%s query:%v, error:%v", arw.Key(), query, err)
arw.processor.Stats.CounterQueryDataErrorTotal.WithLabelValues(fmt.Sprintf("%d", arw.datasourceId)).Inc()
continue
}
total, err := models.TargetCountByFilter(arw.ctx, query)
if err != nil {
logger.Errorf("rule_eval:%s query:%v, error:%v", arw.Key(), query, err)
arw.processor.Stats.CounterQueryDataErrorTotal.WithLabelValues(fmt.Sprintf("%d", arw.datasourceId)).Inc()
continue
}
pct := float64(count) / float64(total) * 100

View File

@@ -13,7 +13,11 @@ import (
)
func IsMuted(rule *models.AlertRule, event *models.AlertCurEvent, targetCache *memsto.TargetCacheType, alertMuteCache *memsto.AlertMuteCacheType) bool {
if TimeNonEffectiveMuteStrategy(rule, event) {
if rule.Disabled == 1 {
return true
}
if TimeSpanMuteStrategy(rule, event) {
return true
}
@@ -32,12 +36,9 @@ func IsMuted(rule *models.AlertRule, event *models.AlertCurEvent, targetCache *m
return false
}
// TimeNonEffectiveMuteStrategy 根据规则配置的告警时间过滤,如果产生的告警不在规则配置的告警时间内,则不告警
func TimeNonEffectiveMuteStrategy(rule *models.AlertRule, event *models.AlertCurEvent) bool {
if rule.Disabled == 1 {
return true
}
// TimeSpanMuteStrategy 根据规则配置的告警生效时间过滤,如果产生的告警不在规则配置的告警生效时间内,则不告警,即被mute
// 时间范围左闭右开默认范围00:00-24:00
func TimeSpanMuteStrategy(rule *models.AlertRule, event *models.AlertCurEvent) bool {
tm := time.Unix(event.TriggerTime, 0)
triggerTime := tm.Format("15:04")
triggerWeek := strconv.Itoa(int(tm.Weekday()))
@@ -52,18 +53,33 @@ func TimeNonEffectiveMuteStrategy(rule *models.AlertRule, event *models.AlertCur
if !strings.Contains(enableDaysOfWeek[i], triggerWeek) {
continue
}
if enableStime[i] <= enableEtime[i] {
if triggerTime < enableStime[i] || triggerTime > enableEtime[i] {
continue
if enableStime[i] < enableEtime[i] {
if enableEtime[i] == "23:59" {
// 02:00-23:59这种情况做个特殊处理相当于左闭右闭区间了
if triggerTime < enableStime[i] {
// mute, 即没生效
continue
}
} else {
// 02:00-04:00 或者 02:00-24:00
if triggerTime < enableStime[i] || triggerTime >= enableEtime[i] {
// mute, 即没生效
continue
}
}
} else {
if triggerTime < enableStime[i] && triggerTime > enableEtime[i] {
} else if enableStime[i] > enableEtime[i] {
// 21:00-09:00
if triggerTime < enableStime[i] && triggerTime >= enableEtime[i] {
// mute, 即没生效
continue
}
}
// 到这里说明当前时刻在告警规则的某组生效时间范围内,直接返回 false
// 到这里说明当前时刻在告警规则的某组生效时间范围内,即没有 mute直接返回 false
return false
}
return true
}
@@ -156,13 +172,16 @@ func matchMute(event *models.AlertCurEvent, mute *models.AlertMute, clock ...int
for i := 0; i < len(mute.PeriodicMutesJson); i++ {
if strings.Contains(mute.PeriodicMutesJson[i].EnableDaysOfWeek, triggerWeek) {
if mute.PeriodicMutesJson[i].EnableStime <= mute.PeriodicMutesJson[i].EnableEtime {
if mute.PeriodicMutesJson[i].EnableStime == mute.PeriodicMutesJson[i].EnableEtime {
matchTime = true
break
} else if mute.PeriodicMutesJson[i].EnableStime < mute.PeriodicMutesJson[i].EnableEtime {
if triggerTime >= mute.PeriodicMutesJson[i].EnableStime && triggerTime < mute.PeriodicMutesJson[i].EnableEtime {
matchTime = true
break
}
} else {
if triggerTime < mute.PeriodicMutesJson[i].EnableStime || triggerTime >= mute.PeriodicMutesJson[i].EnableEtime {
if triggerTime >= mute.PeriodicMutesJson[i].EnableStime || triggerTime < mute.PeriodicMutesJson[i].EnableEtime {
matchTime = true
break
}
@@ -174,5 +193,21 @@ func matchMute(event *models.AlertCurEvent, mute *models.AlertMute, clock ...int
return false
}
var matchSeverity bool
if len(mute.SeveritiesJson) > 0 {
for _, s := range mute.SeveritiesJson {
if event.Severity == s || s == 0 {
matchSeverity = true
break
}
}
} else {
matchSeverity = true
}
if !matchSeverity {
return false
}
return common.MatchTags(event.TagsMap, mute.ITags)
}

View File

@@ -16,7 +16,7 @@ type DatasourceHashRingType struct {
}
// for alert_rule sharding
var HostDatasource int64 = 100000
var HostDatasource int64 = 99999999
var DatasourceHashRing = DatasourceHashRingType{Rings: make(map[int64]*consistent.Consistent)}
func NewConsistentHashRing(replicas int32, nodes []string) *consistent.Consistent {
@@ -53,10 +53,8 @@ func (chr *DatasourceHashRingType) GetNode(datasourceId int64, pk string) (strin
func (chr *DatasourceHashRingType) IsHit(datasourceId int64, pk string, currentNode string) bool {
node, err := chr.GetNode(datasourceId, pk)
if err != nil {
if errors.Is(err, consistent.ErrEmptyCircle) {
logger.Debugf("rule id:%s is not work, datasource id:%d is not assigned to active alert engine", pk, datasourceId)
} else {
logger.Debugf("rule id:%s is not work, datasource id:%d failed to get node from hashring:%v", pk, datasourceId, err)
if !errors.Is(err, consistent.ErrEmptyCircle) {
logger.Errorf("rule id:%s is not work, datasource id:%d failed to get node from hashring:%v", pk, datasourceId, err)
}
return false
}
@@ -68,3 +66,14 @@ func (chr *DatasourceHashRingType) Set(datasourceId int64, r *consistent.Consist
defer chr.Unlock()
chr.Rings[datasourceId] = r
}
func (chr *DatasourceHashRingType) Clear() {
chr.Lock()
defer chr.Unlock()
for id := range chr.Rings {
if id == HostDatasource {
continue
}
delete(chr.Rings, id)
}
}

View File

@@ -96,6 +96,16 @@ func (n *Naming) heartbeat() error {
}
}
if len(datasourceIds) == 0 {
DatasourceHashRing.Clear()
for dsId := range localss {
if dsId == HostDatasource {
continue
}
delete(localss, dsId)
}
}
for i := 0; i < len(datasourceIds); i++ {
servers, err := n.ActiveServers(datasourceIds[i])
if err != nil {
@@ -157,3 +167,13 @@ func (n *Naming) ActiveServers(datasourceId int64) ([]string, error) {
// 30秒内有心跳就认为是活的
return models.AlertingEngineGetsInstances(n.ctx, "datasource_id = ? and clock > ?", datasourceId, time.Now().Unix()-30)
}
func (n *Naming) ActiveServersByEngineName() ([]string, error) {
if !n.ctx.IsCenter {
lst, err := poster.GetByUrls[[]string](n.ctx, "/v1/n9e/servers-active?engine_name="+n.heartbeatConfig.EngineName)
return lst, err
}
// 30秒内有心跳就认为是活的
return models.AlertingEngineGetsInstances(n.ctx, "engine_cluster = ? and clock > ?", n.heartbeatConfig.EngineName, time.Now().Unix()-30)
}

View File

@@ -23,6 +23,8 @@ import (
"github.com/toolkits/pkg/str"
)
type EventMuteHookFunc func(event *models.AlertCurEvent) bool
type ExternalProcessorsType struct {
ExternalLock sync.RWMutex
Processors map[string]*Processor
@@ -43,6 +45,8 @@ func (e *ExternalProcessorsType) GetExternalAlertRule(datasourceId, id int64) (*
return processor, has
}
type HandleEventFunc func(event *models.AlertCurEvent)
type Processor struct {
datasourceId int64
@@ -65,7 +69,11 @@ type Processor struct {
promClients *prom.PromClientMap
ctx *ctx.Context
stats *astats.Stats
Stats *astats.Stats
HandleFireEventHook HandleEventFunc
HandleRecoverEventHook HandleEventFunc
EventMuteHook EventMuteHookFunc
}
func (p *Processor) Key() string {
@@ -86,7 +94,7 @@ func (p *Processor) Hash() string {
}
func NewProcessor(rule *models.AlertRule, datasourceId int64, atertRuleCache *memsto.AlertRuleCacheType, targetCache *memsto.TargetCacheType,
busiGroupCache *memsto.BusiGroupCacheType, alertMuteCache *memsto.AlertMuteCacheType, datasourceCache *memsto.DatasourceCacheType, promClients *prom.PromClientMap, ctx *ctx.Context,
busiGroupCache *memsto.BusiGroupCacheType, alertMuteCache *memsto.AlertMuteCacheType, datasourceCache *memsto.DatasourceCacheType, ctx *ctx.Context,
stats *astats.Stats) *Processor {
p := &Processor{
@@ -99,9 +107,12 @@ func NewProcessor(rule *models.AlertRule, datasourceId int64, atertRuleCache *me
atertRuleCache: atertRuleCache,
datasourceCache: datasourceCache,
promClients: promClients,
ctx: ctx,
stats: stats,
ctx: ctx,
Stats: stats,
HandleFireEventHook: func(event *models.AlertCurEvent) {},
HandleRecoverEventHook: func(event *models.AlertCurEvent) {},
EventMuteHook: func(event *models.AlertCurEvent) bool { return false },
}
p.mayHandleGroup()
@@ -130,9 +141,15 @@ func (p *Processor) Handle(anomalyPoints []common.AnomalyPoint, from string, inh
hash := event.Hash
alertingKeys[hash] = struct{}{}
if mute.IsMuted(cachedRule, event, p.TargetCache, p.alertMuteCache) {
p.Stats.CounterMuteTotal.WithLabelValues(event.GroupName).Inc()
logger.Debugf("rule_eval:%s event:%v is muted", p.Key(), event)
continue
}
if p.EventMuteHook(event) {
continue
}
tagHash := TagHash(anomalyPoint)
eventsMap[tagHash] = append(eventsMap[tagHash], event)
}
@@ -174,6 +191,8 @@ func (p *Processor) BuildEvent(anomalyPoint common.AnomalyPoint, from string, no
event.RuleConfig = p.rule.RuleConfig
event.RuleConfigJson = p.rule.RuleConfigJson
event.Severity = anomalyPoint.Severity
event.ExtraConfig = p.rule.ExtraConfigJSON
event.PromQl = anomalyPoint.Query
if from == "inner" {
event.LastEvalTime = now
@@ -227,6 +246,8 @@ func (p *Processor) RecoverSingle(hash string, now int64, value *string) {
cachedRule.UpdateEvent(event)
event.IsRecovered = true
event.LastEvalTime = now
p.HandleRecoverEventHook(event)
p.pushEventToQueue(event)
}
@@ -284,9 +305,12 @@ func (p *Processor) fireEvent(event *models.AlertCurEvent) {
if cachedRule == nil {
return
}
logger.Debugf("rule_eval:%s event:%+v fire", p.Key(), event)
if fired, has := p.fires.Get(event.Hash); has {
p.fires.UpdateLastEvalTime(event.Hash, event.LastEvalTime)
event.FirstTriggerTime = fired.FirstTriggerTime
p.HandleFireEventHook(event)
if cachedRule.NotifyRepeatStep == 0 {
logger.Debugf("rule_eval:%s event:%+v repeat is zero nothing to do", p.Key(), event)
@@ -296,11 +320,10 @@ func (p *Processor) fireEvent(event *models.AlertCurEvent) {
}
// 之前发送过告警了,这次是否要继续发送,要看是否过了通道静默时间
if event.LastEvalTime > fired.LastSentTime+int64(cachedRule.NotifyRepeatStep)*60 {
if event.LastEvalTime >= fired.LastSentTime+int64(cachedRule.NotifyRepeatStep)*60 {
if cachedRule.NotifyMaxNumber == 0 {
// 最大可以发送次数如果是0表示不想限制最大发送次数一直发即可
event.NotifyCurNumber = fired.NotifyCurNumber + 1
event.FirstTriggerTime = fired.FirstTriggerTime
p.pushEventToQueue(event)
} else {
// 有最大发送次数的限制,就要看已经发了几次了,是否达到了最大发送次数
@@ -309,7 +332,6 @@ func (p *Processor) fireEvent(event *models.AlertCurEvent) {
return
} else {
event.NotifyCurNumber = fired.NotifyCurNumber + 1
event.FirstTriggerTime = fired.FirstTriggerTime
p.pushEventToQueue(event)
}
}
@@ -317,6 +339,7 @@ func (p *Processor) fireEvent(event *models.AlertCurEvent) {
} else {
event.NotifyCurNumber = 1
event.FirstTriggerTime = event.TriggerTime
p.HandleFireEventHook(event)
p.pushEventToQueue(event)
}
}
@@ -327,7 +350,6 @@ func (p *Processor) pushEventToQueue(e *models.AlertCurEvent) {
p.fires.Set(e.Hash, e)
}
p.stats.CounterAlertsTotal.WithLabelValues(fmt.Sprintf("%d", e.DatasourceId)).Inc()
dispatch.LogEvent(e, "push_queue")
if !queue.EventQueue.PushFront(e) {
logger.Warningf("event_push_queue: queue is full, event:%+v", e)
@@ -405,7 +427,13 @@ func (p *Processor) mayHandleIdent() {
if target, exists := p.TargetCache.Get(ident); exists {
p.target = target.Ident
p.targetNote = target.Note
} else {
p.target = ident
p.targetNote = ""
}
} else {
p.target = ""
p.targetNote = ""
}
}
@@ -432,7 +460,7 @@ func labelMapToArr(m map[string]string) []string {
}
func Hash(ruleId, datasourceId int64, vector common.AnomalyPoint) string {
return str.MD5(fmt.Sprintf("%d_%s_%d_%d", ruleId, vector.Labels.String(), datasourceId, vector.Severity))
return str.MD5(fmt.Sprintf("%d_%s_%d_%d_%s", ruleId, vector.Labels.String(), datasourceId, vector.Severity, vector.Query))
}
func TagHash(vector common.AnomalyPoint) string {

View File

@@ -6,6 +6,7 @@ import (
"strings"
"time"
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/prom"
"github.com/ccfos/nightingale/v6/pushgw/writer"
@@ -18,18 +19,18 @@ type RecordRuleContext struct {
datasourceId int64
quit chan struct{}
rule *models.RecordingRule
// writers *writer.WritersType
rule *models.RecordingRule
promClients *prom.PromClientMap
stats *astats.Stats
}
func NewRecordRuleContext(rule *models.RecordingRule, datasourceId int64, promClients *prom.PromClientMap, writers *writer.WritersType) *RecordRuleContext {
func NewRecordRuleContext(rule *models.RecordingRule, datasourceId int64, promClients *prom.PromClientMap, writers *writer.WritersType, stats *astats.Stats) *RecordRuleContext {
return &RecordRuleContext{
datasourceId: datasourceId,
quit: make(chan struct{}),
rule: rule,
promClients: promClients,
//writers: writers,
stats: stats,
}
}
@@ -54,20 +55,23 @@ func (rrc *RecordRuleContext) Start() {
if interval <= 0 {
interval = 10
}
ticker := time.NewTicker(time.Duration(interval) * time.Second)
go func() {
defer ticker.Stop()
for {
select {
case <-rrc.quit:
return
default:
case <-ticker.C:
rrc.Eval()
time.Sleep(time.Duration(interval) * time.Second)
}
}
}()
}
func (rrc *RecordRuleContext) Eval() {
rrc.stats.CounterRecordEval.WithLabelValues().Inc()
promql := strings.TrimSpace(rrc.rule.PromQl)
if promql == "" {
logger.Errorf("eval:%s promql is blank", rrc.Key())
@@ -76,17 +80,20 @@ func (rrc *RecordRuleContext) Eval() {
if rrc.promClients.IsNil(rrc.datasourceId) {
logger.Errorf("eval:%s reader client is nil", rrc.Key())
rrc.stats.CounterRecordEvalErrorTotal.WithLabelValues().Inc()
return
}
value, warnings, err := rrc.promClients.GetCli(rrc.datasourceId).Query(context.Background(), promql, time.Now())
if err != nil {
logger.Errorf("eval:%s promql:%s, error:%v", rrc.Key(), promql, err)
rrc.stats.CounterRecordEvalErrorTotal.WithLabelValues().Inc()
return
}
if len(warnings) > 0 {
logger.Errorf("eval:%s promql:%s, warnings:%v", rrc.Key(), promql, warnings)
rrc.stats.CounterRecordEvalErrorTotal.WithLabelValues().Inc()
return
}

View File

@@ -15,7 +15,7 @@ const (
LabelName = "__name__"
)
func ConvertToTimeSeries(value model.Value, rule *models.RecordingRule) (lst []*prompb.TimeSeries) {
func ConvertToTimeSeries(value model.Value, rule *models.RecordingRule) (lst []prompb.TimeSeries) {
switch value.Type() {
case model.ValVector:
items, ok := value.(model.Vector)
@@ -31,7 +31,7 @@ func ConvertToTimeSeries(value model.Value, rule *models.RecordingRule) (lst []*
s.Timestamp = time.Unix(item.Timestamp.Unix(), 0).UnixNano() / 1e6
s.Value = float64(item.Value)
l := labelsToLabelsProto(item.Metric, rule)
lst = append(lst, &prompb.TimeSeries{
lst = append(lst, prompb.TimeSeries{
Labels: l,
Samples: []prompb.Sample{s},
})
@@ -63,7 +63,7 @@ func ConvertToTimeSeries(value model.Value, rule *models.RecordingRule) (lst []*
Value: float64(v.Value),
})
}
lst = append(lst, &prompb.TimeSeries{
lst = append(lst, prompb.TimeSeries{
Labels: l,
Samples: slst,
})
@@ -78,7 +78,7 @@ func ConvertToTimeSeries(value model.Value, rule *models.RecordingRule) (lst []*
return
}
lst = append(lst, &prompb.TimeSeries{
lst = append(lst, prompb.TimeSeries{
Labels: nil,
Samples: []prompb.Sample{{Value: float64(item.Value), Timestamp: time.Unix(item.Timestamp.Unix(), 0).UnixNano() / 1e6}},
})
@@ -89,9 +89,9 @@ func ConvertToTimeSeries(value model.Value, rule *models.RecordingRule) (lst []*
return
}
func labelsToLabelsProto(labels model.Metric, rule *models.RecordingRule) (result []*prompb.Label) {
func labelsToLabelsProto(labels model.Metric, rule *models.RecordingRule) (result []prompb.Label) {
//name
nameLs := &prompb.Label{
nameLs := prompb.Label{
Name: LabelName,
Value: rule.Name,
}
@@ -101,7 +101,7 @@ func labelsToLabelsProto(labels model.Metric, rule *models.RecordingRule) (resul
continue
}
if model.LabelNameRE.MatchString(string(k)) {
result = append(result, &prompb.Label{
result = append(result, prompb.Label{
Name: string(k),
Value: string(v),
})
@@ -111,7 +111,7 @@ func labelsToLabelsProto(labels model.Metric, rule *models.RecordingRule) (resul
for _, v := range rule.AppendTagsJSON {
index := strings.Index(v, "=")
if model.LabelNameRE.MatchString(v[:index]) {
result = append(result, &prompb.Label{
result = append(result, prompb.Label{
Name: v[:index],
Value: v[index+1:],
})

View File

@@ -72,7 +72,7 @@ func (s *Scheduler) syncRecordRules() {
continue
}
recordRule := NewRecordRuleContext(rule, dsId, s.promClients, s.writers)
recordRule := NewRecordRuleContext(rule, dsId, s.promClients, s.writers, s.stats)
recordRules[recordRule.Hash()] = recordRule
}
}

View File

@@ -39,13 +39,13 @@ func New(httpConfig httpx.Config, alert aconf.Alert, amc *memsto.AlertMuteCacheT
}
func (rt *Router) Config(r *gin.Engine) {
if !rt.HTTP.Alert.Enable {
if !rt.HTTP.APIForService.Enable {
return
}
service := r.Group("/v1/n9e")
if len(rt.HTTP.Alert.BasicAuth) > 0 {
service.Use(gin.BasicAuth(rt.HTTP.Alert.BasicAuth))
if len(rt.HTTP.APIForService.BasicAuth) > 0 {
service.Use(gin.BasicAuth(rt.HTTP.APIForService.BasicAuth))
}
service.POST("/event", rt.pushEventToQueue)
service.POST("/event-persist", rt.eventPersist)

View File

@@ -72,8 +72,6 @@ func (rt *Router) pushEventToQueue(c *gin.Context) {
event.NotifyChannels = strings.Join(event.NotifyChannelsJSON, " ")
event.NotifyGroups = strings.Join(event.NotifyGroupsJSON, " ")
rt.AlertStats.CounterAlertsTotal.WithLabelValues(event.Cluster).Inc()
dispatch.LogEvent(event, "http_push_queue")
if !queue.EventQueue.PushFront(event) {
msg := fmt.Sprintf("event:%+v push_queue err: queue is full", event)
@@ -87,7 +85,8 @@ func (rt *Router) eventPersist(c *gin.Context) {
var event *models.AlertCurEvent
ginx.BindJSON(c, &event)
event.FE2DB()
ginx.NewRender(c).Message(models.EventPersist(rt.Ctx, event))
err := models.EventPersist(rt.Ctx, event)
ginx.NewRender(c).Data(event.Id, err)
}
type eventForm struct {

View File

@@ -1,11 +1,13 @@
package sender
import (
"encoding/json"
"strconv"
"strings"
"time"
"github.com/ccfos/nightingale/v6/alert/aconf"
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/memsto"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
@@ -15,7 +17,8 @@ import (
"github.com/toolkits/pkg/logger"
)
func SendCallbacks(ctx *ctx.Context, urls []string, event *models.AlertCurEvent, targetCache *memsto.TargetCacheType, userCache *memsto.UserCacheType, ibexConf aconf.Ibex) {
func SendCallbacks(ctx *ctx.Context, urls []string, event *models.AlertCurEvent, targetCache *memsto.TargetCacheType, userCache *memsto.UserCacheType,
ibexConf aconf.Ibex, stats *astats.Stats) {
for _, url := range urls {
if url == "" {
continue
@@ -32,9 +35,11 @@ func SendCallbacks(ctx *ctx.Context, urls []string, event *models.AlertCurEvent,
url = "http://" + url
}
stats.AlertNotifyTotal.WithLabelValues("rule_callback").Inc()
resp, code, err := poster.PostJSON(url, 5*time.Second, event, 3)
if err != nil {
logger.Errorf("event_callback_fail(rule_id=%d url=%s), resp: %s, err: %v, code: %d", event.RuleId, url, string(resp), err, code)
stats.AlertNotifyErrorTotal.WithLabelValues("rule_callback").Inc()
} else {
logger.Infof("event_callback_succ(rule_id=%d url=%s), resp: %s, code: %d", event.RuleId, url, string(resp), code)
}
@@ -50,6 +55,7 @@ type TaskForm struct {
Pause string `json:"pause"`
Script string `json:"script"`
Args string `json:"args"`
Stdin string `json:"stdin"`
Action string `json:"action"`
Creator string `json:"creator"`
Hosts []string `json:"hosts"`
@@ -90,7 +96,7 @@ func handleIbex(ctx *ctx.Context, url string, event *models.AlertCurEvent, targe
return
}
tpl, err := models.TaskTplGet(ctx, "id = ?", id)
tpl, err := models.TaskTplGetById(ctx, id)
if err != nil {
logger.Errorf("event_callback_ibex: failed to get tpl: %v", err)
return
@@ -114,6 +120,30 @@ func handleIbex(ctx *ctx.Context, url string, event *models.AlertCurEvent, targe
return
}
tagsMap := make(map[string]string)
for i := 0; i < len(event.TagsJSON); i++ {
pair := strings.TrimSpace(event.TagsJSON[i])
if pair == "" {
continue
}
arr := strings.Split(pair, "=")
if len(arr) != 2 {
continue
}
tagsMap[arr[0]] = arr[1]
}
// 附加告警级别 告警触发值标签
tagsMap["alert_severity"] = strconv.Itoa(event.Severity)
tagsMap["alert_trigger_value"] = event.TriggerValue
tags, err := json.Marshal(tagsMap)
if err != nil {
logger.Errorf("event_callback_ibex: failed to marshal tags to json: %v", tagsMap)
return
}
// call ibex
in := TaskForm{
Title: tpl.Title + " FH: " + host,
@@ -124,6 +154,7 @@ func handleIbex(ctx *ctx.Context, url string, event *models.AlertCurEvent, targe
Pause: tpl.Pause,
Script: tpl.Script,
Args: tpl.Args,
Stdin: string(tags),
Action: "start",
Creator: tpl.UpdateBy,
Hosts: []string{host},

View File

@@ -5,6 +5,7 @@ import (
"strings"
"time"
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/poster"
@@ -32,7 +33,7 @@ type DingtalkSender struct {
}
func (ds *DingtalkSender) Send(ctx MessageContext) {
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
if len(ctx.Users) == 0 || len(ctx.Events) == 0 {
return
}
@@ -40,7 +41,7 @@ func (ds *DingtalkSender) Send(ctx MessageContext) {
if len(urls) == 0 {
return
}
message := BuildTplMessage(ds.tpl, ctx.Event)
message := BuildTplMessage(ds.tpl, ctx.Events)
for _, url := range urls {
var body dingtalk
@@ -49,7 +50,7 @@ func (ds *DingtalkSender) Send(ctx MessageContext) {
body = dingtalk{
Msgtype: "markdown",
Markdown: dingtalkMarkdown{
Title: ctx.Event.RuleName,
Title: ctx.Events[0].RuleName,
Text: message,
},
}
@@ -57,7 +58,7 @@ func (ds *DingtalkSender) Send(ctx MessageContext) {
body = dingtalk{
Msgtype: "markdown",
Markdown: dingtalkMarkdown{
Title: ctx.Event.RuleName,
Title: ctx.Events[0].RuleName,
Text: message + "\n" + strings.Join(ats, " "),
},
At: dingtalkAt{
@@ -66,7 +67,8 @@ func (ds *DingtalkSender) Send(ctx MessageContext) {
},
}
}
ds.doSend(url, body)
doSend(url, body, models.Dingtalk, ctx.Stats)
}
}
@@ -81,7 +83,7 @@ func (ds *DingtalkSender) extract(users []*models.User) ([]string, []string) {
}
if token, has := user.ExtractToken(models.Dingtalk); has {
url := token
if !strings.HasPrefix(token, "https://") {
if !strings.HasPrefix(token, "https://") && !strings.HasPrefix(token, "http://") {
url = "https://oapi.dingtalk.com/robot/send?access_token=" + token
}
urls = append(urls, url)
@@ -90,11 +92,14 @@ func (ds *DingtalkSender) extract(users []*models.User) ([]string, []string) {
return urls, ats
}
func (ds *DingtalkSender) doSend(url string, body dingtalk) {
func doSend(url string, body interface{}, channel string, stats *astats.Stats) {
stats.AlertNotifyTotal.WithLabelValues(channel).Inc()
res, code, err := poster.PostJSON(url, time.Second*5, body, 3)
if err != nil {
logger.Errorf("dingtalk_sender: result=fail url=%s code=%d error=%v response=%s", url, code, err, string(res))
logger.Errorf("%s_sender: result=fail url=%s code=%d error=%v response=%s", channel, url, code, err, string(res))
stats.AlertNotifyErrorTotal.WithLabelValues(channel).Inc()
} else {
logger.Infof("dingtalk_sender: result=succ url=%s code=%d response=%s", url, code, string(res))
logger.Infof("%s_sender: result=succ url=%s code=%d response=%s", channel, url, code, string(res))
}
}

View File

@@ -2,6 +2,7 @@ package sender
import (
"crypto/tls"
"errors"
"html/template"
"time"
@@ -22,19 +23,21 @@ type EmailSender struct {
}
func (es *EmailSender) Send(ctx MessageContext) {
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
if len(ctx.Users) == 0 || len(ctx.Events) == 0 {
return
}
tos := extract(ctx.Users)
var subject string
if es.subjectTpl != nil {
subject = BuildTplMessage(es.subjectTpl, ctx.Event)
subject = BuildTplMessage(es.subjectTpl, []*models.AlertCurEvent{ctx.Events[0]})
} else {
subject = ctx.Event.RuleName
subject = ctx.Events[0].RuleName
}
content := BuildTplMessage(es.contentTpl, ctx.Event)
content := BuildTplMessage(es.contentTpl, ctx.Events)
es.WriteEmail(subject, content, tos)
ctx.Stats.AlertNotifyTotal.WithLabelValues(models.Email).Add(float64(len(tos)))
}
func extract(users []*models.User) []string {
@@ -47,7 +50,7 @@ func extract(users []*models.User) []string {
return tos
}
func (es *EmailSender) SendEmail(subject, content string, tos []string, stmp aconf.SMTPConfig) {
func SendEmail(subject, content string, tos []string, stmp aconf.SMTPConfig) error {
conf := stmp
d := gomail.NewDialer(conf.Host, conf.Port, conf.User, conf.Pass)
@@ -64,8 +67,9 @@ func (es *EmailSender) SendEmail(subject, content string, tos []string, stmp aco
err := d.DialAndSend(m)
if err != nil {
logger.Errorf("email_sender: failed to send: %v", err)
return errors.New("email_sender: failed to send: " + err.Error())
}
return nil
}
func (es *EmailSender) WriteEmail(subject, content string, tos []string) {
@@ -96,19 +100,21 @@ var mailQuit = make(chan struct{})
func RestartEmailSender(smtp aconf.SMTPConfig) {
close(mailQuit)
mailQuit = make(chan struct{})
StartEmailSender(smtp)
startEmailSender(smtp)
}
func StartEmailSender(smtp aconf.SMTPConfig) {
func InitEmailSender(smtp aconf.SMTPConfig) {
mailch = make(chan *gomail.Message, 100000)
startEmailSender(smtp)
}
func startEmailSender(smtp aconf.SMTPConfig) {
conf := smtp
if conf.Host == "" || conf.Port == 0 {
logger.Warning("SMTP configurations invalid")
return
}
logger.Infof("start email sender... %+v", conf)
logger.Infof("start email sender... conf.Host:%+v,conf.Port:%+v", conf.Host, conf.Port)
d := gomail.NewDialer(conf.Host, conf.Port, conf.User, conf.Pass)
if conf.InsecureSkipVerify {

View File

@@ -3,12 +3,8 @@ package sender
import (
"html/template"
"strings"
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/poster"
"github.com/toolkits/pkg/logger"
)
type feishuContent struct {
@@ -31,11 +27,11 @@ type FeishuSender struct {
}
func (fs *FeishuSender) Send(ctx MessageContext) {
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
if len(ctx.Users) == 0 || len(ctx.Events) == 0 {
return
}
urls, ats := fs.extract(ctx.Users)
message := BuildTplMessage(fs.tpl, ctx.Event)
message := BuildTplMessage(fs.tpl, ctx.Events)
for _, url := range urls {
body := feishu{
Msgtype: "text",
@@ -49,7 +45,7 @@ func (fs *FeishuSender) Send(ctx MessageContext) {
IsAtAll: false,
}
}
fs.doSend(url, body)
doSend(url, body, models.Feishu, ctx.Stats)
}
}
@@ -63,7 +59,7 @@ func (fs *FeishuSender) extract(users []*models.User) ([]string, []string) {
}
if token, has := user.ExtractToken(models.Feishu); has {
url := token
if !strings.HasPrefix(token, "https://") {
if !strings.HasPrefix(token, "https://") && !strings.HasPrefix(token, "http://") {
url = "https://open.feishu.cn/open-apis/bot/v2/hook/" + token
}
urls = append(urls, url)
@@ -71,12 +67,3 @@ func (fs *FeishuSender) extract(users []*models.User) ([]string, []string) {
}
return urls, ats
}
func (fs *FeishuSender) doSend(url string, body feishu) {
res, code, err := poster.PostJSON(url, time.Second*5, body, 3)
if err != nil {
logger.Errorf("feishu_sender: result=fail url=%s code=%d error=%v response=%s", url, code, err, string(res))
} else {
logger.Infof("feishu_sender: result=succ url=%s code=%d response=%s", url, code, string(res))
}
}

131
alert/sender/feishucard.go Normal file
View File

@@ -0,0 +1,131 @@
package sender
import (
"fmt"
"html/template"
"strings"
"github.com/ccfos/nightingale/v6/models"
)
type Conf struct {
WideScreenMode bool `json:"wide_screen_mode"`
EnableForward bool `json:"enable_forward"`
}
type Te struct {
Content string `json:"content"`
Tag string `json:"tag"`
}
type Element struct {
Tag string `json:"tag"`
Text Te `json:"text"`
Content string `json:"content"`
Elements []Element `json:"elements"`
}
type Titles struct {
Content string `json:"content"`
Tag string `json:"tag"`
}
type Headers struct {
Title Titles `json:"title"`
Template string `json:"template"`
}
type Cards struct {
Config Conf `json:"config"`
Elements []Element `json:"elements"`
Header Headers `json:"header"`
}
type feishuCard struct {
feishu
Card Cards `json:"card"`
}
type FeishuCardSender struct {
tpl *template.Template
}
const (
Recovered = "recovered"
Triggered = "triggered"
)
var (
body = feishuCard{
feishu: feishu{Msgtype: "interactive"},
Card: Cards{
Config: Conf{
WideScreenMode: true,
EnableForward: true,
},
Header: Headers{
Title: Titles{
Tag: "plain_text",
},
},
Elements: []Element{
{
Tag: "div",
Text: Te{
Tag: "lark_md",
},
},
{
Tag: "hr",
},
{
Tag: "note",
Elements: []Element{
{
Tag: "lark_md",
},
},
},
},
},
}
)
func (fs *FeishuCardSender) Send(ctx MessageContext) {
if len(ctx.Users) == 0 || len(ctx.Events) == 0 {
return
}
urls, _ := fs.extract(ctx.Users)
message := BuildTplMessage(fs.tpl, ctx.Events)
color := "red"
lowerUnicode := strings.ToLower(message)
if strings.Count(lowerUnicode, Recovered) > 0 && strings.Count(lowerUnicode, Triggered) > 0 {
color = "orange"
} else if strings.Count(lowerUnicode, Recovered) > 0 {
color = "green"
}
SendTitle := fmt.Sprintf("🔔 %s", ctx.Events[0].RuleName)
body.Card.Header.Title.Content = SendTitle
body.Card.Header.Template = color
body.Card.Elements[0].Text.Content = message
body.Card.Elements[2].Elements[0].Content = SendTitle
for _, url := range urls {
doSend(url, body, models.FeishuCard, ctx.Stats)
}
}
func (fs *FeishuCardSender) extract(users []*models.User) ([]string, []string) {
urls := make([]string, 0, len(users))
ats := make([]string, 0)
for i := range users {
if token, has := users[i].ExtractToken(models.FeishuCard); has {
url := token
if !strings.HasPrefix(token, "https://") && !strings.HasPrefix(token, "http://") {
url = "https://open.feishu.cn/open-apis/bot/v2/hook/" + strings.TrimSpace(token)
}
urls = append(urls, url)
}
}
return urls, ats
}

View File

@@ -4,10 +4,9 @@ import (
"html/template"
"net/url"
"strings"
"time"
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/poster"
"github.com/toolkits/pkg/logger"
)
@@ -15,6 +14,7 @@ import (
type MatterMostMessage struct {
Text string
Tokens []string
Stats *astats.Stats
}
type mm struct {
@@ -28,7 +28,7 @@ type MmSender struct {
}
func (ms *MmSender) Send(ctx MessageContext) {
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
if len(ctx.Users) == 0 || len(ctx.Events) == 0 {
return
}
@@ -36,11 +36,12 @@ func (ms *MmSender) Send(ctx MessageContext) {
if len(urls) == 0 {
return
}
message := BuildTplMessage(ms.tpl, ctx.Event)
message := BuildTplMessage(ms.tpl, ctx.Events)
SendMM(MatterMostMessage{
Text: message,
Tokens: urls,
Stats: ctx.Stats,
})
}
@@ -87,13 +88,7 @@ func SendMM(message MatterMostMessage) {
Username: username,
Text: txt + message.Text,
}
res, code, err := poster.PostJSON(ur, time.Second*5, body, 3)
if err != nil {
logger.Errorf("mm_sender: result=fail url=%s code=%d error=%v response=%s", ur, code, err, string(res))
} else {
logger.Infof("mm_sender: result=succ url=%s code=%d response=%s", ur, code, string(res))
}
doSend(ur, body, models.Mm, message.Stats)
}
}
}

View File

@@ -6,6 +6,7 @@ import (
"os/exec"
"time"
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/models"
"github.com/toolkits/pkg/file"
@@ -13,20 +14,22 @@ import (
"github.com/toolkits/pkg/sys"
)
func MayPluginNotify(noticeBytes []byte, notifyScript models.NotifyScript) {
func MayPluginNotify(noticeBytes []byte, notifyScript models.NotifyScript, stats *astats.Stats) {
if len(noticeBytes) == 0 {
return
}
alertingCallScript(noticeBytes, notifyScript)
alertingCallScript(noticeBytes, notifyScript, stats)
}
func alertingCallScript(stdinBytes []byte, notifyScript models.NotifyScript) {
func alertingCallScript(stdinBytes []byte, notifyScript models.NotifyScript, stats *astats.Stats) {
// not enable or no notify.py? do nothing
config := notifyScript
if !config.Enable || config.Content == "" {
return
}
channel := "script"
stats.AlertNotifyTotal.WithLabelValues(channel).Inc()
fpath := ".notify_scriptt"
if config.Type == 1 {
fpath = config.Content
@@ -36,6 +39,7 @@ func alertingCallScript(stdinBytes []byte, notifyScript models.NotifyScript) {
oldContent, err := file.ToString(fpath)
if err != nil {
logger.Errorf("event_script_notify_fail: read script file err: %v", err)
stats.AlertNotifyErrorTotal.WithLabelValues(channel).Inc()
return
}
@@ -48,12 +52,14 @@ func alertingCallScript(stdinBytes []byte, notifyScript models.NotifyScript) {
_, err := file.WriteString(fpath, config.Content)
if err != nil {
logger.Errorf("event_script_notify_fail: write script file err: %v", err)
stats.AlertNotifyErrorTotal.WithLabelValues(channel).Inc()
return
}
err = os.Chmod(fpath, 0777)
if err != nil {
logger.Errorf("event_script_notify_fail: chmod script file err: %v", err)
stats.AlertNotifyErrorTotal.WithLabelValues(channel).Inc()
return
}
}
@@ -83,13 +89,14 @@ func alertingCallScript(stdinBytes []byte, notifyScript models.NotifyScript) {
if err != nil {
logger.Errorf("event_script_notify_fail: kill process %s occur error %v", fpath, err)
stats.AlertNotifyErrorTotal.WithLabelValues(channel).Inc()
}
return
}
if err != nil {
logger.Errorf("event_script_notify_fail: exec script %s occur error: %v, output: %s", fpath, err, buf.String())
stats.AlertNotifyErrorTotal.WithLabelValues(channel).Inc()
return
}

View File

@@ -5,6 +5,7 @@ import (
"html/template"
"github.com/ccfos/nightingale/v6/alert/aconf"
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/memsto"
"github.com/ccfos/nightingale/v6/models"
)
@@ -17,13 +18,14 @@ type (
// MessageContext 一个event所生成的告警通知的上下文
MessageContext struct {
Users []*models.User
Rule *models.AlertRule
Event *models.AlertCurEvent
Users []*models.User
Rule *models.AlertRule
Events []*models.AlertCurEvent
Stats *astats.Stats
}
)
func NewSender(key string, tpls map[string]*template.Template, smtp aconf.SMTPConfig) Sender {
func NewSender(key string, tpls map[string]*template.Template, smtp ...aconf.SMTPConfig) Sender {
switch key {
case models.Dingtalk:
return &DingtalkSender{tpl: tpls[models.Dingtalk]}
@@ -31,8 +33,10 @@ func NewSender(key string, tpls map[string]*template.Template, smtp aconf.SMTPCo
return &WecomSender{tpl: tpls[models.Wecom]}
case models.Feishu:
return &FeishuSender{tpl: tpls[models.Feishu]}
case models.FeishuCard:
return &FeishuCardSender{tpl: tpls[models.FeishuCard]}
case models.Email:
return &EmailSender{subjectTpl: tpls["mailsubject"], contentTpl: tpls[models.Email], smtp: smtp}
return &EmailSender{subjectTpl: tpls[models.EmailSubject], contentTpl: tpls[models.Email], smtp: smtp[0]}
case models.Mm:
return &MmSender{tpl: tpls[models.Mm]}
case models.Telegram:
@@ -41,23 +45,33 @@ func NewSender(key string, tpls map[string]*template.Template, smtp aconf.SMTPCo
return nil
}
func BuildMessageContext(rule *models.AlertRule, event *models.AlertCurEvent, uids []int64, userCache *memsto.UserCacheType) MessageContext {
func BuildMessageContext(rule *models.AlertRule, events []*models.AlertCurEvent, uids []int64, userCache *memsto.UserCacheType, stats *astats.Stats) MessageContext {
users := userCache.GetByUserIds(uids)
return MessageContext{
Rule: rule,
Event: event,
Users: users,
Rule: rule,
Events: events,
Users: users,
Stats: stats,
}
}
func BuildTplMessage(tpl *template.Template, event *models.AlertCurEvent) string {
type BuildTplMessageFunc func(tpl *template.Template, events []*models.AlertCurEvent) string
var BuildTplMessage BuildTplMessageFunc = buildTplMessage
func buildTplMessage(tpl *template.Template, events []*models.AlertCurEvent) string {
if tpl == nil {
return "tpl for current sender not found, please check configuration"
}
var body bytes.Buffer
if err := tpl.Execute(&body, event); err != nil {
return err.Error()
var content string
for _, event := range events {
var body bytes.Buffer
if err := tpl.Execute(&body, event); err != nil {
return err.Error()
}
content += body.String() + "\n\n"
}
return body.String()
return content
}

View File

@@ -3,10 +3,9 @@ package sender
import (
"html/template"
"strings"
"time"
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/poster"
"github.com/toolkits/pkg/logger"
)
@@ -14,6 +13,7 @@ import (
type TelegramMessage struct {
Text string
Tokens []string
Stats *astats.Stats
}
type telegram struct {
@@ -26,15 +26,16 @@ type TelegramSender struct {
}
func (ts *TelegramSender) Send(ctx MessageContext) {
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
if len(ctx.Users) == 0 || len(ctx.Events) == 0 {
return
}
tokens := ts.extract(ctx.Users)
message := BuildTplMessage(ts.tpl, ctx.Event)
message := BuildTplMessage(ts.tpl, ctx.Events)
SendTelegram(TelegramMessage{
Text: message,
Tokens: tokens,
Stats: ctx.Stats,
})
}
@@ -55,7 +56,7 @@ func SendTelegram(message TelegramMessage) {
continue
}
var url string
if strings.HasPrefix(message.Tokens[i], "https://") {
if strings.HasPrefix(message.Tokens[i], "https://") || strings.HasPrefix(message.Tokens[i], "http://") {
url = message.Tokens[i]
} else {
array := strings.Split(message.Tokens[i], "/")
@@ -72,11 +73,6 @@ func SendTelegram(message TelegramMessage) {
Text: message.Text,
}
res, code, err := poster.PostJSON(url, time.Second*5, body, 3)
if err != nil {
logger.Errorf("telegram_sender: result=fail url=%s code=%d error=%v response=%s", url, code, err, string(res))
} else {
logger.Infof("telegram_sender: result=succ url=%s code=%d response=%s", url, code, string(res))
}
doSend(url, body, models.Telegram, message.Stats)
}
}

View File

@@ -3,16 +3,17 @@ package sender
import (
"bytes"
"encoding/json"
"io/ioutil"
"io"
"net/http"
"time"
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/models"
"github.com/toolkits/pkg/logger"
)
func SendWebhooks(webhooks []*models.Webhook, event *models.AlertCurEvent) {
func SendWebhooks(webhooks []*models.Webhook, event *models.AlertCurEvent, stats *astats.Stats) {
for _, conf := range webhooks {
if conf.Url == "" || !conf.Enable {
continue
@@ -50,9 +51,11 @@ func SendWebhooks(webhooks []*models.Webhook, event *models.AlertCurEvent) {
Timeout: time.Duration(conf.Timeout) * time.Second,
}
stats.AlertNotifyTotal.WithLabelValues("webhook").Inc()
var resp *http.Response
resp, err = client.Do(req)
if err != nil {
stats.AlertNotifyErrorTotal.WithLabelValues("webhook").Inc()
logger.Errorf("event_webhook_fail, ruleId: [%d], eventId: [%d], url: [%s], error: [%s]", event.RuleId, event.Id, conf.Url, err)
continue
}
@@ -60,7 +63,7 @@ func SendWebhooks(webhooks []*models.Webhook, event *models.AlertCurEvent) {
var body []byte
if resp.Body != nil {
defer resp.Body.Close()
body, _ = ioutil.ReadAll(resp.Body)
body, _ = io.ReadAll(resp.Body)
}
logger.Debugf("event_webhook_succ, url: %s, response code: %d, body: %s", conf.Url, resp.StatusCode, string(body))

View File

@@ -3,12 +3,8 @@ package sender
import (
"html/template"
"strings"
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/poster"
"github.com/toolkits/pkg/logger"
)
type wecomMarkdown struct {
@@ -25,11 +21,11 @@ type WecomSender struct {
}
func (ws *WecomSender) Send(ctx MessageContext) {
if len(ctx.Users) == 0 || ctx.Rule == nil || ctx.Event == nil {
if len(ctx.Users) == 0 || len(ctx.Events) == 0 {
return
}
urls := ws.extract(ctx.Users)
message := BuildTplMessage(ws.tpl, ctx.Event)
message := BuildTplMessage(ws.tpl, ctx.Events)
for _, url := range urls {
body := wecom{
Msgtype: "markdown",
@@ -37,7 +33,7 @@ func (ws *WecomSender) Send(ctx MessageContext) {
Content: message,
},
}
ws.doSend(url, body)
doSend(url, body, models.Wecom, ctx.Stats)
}
}
@@ -46,7 +42,7 @@ func (ws *WecomSender) extract(users []*models.User) []string {
for _, user := range users {
if token, has := user.ExtractToken(models.Wecom); has {
url := token
if !strings.HasPrefix(token, "https://") {
if !strings.HasPrefix(token, "https://") && !strings.HasPrefix(token, "http://") {
url = "https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=" + token
}
urls = append(urls, url)
@@ -54,12 +50,3 @@ func (ws *WecomSender) extract(users []*models.User) []string {
}
return urls
}
func (ws *WecomSender) doSend(url string, body wecom) {
res, code, err := poster.PostJSON(url, time.Second*5, body, 3)
if err != nil {
logger.Errorf("wecom_sender: result=fail url=%s code=%d error=%v response=%s", url, code, err, string(res))
} else {
logger.Infof("wecom_sender: result=succ url=%s code=%d response=%s", url, code, string(res))
}
}

View File

@@ -8,6 +8,7 @@ type Center struct {
I18NHeaderKey string
MetricDesc MetricDescType
AnonymousAccess AnonymousAccess
UseFileAssets bool
}
type Plugin struct {

View File

@@ -1,9 +1,11 @@
package cconf
import (
"fmt"
"path"
"github.com/toolkits/pkg/file"
"gopkg.in/yaml.v2"
)
var Operations = Operation{}
@@ -36,3 +38,142 @@ func GetAllOps(ops []Ops) []string {
}
return ret
}
func MergeOperationConf() error {
opsBuiltIn := Operation{}
err := yaml.Unmarshal([]byte(builtInOps), &opsBuiltIn)
if err != nil {
return fmt.Errorf("cannot parse builtInOps: %s", err.Error())
}
configOpsMap := make(map[string]struct{})
for _, op := range Operations.Ops {
configOpsMap[op.Name] = struct{}{}
}
//If the opBu.Name is not a constant in the target (Operations.Ops), add Ops from the built-in options
for _, opBu := range opsBuiltIn.Ops {
if _, has := configOpsMap[opBu.Name]; !has {
Operations.Ops = append(Operations.Ops, opBu)
}
}
return nil
}
const (
builtInOps = `
ops:
- name: dashboards
cname: 仪表盘
ops:
- "/dashboards"
- "/dashboards/add"
- "/dashboards/put"
- "/dashboards/del"
- "/dashboards-built-in"
- name: alert
cname: 告警规则
ops:
- "/alert-rules"
- "/alert-rules/add"
- "/alert-rules/put"
- "/alert-rules/del"
- "/alert-rules-built-in"
- name: alert-mutes
cname: 告警静默管理
ops:
- "/alert-mutes"
- "/alert-mutes/add"
- "/alert-mutes/put"
- "/alert-mutes/del"
- name: alert-subscribes
cname: 告警订阅管理
ops:
- "/alert-subscribes"
- "/alert-subscribes/add"
- "/alert-subscribes/put"
- "/alert-subscribes/del"
- name: alert-events
cname: 告警事件管理
ops:
- "/alert-cur-events"
- "/alert-cur-events/del"
- "/alert-his-events"
- name: recording-rules
cname: 记录规则管理
ops:
- "/recording-rules"
- "/recording-rules/add"
- "/recording-rules/put"
- "/recording-rules/del"
- name: metric
cname: 时序指标
ops:
- "/metric/explorer"
- "/object/explorer"
- name: log
cname: 日志分析
ops:
- "/log/explorer"
- "/log/index-patterns"
- name: targets
cname: 基础设施
ops:
- "/targets"
- "/targets/add"
- "/targets/put"
- "/targets/del"
- name: job
cname: 任务管理
ops:
- "/job-tpls"
- "/job-tpls/add"
- "/job-tpls/put"
- "/job-tpls/del"
- "/job-tasks"
- "/job-tasks/add"
- "/job-tasks/put"
- "/ibex-settings"
- name: user
cname: 用户管理
ops:
- "/users"
- "/user-groups"
- "/user-groups/add"
- "/user-groups/put"
- "/user-groups/del"
- name: permissions
cname: 权限管理
ops:
- "/permissions"
- name: busi-groups
cname: 业务分组管理
ops:
- "/busi-groups"
- "/busi-groups/add"
- "/busi-groups/put"
- "/busi-groups/del"
- name: system
cname: 系统信息
ops:
- "/help/variable-configs"
- "/help/version"
- "/help/servers"
- "/help/source"
- "/help/sso"
- "/help/notification-tpls"
- "/help/notification-settings"
- "/help/migrate"
- "/site-settings"
`
)

View File

@@ -15,8 +15,14 @@ var Plugins = []Plugin{
},
{
Id: 3,
Category: "logging",
Type: "jaeger",
TypeName: "Jaeger",
Category: "loki",
Type: "loki",
TypeName: "Loki",
},
{
Id: 4,
Category: "timeseries",
Type: "tdengine",
TypeName: "TDengine",
},
}

View File

@@ -0,0 +1,105 @@
package rsa
import (
"os"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/httpx"
"github.com/ccfos/nightingale/v6/pkg/secu"
"github.com/pkg/errors"
"github.com/toolkits/pkg/file"
"github.com/toolkits/pkg/logger"
)
func InitRSAConfig(ctx *ctx.Context, rsaConfig *httpx.RSAConfig) error {
// 1.Load RSA keys from Database
rsaPassWord, err := models.ConfigsGet(ctx, models.RSA_PASSWORD)
if err != nil {
return errors.WithMessagef(err, "cannot query config(%s)", models.RSA_PASSWORD)
}
privateKeyVal, err := models.ConfigsGet(ctx, models.RSA_PRIVATE_KEY)
if err != nil {
return errors.WithMessagef(err, "cannot query config(%s)", models.RSA_PRIVATE_KEY)
}
publicKeyVal, err := models.ConfigsGet(ctx, models.RSA_PUBLIC_KEY)
if err != nil {
return errors.WithMessagef(err, "cannot query config(%s)", models.RSA_PUBLIC_KEY)
}
if rsaPassWord != "" && privateKeyVal != "" && publicKeyVal != "" {
rsaConfig.RSAPassWord = rsaPassWord
rsaConfig.RSAPrivateKey = []byte(privateKeyVal)
rsaConfig.RSAPublicKey = []byte(publicKeyVal)
return nil
}
// 2.Read RSA configuration from file if exists
if file.IsExist(rsaConfig.RSAPrivateKeyPath) && file.IsExist(rsaConfig.RSAPublicKeyPath) {
//password already read from config
rsaConfig.RSAPrivateKey, rsaConfig.RSAPublicKey, err = readConfigFile(rsaConfig)
if err != nil {
return errors.WithMessage(err, "failed to read rsa config from file")
}
return nil
}
// 3.Generate RSA keys if not exist
rsaConfig.RSAPassWord, rsaConfig.RSAPrivateKey, rsaConfig.RSAPublicKey, err = initRSAKeyPairs(ctx, rsaConfig.RSAPassWord)
if err != nil {
return errors.WithMessage(err, "failed to generate rsa key pair")
}
return nil
}
func initRSAKeyPairs(ctx *ctx.Context, rsaPassWord string) (password string, privateByte, publicByte []byte, err error) {
// Generate RSA keys
// Generate RSA password
if rsaPassWord != "" {
logger.Debug("Using existing RSA password")
password = rsaPassWord
err = models.ConfigsSet(ctx, models.RSA_PASSWORD, password)
if err != nil {
err = errors.WithMessagef(err, "failed to set config(%s)", models.RSA_PASSWORD)
return
}
} else {
password, err = models.InitRSAPassWord(ctx)
if err != nil {
err = errors.WithMessage(err, "failed to generate rsa password")
return
}
}
privateByte, publicByte, err = secu.GenerateRsaKeyPair(password)
if err != nil {
err = errors.WithMessage(err, "failed to generate rsa key pair")
return
}
// Save generated RSA keys
err = models.ConfigsSet(ctx, models.RSA_PRIVATE_KEY, string(privateByte))
if err != nil {
err = errors.WithMessagef(err, "failed to set config(%s)", models.RSA_PRIVATE_KEY)
return
}
err = models.ConfigsSet(ctx, models.RSA_PUBLIC_KEY, string(publicByte))
if err != nil {
err = errors.WithMessagef(err, "failed to set config(%s)", models.RSA_PUBLIC_KEY)
return
}
return
}
func readConfigFile(rsaConfig *httpx.RSAConfig) (privateBuf, publicBuf []byte, err error) {
publicBuf, err = os.ReadFile(rsaConfig.RSAPublicKeyPath)
if err != nil {
err = errors.WithMessagef(err, "could not read RSAPublicKeyPath %q", rsaConfig.RSAPublicKeyPath)
return
}
privateBuf, err = os.ReadFile(rsaConfig.RSAPrivateKeyPath)
if err != nil {
err = errors.WithMessagef(err, "could not read RSAPrivateKeyPath %q", rsaConfig.RSAPrivateKeyPath)
}
return
}

15
center/cconf/sql_tpl.go Normal file
View File

@@ -0,0 +1,15 @@
package cconf
var TDengineSQLTpl = map[string]string{
"load5": "SELECT _wstart as ts, last(load5) FROM $database.system WHERE host = '$server' and _ts >= $from and _ts <= $to interval($interval) fill(null)",
"process_total": "SELECT _wstart as ts, last(total) FROM $database.processes WHERE host = '$server' and _ts >= $from and _ts <= $to interval($interval) fill(null)",
"thread_total": "SELECT _wstart as ts, last(total) FROM $database.threads WHERE host = '$server' and _ts >= $from and _ts <= $to interval($interval) fill(null)",
"cpu_idle": "SELECT _wstart as ts, last(usage_idle) * -1 + 100 FROM $database.cpu WHERE (host = '$server' and cpu = 'cpu-total') and _ts >= $from and _ts <= $to interval($interval) fill(null)",
"mem_used_percent": "SELECT _wstart as ts, last(used_percent) FROM $database.mem WHERE (host = '$server') and _ts >= $from and _ts <= $to interval($interval) fill(null)",
"disk_used_percent": "SELECT _wstart as ts, last(used_percent) FROM $database.disk WHERE (host = '$server' and path = '/') and _ts >= $from and _ts <= $to interval($interval) fill(null)",
"cpu_context_switches": "select ts, derivative(context_switches, 1s, 0) as context FROM (SELECT _wstart as ts, avg(context_switches) as context_switches FROM $database.kernel WHERE host = '$server' and _ts >= $from and _ts <= $to interval($interval) )",
"tcp": "SELECT _wstart as ts, avg(tcp_close) as CLOSED, avg(tcp_close_wait) as CLOSE_WAIT, avg(tcp_closing) as CLOSING, avg(tcp_established) as ESTABLISHED, avg(tcp_fin_wait1) as FIN_WAIT1, avg(tcp_fin_wait2) as FIN_WAIT2, avg(tcp_last_ack) as LAST_ACK, avg(tcp_syn_recv) as SYN_RECV, avg(tcp_syn_sent) as SYN_SENT, avg(tcp_time_wait) as TIME_WAIT FROM $database.netstat WHERE host = '$server' and _ts >= $from and _ts <= $to interval($interval)",
"net_bytes_recv": "SELECT _wstart as ts, derivative(bytes_recv,1s, 1) as bytes_in FROM $database.net WHERE host = '$server' and interface = '$netif' and _ts >= $from and _ts <= $to group by tbname",
"net_bytes_sent": "SELECT _wstart as ts, derivative(bytes_sent,1s, 1) as bytes_out FROM $database.net WHERE host = '$server' and interface = '$netif' and _ts >= $from and _ts <= $to group by tbname",
"disk_total": "SELECT _wstart as ts, avg(total) AS total, avg(used) as used FROM $database.disk WHERE path = '$mountpoint' and _ts >= $from and _ts <= $to interval($interval) group by host",
}

View File

@@ -8,19 +8,25 @@ import (
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/alert/process"
"github.com/ccfos/nightingale/v6/center/cconf"
"github.com/ccfos/nightingale/v6/center/cconf/rsa"
"github.com/ccfos/nightingale/v6/center/cstats"
"github.com/ccfos/nightingale/v6/center/metas"
"github.com/ccfos/nightingale/v6/center/sso"
"github.com/ccfos/nightingale/v6/conf"
"github.com/ccfos/nightingale/v6/dumper"
"github.com/ccfos/nightingale/v6/memsto"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/models/migrate"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/httpx"
"github.com/ccfos/nightingale/v6/pkg/i18nx"
"github.com/ccfos/nightingale/v6/pkg/logx"
"github.com/ccfos/nightingale/v6/pkg/version"
"github.com/ccfos/nightingale/v6/prom"
"github.com/ccfos/nightingale/v6/pushgw/idents"
"github.com/ccfos/nightingale/v6/pushgw/writer"
"github.com/ccfos/nightingale/v6/storage"
"github.com/ccfos/nightingale/v6/tdengine"
alertrt "github.com/ccfos/nightingale/v6/alert/router"
centerrt "github.com/ccfos/nightingale/v6/center/router"
@@ -36,21 +42,31 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
cconf.LoadMetricsYaml(configDir, config.Center.MetricsYamlFile)
cconf.LoadOpsYaml(configDir, config.Center.OpsYamlFile)
cconf.MergeOperationConf()
logxClean, err := logx.Init(config.Log)
if err != nil {
return nil, err
}
i18nx.Init()
i18nx.Init(configDir)
cstats.Init()
db, err := storage.New(config.DB)
if err != nil {
return nil, err
}
ctx := ctx.NewContext(context.Background(), db, true)
migrate.Migrate(db)
models.InitRoot(ctx)
redis, err := storage.NewRedis(config.Redis)
err = rsa.InitRSAConfig(ctx, &config.HTTP.RSA)
if err != nil {
return nil, err
}
var redis storage.Redis
redis, err = storage.NewRedis(config.Redis)
if err != nil {
return nil, err
}
@@ -63,22 +79,29 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
sso := sso.Init(config.Center, ctx)
configCache := memsto.NewConfigCache(ctx, syncStats, config.HTTP.RSA.RSAPrivateKey, config.HTTP.RSA.RSAPassWord)
busiGroupCache := memsto.NewBusiGroupCache(ctx, syncStats)
targetCache := memsto.NewTargetCache(ctx, syncStats, redis)
dsCache := memsto.NewDatasourceCache(ctx, syncStats)
alertMuteCache := memsto.NewAlertMuteCache(ctx, syncStats)
alertRuleCache := memsto.NewAlertRuleCache(ctx, syncStats)
notifyConfigCache := memsto.NewNotifyConfigCache(ctx)
notifyConfigCache := memsto.NewNotifyConfigCache(ctx, configCache)
userCache := memsto.NewUserCache(ctx, syncStats)
userGroupCache := memsto.NewUserGroupCache(ctx, syncStats)
promClients := prom.NewPromClient(ctx, config.Alert.Heartbeat)
promClients := prom.NewPromClient(ctx)
tdengineClients := tdengine.NewTdengineClient(ctx, config.Alert.Heartbeat)
externalProcessors := process.NewExternalProcessors()
alert.Start(config.Alert, config.Pushgw, syncStats, alertStats, externalProcessors, targetCache, busiGroupCache, alertMuteCache, alertRuleCache, notifyConfigCache, dsCache, ctx, promClients)
alert.Start(config.Alert, config.Pushgw, syncStats, alertStats, externalProcessors, targetCache, busiGroupCache, alertMuteCache, alertRuleCache, notifyConfigCache, dsCache, ctx, promClients, tdengineClients, userCache, userGroupCache)
writers := writer.NewWriters(config.Pushgw)
go version.GetGithubVersion()
alertrtRouter := alertrt.New(config.HTTP, config.Alert, alertMuteCache, targetCache, busiGroupCache, alertStats, ctx, externalProcessors)
centerRouter := centerrt.New(config.HTTP, config.Center, cconf.Operations, dsCache, notifyConfigCache, promClients, redis, sso, ctx, metas, targetCache)
centerRouter := centerrt.New(config.HTTP, config.Center, cconf.Operations, dsCache, notifyConfigCache, promClients, tdengineClients,
redis, sso, ctx, metas, idents, targetCache, userCache, userGroupCache)
pushgwRouter := pushgwrt.New(config.HTTP, config.Pushgw, targetCache, busiGroupCache, idents, writers, ctx)
r := httpx.GinEngine(config.Global.RunMode, config.HTTP)
@@ -86,6 +109,7 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
centerRouter.Config(r)
alertrtRouter.Config(r)
pushgwRouter.Config(r)
dumper.ConfigRouter(r)
httpClean := httpx.Init(config.HTTP, r)

View File

@@ -3,6 +3,8 @@ package router
import (
"fmt"
"net/http"
"path"
"runtime"
"strings"
"time"
@@ -15,12 +17,17 @@ import (
"github.com/ccfos/nightingale/v6/pkg/aop"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/httpx"
"github.com/ccfos/nightingale/v6/pkg/version"
"github.com/ccfos/nightingale/v6/prom"
"github.com/ccfos/nightingale/v6/pushgw/idents"
"github.com/ccfos/nightingale/v6/storage"
"github.com/ccfos/nightingale/v6/tdengine"
"github.com/gin-gonic/gin"
"github.com/rakyll/statik/fs"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
"github.com/toolkits/pkg/runner"
)
type Router struct {
@@ -30,15 +37,18 @@ type Router struct {
DatasourceCache *memsto.DatasourceCacheType
NotifyConfigCache *memsto.NotifyConfigCacheType
PromClients *prom.PromClientMap
TdendgineClients *tdengine.TdengineClientMap
Redis storage.Redis
MetaSet *metas.Set
IdentSet *idents.Set
TargetCache *memsto.TargetCacheType
Sso *sso.SsoClient
UserCache *memsto.UserCacheType
UserGroupCache *memsto.UserGroupCacheType
Ctx *ctx.Context
}
func New(httpConfig httpx.Config, center cconf.Center, operations cconf.Operation, ds *memsto.DatasourceCacheType, ncc *memsto.NotifyConfigCacheType,
pc *prom.PromClientMap, redis storage.Redis, sso *sso.SsoClient, ctx *ctx.Context, metaSet *metas.Set, tc *memsto.TargetCacheType) *Router {
func New(httpConfig httpx.Config, center cconf.Center, operations cconf.Operation, ds *memsto.DatasourceCacheType, ncc *memsto.NotifyConfigCacheType, pc *prom.PromClientMap, tdendgineClients *tdengine.TdengineClientMap, redis storage.Redis, sso *sso.SsoClient, ctx *ctx.Context, metaSet *metas.Set, idents *idents.Set, tc *memsto.TargetCacheType, uc *memsto.UserCacheType, ugc *memsto.UserGroupCacheType) *Router {
return &Router{
HTTP: httpConfig,
Center: center,
@@ -46,10 +56,14 @@ func New(httpConfig httpx.Config, center cconf.Center, operations cconf.Operatio
DatasourceCache: ds,
NotifyConfigCache: ncc,
PromClients: pc,
TdendgineClients: tdendgineClients,
Redis: redis,
MetaSet: metaSet,
IdentSet: idents,
TargetCache: tc,
Sso: sso,
UserCache: uc,
UserGroupCache: ugc,
Ctx: ctx,
}
}
@@ -95,10 +109,32 @@ func (rt *Router) configNoRoute(r *gin.Engine, fs *http.FileSystem) {
suffix := arr[len(arr)-1]
switch suffix {
case "png", "jpeg", "jpg", "svg", "ico", "gif", "css", "js", "html", "htm", "gz", "zip", "map":
c.FileFromFS(c.Request.URL.Path, *fs)
case "png", "jpeg", "jpg", "svg", "ico", "gif", "css", "js", "html", "htm", "gz", "zip", "map", "ttf":
if !rt.Center.UseFileAssets {
c.FileFromFS(c.Request.URL.Path, *fs)
} else {
cwdarr := []string{"/"}
if runtime.GOOS == "windows" {
cwdarr[0] = ""
}
cwdarr = append(cwdarr, strings.Split(runner.Cwd, "/")...)
cwdarr = append(cwdarr, "pub")
cwdarr = append(cwdarr, strings.Split(c.Request.URL.Path, "/")...)
c.File(path.Join(cwdarr...))
}
default:
c.FileFromFS("/", *fs)
if !rt.Center.UseFileAssets {
c.FileFromFS("/", *fs)
} else {
cwdarr := []string{"/"}
if runtime.GOOS == "windows" {
cwdarr[0] = ""
}
cwdarr = append(cwdarr, strings.Split(runner.Cwd, "/")...)
cwdarr = append(cwdarr, "pub")
cwdarr = append(cwdarr, "index.html")
c.File(path.Join(cwdarr...))
}
}
})
}
@@ -113,7 +149,10 @@ func (rt *Router) Config(r *gin.Engine) {
if err != nil {
logger.Errorf("cannot create statik fs: %v", err)
}
r.StaticFS("/pub", statikFS)
if !rt.Center.UseFileAssets {
r.StaticFS("/pub", statikFS)
}
pagesPrefix := "/api/n9e"
pages := r.Group(pagesPrefix)
@@ -124,18 +163,38 @@ func (rt *Router) Config(r *gin.Engine) {
pages.POST("/query-range-batch", rt.promBatchQueryRange)
pages.POST("/query-instant-batch", rt.promBatchQueryInstant)
pages.GET("/datasource/brief", rt.datasourceBriefs)
pages.POST("/ds-query", rt.QueryData)
pages.POST("/logs-query", rt.QueryLog)
pages.POST("/tdengine-databases", rt.tdengineDatabases)
pages.POST("/tdengine-tables", rt.tdengineTables)
pages.POST("/tdengine-columns", rt.tdengineColumns)
} else {
pages.Any("/proxy/:id/*url", rt.auth(), rt.dsProxy)
pages.POST("/query-range-batch", rt.auth(), rt.promBatchQueryRange)
pages.POST("/query-instant-batch", rt.auth(), rt.promBatchQueryInstant)
pages.GET("/datasource/brief", rt.auth(), rt.datasourceBriefs)
pages.GET("/datasource/brief", rt.auth(), rt.user(), rt.datasourceBriefs)
pages.POST("/ds-query", rt.auth(), rt.QueryData)
pages.POST("/logs-query", rt.auth(), rt.QueryLog)
pages.POST("/tdengine-databases", rt.auth(), rt.tdengineDatabases)
pages.POST("/tdengine-tables", rt.auth(), rt.tdengineTables)
pages.POST("/tdengine-columns", rt.auth(), rt.tdengineColumns)
}
pages.GET("/sql-template", rt.QuerySqlTemplate)
pages.POST("/auth/login", rt.jwtMock(), rt.loginPost)
pages.POST("/auth/logout", rt.jwtMock(), rt.logoutPost)
pages.POST("/auth/logout", rt.jwtMock(), rt.auth(), rt.logoutPost)
pages.POST("/auth/refresh", rt.jwtMock(), rt.refreshPost)
pages.POST("/auth/captcha", rt.jwtMock(), rt.generateCaptcha)
pages.POST("/auth/captcha-verify", rt.jwtMock(), rt.captchaVerify)
pages.GET("/auth/ifshowcaptcha", rt.ifShowCaptcha)
pages.GET("/auth/sso-config", rt.ssoConfigNameGet)
pages.GET("/auth/rsa-config", rt.rsaConfigGet)
pages.GET("/auth/redirect", rt.loginRedirect)
pages.GET("/auth/redirect/cas", rt.loginRedirectCas)
pages.GET("/auth/redirect/oauth", rt.loginRedirectOAuth)
@@ -203,7 +262,9 @@ func (rt *Router) Config(r *gin.Engine) {
pages.GET("/builtin-boards-cates", rt.auth(), rt.user(), rt.builtinBoardCateGets)
pages.POST("/builtin-boards-detail", rt.auth(), rt.user(), rt.builtinBoardDetailGets)
pages.GET("/integrations/icon/:cate/:name", rt.builtinIcon)
pages.GET("/integrations/makedown/:cate", rt.builtinMarkdown)
pages.GET("/busi-groups/boards", rt.auth(), rt.user(), rt.perm("/dashboards"), rt.boardGetsByGids)
pages.GET("/busi-group/:id/boards", rt.auth(), rt.user(), rt.perm("/dashboards"), rt.bgro(), rt.boardGets)
pages.POST("/busi-group/:id/boards", rt.auth(), rt.user(), rt.perm("/dashboards/add"), rt.bgrw(), rt.boardAdd)
pages.POST("/busi-group/:id/board/:bid/clone", rt.auth(), rt.user(), rt.perm("/dashboards/add"), rt.bgrw(), rt.boardClone)
@@ -221,6 +282,7 @@ func (rt *Router) Config(r *gin.Engine) {
pages.GET("/alert-rules/builtin/alerts-cates", rt.auth(), rt.user(), rt.builtinAlertCateGets)
pages.GET("/alert-rules/builtin/list", rt.auth(), rt.user(), rt.builtinAlertRules)
pages.GET("/busi-groups/alert-rules", rt.auth(), rt.user(), rt.perm("/alert-rules"), rt.alertRuleGetsByGids)
pages.GET("/busi-group/:id/alert-rules", rt.auth(), rt.user(), rt.perm("/alert-rules"), rt.alertRuleGets)
pages.POST("/busi-group/:id/alert-rules", rt.auth(), rt.user(), rt.perm("/alert-rules/add"), rt.bgrw(), rt.alertRuleAddByFE)
pages.POST("/busi-group/:id/alert-rules/import", rt.auth(), rt.user(), rt.perm("/alert-rules/add"), rt.bgrw(), rt.alertRuleAddByImport)
@@ -228,7 +290,9 @@ func (rt *Router) Config(r *gin.Engine) {
pages.PUT("/busi-group/:id/alert-rules/fields", rt.auth(), rt.user(), rt.perm("/alert-rules/put"), rt.bgrw(), rt.alertRulePutFields)
pages.PUT("/busi-group/:id/alert-rule/:arid", rt.auth(), rt.user(), rt.perm("/alert-rules/put"), rt.alertRulePutByFE)
pages.GET("/alert-rule/:arid", rt.auth(), rt.user(), rt.perm("/alert-rules"), rt.alertRuleGet)
pages.PUT("/busi-group/alert-rule/validate", rt.auth(), rt.user(), rt.perm("/alert-rules/put"), rt.alertRuleValidation)
pages.GET("/busi-groups/recording-rules", rt.auth(), rt.user(), rt.perm("/recording-rules"), rt.recordingRuleGetsByGids)
pages.GET("/busi-group/:id/recording-rules", rt.auth(), rt.user(), rt.perm("/recording-rules"), rt.recordingRuleGets)
pages.POST("/busi-group/:id/recording-rules", rt.auth(), rt.user(), rt.perm("/recording-rules/add"), rt.bgrw(), rt.recordingRuleAddByFE)
pages.DELETE("/busi-group/:id/recording-rules", rt.auth(), rt.user(), rt.perm("/recording-rules/del"), rt.bgrw(), rt.recordingRuleDel)
@@ -236,12 +300,15 @@ func (rt *Router) Config(r *gin.Engine) {
pages.GET("/recording-rule/:rrid", rt.auth(), rt.user(), rt.perm("/recording-rules"), rt.recordingRuleGet)
pages.PUT("/busi-group/:id/recording-rules/fields", rt.auth(), rt.user(), rt.perm("/recording-rules/put"), rt.recordingRulePutFields)
pages.GET("/busi-groups/alert-mutes", rt.auth(), rt.user(), rt.perm("/alert-mutes"), rt.alertMuteGetsByGids)
pages.GET("/busi-group/:id/alert-mutes", rt.auth(), rt.user(), rt.perm("/alert-mutes"), rt.bgro(), rt.alertMuteGetsByBG)
pages.POST("/busi-group/:id/alert-mutes/preview", rt.auth(), rt.user(), rt.perm("/alert-mutes/add"), rt.bgrw(), rt.alertMutePreview)
pages.POST("/busi-group/:id/alert-mutes", rt.auth(), rt.user(), rt.perm("/alert-mutes/add"), rt.bgrw(), rt.alertMuteAdd)
pages.DELETE("/busi-group/:id/alert-mutes", rt.auth(), rt.user(), rt.perm("/alert-mutes/del"), rt.bgrw(), rt.alertMuteDel)
pages.PUT("/busi-group/:id/alert-mute/:amid", rt.auth(), rt.user(), rt.perm("/alert-mutes/put"), rt.alertMutePutByFE)
pages.PUT("/busi-group/:id/alert-mutes/fields", rt.auth(), rt.user(), rt.perm("/alert-mutes/put"), rt.bgrw(), rt.alertMutePutFields)
pages.GET("/busi-groups/alert-subscribes", rt.auth(), rt.user(), rt.perm("/alert-subscribes"), rt.alertSubscribeGetsByGids)
pages.GET("/busi-group/:id/alert-subscribes", rt.auth(), rt.user(), rt.perm("/alert-subscribes"), rt.bgro(), rt.alertSubscribeGets)
pages.GET("/alert-subscribe/:sid", rt.auth(), rt.user(), rt.perm("/alert-subscribes"), rt.alertSubscribeGet)
pages.POST("/busi-group/:id/alert-subscribes", rt.auth(), rt.user(), rt.perm("/alert-subscribes/add"), rt.bgrw(), rt.alertSubscribeAdd)
@@ -262,12 +329,14 @@ func (rt *Router) Config(r *gin.Engine) {
pages.POST("/alert-cur-events/card/details", rt.auth(), rt.alertCurEventsCardDetails)
pages.GET("/alert-his-events/list", rt.auth(), rt.alertHisEventsList)
pages.DELETE("/alert-cur-events", rt.auth(), rt.user(), rt.perm("/alert-cur-events/del"), rt.alertCurEventDel)
pages.GET("/alert-cur-events/stats", rt.auth(), rt.alertCurEventsStatistics)
pages.GET("/alert-aggr-views", rt.auth(), rt.alertAggrViewGets)
pages.DELETE("/alert-aggr-views", rt.auth(), rt.user(), rt.alertAggrViewDel)
pages.POST("/alert-aggr-views", rt.auth(), rt.user(), rt.alertAggrViewAdd)
pages.PUT("/alert-aggr-views", rt.auth(), rt.user(), rt.alertAggrViewPut)
pages.GET("/busi-groups/task-tpls", rt.auth(), rt.user(), rt.perm("/job-tpls"), rt.taskTplGetsByGids)
pages.GET("/busi-group/:id/task-tpls", rt.auth(), rt.user(), rt.perm("/job-tpls"), rt.bgro(), rt.taskTplGets)
pages.POST("/busi-group/:id/task-tpls", rt.auth(), rt.user(), rt.perm("/job-tpls/add"), rt.bgrw(), rt.taskTplAdd)
pages.DELETE("/busi-group/:id/task-tpl/:tid", rt.auth(), rt.user(), rt.perm("/job-tpls/del"), rt.bgrw(), rt.taskTplDel)
@@ -276,6 +345,7 @@ func (rt *Router) Config(r *gin.Engine) {
pages.GET("/busi-group/:id/task-tpl/:tid", rt.auth(), rt.user(), rt.perm("/job-tpls"), rt.bgro(), rt.taskTplGet)
pages.PUT("/busi-group/:id/task-tpl/:tid", rt.auth(), rt.user(), rt.perm("/job-tpls/put"), rt.bgrw(), rt.taskTplPut)
pages.GET("/busi-groups/tasks", rt.auth(), rt.user(), rt.perm("/job-tasks"), rt.taskGetsByGids)
pages.GET("/busi-group/:id/tasks", rt.auth(), rt.user(), rt.perm("/job-tasks"), rt.bgro(), rt.taskGets)
pages.POST("/busi-group/:id/tasks", rt.auth(), rt.user(), rt.perm("/job-tasks/add"), rt.bgrw(), rt.taskAdd)
pages.GET("/busi-group/:id/task/*url", rt.auth(), rt.user(), rt.perm("/job-tasks"), rt.taskProxy)
@@ -284,7 +354,7 @@ func (rt *Router) Config(r *gin.Engine) {
pages.GET("/servers", rt.auth(), rt.admin(), rt.serversGet)
pages.GET("/server-clusters", rt.auth(), rt.admin(), rt.serverClustersGet)
pages.POST("/datasource/list", rt.auth(), rt.datasourceList)
pages.POST("/datasource/list", rt.auth(), rt.user(), rt.datasourceList)
pages.POST("/datasource/plugin/list", rt.auth(), rt.pluginList)
pages.POST("/datasource/upsert", rt.auth(), rt.admin(), rt.datasourceUpsert)
pages.POST("/datasource/desc", rt.auth(), rt.admin(), rt.datasourceGet)
@@ -300,34 +370,71 @@ func (rt *Router) Config(r *gin.Engine) {
pages.PUT("/role/:id/ops", rt.auth(), rt.admin(), rt.roleBindOperation)
pages.GET("/operation", rt.operations)
pages.GET("/notify-tpls", rt.auth(), rt.admin(), rt.notifyTplGets)
pages.GET("/notify-tpls", rt.auth(), rt.user(), rt.perm("/help/notification-tpls"), rt.notifyTplGets)
pages.PUT("/notify-tpl/content", rt.auth(), rt.admin(), rt.notifyTplUpdateContent)
pages.PUT("/notify-tpl", rt.auth(), rt.admin(), rt.notifyTplUpdate)
pages.POST("/notify-tpl", rt.auth(), rt.admin(), rt.notifyTplAdd)
pages.DELETE("/notify-tpl/:id", rt.auth(), rt.admin(), rt.notifyTplDel)
pages.POST("/notify-tpl/preview", rt.auth(), rt.admin(), rt.notifyTplPreview)
pages.GET("/sso-configs", rt.auth(), rt.admin(), rt.ssoConfigGets)
pages.PUT("/sso-config", rt.auth(), rt.admin(), rt.ssoConfigUpdate)
pages.GET("/webhooks", rt.auth(), rt.admin(), rt.webhookGets)
pages.GET("/webhooks", rt.auth(), rt.user(), rt.webhookGets)
pages.PUT("/webhooks", rt.auth(), rt.admin(), rt.webhookPuts)
pages.GET("/notify-script", rt.auth(), rt.admin(), rt.notifyScriptGet)
pages.GET("/notify-script", rt.auth(), rt.user(), rt.perm("/help/notification-settings"), rt.notifyScriptGet)
pages.PUT("/notify-script", rt.auth(), rt.admin(), rt.notifyScriptPut)
pages.GET("/notify-channel", rt.auth(), rt.admin(), rt.notifyChannelGets)
pages.GET("/notify-channel", rt.auth(), rt.user(), rt.perm("/help/notification-settings"), rt.notifyChannelGets)
pages.PUT("/notify-channel", rt.auth(), rt.admin(), rt.notifyChannelPuts)
pages.GET("/notify-contact", rt.auth(), rt.admin(), rt.notifyContactGets)
pages.GET("/notify-contact", rt.auth(), rt.user(), rt.perm("/help/notification-settings"), rt.notifyContactGets)
pages.PUT("/notify-contact", rt.auth(), rt.admin(), rt.notifyContactPuts)
pages.GET("/notify-config", rt.auth(), rt.admin(), rt.notifyConfigGet)
pages.GET("/notify-config", rt.auth(), rt.user(), rt.perm("/help/notification-settings"), rt.notifyConfigGet)
pages.PUT("/notify-config", rt.auth(), rt.admin(), rt.notifyConfigPut)
pages.PUT("/smtp-config-test", rt.auth(), rt.admin(), rt.attemptSendEmail)
pages.GET("/es-index-pattern", rt.auth(), rt.esIndexPatternGet)
pages.GET("/es-index-pattern-list", rt.auth(), rt.esIndexPatternGetList)
pages.POST("/es-index-pattern", rt.auth(), rt.admin(), rt.esIndexPatternAdd)
pages.PUT("/es-index-pattern", rt.auth(), rt.admin(), rt.esIndexPatternPut)
pages.DELETE("/es-index-pattern", rt.auth(), rt.admin(), rt.esIndexPatternDel)
pages.GET("/user-variable-configs", rt.auth(), rt.user(), rt.perm("/help/variable-configs"), rt.userVariableConfigGets)
pages.POST("/user-variable-config", rt.auth(), rt.user(), rt.perm("/help/variable-configs"), rt.userVariableConfigAdd)
pages.PUT("/user-variable-config/:id", rt.auth(), rt.user(), rt.perm("/help/variable-configs"), rt.userVariableConfigPut)
pages.DELETE("/user-variable-config/:id", rt.auth(), rt.user(), rt.perm("/help/variable-configs"), rt.userVariableConfigDel)
pages.GET("/config", rt.auth(), rt.admin(), rt.configGetByKey)
pages.PUT("/config", rt.auth(), rt.admin(), rt.configPutByKey)
pages.GET("/site-info", rt.siteInfo)
// for admin api
pages.GET("/user/busi-groups", rt.auth(), rt.admin(), rt.userBusiGroupsGets)
}
if rt.HTTP.Service.Enable {
r.GET("/api/n9e/versions", func(c *gin.Context) {
v := version.Version
lastIndex := strings.LastIndex(version.Version, "-")
if lastIndex != -1 {
v = version.Version[:lastIndex]
}
gv := version.GithubVersion.Load()
if gv != nil {
ginx.NewRender(c).Data(gin.H{"version": v, "github_verison": gv.(string)}, nil)
} else {
ginx.NewRender(c).Data(gin.H{"version": v, "github_verison": ""}, nil)
}
})
if rt.HTTP.APIForService.Enable {
service := r.Group("/v1/n9e")
if len(rt.HTTP.Service.BasicAuth) > 0 {
service.Use(gin.BasicAuth(rt.HTTP.Service.BasicAuth))
if len(rt.HTTP.APIForService.BasicAuth) > 0 {
service.Use(gin.BasicAuth(rt.HTTP.APIForService.BasicAuth))
}
{
service.Any("/prometheus/*url", rt.dsProxy)
@@ -369,6 +476,8 @@ func (rt *Router) Config(r *gin.Engine) {
service.GET("/alert-his-events", rt.alertHisEventsList)
service.GET("/alert-his-event/:eid", rt.alertHisEventGet)
service.GET("/task-tpl/:tid", rt.taskTplGetByService)
service.GET("/config/:id", rt.configGet)
service.GET("/configs", rt.configsGet)
service.GET("/config", rt.configGetByKey)
@@ -384,14 +493,16 @@ func (rt *Router) Config(r *gin.Engine) {
service.GET("/notify-tpls", rt.notifyTplGets)
service.POST("/task-record-add", rt.taskRecordAdd)
service.GET("/user-variable/decrypt", rt.userVariableGetDecryptByService)
}
}
if rt.HTTP.Heartbeat.Enable {
if rt.HTTP.APIForAgent.Enable {
heartbeat := r.Group("/v1/n9e")
{
if len(rt.HTTP.Heartbeat.BasicAuth) > 0 {
heartbeat.Use(gin.BasicAuth(rt.HTTP.Heartbeat.BasicAuth))
if len(rt.HTTP.APIForAgent.BasicAuth) > 0 {
heartbeat.Use(gin.BasicAuth(rt.HTTP.APIForAgent.BasicAuth))
}
heartbeat.POST("/heartbeat", rt.heartbeat)
}

View File

@@ -69,6 +69,11 @@ func (rt *Router) alertAggrViewPut(c *gin.Context) {
return
}
}
ginx.NewRender(c).Message(view.Update(rt.Ctx, f.Name, f.Rule, f.Cate, me.Id))
view.Name = f.Name
view.Rule = f.Rule
view.Cate = f.Cate
if view.CreateBy == 0 {
view.CreateBy = me.Id
}
ginx.NewRender(c).Message(view.Update(rt.Ctx))
}

View File

@@ -4,6 +4,7 @@ import (
"net/http"
"sort"
"strings"
"time"
"github.com/ccfos/nightingale/v6/models"
@@ -182,10 +183,19 @@ func (rt *Router) alertCurEventDel(c *gin.Context) {
ginx.BindJSON(c, &f)
f.Verify()
rt.checkCurEventBusiGroupRWPermission(c, f.Ids)
ginx.NewRender(c).Message(models.AlertCurEventDel(rt.Ctx, f.Ids))
}
func (rt *Router) checkCurEventBusiGroupRWPermission(c *gin.Context, ids []int64) {
set := make(map[int64]struct{})
for i := 0; i < len(f.Ids); i++ {
event, err := models.AlertCurEventGetById(rt.Ctx, f.Ids[i])
// event group id is 0, ignore perm check
set[0] = struct{}{}
for i := 0; i < len(ids); i++ {
event, err := models.AlertCurEventGetById(rt.Ctx, ids[i])
ginx.Dangerous(err)
if _, has := set[event.GroupId]; !has {
@@ -193,8 +203,6 @@ func (rt *Router) alertCurEventDel(c *gin.Context) {
set[event.GroupId] = struct{}{}
}
}
ginx.NewRender(c).Message(models.AlertCurEventDel(rt.Ctx, f.Ids))
}
func (rt *Router) alertCurEventGet(c *gin.Context) {
@@ -208,3 +216,8 @@ func (rt *Router) alertCurEventGet(c *gin.Context) {
ginx.NewRender(c).Data(event, nil)
}
func (rt *Router) alertCurEventsStatistics(c *gin.Context) {
ginx.NewRender(c).Data(models.AlertCurEventStatistics(rt.Ctx, time.Now()), nil)
}

View File

@@ -2,6 +2,7 @@ package router
import (
"net/http"
"strconv"
"strings"
"time"
@@ -10,6 +11,7 @@ import (
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
"github.com/toolkits/pkg/str"
)
// Return all, front-end search and paging
@@ -26,6 +28,27 @@ func (rt *Router) alertRuleGets(c *gin.Context) {
ginx.NewRender(c).Data(ars, err)
}
func (rt *Router) alertRuleGetsByGids(c *gin.Context) {
gids := str.IdsInt64(ginx.QueryStr(c, "gids"), ",")
if len(gids) == 0 {
ginx.NewRender(c, http.StatusBadRequest).Message("arg(gids) is empty")
return
}
for _, gid := range gids {
rt.bgroCheck(c, gid)
}
ars, err := models.AlertRuleGetsByBGIds(rt.Ctx, gids)
if err == nil {
cache := make(map[int64]*models.UserGroup)
for i := 0; i < len(ars); i++ {
ars[i].FillNotifyGroups(rt.Ctx, cache)
ars[i].FillSeverities()
}
}
ginx.NewRender(c).Data(ars, err)
}
func (rt *Router) alertRulesGetByService(c *gin.Context) {
prods := []string{}
prodStr := ginx.QueryStr(c, "prods", "")
@@ -271,3 +294,54 @@ func (rt *Router) alertRuleGet(c *gin.Context) {
ginx.NewRender(c).Data(ar, err)
}
// pre validation before save rule
func (rt *Router) alertRuleValidation(c *gin.Context) {
var f models.AlertRule //new
ginx.BindJSON(c, &f)
if len(f.NotifyChannelsJSON) > 0 && len(f.NotifyGroupsJSON) > 0 { //Validation NotifyChannels
ngids := make([]int64, 0, len(f.NotifyChannelsJSON))
for i := range f.NotifyGroupsJSON {
id, _ := strconv.ParseInt(f.NotifyGroupsJSON[i], 10, 64)
ngids = append(ngids, id)
}
userGroups := rt.UserGroupCache.GetByUserGroupIds(ngids)
uids := make([]int64, 0)
for i := range userGroups {
uids = append(uids, userGroups[i].UserIds...)
}
users := rt.UserCache.GetByUserIds(uids)
//If any users have a certain notify channel's token, it will be okay. Otherwise, this notify channel is absent of tokens.
ancs := make([]string, 0, len(f.NotifyChannelsJSON)) //absent Notify Channels
for i := range f.NotifyChannelsJSON {
flag := true
//ignore non-default channels
switch f.NotifyChannelsJSON[i] {
case models.Dingtalk, models.Wecom, models.Feishu, models.Mm,
models.Telegram, models.Email, models.FeishuCard:
// do nothing
default:
continue
}
//default channels
for ui := range users {
if _, b := users[ui].ExtractToken(f.NotifyChannelsJSON[i]); b {
flag = false
break
}
}
if flag {
ancs = append(ancs, f.NotifyChannelsJSON[i])
}
}
if len(ancs) > 0 {
ginx.NewRender(c).Message("All users are missing notify channel configurations. Please check for missing tokens (each channel should be configured with at least one user). %s", ancs)
return
}
}
ginx.NewRender(c).Message("")
}

View File

@@ -8,27 +8,52 @@ import (
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/str"
)
// Return all, front-end search and paging
func (rt *Router) alertSubscribeGets(c *gin.Context) {
bgid := ginx.UrlParamInt64(c, "id")
lst, err := models.AlertSubscribeGets(rt.Ctx, bgid)
if err == nil {
ugcache := make(map[int64]*models.UserGroup)
for i := 0; i < len(lst); i++ {
ginx.Dangerous(lst[i].FillUserGroups(rt.Ctx, ugcache))
}
ginx.Dangerous(err)
rulecache := make(map[int64]string)
for i := 0; i < len(lst); i++ {
ginx.Dangerous(lst[i].FillRuleName(rt.Ctx, rulecache))
}
ugcache := make(map[int64]*models.UserGroup)
rulecache := make(map[int64]string)
for i := 0; i < len(lst); i++ {
ginx.Dangerous(lst[i].FillDatasourceIds(rt.Ctx))
}
for i := 0; i < len(lst); i++ {
ginx.Dangerous(lst[i].FillUserGroups(rt.Ctx, ugcache))
ginx.Dangerous(lst[i].FillRuleName(rt.Ctx, rulecache))
ginx.Dangerous(lst[i].FillDatasourceIds(rt.Ctx))
ginx.Dangerous(lst[i].DB2FE())
}
ginx.NewRender(c).Data(lst, err)
}
func (rt *Router) alertSubscribeGetsByGids(c *gin.Context) {
gids := str.IdsInt64(ginx.QueryStr(c, "gids"), ",")
if len(gids) == 0 {
ginx.NewRender(c, http.StatusBadRequest).Message("arg(gids) is empty")
return
}
for _, gid := range gids {
rt.bgroCheck(c, gid)
}
lst, err := models.AlertSubscribeGetsByBGIds(rt.Ctx, gids)
ginx.Dangerous(err)
ugcache := make(map[int64]*models.UserGroup)
rulecache := make(map[int64]string)
for i := 0; i < len(lst); i++ {
ginx.Dangerous(lst[i].FillUserGroups(rt.Ctx, ugcache))
ginx.Dangerous(lst[i].FillRuleName(rt.Ctx, rulecache))
ginx.Dangerous(lst[i].FillDatasourceIds(rt.Ctx))
ginx.Dangerous(lst[i].DB2FE())
}
ginx.NewRender(c).Data(lst, err)
}
@@ -83,6 +108,9 @@ func (rt *Router) alertSubscribePut(c *gin.Context) {
rt.Ctx,
"name",
"disabled",
"prod",
"cate",
"datasource_ids",
"cluster",
"rule_id",
"tags",
@@ -96,7 +124,10 @@ func (rt *Router) alertSubscribePut(c *gin.Context) {
"webhooks",
"for_duration",
"redefine_webhooks",
"datasource_ids",
"severities",
"extra_config",
"busi_groups",
"note",
))
}

View File

@@ -9,6 +9,7 @@ import (
"github.com/gin-gonic/gin"
"github.com/google/uuid"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/str"
)
type boardForm struct {
@@ -207,12 +208,24 @@ func (rt *Router) boardGets(c *gin.Context) {
ginx.NewRender(c).Data(boards, err)
}
func (rt *Router) boardGetsByGids(c *gin.Context) {
gids := str.IdsInt64(ginx.QueryStr(c, "gids"), ",")
query := ginx.QueryStr(c, "query", "")
for _, gid := range gids {
rt.bgroCheck(c, gid)
}
boards, err := models.BoardGetsByBGIds(rt.Ctx, gids, query)
ginx.NewRender(c).Data(boards, err)
}
func (rt *Router) boardClone(c *gin.Context) {
me := c.MustGet("user").(*models.User)
bo := rt.Board(ginx.UrlParamInt64(c, "bid"))
newBoard := &models.Board{
Name: bo.Name + " Copy",
Name: bo.Name + " Cloned",
Tags: bo.Tags,
GroupId: bo.GroupId,
CreateBy: me.Username,

View File

@@ -78,7 +78,7 @@ func (rt *Router) builtinBoardCateGets(c *gin.Context) {
}
me := c.MustGet("user").(*models.User)
buildinFavoritesMap, err := models.BuiltinCateGetByUserId(rt.Ctx, me.Id)
builtinFavoritesMap, err := models.BuiltinCateGetByUserId(rt.Ctx, me.Id)
if err != nil {
logger.Warningf("get builtin favorites fail: %v", err)
}
@@ -91,6 +91,9 @@ func (rt *Router) builtinBoardCateGets(c *gin.Context) {
boardCate.Name = dir
files, err := file.FilesUnder(fp + "/" + dir + "/dashboards")
ginx.Dangerous(err)
if len(files) == 0 {
continue
}
var boards []Payload
for _, f := range files {
@@ -114,7 +117,7 @@ func (rt *Router) builtinBoardCateGets(c *gin.Context) {
}
boardCate.Boards = boards
if _, ok := buildinFavoritesMap[dir]; ok {
if _, ok := builtinFavoritesMap[dir]; ok {
boardCate.Favorite = true
}
@@ -170,7 +173,7 @@ func (rt *Router) builtinAlertCateGets(c *gin.Context) {
}
me := c.MustGet("user").(*models.User)
buildinFavoritesMap, err := models.BuiltinCateGetByUserId(rt.Ctx, me.Id)
builtinFavoritesMap, err := models.BuiltinCateGetByUserId(rt.Ctx, me.Id)
if err != nil {
logger.Warningf("get builtin favorites fail: %v", err)
}
@@ -207,7 +210,7 @@ func (rt *Router) builtinAlertCateGets(c *gin.Context) {
alertCate.IconUrl = fmt.Sprintf("/api/n9e/integrations/icon/%s/%s", dir, iconFiles[0])
}
if _, ok := buildinFavoritesMap[dir]; ok {
if _, ok := builtinFavoritesMap[dir]; ok {
alertCate.Favorite = true
}
@@ -230,7 +233,7 @@ func (rt *Router) builtinAlertRules(c *gin.Context) {
}
me := c.MustGet("user").(*models.User)
buildinFavoritesMap, err := models.BuiltinCateGetByUserId(rt.Ctx, me.Id)
builtinFavoritesMap, err := models.BuiltinCateGetByUserId(rt.Ctx, me.Id)
if err != nil {
logger.Warningf("get builtin favorites fail: %v", err)
}
@@ -243,6 +246,9 @@ func (rt *Router) builtinAlertRules(c *gin.Context) {
alertCate.Name = dir
files, err := file.FilesUnder(fp + "/" + dir + "/alerts")
ginx.Dangerous(err)
if len(files) == 0 {
continue
}
alertRules := make(map[string][]models.AlertRule)
for _, f := range files {
@@ -268,7 +274,7 @@ func (rt *Router) builtinAlertRules(c *gin.Context) {
alertCate.IconUrl = fmt.Sprintf("/api/n9e/integrations/icon/%s/%s", dir, iconFiles[0])
}
if _, ok := buildinFavoritesMap[dir]; ok {
if _, ok := builtinFavoritesMap[dir]; ok {
alertCate.Favorite = true
}
@@ -309,3 +315,26 @@ func (rt *Router) builtinIcon(c *gin.Context) {
iconPath := fp + "/" + cate + "/icon/" + ginx.UrlParamStr(c, "name")
c.File(path.Join(iconPath))
}
func (rt *Router) builtinMarkdown(c *gin.Context) {
fp := rt.Center.BuiltinIntegrationsDir
if fp == "" {
fp = path.Join(runner.Cwd, "integrations")
}
cate := ginx.UrlParamStr(c, "cate")
var markdown []byte
markdownDir := fp + "/" + cate + "/markdown"
markdownFiles, err := file.FilesUnder(markdownDir)
if err != nil {
logger.Warningf("get markdown fail: %v", err)
} else if len(markdownFiles) > 0 {
f := markdownFiles[0]
fn := markdownDir + "/" + f
markdown, err = file.ReadBytes(fn)
if err != nil {
logger.Warningf("get collect fail: %v", err)
}
}
ginx.NewRender(c).Data(string(markdown), nil)
}

View File

@@ -0,0 +1,114 @@
package router
import (
"context"
"time"
"github.com/ccfos/nightingale/v6/storage"
"github.com/gin-gonic/gin"
captcha "github.com/mojocn/base64Captcha"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
type CaptchaRedisStore struct {
redis storage.Redis
}
func (s *CaptchaRedisStore) Set(id string, value string) error {
ctx := context.Background()
err := s.redis.Set(ctx, id, value, time.Duration(300*time.Second)).Err()
if err != nil {
logger.Errorf("captcha id set to redis error : %s", err.Error())
return err
}
return nil
}
func (s *CaptchaRedisStore) Get(id string, clear bool) string {
ctx := context.Background()
val, err := s.redis.Get(ctx, id).Result()
if err != nil {
logger.Errorf("captcha id get from redis error : %s", err.Error())
return ""
}
if clear {
s.redis.Del(ctx, id)
}
return val
}
func (s *CaptchaRedisStore) Verify(id, answer string, clear bool) bool {
old := s.Get(id, clear)
return old == answer
}
func (rt *Router) newCaptchaRedisStore() *CaptchaRedisStore {
if captchaStore == nil {
captchaStore = &CaptchaRedisStore{redis: rt.Redis}
}
return captchaStore
}
var captchaStore *CaptchaRedisStore
type CaptchaReqBody struct {
Id string
VerifyValue string
}
// 生成图形验证码
func (rt *Router) generateCaptcha(c *gin.Context) {
var driver = captcha.NewDriverMath(60, 200, 0, captcha.OptionShowHollowLine, nil, nil, []string{"wqy-microhei.ttc"})
cc := captcha.NewCaptcha(driver, rt.newCaptchaRedisStore())
//data:image/png;base64
id, b64s, err := cc.Generate()
if err != nil {
ginx.NewRender(c).Message(err)
return
}
ginx.NewRender(c).Data(gin.H{
"imgdata": b64s,
"captchaid": id,
}, nil)
}
// 验证
func (rt *Router) captchaVerify(c *gin.Context) {
var param CaptchaReqBody
ginx.BindJSON(c, &param)
//verify the captcha
if captchaStore.Verify(param.Id, param.VerifyValue, true) {
ginx.NewRender(c).Message("")
return
}
ginx.NewRender(c).Message("incorrect verification code")
}
// 验证码开关
func (rt *Router) ifShowCaptcha(c *gin.Context) {
if rt.HTTP.ShowCaptcha.Enable {
ginx.NewRender(c).Data(gin.H{
"show": true,
}, nil)
return
}
ginx.NewRender(c).Data(gin.H{
"show": false,
}, nil)
}
// 验证
func CaptchaVerify(id string, value string) bool {
//verify the captcha
return captchaStore.Verify(id, value, true)
}

View File

@@ -62,3 +62,8 @@ func (rt *Router) contactKeysGets(c *gin.Context) {
ginx.NewRender(c).Data(labelAndKeys, nil)
}
func (rt *Router) siteInfo(c *gin.Context) {
config, err := models.ConfigsGet(rt.Ctx, "site_info")
ginx.NewRender(c).Data(config, err)
}

View File

@@ -2,6 +2,7 @@ package router
import (
"github.com/ccfos/nightingale/v6/models"
"time"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
@@ -25,28 +26,49 @@ func (rt *Router) configGetByKey(c *gin.Context) {
ginx.NewRender(c).Data(config, err)
}
func (rt *Router) configPutByKey(c *gin.Context) {
var f models.Configs
ginx.BindJSON(c, &f)
username := c.MustGet("username").(string)
ginx.NewRender(c).Message(models.ConfigsSetWithUname(rt.Ctx, f.Ckey, f.Cval, username))
}
func (rt *Router) configsDel(c *gin.Context) {
var f idsForm
ginx.BindJSON(c, &f)
ginx.NewRender(c).Message(models.ConfigsDel(rt.Ctx, f.Ids))
}
func (rt *Router) configsPut(c *gin.Context) {
func (rt *Router) configsPut(c *gin.Context) { //for APIForService
var arr []models.Configs
ginx.BindJSON(c, &arr)
username := c.GetString("user")
if username == "" {
username = "default"
}
now := time.Now().Unix()
for i := 0; i < len(arr); i++ {
arr[i].UpdateBy = username
arr[i].UpdateAt = now
ginx.Dangerous(arr[i].Update(rt.Ctx))
}
ginx.NewRender(c).Message(nil)
}
func (rt *Router) configsPost(c *gin.Context) {
func (rt *Router) configsPost(c *gin.Context) { //for APIForService
var arr []models.Configs
ginx.BindJSON(c, &arr)
username := c.GetString("user")
if username == "" {
username = "default"
}
now := time.Now().Unix()
for i := 0; i < len(arr); i++ {
arr[i].CreateBy = username
arr[i].UpdateBy = username
arr[i].CreateAt = now
arr[i].UpdateAt = now
ginx.Dangerous(arr[i].Add(rt.Ctx))
}

View File

@@ -3,6 +3,7 @@ package router
import (
"crypto/tls"
"fmt"
"io"
"net/http"
"net/url"
"strings"
@@ -25,6 +26,11 @@ type listReq struct {
}
func (rt *Router) datasourceList(c *gin.Context) {
if rt.DatasourceCache.DatasourceCheckHook(c) {
Render(c, []int{}, nil)
return
}
var req listReq
ginx.BindJSON(c, &req)
@@ -32,8 +38,10 @@ func (rt *Router) datasourceList(c *gin.Context) {
category := req.Category
name := req.Name
user := c.MustGet("user").(*models.User)
list, err := models.GetDatasourcesGetsBy(rt.Ctx, typ, category, name, "")
Render(c, list, err)
Render(c, rt.DatasourceCache.DatasourceFilter(list, user), err)
}
func (rt *Router) datasourceGetsByService(c *gin.Context) {
@@ -42,29 +50,33 @@ func (rt *Router) datasourceGetsByService(c *gin.Context) {
ginx.NewRender(c).Data(lst, err)
}
type datasourceBrief struct {
Id int64 `json:"id"`
Name string `json:"name"`
PluginType string `json:"plugin_type"`
}
func (rt *Router) datasourceBriefs(c *gin.Context) {
var dss []datasourceBrief
var dss []*models.Datasource
list, err := models.GetDatasourcesGetsBy(rt.Ctx, "", "", "", "")
ginx.Dangerous(err)
for i := range list {
dss = append(dss, datasourceBrief{
dss = append(dss, &models.Datasource{
Id: list[i].Id,
Name: list[i].Name,
PluginType: list[i].PluginType,
})
}
if !rt.Center.AnonymousAccess.PromQuerier {
user := c.MustGet("user").(*models.User)
dss = rt.DatasourceCache.DatasourceFilter(dss, user)
}
ginx.NewRender(c).Data(dss, err)
}
func (rt *Router) datasourceUpsert(c *gin.Context) {
if rt.DatasourceCache.DatasourceCheckHook(c) {
Render(c, []int{}, nil)
return
}
var req models.Datasource
ginx.BindJSON(c, &req)
username := Username(c)
@@ -94,7 +106,7 @@ func (rt *Router) datasourceUpsert(c *gin.Context) {
}
err = req.Add(rt.Ctx)
} else {
err = req.Update(rt.Ctx, "name", "description", "cluster_name", "settings", "http", "auth", "updated_by", "updated_at")
err = req.Update(rt.Ctx, "name", "description", "cluster_name", "settings", "http", "auth", "updated_by", "updated_at", "is_default")
}
Render(c, nil, err)
@@ -105,6 +117,10 @@ func DatasourceCheck(ds models.Datasource) error {
return fmt.Errorf("url is empty")
}
if !strings.HasPrefix(ds.HTTPJson.Url, "http") {
return fmt.Errorf("url must start with http or https")
}
client := &http.Client{
Transport: &http.Transport{
TLSClientConfig: &tls.Config{
@@ -123,16 +139,33 @@ func DatasourceCheck(ds models.Datasource) error {
if ds.PluginType == models.PROMETHEUS {
subPath := "/api/v1/query"
query := url.Values{}
if strings.Contains(fullURL, "loki") {
if ds.HTTPJson.IsLoki() {
subPath = "/api/v1/labels"
query.Add("start", "1")
query.Add("end", "2")
} else {
query.Add("query", "1+1")
}
fullURL = fmt.Sprintf("%s%s?%s", ds.HTTPJson.Url, subPath, query.Encode())
req, err = http.NewRequest("POST", fullURL, nil)
req, err = http.NewRequest("GET", fullURL, nil)
if err != nil {
logger.Errorf("Error creating request: %v", err)
return fmt.Errorf("request url:%s failed", fullURL)
}
} else if ds.PluginType == models.TDENGINE {
fullURL = fmt.Sprintf("%s/rest/sql", ds.HTTPJson.Url)
req, err = http.NewRequest("POST", fullURL, strings.NewReader("show databases"))
if err != nil {
logger.Errorf("Error creating request: %v", err)
return fmt.Errorf("request url:%s failed", fullURL)
}
}
if ds.PluginType == models.LOKI {
subPath := "/api/v1/labels"
fullURL = fmt.Sprintf("%s%s", ds.HTTPJson.Url, subPath)
req, err = http.NewRequest("GET", fullURL, nil)
if err != nil {
logger.Errorf("Error creating request: %v", err)
return fmt.Errorf("request url:%s failed", fullURL)
@@ -156,13 +189,19 @@ func DatasourceCheck(ds models.Datasource) error {
if resp.StatusCode != 200 {
logger.Errorf("Error making request: %v\n", resp.StatusCode)
return fmt.Errorf("request url:%s failed code:%d", fullURL, resp.StatusCode)
body, _ := io.ReadAll(resp.Body)
return fmt.Errorf("request url:%s failed code:%d body:%s", fullURL, resp.StatusCode, string(body))
}
return nil
}
func (rt *Router) datasourceGet(c *gin.Context) {
if rt.DatasourceCache.DatasourceCheckHook(c) {
Render(c, []int{}, nil)
return
}
var req models.Datasource
ginx.BindJSON(c, &req)
err := req.Get(rt.Ctx)
@@ -170,6 +209,11 @@ func (rt *Router) datasourceGet(c *gin.Context) {
}
func (rt *Router) datasourceUpdataStatus(c *gin.Context) {
if rt.DatasourceCache.DatasourceCheckHook(c) {
Render(c, []int{}, nil)
return
}
var req models.Datasource
ginx.BindJSON(c, &req)
username := Username(c)
@@ -179,6 +223,11 @@ func (rt *Router) datasourceUpdataStatus(c *gin.Context) {
}
func (rt *Router) datasourceDel(c *gin.Context) {
if rt.DatasourceCache.DatasourceCheckHook(c) {
Render(c, []int{}, nil)
return
}
var ids []int64
ginx.BindJSON(c, &ids)
err := models.DatasourceDel(rt.Ctx, ids)

View File

@@ -0,0 +1,81 @@
package router
import (
"net/http"
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
// 创建 ES Index Pattern
func (rt *Router) esIndexPatternAdd(c *gin.Context) {
var f models.EsIndexPattern
ginx.BindJSON(c, &f)
username := c.MustGet("username").(string)
now := time.Now().Unix()
f.CreateAt = now
f.CreateBy = username
f.UpdateAt = now
f.UpdateBy = username
err := f.Add(rt.Ctx)
ginx.NewRender(c).Message(err)
}
// 更新 ES Index Pattern
func (rt *Router) esIndexPatternPut(c *gin.Context) {
var f models.EsIndexPattern
ginx.BindJSON(c, &f)
id := ginx.QueryInt64(c, "id")
esIndexPattern, err := models.EsIndexPatternGetById(rt.Ctx, id)
ginx.Dangerous(err)
if esIndexPattern == nil {
ginx.NewRender(c, http.StatusNotFound).Message("No such EsIndexPattern")
return
}
f.UpdateBy = c.MustGet("username").(string)
ginx.NewRender(c).Message(esIndexPattern.Update(rt.Ctx, f))
}
// 删除 ES Index Pattern
func (rt *Router) esIndexPatternDel(c *gin.Context) {
var f idsForm
ginx.BindJSON(c, &f)
if len(f.Ids) == 0 {
ginx.Bomb(http.StatusBadRequest, "ids empty")
}
ginx.NewRender(c).Message(models.EsIndexPatternDel(rt.Ctx, f.Ids))
}
// ES Index Pattern列表
func (rt *Router) esIndexPatternGetList(c *gin.Context) {
datasourceId := ginx.QueryInt64(c, "datasource_id", 0)
var lst []*models.EsIndexPattern
var err error
if datasourceId != 0 {
lst, err = models.EsIndexPatternGets(rt.Ctx, "datasource_id = ?", datasourceId)
} else {
lst, err = models.EsIndexPatternGets(rt.Ctx, "")
}
ginx.NewRender(c).Data(lst, err)
}
// ES Index Pattern 单个数据
func (rt *Router) esIndexPatternGet(c *gin.Context) {
id := ginx.QueryInt64(c, "id")
item, err := models.EsIndexPatternGet(rt.Ctx, "id=?", id)
ginx.NewRender(c).Data(item, err)
}

View File

@@ -44,6 +44,10 @@ func (rt *Router) statistic(c *gin.Context) {
statistics, err = models.DatasourceStatistics(rt.Ctx)
ginx.NewRender(c).Data(statistics, err)
return
case "user_variable":
statistics, err = models.ConfigsUserVariableStatistics(rt.Ctx)
ginx.NewRender(c).Data(statistics, err)
return
default:
ginx.Bomb(http.StatusBadRequest, "invalid name")
}

View File

@@ -4,12 +4,15 @@ import (
"compress/gzip"
"encoding/json"
"io/ioutil"
"sort"
"strings"
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
func (rt *Router) heartbeat(c *gin.Context) {
@@ -35,17 +38,52 @@ func (rt *Router) heartbeat(c *gin.Context) {
err = json.Unmarshal(bs, &req)
ginx.Dangerous(err)
req.Offset = (time.Now().UnixMilli() - req.UnixTime)
req.RemoteAddr = c.ClientIP()
// maybe from pushgw
if req.Offset == 0 {
req.Offset = (time.Now().UnixMilli() - req.UnixTime)
}
if req.RemoteAddr == "" {
req.RemoteAddr = c.ClientIP()
}
rt.MetaSet.Set(req.Hostname, req)
var items = make(map[string]struct{})
items[req.Hostname] = struct{}{}
rt.IdentSet.MSet(items)
gid := ginx.QueryInt64(c, "gid", 0)
if target, has := rt.TargetCache.Get(req.Hostname); has && target != nil {
gid := ginx.QueryInt64(c, "gid", 0)
hostIp := strings.TrimSpace(req.HostIp)
if gid != 0 {
target, has := rt.TargetCache.Get(req.Hostname)
if has && target.GroupId != gid {
err = models.TargetUpdateBgid(rt.Ctx, []string{req.Hostname}, gid, false)
filed := make(map[string]interface{})
if gid != 0 && gid != target.GroupId {
filed["group_id"] = gid
}
if hostIp != "" && hostIp != target.HostIp {
filed["host_ip"] = hostIp
}
if len(req.GlobalLabels) > 0 {
lst := []string{}
for k, v := range req.GlobalLabels {
lst = append(lst, k+"="+v)
}
sort.Strings(lst)
labels := strings.Join(lst, " ")
if target.Tags != labels {
filed["tags"] = labels
}
}
if len(filed) > 0 {
err := target.UpdateFieldsMap(rt.Ctx, filed)
if err != nil {
logger.Errorf("update target fields failed, err: %v", err)
}
}
logger.Debugf("heartbeat field:%+v target: %v", filed, *target)
}
ginx.NewRender(c).Message(err)

View File

@@ -1,6 +1,7 @@
package router
import (
"encoding/base64"
"fmt"
"net/http"
"strconv"
@@ -12,6 +13,7 @@ import (
"github.com/ccfos/nightingale/v6/pkg/ldapx"
"github.com/ccfos/nightingale/v6/pkg/oauth2x"
"github.com/ccfos/nightingale/v6/pkg/oidcx"
"github.com/ccfos/nightingale/v6/pkg/secu"
"github.com/pelletier/go-toml/v2"
"github.com/dgrijalva/jwt-go"
@@ -21,20 +23,40 @@ import (
)
type loginForm struct {
Username string `json:"username" binding:"required"`
Password string `json:"password" binding:"required"`
Username string `json:"username" binding:"required"`
Password string `json:"password" binding:"required"`
Captchaid string `json:"captchaid"`
Verifyvalue string `json:"verifyvalue"`
}
func (rt *Router) loginPost(c *gin.Context) {
var f loginForm
ginx.BindJSON(c, &f)
logger.Infof("username:%s login from:%s", f.Username, c.ClientIP())
user, err := models.PassLogin(rt.Ctx, f.Username, f.Password)
if rt.HTTP.ShowCaptcha.Enable {
if !CaptchaVerify(f.Captchaid, f.Verifyvalue) {
ginx.NewRender(c).Message("incorrect verification code")
return
}
}
authPassWord := f.Password
// need decode
if rt.HTTP.RSA.OpenRSA {
decPassWord, err := secu.Decrypt(f.Password, rt.HTTP.RSA.RSAPrivateKey, rt.HTTP.RSA.RSAPassWord)
if err != nil {
logger.Errorf("RSA Decrypt failed: %v username: %s", err, f.Username)
ginx.NewRender(c).Message(err)
return
}
authPassWord = decPassWord
}
user, err := models.PassLogin(rt.Ctx, f.Username, authPassWord)
if err != nil {
// pass validate fail, try ldap
if rt.Sso.LDAP.Enable {
roles := strings.Join(rt.Sso.LDAP.DefaultRoles, " ")
user, err = models.LdapLogin(rt.Ctx, f.Username, f.Password, roles, rt.Sso.LDAP)
user, err = models.LdapLogin(rt.Ctx, f.Username, authPassWord, roles, rt.Sso.LDAP)
if err != nil {
logger.Debugf("ldap login failed: %v username: %s", err, f.Username)
ginx.NewRender(c).Message(err)
@@ -67,6 +89,7 @@ func (rt *Router) loginPost(c *gin.Context) {
}
func (rt *Router) logoutPost(c *gin.Context) {
logger.Infof("username:%s login from:%s", c.GetString("username"), c.ClientIP())
metadata, err := rt.extractTokenMetadata(c.Request)
if err != nil {
ginx.NewRender(c, http.StatusBadRequest).Message("failed to parse jwt token")
@@ -207,7 +230,7 @@ func (rt *Router) loginCallback(c *gin.Context) {
ret, err := rt.Sso.OIDC.Callback(rt.Redis, c.Request.Context(), code, state)
if err != nil {
logger.Debugf("sso.callback() get ret %+v error %v", ret, err)
logger.Errorf("sso_callback fail. code:%s, state:%s, get ret: %+v. error: %v", code, state, ret, err)
ginx.NewRender(c).Data(CallbackOutput{}, err)
return
}
@@ -492,10 +515,23 @@ type SsoConfigOutput struct {
}
func (rt *Router) ssoConfigNameGet(c *gin.Context) {
var oidcDisplayName, casDisplayName, oauthDisplayName string
if rt.Sso.OIDC != nil {
oidcDisplayName = rt.Sso.OIDC.GetDisplayName()
}
if rt.Sso.CAS != nil {
casDisplayName = rt.Sso.CAS.GetDisplayName()
}
if rt.Sso.OAuth2 != nil {
oauthDisplayName = rt.Sso.OAuth2.GetDisplayName()
}
ginx.NewRender(c).Data(SsoConfigOutput{
OidcDisplayName: rt.Sso.OIDC.GetDisplayName(),
CasDisplayName: rt.Sso.CAS.GetDisplayName(),
OauthDisplayName: rt.Sso.OAuth2.GetDisplayName(),
OidcDisplayName: oidcDisplayName,
CasDisplayName: casDisplayName,
OauthDisplayName: oauthDisplayName,
}, nil)
}
@@ -520,8 +556,7 @@ func (rt *Router) ssoConfigUpdate(c *gin.Context) {
var config oidcx.Config
err := toml.Unmarshal([]byte(f.Content), &config)
ginx.Dangerous(err)
err = rt.Sso.OIDC.Reload(config)
rt.Sso.OIDC, err = oidcx.New(config)
ginx.Dangerous(err)
case "CAS":
var config cas.Config
@@ -537,3 +572,19 @@ func (rt *Router) ssoConfigUpdate(c *gin.Context) {
ginx.NewRender(c).Message(nil)
}
type RSAConfigOutput struct {
OpenRSA bool
RSAPublicKey string
}
func (rt *Router) rsaConfigGet(c *gin.Context) {
publicKey := ""
if len(rt.HTTP.RSA.RSAPublicKey) > 0 {
publicKey = base64.StdEncoding.EncodeToString(rt.HTTP.RSA.RSAPublicKey)
}
ginx.NewRender(c).Data(RSAConfigOutput{
OpenRSA: rt.HTTP.RSA.OpenRSA,
RSAPublicKey: publicKey,
}, nil)
}

View File

@@ -5,10 +5,12 @@ import (
"strings"
"time"
"github.com/ccfos/nightingale/v6/alert/common"
"github.com/ccfos/nightingale/v6/models"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/str"
)
// Return all, front-end search and paging
@@ -19,6 +21,22 @@ func (rt *Router) alertMuteGetsByBG(c *gin.Context) {
ginx.NewRender(c).Data(lst, err)
}
func (rt *Router) alertMuteGetsByGids(c *gin.Context) {
gids := str.IdsInt64(ginx.QueryStr(c, "gids"), ",")
if len(gids) == 0 {
ginx.NewRender(c, http.StatusBadRequest).Message("arg(gids) is empty")
return
}
for _, gid := range gids {
rt.bgroCheck(c, gid)
}
lst, err := models.AlertMuteGetsByBGIds(rt.Ctx, gids)
ginx.NewRender(c).Data(lst, err)
}
func (rt *Router) alertMuteGets(c *gin.Context) {
prods := strings.Fields(ginx.QueryStr(c, "prods", ""))
bgid := ginx.QueryInt64(c, "bgid", -1)
@@ -29,16 +47,41 @@ func (rt *Router) alertMuteGets(c *gin.Context) {
}
func (rt *Router) alertMuteAdd(c *gin.Context) {
var f models.AlertMute
ginx.BindJSON(c, &f)
username := c.MustGet("username").(string)
f.CreateBy = username
f.GroupId = ginx.UrlParamInt64(c, "id")
ginx.NewRender(c).Message(f.Add(rt.Ctx))
}
// Preview events (alert_cur_event) that match the mute strategy based on the following criteria:
// business group ID (group_id, group_id), product (prod, rule_prod),
// alert event severity (severities, severity), and event tags (tags, tags).
// For products of type not 'host', also consider the category (cate, cate) and datasource ID (datasource_ids, datasource_id).
func (rt *Router) alertMutePreview(c *gin.Context) {
//Generally the match of events would be less.
var f models.AlertMute
ginx.BindJSON(c, &f)
f.GroupId = ginx.UrlParamInt64(c, "id")
ginx.Dangerous(f.Verify()) //verify and parse tags json to ITags
events, err := models.AlertCurEventGetsFromAlertMute(rt.Ctx, &f)
ginx.Dangerous(err)
matchEvents := make([]*models.AlertCurEvent, 0, len(events))
for i := 0; i < len(events); i++ {
events[i].DB2Mem()
if common.MatchTags(events[i].TagsMap, f.ITags) {
matchEvents = append(matchEvents, events[i])
}
}
ginx.NewRender(c).Data(matchEvents, err)
}
func (rt *Router) alertMuteAddByService(c *gin.Context) {
var f models.AlertMute
ginx.BindJSON(c, &f)

View File

@@ -2,15 +2,19 @@ package router
import (
"encoding/json"
"fmt"
"strings"
"github.com/ccfos/nightingale/v6/alert/aconf"
"github.com/ccfos/nightingale/v6/alert/sender"
"github.com/ccfos/nightingale/v6/memsto"
"github.com/ccfos/nightingale/v6/models"
"github.com/pelletier/go-toml/v2"
"github.com/ccfos/nightingale/v6/pkg/tplx"
"github.com/gin-gonic/gin"
"github.com/pelletier/go-toml/v2"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/str"
)
func (rt *Router) webhookGets(c *gin.Context) {
@@ -41,8 +45,8 @@ func (rt *Router) webhookPuts(c *gin.Context) {
data, err := json.Marshal(webhooks)
ginx.Dangerous(err)
ginx.NewRender(c).Message(models.ConfigsSet(rt.Ctx, models.WEBHOOKKEY, string(data)))
username := c.MustGet("username").(string)
ginx.NewRender(c).Message(models.ConfigsSetWithUname(rt.Ctx, models.WEBHOOKKEY, string(data), username))
}
func (rt *Router) notifyScriptGet(c *gin.Context) {
@@ -65,8 +69,8 @@ func (rt *Router) notifyScriptPut(c *gin.Context) {
data, err := json.Marshal(notifyScript)
ginx.Dangerous(err)
ginx.NewRender(c).Message(models.ConfigsSet(rt.Ctx, models.NOTIFYSCRIPT, string(data)))
username := c.MustGet("username").(string)
ginx.NewRender(c).Message(models.ConfigsSetWithUname(rt.Ctx, models.NOTIFYSCRIPT, string(data), username))
}
func (rt *Router) notifyChannelGets(c *gin.Context) {
@@ -101,8 +105,8 @@ func (rt *Router) notifyChannelPuts(c *gin.Context) {
data, err := json.Marshal(notifyChannels)
ginx.Dangerous(err)
ginx.NewRender(c).Message(models.ConfigsSet(rt.Ctx, models.NOTIFYCHANNEL, string(data)))
username := c.MustGet("username").(string)
ginx.NewRender(c).Message(models.ConfigsSetWithUname(rt.Ctx, models.NOTIFYCHANNEL, string(data), username))
}
func (rt *Router) notifyContactGets(c *gin.Context) {
@@ -137,8 +141,8 @@ func (rt *Router) notifyContactPuts(c *gin.Context) {
data, err := json.Marshal(notifyContacts)
ginx.Dangerous(err)
ginx.NewRender(c).Message(models.ConfigsSet(rt.Ctx, models.NOTIFYCONTACT, string(data)))
username := c.MustGet("username").(string)
ginx.NewRender(c).Message(models.ConfigsSetWithUname(rt.Ctx, models.NOTIFYCONTACT, string(data), username))
}
func (rt *Router) notifyConfigGet(c *gin.Context) {
@@ -158,10 +162,12 @@ func (rt *Router) notifyConfigGet(c *gin.Context) {
func (rt *Router) notifyConfigPut(c *gin.Context) {
var f models.Configs
ginx.BindJSON(c, &f)
userVariableMap := rt.NotifyConfigCache.ConfigCache.Get()
text := tplx.ReplaceTemplateUseText(f.Ckey, f.Cval, userVariableMap)
switch f.Ckey {
case models.SMTP:
var smtp aconf.SMTPConfig
err := toml.Unmarshal([]byte(f.Cval), &smtp)
err := toml.Unmarshal([]byte(text), &smtp)
ginx.Dangerous(err)
case models.IBEX:
var ibex aconf.Ibex
@@ -170,24 +176,55 @@ func (rt *Router) notifyConfigPut(c *gin.Context) {
default:
ginx.Bomb(200, "key %s can not modify", f.Ckey)
}
err := models.ConfigsSet(rt.Ctx, f.Ckey, f.Cval)
if err != nil {
ginx.Bomb(200, err.Error())
}
username := c.MustGet("username").(string)
//insert or update build-in config
ginx.Dangerous(models.ConfigsSetWithUname(rt.Ctx, f.Ckey, f.Cval, username))
if f.Ckey == models.SMTP {
// 重置邮件发送器
var smtp aconf.SMTPConfig
err := toml.Unmarshal([]byte(f.Cval), &smtp)
ginx.Dangerous(err)
if smtp.Host == "" || smtp.Port == 0 {
ginx.Bomb(200, "smtp host or port can not be empty")
}
smtp, errSmtp := SmtpValidate(text)
ginx.Dangerous(errSmtp)
go sender.RestartEmailSender(smtp)
}
ginx.NewRender(c).Message(nil)
}
func SmtpValidate(text string) (aconf.SMTPConfig, error) {
var smtp aconf.SMTPConfig
var err error
err = toml.Unmarshal([]byte(text), &smtp)
if err != nil {
return smtp, err
}
if smtp.Host == "" || smtp.Port == 0 {
return smtp, fmt.Errorf("smtp host or port can not be empty")
}
return smtp, err
}
type form struct {
models.Configs
Email string `json:"email"`
}
// After configuring the aconf.SMTPConfig, users can choose to perform a test. In this test, the function attempts to send an email
func (rt *Router) attemptSendEmail(c *gin.Context) {
var f form
ginx.BindJSON(c, &f)
if f.Email = strings.TrimSpace(f.Email); f.Email == "" || !str.IsMail(f.Email) {
ginx.Bomb(200, "email(%s) invalid", f.Email)
}
if f.Ckey != models.SMTP {
ginx.Bomb(200, "config(%v) invalid", f)
}
userVariableMap := rt.NotifyConfigCache.ConfigCache.Get()
text := tplx.ReplaceTemplateUseText(f.Ckey, f.Cval, userVariableMap)
smtp, err := SmtpValidate(text)
ginx.Dangerous(err)
ginx.NewRender(c).Message(sender.SendEmail("Email test", "email content", []string{f.Email}, smtp))
}

View File

@@ -10,12 +10,25 @@ import (
"github.com/ccfos/nightingale/v6/center/cconf"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/tplx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/str"
)
func (rt *Router) notifyTplGets(c *gin.Context) {
m := make(map[string]struct{})
for _, channel := range models.DefaultChannels {
m[channel] = struct{}{}
}
m[models.EmailSubject] = struct{}{}
lst, err := models.NotifyTplGets(rt.Ctx)
for i := 0; i < len(lst); i++ {
if _, exists := m[lst[i].Channel]; exists {
lst[i].BuiltIn = true
}
}
ginx.NewRender(c).Data(lst, err)
}
@@ -23,11 +36,7 @@ func (rt *Router) notifyTplGets(c *gin.Context) {
func (rt *Router) notifyTplUpdateContent(c *gin.Context) {
var f models.NotifyTpl
ginx.BindJSON(c, &f)
if err := templateValidate(f); err != nil {
ginx.NewRender(c).Message(err.Error())
return
}
ginx.Dangerous(templateValidate(f))
ginx.NewRender(c).Message(f.UpdateContent(rt.Ctx))
}
@@ -35,16 +44,28 @@ func (rt *Router) notifyTplUpdateContent(c *gin.Context) {
func (rt *Router) notifyTplUpdate(c *gin.Context) {
var f models.NotifyTpl
ginx.BindJSON(c, &f)
if err := templateValidate(f); err != nil {
ginx.NewRender(c).Message(err.Error())
return
}
ginx.Dangerous(templateValidate(f))
ginx.NewRender(c).Message(f.Update(rt.Ctx))
}
func templateValidate(f models.NotifyTpl) error {
if len(f.Channel) > 32 {
return fmt.Errorf("channel length should not exceed 32")
}
if str.Dangerous(f.Channel) {
return fmt.Errorf("channel should not contain dangerous characters")
}
if len(f.Name) > 255 {
return fmt.Errorf("name length should not exceed 255")
}
if str.Dangerous(f.Name) {
return fmt.Errorf("name should not contain dangerous characters")
}
if f.Content == "" {
return nil
}
@@ -65,10 +86,7 @@ func templateValidate(f models.NotifyTpl) error {
func (rt *Router) notifyTplPreview(c *gin.Context) {
var event models.AlertCurEvent
err := json.Unmarshal([]byte(cconf.EVENT_EXAMPLE), &event)
if err != nil {
ginx.NewRender(c).Message(err.Error())
return
}
ginx.Dangerous(err)
var f models.NotifyTpl
ginx.BindJSON(c, &f)
@@ -106,3 +124,25 @@ func (rt *Router) notifyTplPreview(c *gin.Context) {
ginx.NewRender(c).Data(ret, nil)
}
// add new notify template
func (rt *Router) notifyTplAdd(c *gin.Context) {
var f models.NotifyTpl
ginx.BindJSON(c, &f)
f.Channel = strings.TrimSpace(f.Channel)
ginx.Dangerous(templateValidate(f))
count, err := models.NotifyTplCountByChannel(rt.Ctx, f.Channel)
ginx.Dangerous(err)
if count != 0 {
ginx.Bomb(200, "Refuse to create duplicate channel(unique)")
}
ginx.NewRender(c).Message(f.Create(rt.Ctx))
}
// delete notify template, not allowed to delete the system defaults(models.DefaultChannels)
func (rt *Router) notifyTplDel(c *gin.Context) {
f := new(models.NotifyTpl)
id := ginx.UrlParamInt64(c, "id")
ginx.NewRender(c).Message(f.NotifyTplDelete(rt.Ctx, id))
}

View File

@@ -3,39 +3,51 @@ package router
import (
"context"
"crypto/tls"
"fmt"
"net"
"net/http"
"net/http/httputil"
"net/url"
"strings"
"sync"
"time"
pkgprom "github.com/ccfos/nightingale/v6/pkg/prom"
"github.com/ccfos/nightingale/v6/prom"
"github.com/gin-gonic/gin"
"github.com/prometheus/common/model"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
type queryFormItem struct {
type QueryFormItem struct {
Start int64 `json:"start" binding:"required"`
End int64 `json:"end" binding:"required"`
Step int64 `json:"step" binding:"required"`
Query string `json:"query" binding:"required"`
}
type batchQueryForm struct {
type BatchQueryForm struct {
DatasourceId int64 `json:"datasource_id" binding:"required"`
Queries []queryFormItem `json:"queries" binding:"required"`
Queries []QueryFormItem `json:"queries" binding:"required"`
}
func (rt *Router) promBatchQueryRange(c *gin.Context) {
var f batchQueryForm
var f BatchQueryForm
ginx.Dangerous(c.BindJSON(&f))
cli := rt.PromClients.GetCli(f.DatasourceId)
lst, err := PromBatchQueryRange(rt.PromClients, f)
ginx.NewRender(c).Data(lst, err)
}
func PromBatchQueryRange(pc *prom.PromClientMap, f BatchQueryForm) ([]model.Value, error) {
var lst []model.Value
cli := pc.GetCli(f.DatasourceId)
if cli == nil {
return lst, fmt.Errorf("no such datasource id: %d", f.DatasourceId)
}
for _, item := range f.Queries {
r := pkgprom.Range{
Start: time.Unix(item.Start, 0),
@@ -44,15 +56,16 @@ func (rt *Router) promBatchQueryRange(c *gin.Context) {
}
resp, _, err := cli.QueryRange(context.Background(), item.Query, r)
ginx.Dangerous(err)
if err != nil {
return lst, err
}
lst = append(lst, resp)
}
ginx.NewRender(c).Data(lst, nil)
return lst, nil
}
type batchInstantForm struct {
type BatchInstantForm struct {
DatasourceId int64 `json:"datasource_id" binding:"required"`
Queries []InstantFormItem `json:"queries" binding:"required"`
}
@@ -63,21 +76,31 @@ type InstantFormItem struct {
}
func (rt *Router) promBatchQueryInstant(c *gin.Context) {
var f batchInstantForm
var f BatchInstantForm
ginx.Dangerous(c.BindJSON(&f))
cli := rt.PromClients.GetCli(f.DatasourceId)
lst, err := PromBatchQueryInstant(rt.PromClients, f)
ginx.NewRender(c).Data(lst, err)
}
func PromBatchQueryInstant(pc *prom.PromClientMap, f BatchInstantForm) ([]model.Value, error) {
var lst []model.Value
cli := pc.GetCli(f.DatasourceId)
if cli == nil {
logger.Warningf("no such datasource id: %d", f.DatasourceId)
return lst, fmt.Errorf("no such datasource id: %d", f.DatasourceId)
}
for _, item := range f.Queries {
resp, _, err := cli.Query(context.Background(), item.Query, time.Unix(item.Time, 0))
ginx.Dangerous(err)
if err != nil {
return lst, err
}
lst = append(lst, resp)
}
ginx.NewRender(c).Data(lst, nil)
return lst, nil
}
func (rt *Router) dsProxy(c *gin.Context) {
@@ -139,21 +162,77 @@ func (rt *Router) dsProxy(c *gin.Context) {
http.Error(w, err.Error(), http.StatusBadGateway)
}
transport := &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: ds.HTTPJson.TLS.SkipTlsVerify},
Proxy: http.ProxyFromEnvironment,
DialContext: (&net.Dialer{
Timeout: time.Duration(ds.HTTPJson.DialTimeout) * time.Millisecond,
}).DialContext,
ResponseHeaderTimeout: time.Duration(ds.HTTPJson.Timeout) * time.Millisecond,
MaxIdleConnsPerHost: ds.HTTPJson.MaxIdleConnsPerHost,
transport, has := transportGet(dsId, ds.UpdatedAt)
if !has {
transport = &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: ds.HTTPJson.TLS.SkipTlsVerify},
Proxy: http.ProxyFromEnvironment,
DialContext: (&net.Dialer{
Timeout: time.Duration(ds.HTTPJson.DialTimeout) * time.Millisecond,
}).DialContext,
ResponseHeaderTimeout: time.Duration(ds.HTTPJson.Timeout) * time.Millisecond,
MaxIdleConnsPerHost: ds.HTTPJson.MaxIdleConnsPerHost,
}
transportPut(dsId, ds.UpdatedAt, transport)
}
modifyResponse := func(r *http.Response) error {
if r.StatusCode == http.StatusUnauthorized {
logger.Warningf("proxy path:%s unauthorized access ", c.Request.URL.Path)
return fmt.Errorf("unauthorized access")
}
return nil
}
proxy := &httputil.ReverseProxy{
Director: director,
Transport: transport,
ErrorHandler: errFunc,
Director: director,
Transport: transport,
ErrorHandler: errFunc,
ModifyResponse: modifyResponse,
}
proxy.ServeHTTP(c.Writer, c.Request)
}
var (
transports = map[int64]http.RoundTripper{}
updatedAts = map[int64]int64{}
transportsLock = &sync.Mutex{}
)
func transportGet(dsid, newUpdatedAt int64) (http.RoundTripper, bool) {
transportsLock.Lock()
defer transportsLock.Unlock()
tran, has := transports[dsid]
if !has {
return nil, false
}
oldUpdateAt, has := updatedAts[dsid]
if !has {
oldtran := tran.(*http.Transport)
oldtran.CloseIdleConnections()
delete(transports, dsid)
return nil, false
}
if oldUpdateAt != newUpdatedAt {
oldtran := tran.(*http.Transport)
oldtran.CloseIdleConnections()
delete(transports, dsid)
delete(updatedAts, dsid)
return nil, false
}
return tran, has
}
func transportPut(dsid, updatedat int64, tran http.RoundTripper) {
transportsLock.Lock()
transports[dsid] = tran
updatedAts[dsid] = updatedat
transportsLock.Unlock()
}

View File

@@ -11,6 +11,7 @@ import (
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/str"
)
func (rt *Router) recordingRuleGets(c *gin.Context) {
@@ -19,6 +20,21 @@ func (rt *Router) recordingRuleGets(c *gin.Context) {
ginx.NewRender(c).Data(ars, err)
}
func (rt *Router) recordingRuleGetsByGids(c *gin.Context) {
gids := str.IdsInt64(ginx.QueryStr(c, "gids"), ",")
if len(gids) == 0 {
ginx.NewRender(c, http.StatusBadRequest).Message("arg(gids) is empty")
return
}
for _, gid := range gids {
rt.bgroCheck(c, gid)
}
ars, err := models.RecordingRuleGetsByBGIds(rt.Ctx, gids)
ginx.NewRender(c).Data(ars, err)
}
func (rt *Router) recordingRuleGetsByService(c *gin.Context) {
ars, err := models.RecordingRuleEnabledGets(rt.Ctx)
ginx.NewRender(c).Data(ars, err)

View File

@@ -4,6 +4,7 @@ import (
"net/http"
"strings"
"github.com/ccfos/nightingale/v6/center/cconf"
"github.com/ccfos/nightingale/v6/models"
"github.com/gin-gonic/gin"
@@ -17,6 +18,15 @@ func (rt *Router) rolesGets(c *gin.Context) {
func (rt *Router) permsGets(c *gin.Context) {
user := c.MustGet("user").(*models.User)
if user.IsAdmin() {
var lst []string
for _, ops := range cconf.Operations.Ops {
lst = append(lst, ops.Ops...)
}
ginx.NewRender(c).Data(lst, nil)
return
}
lst, err := models.OperationsOfRole(rt.Ctx, strings.Fields(user.Roles))
ginx.NewRender(c).Data(lst, err)
}

View File

@@ -3,9 +3,11 @@ package router
import (
"net/http"
"github.com/ccfos/nightingale/v6/center/cconf"
"github.com/ccfos/nightingale/v6/models"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
)
func (rt *Router) operationOfRole(c *gin.Context) {
@@ -16,6 +18,15 @@ func (rt *Router) operationOfRole(c *gin.Context) {
ginx.Bomb(http.StatusOK, "role not found")
}
if role.Name == "Admin" {
var lst []string
for _, ops := range cconf.Operations.Ops {
lst = append(lst, ops.Ops...)
}
ginx.NewRender(c).Data(lst, nil)
return
}
ops, err := models.OperationsOfRole(rt.Ctx, []string{role.Name})
ginx.NewRender(c).Data(ops, err)
}
@@ -39,5 +50,11 @@ func (rt *Router) roleBindOperation(c *gin.Context) {
}
func (rt *Router) operations(c *gin.Context) {
ginx.NewRender(c).Data(rt.Operations.Ops, nil)
var ops []cconf.Ops
for _, v := range rt.Operations.Ops {
v.Cname = i18n.Sprintf(c.GetHeader("X-Language"), v.Cname)
ops = append(ops, v)
}
ginx.NewRender(c).Data(ops, nil)
}

View File

@@ -28,6 +28,12 @@ func (rt *Router) serverHeartbeat(c *gin.Context) {
func (rt *Router) serversActive(c *gin.Context) {
datasourceId := ginx.QueryInt64(c, "dsid")
engineName := ginx.QueryStr(c, "engine_name", "")
if engineName != "" {
servers, err := models.AlertingEngineGetsInstances(rt.Ctx, "engine_cluster = ? and clock > ?", engineName, time.Now().Unix()-30)
ginx.NewRender(c).Data(servers, err)
return
}
servers, err := models.AlertingEngineGetsInstances(rt.Ctx, "datasource_id = ? and clock > ?", datasourceId, time.Now().Unix()-30)
ginx.NewRender(c).Data(servers, err)

View File

@@ -15,6 +15,7 @@ import (
"github.com/prometheus/common/model"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
"github.com/toolkits/pkg/str"
)
type TargetQuery struct {
@@ -42,15 +43,32 @@ func (rt *Router) targetGetsByHostFilter(c *gin.Context) {
}
func (rt *Router) targetGets(c *gin.Context) {
bgid := ginx.QueryInt64(c, "bgid", -1)
bgids := str.IdsInt64(ginx.QueryStr(c, "gids", ""), ",")
query := ginx.QueryStr(c, "query", "")
limit := ginx.QueryInt(c, "limit", 30)
downtime := ginx.QueryInt64(c, "downtime", 0)
dsIds := queryDatasourceIds(c)
total, err := models.TargetTotal(rt.Ctx, bgid, dsIds, query)
var err error
if len(bgids) == 0 {
user := c.MustGet("user").(*models.User)
if !user.IsAdmin() {
// 如果是非 admin 用户,全部对象的情况,找到用户有权限的业务组
userGroupIds, err := models.MyGroupIds(rt.Ctx, user.Id)
ginx.Dangerous(err)
bgids, err = models.BusiGroupIds(rt.Ctx, userGroupIds)
ginx.Dangerous(err)
// 将未分配业务组的对象也加入到列表中
bgids = append(bgids, 0)
}
}
total, err := models.TargetTotal(rt.Ctx, bgids, dsIds, query, downtime)
ginx.Dangerous(err)
list, err := models.TargetGets(rt.Ctx, bgid, dsIds, query, limit, ginx.Offset(c, limit))
list, err := models.TargetGets(rt.Ctx, bgids, dsIds, query, downtime, limit, ginx.Offset(c, limit))
ginx.Dangerous(err)
if err == nil {
@@ -61,6 +79,12 @@ func (rt *Router) targetGets(c *gin.Context) {
for i := 0; i < len(list); i++ {
ginx.Dangerous(list[i].FillGroup(rt.Ctx, cache))
keys = append(keys, models.WrapIdent(list[i].Ident))
if now.Unix()-list[i].UpdateAt < 60 {
list[i].TargetUp = 2
} else if now.Unix()-list[i].UpdateAt < 180 {
list[i].TargetUp = 1
}
}
if len(keys) > 0 {
@@ -80,10 +104,6 @@ func (rt *Router) targetGets(c *gin.Context) {
}
for i := 0; i < len(list); i++ {
if now.Unix()-list[i].UpdateAt < 120 {
list[i].TargetUp = 1
}
if meta, ok := metaMap[list[i].Ident]; ok {
list[i].FillMeta(meta)
} else {

View File

@@ -30,10 +30,46 @@ func (rt *Router) taskGets(c *gin.Context) {
beginTime := time.Now().Unix() - days*24*3600
total, err := models.TaskRecordTotal(rt.Ctx, bgid, beginTime, creator, query)
total, err := models.TaskRecordTotal(rt.Ctx, []int64{bgid}, beginTime, creator, query)
ginx.Dangerous(err)
list, err := models.TaskRecordGets(rt.Ctx, bgid, beginTime, creator, query, limit, ginx.Offset(c, limit))
list, err := models.TaskRecordGets(rt.Ctx, []int64{bgid}, beginTime, creator, query, limit, ginx.Offset(c, limit))
ginx.Dangerous(err)
ginx.NewRender(c).Data(gin.H{
"total": total,
"list": list,
}, nil)
}
func (rt *Router) taskGetsByGids(c *gin.Context) {
gids := str.IdsInt64(ginx.QueryStr(c, "gids"), ",")
if len(gids) == 0 {
ginx.NewRender(c, http.StatusBadRequest).Message("arg(gids) is empty")
return
}
for _, gid := range gids {
rt.bgroCheck(c, gid)
}
mine := ginx.QueryBool(c, "mine", false)
days := ginx.QueryInt64(c, "days", 7)
limit := ginx.QueryInt(c, "limit", 20)
query := ginx.QueryStr(c, "query", "")
user := c.MustGet("user").(*models.User)
creator := ""
if mine {
creator = user.Username
}
beginTime := time.Now().Unix() - days*24*3600
total, err := models.TaskRecordTotal(rt.Ctx, gids, beginTime, creator, query)
ginx.Dangerous(err)
list, err := models.TaskRecordGets(rt.Ctx, gids, beginTime, creator, query, limit, ginx.Offset(c, limit))
ginx.Dangerous(err)
ginx.NewRender(c).Data(gin.H{
@@ -92,6 +128,7 @@ func (f *taskForm) Verify() error {
if f.Script == "" {
return fmt.Errorf("arg(script) is required")
}
f.Script = strings.Replace(f.Script, "\r\n", "\n", -1)
if str.Dangerous(f.Args) {
return fmt.Errorf("arg(args) is dangerous")

View File

@@ -18,10 +18,36 @@ func (rt *Router) taskTplGets(c *gin.Context) {
limit := ginx.QueryInt(c, "limit", 20)
groupId := ginx.UrlParamInt64(c, "id")
total, err := models.TaskTplTotal(rt.Ctx, groupId, query)
total, err := models.TaskTplTotal(rt.Ctx, []int64{groupId}, query)
ginx.Dangerous(err)
list, err := models.TaskTplGets(rt.Ctx, groupId, query, limit, ginx.Offset(c, limit))
list, err := models.TaskTplGets(rt.Ctx, []int64{groupId}, query, limit, ginx.Offset(c, limit))
ginx.Dangerous(err)
ginx.NewRender(c).Data(gin.H{
"total": total,
"list": list,
}, nil)
}
func (rt *Router) taskTplGetsByGids(c *gin.Context) {
query := ginx.QueryStr(c, "query", "")
limit := ginx.QueryInt(c, "limit", 20)
gids := str.IdsInt64(ginx.QueryStr(c, "gids"), ",")
if len(gids) == 0 {
ginx.NewRender(c, http.StatusBadRequest).Message("arg(gids) is empty")
return
}
for _, gid := range gids {
rt.bgroCheck(c, gid)
}
total, err := models.TaskTplTotal(rt.Ctx, gids, query)
ginx.Dangerous(err)
list, err := models.TaskTplGets(rt.Ctx, gids, query, limit, ginx.Offset(c, limit))
ginx.Dangerous(err)
ginx.NewRender(c).Data(gin.H{
@@ -48,6 +74,19 @@ func (rt *Router) taskTplGet(c *gin.Context) {
}, err)
}
func (rt *Router) taskTplGetByService(c *gin.Context) {
tid := ginx.UrlParamInt64(c, "tid")
tpl, err := models.TaskTplGetById(rt.Ctx, tid)
ginx.Dangerous(err)
if tpl == nil {
ginx.Bomb(404, "no such task template")
}
ginx.NewRender(c).Data(tpl, err)
}
type taskTplForm struct {
Title string `json:"title" binding:"required"`
Batch int `json:"batch"`

View File

@@ -0,0 +1,117 @@
package router
import (
"net/http"
"github.com/ccfos/nightingale/v6/center/cconf"
"github.com/ccfos/nightingale/v6/models"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
type databasesQueryForm struct {
Cate string `json:"cate" form:"cate"`
DatasourceId int64 `json:"datasource_id" form:"datasource_id"`
}
func (rt *Router) tdengineDatabases(c *gin.Context) {
var f databasesQueryForm
ginx.BindJSON(c, &f)
tdClient := rt.TdendgineClients.GetCli(f.DatasourceId)
if tdClient == nil {
ginx.NewRender(c, http.StatusNotFound).Message("No such datasource")
return
}
databases, err := tdClient.GetDatabases()
ginx.NewRender(c).Data(databases, err)
}
type tablesQueryForm struct {
Cate string `json:"cate"`
DatasourceId int64 `json:"datasource_id" `
Database string `json:"db"`
IsStable bool `json:"is_stable"`
}
// get tdengine tables
func (rt *Router) tdengineTables(c *gin.Context) {
var f tablesQueryForm
ginx.BindJSON(c, &f)
tdClient := rt.TdendgineClients.GetCli(f.DatasourceId)
if tdClient == nil {
ginx.NewRender(c, http.StatusNotFound).Message("No such datasource")
return
}
tables, err := tdClient.GetTables(f.Database, f.IsStable)
ginx.NewRender(c).Data(tables, err)
}
type columnsQueryForm struct {
Cate string `json:"cate"`
DatasourceId int64 `json:"datasource_id" `
Database string `json:"db"`
Table string `json:"table"`
}
// get tdengine columns
func (rt *Router) tdengineColumns(c *gin.Context) {
var f columnsQueryForm
ginx.BindJSON(c, &f)
tdClient := rt.TdendgineClients.GetCli(f.DatasourceId)
if tdClient == nil {
ginx.NewRender(c, http.StatusNotFound).Message("No such datasource")
return
}
columns, err := tdClient.GetColumns(f.Database, f.Table)
ginx.NewRender(c).Data(columns, err)
}
func (rt *Router) QueryData(c *gin.Context) {
var f models.QueryParam
ginx.BindJSON(c, &f)
var resp []*models.DataResp
var err error
tdClient := rt.TdendgineClients.GetCli(f.DatasourceId)
for _, q := range f.Querys {
datas, err := tdClient.Query(q)
ginx.Dangerous(err)
resp = append(resp, datas...)
}
ginx.NewRender(c).Data(resp, err)
}
func (rt *Router) QueryLog(c *gin.Context) {
var f models.QueryParam
ginx.BindJSON(c, &f)
tdClient := rt.TdendgineClients.GetCli(f.DatasourceId)
if len(f.Querys) == 0 {
ginx.Bomb(200, "querys is empty")
return
}
data, err := tdClient.QueryLog(f.Querys[0])
logger.Debugf("tdengine query:%s result: %+v", f.Querys[0], data)
ginx.NewRender(c).Data(data, err)
}
// query sql template
func (rt *Router) QuerySqlTemplate(c *gin.Context) {
cate := ginx.QueryStr(c, "cate")
m := make(map[string]string)
switch cate {
case models.TDENGINE:
m = cconf.TDengineSQLTpl
}
ginx.NewRender(c).Data(m, nil)
}

View File

@@ -11,6 +11,28 @@ import (
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) userBusiGroupsGets(c *gin.Context) {
userid := ginx.QueryInt64(c, "userid", 0)
username := ginx.QueryStr(c, "username", "")
if userid == 0 && username == "" {
ginx.Bomb(http.StatusBadRequest, "userid or username required")
}
var user *models.User
var err error
if userid > 0 {
user, err = models.UserGetById(rt.Ctx, userid)
} else {
user, err = models.UserGetByUsername(rt.Ctx, username)
}
ginx.Dangerous(err)
groups, err := user.BusiGroups(rt.Ctx, 10000, "")
ginx.NewRender(c).Data(groups, err)
}
func (rt *Router) userFindAll(c *gin.Context) {
list, err := models.UserGetAll(rt.Ctx)
ginx.NewRender(c).Data(list, err)

View File

@@ -0,0 +1,70 @@
package router
import (
"strings"
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) userVariableConfigGets(context *gin.Context) {
userVariables, err := models.ConfigsGetUserVariable(rt.Ctx)
ginx.NewRender(context).Data(userVariables, err)
}
func (rt *Router) userVariableConfigAdd(context *gin.Context) {
var f models.Configs
ginx.BindJSON(context, &f)
f.Ckey = strings.TrimSpace(f.Ckey)
//insert external config. needs to make sure not plaintext for an encrypted type config
username := context.MustGet("username").(string)
now := time.Now().Unix()
f.CreateBy = username
f.UpdateBy = username
f.CreateAt = now
f.UpdateAt = now
ginx.NewRender(context).Message(models.ConfigsUserVariableInsert(rt.Ctx, f))
}
func (rt *Router) userVariableConfigPut(context *gin.Context) {
var f models.Configs
ginx.BindJSON(context, &f)
f.Id = ginx.UrlParamInt64(context, "id")
f.Ckey = strings.TrimSpace(f.Ckey)
f.UpdateBy = context.MustGet("username").(string)
f.UpdateAt = time.Now().Unix()
user := context.MustGet("user").(*models.User)
if !user.IsAdmin() && f.CreateBy != user.Username {
// only admin or creator can update
ginx.Bomb(403, "no permission")
}
ginx.NewRender(context).Message(models.ConfigsUserVariableUpdate(rt.Ctx, f))
}
func (rt *Router) userVariableConfigDel(context *gin.Context) {
id := ginx.UrlParamInt64(context, "id")
configs, err := models.ConfigGet(rt.Ctx, id)
ginx.Dangerous(err)
user := context.MustGet("user").(*models.User)
if !user.IsAdmin() && configs.CreateBy != user.Username {
// only admin or creator can delete
ginx.Bomb(403, "no permission")
}
if configs != nil && configs.External == models.ConfigExternal {
ginx.NewRender(context).Message(models.ConfigsDel(rt.Ctx, []int64{id}))
} else {
ginx.NewRender(context).Message(nil)
}
}
func (rt *Router) userVariableGetDecryptByService(context *gin.Context) {
decryptMap, decryptErr := models.ConfigUserVariableGetDecryptMap(rt.Ctx, rt.HTTP.RSA.RSAPrivateKey, rt.HTTP.RSA.RSAPassWord)
ginx.NewRender(context).Data(decryptMap, decryptErr)
}

View File

@@ -29,6 +29,8 @@ Port = 389
BaseDn = 'dc=example,dc=org'
BindUser = 'cn=manager,dc=example,dc=org'
BindPass = '*******'
# openldap format e.g. (&(uid=%s))
# AD format e.g. (&(sAMAccountName=%s))
AuthFilter = '(&(uid=%s))'
CoverAttributes = true
TLS = false
@@ -67,6 +69,7 @@ Email = 'email'
const CAS = `
Enable = false
SsoAddr = 'https://cas.example.com/cas/'
# LoginPath = ''
RedirectURL = 'http://127.0.0.1:18000/callback/cas'
DisplayName = 'CAS登录'
CoverAttributes = false

View File

@@ -18,9 +18,9 @@ func Upgrade(configFile string) error {
return err
}
ctx := ctx.NewContext(context.Background(), db, false)
ctx := ctx.NewContext(context.Background(), db, true)
for _, cluster := range config.Clusters {
count, err := models.GetDatasourcesCountBy(ctx, "", "", cluster.Name)
count, err := models.GetDatasourcesCountByName(ctx, cluster.Name)
if err != nil {
logger.Errorf("get datasource %s count error: %v", cluster.Name, err)
continue

View File

@@ -8,6 +8,13 @@ insert into `role_operation`(role_name, operation) values('Standard', '/trace/ex
insert into `role_operation`(role_name, operation) values('Standard', '/alert-rules-built-in');
insert into `role_operation`(role_name, operation) values('Standard', '/dashboards-built-in');
insert into `role_operation`(role_name, operation) values('Standard', '/trace/dependencies');
insert into `role_operation`(role_name, operation) values('Standard', '/help/servers');
insert into `role_operation`(role_name, operation) values('Standard', '/help/migrate');
insert into `role_operation`(role_name, operation) values('Admin', '/help/source');
insert into `role_operation`(role_name, operation) values('Admin', '/help/sso');
insert into `role_operation`(role_name, operation) values('Admin', '/help/notification-tpls');
insert into `role_operation`(role_name, operation) values('Admin', '/help/notification-settings');
alter table `board` add built_in tinyint(1) not null default 0 comment '0:false 1:true';
alter table `board` add hide tinyint(1) not null default 0 comment '0:false 1:true';

View File

@@ -17,7 +17,7 @@ import (
var (
showVersion = flag.Bool("version", false, "Show version.")
configDir = flag.String("configs", osx.GetEnv("N9E_CONFIGS", "etc"), "Specify configuration directory.(env:N9E_CONFIGS)")
configDir = flag.String("configs", osx.GetEnv("N9E_ALERT_CONFIGS", "etc"), "Specify configuration directory.(env:N9E_ALERT_CONFIGS)")
cryptoKey = flag.String("crypto-key", "", "Specify the secret key for configuration file field encryption.")
)

View File

@@ -12,6 +12,7 @@ import (
"github.com/ccfos/nightingale/v6/pkg/osx"
"github.com/ccfos/nightingale/v6/pkg/version"
"github.com/toolkits/pkg/net/tcpx"
"github.com/toolkits/pkg/runner"
)
@@ -31,6 +32,8 @@ func main() {
printEnv()
tcpx.WaitHosts()
cleanFunc, err := center.Initialize(*configDir, *cryptoKey)
if err != nil {
log.Fatalln("failed to initialize:", err)

81
cmd/edge/edge.go Normal file
View File

@@ -0,0 +1,81 @@
package main
import (
"context"
"errors"
"fmt"
"github.com/ccfos/nightingale/v6/alert"
"github.com/ccfos/nightingale/v6/alert/astats"
"github.com/ccfos/nightingale/v6/alert/process"
"github.com/ccfos/nightingale/v6/conf"
"github.com/ccfos/nightingale/v6/dumper"
"github.com/ccfos/nightingale/v6/memsto"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/httpx"
"github.com/ccfos/nightingale/v6/pkg/logx"
"github.com/ccfos/nightingale/v6/prom"
"github.com/ccfos/nightingale/v6/pushgw/idents"
"github.com/ccfos/nightingale/v6/pushgw/writer"
"github.com/ccfos/nightingale/v6/tdengine"
alertrt "github.com/ccfos/nightingale/v6/alert/router"
pushgwrt "github.com/ccfos/nightingale/v6/pushgw/router"
)
func Initialize(configDir string, cryptoKey string) (func(), error) {
config, err := conf.InitConfig(configDir, cryptoKey)
if err != nil {
return nil, fmt.Errorf("failed to init config: %v", err)
}
logxClean, err := logx.Init(config.Log)
if err != nil {
return nil, err
}
//check CenterApi is default value
if len(config.CenterApi.Addrs) < 1 {
return nil, errors.New("failed to init config: the CenterApi configuration is missing")
}
ctx := ctx.NewContext(context.Background(), nil, false, config.CenterApi)
syncStats := memsto.NewSyncStats()
targetCache := memsto.NewTargetCache(ctx, syncStats, nil)
busiGroupCache := memsto.NewBusiGroupCache(ctx, syncStats)
idents := idents.New(ctx)
writers := writer.NewWriters(config.Pushgw)
pushgwRouter := pushgwrt.New(config.HTTP, config.Pushgw, targetCache, busiGroupCache, idents, writers, ctx)
r := httpx.GinEngine(config.Global.RunMode, config.HTTP)
pushgwRouter.Config(r)
if !config.Alert.Disable {
configCache := memsto.NewConfigCache(ctx, syncStats, nil, "")
alertStats := astats.NewSyncStats()
dsCache := memsto.NewDatasourceCache(ctx, syncStats)
alertMuteCache := memsto.NewAlertMuteCache(ctx, syncStats)
alertRuleCache := memsto.NewAlertRuleCache(ctx, syncStats)
notifyConfigCache := memsto.NewNotifyConfigCache(ctx, configCache)
userCache := memsto.NewUserCache(ctx, syncStats)
userGroupCache := memsto.NewUserGroupCache(ctx, syncStats)
promClients := prom.NewPromClient(ctx)
tdengineClients := tdengine.NewTdengineClient(ctx, config.Alert.Heartbeat)
externalProcessors := process.NewExternalProcessors()
alert.Start(config.Alert, config.Pushgw, syncStats, alertStats, externalProcessors, targetCache, busiGroupCache, alertMuteCache,
alertRuleCache, notifyConfigCache, dsCache, ctx, promClients, tdengineClients, userCache, userGroupCache)
alertrtRouter := alertrt.New(config.HTTP, config.Alert, alertMuteCache, targetCache, busiGroupCache, alertStats, ctx, externalProcessors)
alertrtRouter.Config(r)
}
dumper.ConfigRouter(r)
httpClean := httpx.Init(config.HTTP, r)
return func() {
logxClean()
httpClean()
}, nil
}

68
cmd/edge/main.go Normal file
View File

@@ -0,0 +1,68 @@
package main
import (
"flag"
"fmt"
"log"
"os"
"os/signal"
"syscall"
"github.com/ccfos/nightingale/v6/pkg/osx"
"github.com/ccfos/nightingale/v6/pkg/version"
"github.com/toolkits/pkg/runner"
)
var (
showVersion = flag.Bool("version", false, "Show version.")
configDir = flag.String("configs", osx.GetEnv("N9E_EDGE_CONFIGS", "etc"), "Specify configuration directory.(env:N9E_EDGE_CONFIGS)")
cryptoKey = flag.String("crypto-key", "", "Specify the secret key for configuration file field encryption.")
)
func main() {
flag.Parse()
if *showVersion {
fmt.Println(version.Version)
os.Exit(0)
}
printEnv()
cleanFunc, err := Initialize(*configDir, *cryptoKey)
if err != nil {
log.Fatalln("failed to initialize:", err)
}
code := 1
sc := make(chan os.Signal, 1)
signal.Notify(sc, syscall.SIGHUP, syscall.SIGINT, syscall.SIGTERM, syscall.SIGQUIT)
EXIT:
for {
sig := <-sc
fmt.Println("received signal:", sig.String())
switch sig {
case syscall.SIGQUIT, syscall.SIGTERM, syscall.SIGINT:
code = 0
break EXIT
case syscall.SIGHUP:
// reload configuration?
default:
break EXIT
}
}
cleanFunc()
fmt.Println("process exited")
os.Exit(code)
}
func printEnv() {
runner.Init()
fmt.Println("runner.cwd:", runner.Cwd)
fmt.Println("runner.hostname:", runner.Hostname)
fmt.Println("runner.fd_limits:", runner.FdLimits())
fmt.Println("runner.vm_limits:", runner.VMLimits())
}

View File

@@ -17,7 +17,7 @@ import (
var (
showVersion = flag.Bool("version", false, "Show version.")
configDir = flag.String("configs", osx.GetEnv("N9E_CONFIGS", "etc"), "Specify configuration directory.(env:N9E_CONFIGS)")
configDir = flag.String("configs", osx.GetEnv("N9E_PUSHGW_CONFIGS", "etc"), "Specify configuration directory.(env:N9E_PUSHGW_CONFIGS)")
cryptoKey = flag.String("crypto-key", "", "Specify the secret key for configuration file field encryption.")
)

View File

@@ -14,8 +14,6 @@ import (
"github.com/ccfos/nightingale/v6/pkg/ormx"
"github.com/ccfos/nightingale/v6/pushgw/pconf"
"github.com/ccfos/nightingale/v6/storage"
"github.com/gin-gonic/gin"
)
type ConfigType struct {
@@ -32,8 +30,10 @@ type ConfigType struct {
}
type CenterApi struct {
Addrs []string
BasicAuth gin.Accounts
Addrs []string
BasicAuthUser string
BasicAuthPass string
Timeout int64
}
type GlobalConfig struct {
@@ -48,7 +48,7 @@ func InitConfig(configDir, cryptoKey string) (*ConfigType, error) {
}
config.Pushgw.PreCheck()
config.Alert.PreCheck()
config.Alert.PreCheck(configDir)
config.Center.PreCheck()
err := decryptConfig(config, cryptoKey)

View File

@@ -14,39 +14,22 @@ func decryptConfig(config *ConfigType, cryptoKey string) error {
config.DB.DSN = decryptDsn
for k := range config.HTTP.Alert.BasicAuth {
decryptPwd, err := secu.DealWithDecrypt(config.HTTP.Alert.BasicAuth[k], cryptoKey)
for k := range config.HTTP.APIForService.BasicAuth {
decryptPwd, err := secu.DealWithDecrypt(config.HTTP.APIForService.BasicAuth[k], cryptoKey)
if err != nil {
return fmt.Errorf("failed to decrypt http basic auth password: %s", err)
}
config.HTTP.Alert.BasicAuth[k] = decryptPwd
config.HTTP.APIForService.BasicAuth[k] = decryptPwd
}
for k := range config.HTTP.Pushgw.BasicAuth {
decryptPwd, err := secu.DealWithDecrypt(config.HTTP.Pushgw.BasicAuth[k], cryptoKey)
for k := range config.HTTP.APIForAgent.BasicAuth {
decryptPwd, err := secu.DealWithDecrypt(config.HTTP.APIForAgent.BasicAuth[k], cryptoKey)
if err != nil {
return fmt.Errorf("failed to decrypt http basic auth password: %s", err)
}
config.HTTP.Pushgw.BasicAuth[k] = decryptPwd
}
for k := range config.HTTP.Heartbeat.BasicAuth {
decryptPwd, err := secu.DealWithDecrypt(config.HTTP.Heartbeat.BasicAuth[k], cryptoKey)
if err != nil {
return fmt.Errorf("failed to decrypt http basic auth password: %s", err)
}
config.HTTP.Heartbeat.BasicAuth[k] = decryptPwd
}
for k := range config.HTTP.Service.BasicAuth {
decryptPwd, err := secu.DealWithDecrypt(config.HTTP.Service.BasicAuth[k], cryptoKey)
if err != nil {
return fmt.Errorf("failed to decrypt http basic auth password: %s", err)
}
config.HTTP.Service.BasicAuth[k] = decryptPwd
config.HTTP.APIForAgent.BasicAuth[k] = decryptPwd
}
for i, v := range config.Pushgw.Writers {

View File

@@ -77,4 +77,3 @@ Committer 记录并公示于 **[COMMITTERS](https://github.com/ccfos/nightingale
2. 提问之前请先搜索 [Github Issues](https://github.com/ccfos/nightingale/issues "Github Issue")
3. 我们优先推荐通过提交 [Github Issue](https://github.com/ccfos/nightingale/issues "Github Issue") 来提问,如果[有问题点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&template=bug_report.yml "有问题点击这里") | [有需求建议点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md "有需求建议点击这里")
最后,我们推荐你加入微信群,针对相关开放式问题,相互交流咨询 (请先加好友:[UlricGO](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg "UlricGO") 备注:夜莺加群+姓名+公司,交流群里会有开发者团队和专业、热心的群友回答问题)。

BIN
doc/img/Nightingale_L_V.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 230 KiB

BIN
doc/img/n9e-arch-latest.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 215 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 877 KiB

View File

@@ -1,7 +1,5 @@
ibexetc
compose-host-network
compose-postgres
compose-bridge
initsql
mysqletc
n9eetc
prometc
build.sh
docker-compose.yaml

View File

@@ -5,8 +5,7 @@ WORKDIR /app
ADD n9e /app/
ADD etc /app/
ADD integrations /app/integrations/
ADD --chmod=755 https://github.com/ufoscout/docker-compose-wait/releases/download/2.11.0/wait_x86_64 /wait
RUN chmod +x /wait
RUN pip install requests
EXPOSE 17000

View File

@@ -1,4 +1,3 @@
FROM flashcatcloud/toolbox:v0.0.1 as toolbox
FROM --platform=$TARGETPLATFORM python:3-slim
@@ -6,7 +5,6 @@ WORKDIR /app
ADD n9e /app/
ADD etc /app/
ADD integrations /app/integrations/
COPY --chmod=755 --from=toolbox /toolbox/wait_aarch64 /wait
EXPOSE 17000

View File

@@ -0,0 +1,132 @@
version: "3.7"
networks:
nightingale:
driver: bridge
services:
mysql:
image: "mysql:8"
container_name: mysql
hostname: mysql
restart: always
environment:
TZ: Asia/Shanghai
MYSQL_ROOT_PASSWORD: 1234
volumes:
- ./mysqldata:/var/lib/mysql/
- ../initsql:/docker-entrypoint-initdb.d/
- ./etc-mysql/my.cnf:/etc/my.cnf
networks:
- nightingale
ports:
- "3306:3306"
redis:
image: "redis:6.2"
container_name: redis
hostname: redis
restart: always
environment:
TZ: Asia/Shanghai
networks:
- nightingale
ports:
- "6379:6379"
# prometheus:
# image: prom/prometheus
# container_name: prometheus
# hostname: prometheus
# restart: always
# environment:
# TZ: Asia/Shanghai
# volumes:
# - ./etc-prometheus:/etc/prometheus
# command:
# - "--config.file=/etc/prometheus/prometheus.yml"
# - "--storage.tsdb.path=/prometheus"
# - "--web.console.libraries=/usr/share/prometheus/console_libraries"
# - "--web.console.templates=/usr/share/prometheus/consoles"
# - "--enable-feature=remote-write-receiver"
# - "--query.lookback-delta=2m"
# networks:
# - nightingale
# ports:
# - "9090:9090"
victoriametrics:
image: victoriametrics/victoria-metrics:v1.79.12
container_name: victoriametrics
hostname: victoriametrics
restart: always
environment:
TZ: Asia/Shanghai
ports:
- "8428:8428"
networks:
- nightingale
command:
- "--loggerTimezone=Asia/Shanghai"
ibex:
image: flashcatcloud/ibex:v1.2.0
container_name: ibex
hostname: ibex
restart: always
environment:
GIN_MODE: release
TZ: Asia/Shanghai
WAIT_HOSTS: mysql:3306
volumes:
- ./etc-ibex:/app/etc
networks:
- nightingale
ports:
- "10090:10090"
- "20090:20090"
depends_on:
- mysql
command: >
sh -c "/app/ibex server"
nightingale:
image: flashcatcloud/nightingale:latest
container_name: nightingale
hostname: nightingale
restart: always
environment:
GIN_MODE: release
TZ: Asia/Shanghai
WAIT_HOSTS: mysql:3306, redis:6379
volumes:
- ./etc-nightingale:/app/etc
networks:
- nightingale
ports:
- "17000:17000"
depends_on:
- mysql
- redis
- victoriametrics
command: >
sh -c "/app/n9e"
categraf:
image: "flashcatcloud/categraf:latest"
container_name: "categraf"
hostname: "categraf01"
restart: always
environment:
TZ: Asia/Shanghai
HOST_PROC: /hostfs/proc
HOST_SYS: /hostfs/sys
HOST_MOUNT_PREFIX: /hostfs
WAIT_HOSTS: nightingale:17000, ibex:20090
volumes:
- ./etc-categraf:/etc/categraf/conf
- /:/hostfs
networks:
- nightingale
depends_on:
- nightingale

View File

@@ -0,0 +1,83 @@
[global]
# whether print configs
print_configs = false
# add label(agent_hostname) to series
# "" -> auto detect hostname
# "xx" -> use specified string xx
# "$hostname" -> auto detect hostname
# "$ip" -> auto detect ip
# "$hostname-$ip" -> auto detect hostname and ip to replace the vars
hostname = "$HOSTNAME"
# will not add label(agent_hostname) if true
omit_hostname = false
# s | ms
precision = "ms"
# global collect interval
interval = 15
[global.labels]
source="categraf"
# region = "shanghai"
# env = "localhost"
[writer_opt]
# default: 2000
batch = 2000
# channel(as queue) size
chan_size = 10000
[[writers]]
url = "http://nightingale:17000/prometheus/v1/write"
# Basic auth username
basic_auth_user = ""
# Basic auth password
basic_auth_pass = ""
# timeout settings, unit: ms
timeout = 5000
dial_timeout = 2500
max_idle_conns_per_host = 100
[http]
enable = false
address = ":9100"
print_access = false
run_mode = "release"
[heartbeat]
enable = true
# report os version cpu.util mem.util metadata
url = "http://nightingale:17000/v1/n9e/heartbeat"
# interval, unit: s
interval = 10
# Basic auth username
basic_auth_user = ""
# Basic auth password
basic_auth_pass = ""
## Optional headers
# headers = ["X-From", "categraf", "X-Xyz", "abc"]
# timeout settings, unit: ms
timeout = 5000
dial_timeout = 2500
max_idle_conns_per_host = 100
[ibex]
enable = true
## ibex flush interval
interval = "1000ms"
## n9e ibex server rpc address
servers = ["ibex:20090"]
## temp script dir
meta_dir = "./meta"

View File

@@ -0,0 +1,10 @@
# # collect interval
# interval = 15
# # By default stats will be gathered for all mount points.
# # Set mount_points will restrict the stats to only the specified mount points.
mount_points = ["/"]
# Ignore mount points by filesystem type.
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs", "nsfs"]

View File

@@ -0,0 +1,4 @@
[[instances]]
urls = [
"http://nightingale:17000/metrics"
]

View File

@@ -0,0 +1,97 @@
# debug, release
RunMode = "release"
[Log]
# log write dir
Dir = "logs-server"
# log level: DEBUG INFO WARNING ERROR
Level = "DEBUG"
# stdout, stderr, file
Output = "stdout"
# # rotate by time
# KeepHours: 4
# # rotate by size
# RotateNum = 3
# # unit: MB
# RotateSize = 256
[HTTP]
Enable = true
# http listening address
Host = "0.0.0.0"
# http listening port
Port = 10090
# https cert file path
CertFile = ""
# https key file path
KeyFile = ""
# whether print access log
PrintAccessLog = true
# whether enable pprof
PProf = false
# http graceful shutdown timeout, unit: s
ShutdownTimeout = 30
# max content length: 64M
MaxContentLength = 67108864
# http server read timeout, unit: s
ReadTimeout = 20
# http server write timeout, unit: s
WriteTimeout = 40
# http server idle timeout, unit: s
IdleTimeout = 120
[BasicAuth]
# using when call apis
ibex = "ibex"
[RPC]
Listen = "0.0.0.0:20090"
[Heartbeat]
# auto detect if blank
IP = ""
# unit: ms
Interval = 1000
[Output]
# database | remote
ComeFrom = "database"
AgtdPort = 2090
[Gorm]
# enable debug mode or not
Debug = false
# mysql postgres
DBType = "mysql"
# unit: s
MaxLifetime = 7200
# max open connections
MaxOpenConns = 150
# max idle connections
MaxIdleConns = 50
# table prefix
TablePrefix = ""
[MySQL]
# mysql address host:port
Address = "mysql:3306"
# mysql username
User = "root"
# mysql password
Password = "1234"
# database name
DBName = "ibex"
# connection params
Parameters = "charset=utf8mb4&parseTime=True&loc=Local&allowNativePasswords=true"
[Postgres]
# pg address host:port
Address = "postgres:5432"
# pg user
User = "root"
# pg password
Password = "1234"
# database name
DBName = "ibex"
# ssl mode
SSLMode = "disable"

Some files were not shown because too many files have changed in this diff Show More