Compare commits

..

30 Commits

Author SHA1 Message Date
ning
f5fb52024b update doc api 2026-03-13 12:36:53 +08:00
ning
04e9cd08da update ai config 2026-03-12 16:57:39 +08:00
ning
1310b8a522 update ai config 2026-03-12 15:42:29 +08:00
ning
0d105e1f9d update mcp 2026-03-11 19:21:53 +08:00
ning
77bca17970 refactor: optimize llm config 2026-03-11 16:43:16 +08:00
ning
3fb5f446be update agent model 2026-03-11 15:36:41 +08:00
ning
f2384cc12b update chat api 2026-03-10 17:04:40 +08:00
ning
72e16b25f3 refactor: update llm config 2026-03-09 20:19:14 +08:00
ning
59c85a8efb update add skill 2026-03-06 22:03:29 +08:00
ning
f50f05ae01 update ai talk 2026-03-06 16:04:04 +08:00
ning
ef6676d3d6 update ai agent 2026-03-06 14:48:47 +08:00
ning
eacf1b650a optimize talk 2026-03-05 19:49:57 +08:00
ning
7566b9b690 add llm 2026-03-05 18:49:00 +08:00
yuansheng
5e01e8e021 refactor: alert rule support Local in timezones
Made-with: Cursor
2026-03-04 15:22:35 +08:00
ning
61c7bbd0d8 fix: doris timeout 2026-03-04 11:29:31 +08:00
ning
303ef3476e fix: doris timeout 2026-03-04 11:16:40 +08:00
Yening Qin
b49ab44818 refactor: http support tracing (#3083) 2026-02-12 17:00:45 +08:00
yuansheng
4d37594a0a feat: alert time tz support (#3081) 2026-02-11 19:15:09 +08:00
ning
5b941d2ce5 optimize event detail api 2026-02-11 11:45:54 +08:00
Yening Qin
932199fde1 refactor: add alert eval detail debug api (#3080) 2026-02-10 22:40:23 +08:00
ning
5d1636d1a5 fix: delete dup api 2026-02-10 20:13:29 +08:00
Yening Qin
6167eb3b13 refactor: event debug api (#3079) 2026-02-10 18:35:00 +08:00
huangjie
5beee98cde refactor: feishu sso support default usergroup (#3078) 2026-02-10 17:49:18 +08:00
Yening Qin
c34c008080 update list api 2026-02-10 10:38:09 +08:00
laiwei
75218f9d5a Revise README_zh.md for clarity and new features
Updated the description to emphasize the open-source monitoring alert management aspect and added information about the MCP-Server.
2026-02-07 17:15:57 +08:00
laiwei
341f82ecde Update README with MCP-Server launch details
Added information about the MCP-Server and its capabilities.
2026-02-07 17:12:40 +08:00
liufuniu
a6056a5fab fix: doris datasource equal (#3074) 2026-02-05 17:45:47 +08:00
liufuniu
01e8370882 refactor: update doris datasource (#3071) 2026-02-05 13:42:17 +08:00
ning
8b11e18754 refactor: optimize es query data 2026-02-05 12:10:41 +08:00
liufuniu
aa749065da refactor: doris datasource add write user (#3067) 2026-02-04 17:27:46 +08:00
169 changed files with 13282 additions and 524 deletions

View File

@@ -31,7 +31,9 @@
Nightingale is an open-source monitoring project that focuses on alerting. Similar to Grafana, Nightingale also connects with various existing data sources. However, while Grafana emphasizes visualization, Nightingale places greater emphasis on the alerting engine, as well as the processing and distribution of alarms.
> The Nightingale project was initially developed and open-sourced by DiDi.inc. On May 11, 2022, it was donated to the Open Source Development Committee of the China Computer Federation (CCF ODC).
> 💡 Nightingale has now officially launched the [MCP-Server](https://github.com/n9e/n9e-mcp-server/). This MCP Server enables AI assistants to interact with the Nightingale API using natural language, facilitating alert management, monitoring, and observability tasks.
>
> The Nightingale project was initially developed and open-sourced by DiDi.inc. On May 11, 2022, it was donated to the Open Source Development Committee of the China Computer Federation (CCF ODTC).
![](https://n9e.github.io/img/global/arch-bg.png)

View File

@@ -3,7 +3,7 @@
<img src="doc/img/Nightingale_L_V.png" alt="nightingale - cloud native monitoring" width="100" /></a>
</p>
<p align="center">
<b>开源告警管理专家</b>
<b>开源监控告警管理专家</b>
</p>
<p align="center">
@@ -33,7 +33,8 @@
夜莺侧重于监控告警,类似于 Grafana 的数据源集成方式,夜莺也是对接多种既有的数据源,不过 Grafana 侧重于可视化,夜莺则是侧重于告警引擎、告警事件的处理和分发。
> 夜莺监控项目,最初由滴滴开发和开源,并于 2022 年 5 月 11 日捐赠予中国计算机学会开源发展技术委员会CCF ODTC为 CCF ODTC 成立后接受捐赠的第一个开源项目
> - 💡夜莺正式推出了 [MCP-Server](https://github.com/n9e/n9e-mcp-server/),此 MCP Server 允许 AI 助手通过自然语言与夜莺 API 交互,实现告警管理、监控和可观测性任务
> - 夜莺监控项目,最初由滴滴开发和开源,并于 2022 年 5 月 11 日捐赠予中国计算机学会开源发展技术委员会CCF ODTC为 CCF ODTC 成立后接受捐赠的第一个开源项目。
![](https://n9e.github.io/img/global/arch-bg.png)

3338
aiagent/ai_agent.go Normal file

File diff suppressed because it is too large Load Diff

546
aiagent/builtin_tools.go Normal file
View File

@@ -0,0 +1,546 @@
package aiagent
import (
"context"
"encoding/json"
"fmt"
"strings"
"time"
"github.com/ccfos/nightingale/v6/datasource"
"github.com/ccfos/nightingale/v6/dscache"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/prom"
"github.com/toolkits/pkg/logger"
)
const (
// ToolTypeBuiltin 内置工具类型
ToolTypeBuiltin = "builtin"
)
// =============================================================================
// 数据源获取函数(支持注入,便于测试)
// =============================================================================
// PromClientGetter Prometheus 客户端获取函数类型
type PromClientGetter func(dsId int64) prom.API
// SQLDatasourceGetter SQL 数据源获取函数类型
type SQLDatasourceGetter func(dsType string, dsId int64) (datasource.Datasource, bool)
// 默认使用 GlobalCache可通过 SetPromClientGetter/SetSQLDatasourceGetter 替换
var (
getPromClientFunc PromClientGetter = defaultGetPromClient
getSQLDatasourceFunc SQLDatasourceGetter = defaultGetSQLDatasource
)
// SetPromClientGetter 设置 Prometheus 客户端获取函数(用于测试)
func SetPromClientGetter(getter PromClientGetter) {
getPromClientFunc = getter
}
// SetSQLDatasourceGetter 设置 SQL 数据源获取函数(用于测试)
func SetSQLDatasourceGetter(getter SQLDatasourceGetter) {
getSQLDatasourceFunc = getter
}
// ResetDatasourceGetters 重置为默认的数据源获取函数
func ResetDatasourceGetters() {
getPromClientFunc = defaultGetPromClient
getSQLDatasourceFunc = defaultGetSQLDatasource
}
func defaultGetPromClient(dsId int64) prom.API {
// Default: no PromClient available. Use SetPromClientGetter to inject.
return nil
}
func defaultGetSQLDatasource(dsType string, dsId int64) (datasource.Datasource, bool) {
return dscache.DsCache.Get(dsType, dsId)
}
// BuiltinToolHandler 内置工具处理函数
type BuiltinToolHandler func(ctx context.Context, wfCtx *models.WorkflowContext, args map[string]interface{}) (string, error)
// BuiltinTool 内置工具定义
type BuiltinTool struct {
Definition AgentTool
Handler BuiltinToolHandler
}
// builtinTools 内置工具注册表
var builtinTools = map[string]*BuiltinTool{
// Prometheus 相关工具
"list_metrics": {
Definition: AgentTool{
Name: "list_metrics",
Description: "搜索 Prometheus 数据源的指标名称,支持关键词模糊匹配",
Type: ToolTypeBuiltin,
Parameters: []ToolParameter{
{Name: "keyword", Type: "string", Description: "搜索关键词,模糊匹配指标名", Required: false},
{Name: "limit", Type: "integer", Description: "返回数量限制默认30", Required: false},
},
},
Handler: listMetrics,
},
"get_metric_labels": {
Definition: AgentTool{
Name: "get_metric_labels",
Description: "获取 Prometheus 指标的所有标签键及其可选值",
Type: ToolTypeBuiltin,
Parameters: []ToolParameter{
{Name: "metric", Type: "string", Description: "指标名称", Required: true},
},
},
Handler: getMetricLabels,
},
// SQL 类数据源相关工具
"list_databases": {
Definition: AgentTool{
Name: "list_databases",
Description: "列出 SQL 数据源MySQL/Doris/ClickHouse/PostgreSQL中的所有数据库",
Type: ToolTypeBuiltin,
Parameters: []ToolParameter{},
},
Handler: listDatabases,
},
"list_tables": {
Definition: AgentTool{
Name: "list_tables",
Description: "列出指定数据库中的所有表",
Type: ToolTypeBuiltin,
Parameters: []ToolParameter{
{Name: "database", Type: "string", Description: "数据库名", Required: true},
},
},
Handler: listTables,
},
"describe_table": {
Definition: AgentTool{
Name: "describe_table",
Description: "获取表的字段结构(字段名、类型、注释)",
Type: ToolTypeBuiltin,
Parameters: []ToolParameter{
{Name: "database", Type: "string", Description: "数据库名", Required: true},
{Name: "table", Type: "string", Description: "表名", Required: true},
},
},
Handler: describeTable,
},
}
// GetBuiltinToolDef 获取内置工具定义
func GetBuiltinToolDef(name string) (AgentTool, bool) {
if tool, ok := builtinTools[name]; ok {
return tool.Definition, true
}
return AgentTool{}, false
}
// GetBuiltinToolDefs 获取指定的内置工具定义列表
func GetBuiltinToolDefs(names []string) []AgentTool {
var defs []AgentTool
for _, name := range names {
if def, ok := GetBuiltinToolDef(name); ok {
defs = append(defs, def)
}
}
return defs
}
// GetAllBuiltinToolDefs 获取所有内置工具定义
func GetAllBuiltinToolDefs() []AgentTool {
defs := make([]AgentTool, 0, len(builtinTools))
for _, tool := range builtinTools {
defs = append(defs, tool.Definition)
}
return defs
}
// ExecuteBuiltinTool 执行内置工具
// 返回值result, handled, error
// handled 表示是否是内置工具true 表示已处理false 表示不是内置工具需要继续查找)
func ExecuteBuiltinTool(ctx context.Context, name string, wfCtx *models.WorkflowContext, argsJSON string) (string, bool, error) {
tool, exists := builtinTools[name]
if !exists {
return "", false, nil
}
// 解析参数
var args map[string]interface{}
if argsJSON != "" {
if err := json.Unmarshal([]byte(argsJSON), &args); err != nil {
// 如果不是 JSON尝试作为简单字符串参数
args = map[string]interface{}{"input": argsJSON}
}
}
if args == nil {
args = make(map[string]interface{})
}
result, err := tool.Handler(ctx, wfCtx, args)
return result, true, err
}
// getDatasourceId 从 wfCtx.Inputs 中获取 datasource_id
func getDatasourceId(wfCtx *models.WorkflowContext) int64 {
if wfCtx == nil || wfCtx.Inputs == nil {
return 0
}
var dsId int64
if dsIdStr, ok := wfCtx.Inputs["datasource_id"]; ok {
fmt.Sscanf(dsIdStr, "%d", &dsId)
}
return dsId
}
// getDatasourceType 从 wfCtx.Inputs 中获取 datasource_type
func getDatasourceType(wfCtx *models.WorkflowContext) string {
if wfCtx == nil || wfCtx.Inputs == nil {
return ""
}
return wfCtx.Inputs["datasource_type"]
}
// =============================================================================
// Prometheus 工具实现
// =============================================================================
// listMetrics 列出 Prometheus 指标
func listMetrics(ctx context.Context, wfCtx *models.WorkflowContext, args map[string]interface{}) (string, error) {
dsId := getDatasourceId(wfCtx)
if dsId == 0 {
return "", fmt.Errorf("datasource_id not found in inputs")
}
keyword, _ := args["keyword"].(string)
limit := 30
if l, ok := args["limit"].(float64); ok && l > 0 {
limit = int(l)
}
// 获取 Prometheus 客户端
client := getPromClientFunc(dsId)
if client == nil {
return "", fmt.Errorf("prometheus datasource not found: %d", dsId)
}
// 调用 LabelValues 获取 __name__ 的所有值(即所有指标名)
values, _, err := client.LabelValues(ctx, "__name__", nil)
if err != nil {
return "", fmt.Errorf("failed to get metrics: %v", err)
}
// 过滤和限制
result := make([]string, 0)
keyword = strings.ToLower(keyword)
for _, v := range values {
m := string(v)
if keyword == "" || strings.Contains(strings.ToLower(m), keyword) {
result = append(result, m)
if len(result) >= limit {
break
}
}
}
logger.Debugf("list_metrics: found %d metrics (keyword=%s, limit=%d)", len(result), keyword, limit)
bytes, _ := json.Marshal(result)
return string(bytes), nil
}
// getMetricLabels 获取指标的标签
func getMetricLabels(ctx context.Context, wfCtx *models.WorkflowContext, args map[string]interface{}) (string, error) {
dsId := getDatasourceId(wfCtx)
if dsId == 0 {
return "", fmt.Errorf("datasource_id not found in inputs")
}
metric, ok := args["metric"].(string)
if !ok || metric == "" {
return "", fmt.Errorf("metric parameter is required")
}
client := getPromClientFunc(dsId)
if client == nil {
return "", fmt.Errorf("prometheus datasource not found: %d", dsId)
}
// 使用 Series 接口获取指标的所有 series
endTime := time.Now()
startTime := endTime.Add(-1 * time.Hour)
series, _, err := client.Series(ctx, []string{metric}, startTime, endTime)
if err != nil {
return "", fmt.Errorf("failed to get metric series: %v", err)
}
// 聚合标签键值
labels := make(map[string][]string)
seen := make(map[string]map[string]bool)
for _, s := range series {
for k, v := range s {
key := string(k)
val := string(v)
if key == "__name__" {
continue
}
if seen[key] == nil {
seen[key] = make(map[string]bool)
}
if !seen[key][val] {
seen[key][val] = true
labels[key] = append(labels[key], val)
}
}
}
logger.Debugf("get_metric_labels: metric=%s, found %d labels", metric, len(labels))
bytes, _ := json.Marshal(labels)
return string(bytes), nil
}
// =============================================================================
// SQL 数据源工具实现
// =============================================================================
// SQLMetadataQuerier SQL 元数据查询接口
type SQLMetadataQuerier interface {
ListDatabases(ctx context.Context) ([]string, error)
ListTables(ctx context.Context, database string) ([]string, error)
DescribeTable(ctx context.Context, database, table string) ([]map[string]interface{}, error)
}
// listDatabases 列出数据库
func listDatabases(ctx context.Context, wfCtx *models.WorkflowContext, args map[string]interface{}) (string, error) {
dsId := getDatasourceId(wfCtx)
dsType := getDatasourceType(wfCtx)
if dsId == 0 {
return "", fmt.Errorf("datasource_id not found in inputs")
}
if dsType == "" {
return "", fmt.Errorf("datasource_type not found in inputs")
}
plug, exists := getSQLDatasourceFunc(dsType, dsId)
if !exists {
return "", fmt.Errorf("datasource not found: %s/%d", dsType, dsId)
}
// 构建查询 SQL
var sql string
switch dsType {
case "mysql", "doris":
sql = "SHOW DATABASES"
case "ck", "clickhouse":
sql = "SHOW DATABASES"
case "pgsql", "postgresql":
sql = "SELECT datname FROM pg_database WHERE datistemplate = false"
default:
return "", fmt.Errorf("unsupported datasource type for list_databases: %s", dsType)
}
// 执行查询
query := map[string]interface{}{"sql": sql}
data, _, err := plug.QueryLog(ctx, query)
if err != nil {
return "", fmt.Errorf("failed to list databases: %v", err)
}
// 提取数据库名
databases := extractColumnValues(data, dsType, "database")
logger.Debugf("list_databases: dsType=%s, found %d databases", dsType, len(databases))
bytes, _ := json.Marshal(databases)
return string(bytes), nil
}
// listTables 列出表
func listTables(ctx context.Context, wfCtx *models.WorkflowContext, args map[string]interface{}) (string, error) {
dsId := getDatasourceId(wfCtx)
dsType := getDatasourceType(wfCtx)
if dsId == 0 {
return "", fmt.Errorf("datasource_id not found in inputs")
}
database, ok := args["database"].(string)
if !ok || database == "" {
return "", fmt.Errorf("database parameter is required")
}
plug, exists := getSQLDatasourceFunc(dsType, dsId)
if !exists {
return "", fmt.Errorf("datasource not found: %s/%d", dsType, dsId)
}
// 构建查询 SQL
var sql string
switch dsType {
case "mysql", "doris":
sql = fmt.Sprintf("SHOW TABLES FROM `%s`", database)
case "ck", "clickhouse":
sql = fmt.Sprintf("SHOW TABLES FROM `%s`", database)
case "pgsql", "postgresql":
sql = fmt.Sprintf("SELECT tablename FROM pg_tables WHERE schemaname = 'public'")
default:
return "", fmt.Errorf("unsupported datasource type for list_tables: %s", dsType)
}
// 执行查询
query := map[string]interface{}{"sql": sql, "database": database}
data, _, err := plug.QueryLog(ctx, query)
if err != nil {
return "", fmt.Errorf("failed to list tables: %v", err)
}
// 提取表名
tables := extractColumnValues(data, dsType, "table")
logger.Debugf("list_tables: dsType=%s, database=%s, found %d tables", dsType, database, len(tables))
bytes, _ := json.Marshal(tables)
return string(bytes), nil
}
// describeTable 获取表结构
func describeTable(ctx context.Context, wfCtx *models.WorkflowContext, args map[string]interface{}) (string, error) {
dsId := getDatasourceId(wfCtx)
dsType := getDatasourceType(wfCtx)
if dsId == 0 {
return "", fmt.Errorf("datasource_id not found in inputs")
}
database, ok := args["database"].(string)
if !ok || database == "" {
return "", fmt.Errorf("database parameter is required")
}
table, ok := args["table"].(string)
if !ok || table == "" {
return "", fmt.Errorf("table parameter is required")
}
plug, exists := getSQLDatasourceFunc(dsType, dsId)
if !exists {
return "", fmt.Errorf("datasource not found: %s/%d", dsType, dsId)
}
// 构建查询 SQL
var sql string
switch dsType {
case "mysql", "doris":
sql = fmt.Sprintf("DESCRIBE `%s`.`%s`", database, table)
case "ck", "clickhouse":
sql = fmt.Sprintf("DESCRIBE TABLE `%s`.`%s`", database, table)
case "pgsql", "postgresql":
sql = fmt.Sprintf(`SELECT column_name as "Field", data_type as "Type", is_nullable as "Null", column_default as "Default" FROM information_schema.columns WHERE table_schema = 'public' AND table_name = '%s'`, table)
default:
return "", fmt.Errorf("unsupported datasource type for describe_table: %s", dsType)
}
// 执行查询
query := map[string]interface{}{"sql": sql, "database": database}
data, _, err := plug.QueryLog(ctx, query)
if err != nil {
return "", fmt.Errorf("failed to describe table: %v", err)
}
// 转换为统一的列结构
columns := convertToColumnInfo(data, dsType)
logger.Debugf("describe_table: dsType=%s, table=%s.%s, found %d columns", dsType, database, table, len(columns))
bytes, _ := json.Marshal(columns)
return string(bytes), nil
}
// ColumnInfo 列信息
type ColumnInfo struct {
Name string `json:"name"`
Type string `json:"type"`
Comment string `json:"comment,omitempty"`
}
// extractColumnValues 从查询结果中提取列值
func extractColumnValues(data []interface{}, dsType string, columnType string) []string {
result := make([]string, 0)
for _, row := range data {
if rowMap, ok := row.(map[string]interface{}); ok {
// 尝试多种可能的列名
var value string
for _, key := range getPossibleColumnNames(dsType, columnType) {
if v, ok := rowMap[key]; ok {
if s, ok := v.(string); ok {
value = s
break
}
}
}
if value != "" {
result = append(result, value)
}
}
}
return result
}
// getPossibleColumnNames 获取可能的列名
func getPossibleColumnNames(dsType string, columnType string) []string {
switch columnType {
case "database":
return []string{"Database", "database", "datname", "name"}
case "table":
return []string{"Tables_in_", "table", "tablename", "name", "Name"}
default:
return []string{}
}
}
// convertToColumnInfo 将查询结果转换为统一的列信息格式
func convertToColumnInfo(data []interface{}, dsType string) []ColumnInfo {
result := make([]ColumnInfo, 0)
for _, row := range data {
if rowMap, ok := row.(map[string]interface{}); ok {
col := ColumnInfo{}
// 提取列名
for _, key := range []string{"Field", "field", "column_name", "name"} {
if v, ok := rowMap[key]; ok {
if s, ok := v.(string); ok {
col.Name = s
break
}
}
}
// 提取类型
for _, key := range []string{"Type", "type", "data_type"} {
if v, ok := rowMap[key]; ok {
if s, ok := v.(string); ok {
col.Type = s
break
}
}
}
// 提取注释(可选)
for _, key := range []string{"Comment", "comment", "column_comment"} {
if v, ok := rowMap[key]; ok {
if s, ok := v.(string); ok {
col.Comment = s
break
}
}
}
if col.Name != "" {
result = append(result, col)
}
}
}
return result
}

376
aiagent/llm/claude.go Normal file
View File

@@ -0,0 +1,376 @@
package llm
import (
"bufio"
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"strings"
)
const (
DefaultClaudeURL = "https://api.anthropic.com/v1/messages"
ClaudeAPIVersion = "2023-06-01"
DefaultClaudeMaxTokens = 4096
)
// Claude implements the LLM interface for Anthropic Claude API
type Claude struct {
config *Config
client *http.Client
}
// NewClaude creates a new Claude provider
func NewClaude(cfg *Config, client *http.Client) (*Claude, error) {
if cfg.BaseURL == "" {
cfg.BaseURL = DefaultClaudeURL
}
return &Claude{
config: cfg,
client: client,
}, nil
}
func (c *Claude) Name() string {
return ProviderClaude
}
// Claude API request/response structures
type claudeRequest struct {
Model string `json:"model"`
Messages []claudeMessage `json:"messages"`
System string `json:"system,omitempty"`
MaxTokens int `json:"max_tokens"`
Temperature float64 `json:"temperature,omitempty"`
TopP float64 `json:"top_p,omitempty"`
Stop []string `json:"stop_sequences,omitempty"`
Stream bool `json:"stream,omitempty"`
Tools []claudeTool `json:"tools,omitempty"`
}
type claudeMessage struct {
Role string `json:"role"`
Content []claudeContentBlock `json:"content"`
}
type claudeContentBlock struct {
Type string `json:"type"`
Text string `json:"text,omitempty"`
ID string `json:"id,omitempty"`
Name string `json:"name,omitempty"`
Input any `json:"input,omitempty"`
ToolUseID string `json:"tool_use_id,omitempty"`
Content string `json:"content,omitempty"`
}
type claudeTool struct {
Name string `json:"name"`
Description string `json:"description"`
InputSchema map[string]interface{} `json:"input_schema"`
}
type claudeResponse struct {
ID string `json:"id"`
Type string `json:"type"`
Role string `json:"role"`
Content []claudeContentBlock `json:"content"`
Model string `json:"model"`
StopReason string `json:"stop_reason"`
StopSequence string `json:"stop_sequence,omitempty"`
Usage *struct {
InputTokens int `json:"input_tokens"`
OutputTokens int `json:"output_tokens"`
} `json:"usage,omitempty"`
Error *struct {
Type string `json:"type"`
Message string `json:"message"`
} `json:"error,omitempty"`
}
// Claude streaming event types
type claudeStreamEvent struct {
Type string `json:"type"`
Index int `json:"index,omitempty"`
ContentBlock *claudeContentBlock `json:"content_block,omitempty"`
Delta *claudeStreamDelta `json:"delta,omitempty"`
Message *claudeResponse `json:"message,omitempty"`
Usage *claudeStreamUsage `json:"usage,omitempty"`
}
type claudeStreamDelta struct {
Type string `json:"type"`
Text string `json:"text,omitempty"`
PartialJSON string `json:"partial_json,omitempty"`
StopReason string `json:"stop_reason,omitempty"`
}
type claudeStreamUsage struct {
OutputTokens int `json:"output_tokens"`
}
func (c *Claude) Generate(ctx context.Context, req *GenerateRequest) (*GenerateResponse, error) {
claudeReq := c.convertRequest(req)
claudeReq.Stream = false
respBody, err := c.doRequest(ctx, claudeReq)
if err != nil {
return nil, err
}
var claudeResp claudeResponse
if err := json.Unmarshal(respBody, &claudeResp); err != nil {
return nil, fmt.Errorf("failed to parse response: %w", err)
}
if claudeResp.Error != nil {
return nil, fmt.Errorf("Claude API error: %s", claudeResp.Error.Message)
}
return c.convertResponse(&claudeResp), nil
}
func (c *Claude) GenerateStream(ctx context.Context, req *GenerateRequest) (<-chan StreamChunk, error) {
claudeReq := c.convertRequest(req)
claudeReq.Stream = true
jsonData, err := json.Marshal(claudeReq)
if err != nil {
return nil, fmt.Errorf("failed to marshal request: %w", err)
}
httpReq, err := http.NewRequestWithContext(ctx, "POST", c.config.BaseURL, bytes.NewBuffer(jsonData))
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
c.setHeaders(httpReq)
resp, err := c.client.Do(httpReq)
if err != nil {
return nil, fmt.Errorf("failed to send request: %w", err)
}
if resp.StatusCode >= 400 {
body, _ := io.ReadAll(resp.Body)
resp.Body.Close()
return nil, fmt.Errorf("Claude API error (status %d): %s", resp.StatusCode, string(body))
}
ch := make(chan StreamChunk, 100)
go c.streamResponse(ctx, resp, ch)
return ch, nil
}
func (c *Claude) streamResponse(ctx context.Context, resp *http.Response, ch chan<- StreamChunk) {
defer close(ch)
defer resp.Body.Close()
reader := bufio.NewReader(resp.Body)
var currentToolCall *ToolCall
for {
select {
case <-ctx.Done():
ch <- StreamChunk{Done: true, Error: ctx.Err()}
return
default:
}
line, err := reader.ReadString('\n')
if err != nil {
if err != io.EOF {
ch <- StreamChunk{Done: true, Error: err}
} else {
ch <- StreamChunk{Done: true}
}
return
}
line = strings.TrimSpace(line)
if line == "" || !strings.HasPrefix(line, "data: ") {
continue
}
data := strings.TrimPrefix(line, "data: ")
var event claudeStreamEvent
if err := json.Unmarshal([]byte(data), &event); err != nil {
continue
}
switch event.Type {
case "content_block_start":
if event.ContentBlock != nil && event.ContentBlock.Type == "tool_use" {
currentToolCall = &ToolCall{
ID: event.ContentBlock.ID,
Name: event.ContentBlock.Name,
}
}
case "content_block_delta":
if event.Delta != nil {
chunk := StreamChunk{}
switch event.Delta.Type {
case "text_delta":
chunk.Content = event.Delta.Text
case "input_json_delta":
if currentToolCall != nil {
currentToolCall.Arguments += event.Delta.PartialJSON
}
}
if chunk.Content != "" {
ch <- chunk
}
}
case "content_block_stop":
if currentToolCall != nil {
ch <- StreamChunk{
ToolCalls: []ToolCall{*currentToolCall},
}
currentToolCall = nil
}
case "message_delta":
if event.Delta != nil && event.Delta.StopReason != "" {
ch <- StreamChunk{
FinishReason: event.Delta.StopReason,
}
}
case "message_stop":
ch <- StreamChunk{Done: true}
return
case "error":
ch <- StreamChunk{Done: true, Error: fmt.Errorf("stream error")}
return
}
}
}
func (c *Claude) convertRequest(req *GenerateRequest) *claudeRequest {
claudeReq := &claudeRequest{
Model: c.config.Model,
MaxTokens: req.MaxTokens,
Temperature: req.Temperature,
TopP: req.TopP,
Stop: req.Stop,
}
if claudeReq.MaxTokens <= 0 {
claudeReq.MaxTokens = DefaultClaudeMaxTokens
}
// Extract system message and convert other messages
for _, msg := range req.Messages {
if msg.Role == RoleSystem {
claudeReq.System = msg.Content
continue
}
// Claude uses content blocks instead of plain strings
claudeMsg := claudeMessage{
Role: msg.Role,
Content: []claudeContentBlock{
{Type: "text", Text: msg.Content},
},
}
claudeReq.Messages = append(claudeReq.Messages, claudeMsg)
}
// Convert tools
for _, tool := range req.Tools {
claudeReq.Tools = append(claudeReq.Tools, claudeTool{
Name: tool.Name,
Description: tool.Description,
InputSchema: tool.Parameters,
})
}
return claudeReq
}
func (c *Claude) convertResponse(resp *claudeResponse) *GenerateResponse {
result := &GenerateResponse{
FinishReason: resp.StopReason,
}
// Extract text content and tool calls
var textParts []string
for _, block := range resp.Content {
switch block.Type {
case "text":
textParts = append(textParts, block.Text)
case "tool_use":
inputJSON, _ := json.Marshal(block.Input)
result.ToolCalls = append(result.ToolCalls, ToolCall{
ID: block.ID,
Name: block.Name,
Arguments: string(inputJSON),
})
}
}
result.Content = strings.Join(textParts, "")
if resp.Usage != nil {
result.Usage = &Usage{
PromptTokens: resp.Usage.InputTokens,
CompletionTokens: resp.Usage.OutputTokens,
TotalTokens: resp.Usage.InputTokens + resp.Usage.OutputTokens,
}
}
return result
}
func (c *Claude) doRequest(ctx context.Context, req *claudeRequest) ([]byte, error) {
jsonData, err := json.Marshal(req)
if err != nil {
return nil, fmt.Errorf("failed to marshal request: %w", err)
}
httpReq, err := http.NewRequestWithContext(ctx, "POST", c.config.BaseURL, bytes.NewBuffer(jsonData))
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
c.setHeaders(httpReq)
resp, err := c.client.Do(httpReq)
if err != nil {
return nil, fmt.Errorf("failed to send request: %w", err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("failed to read response: %w", err)
}
if resp.StatusCode >= 400 {
return nil, fmt.Errorf("Claude API error (status %d): %s", resp.StatusCode, string(body))
}
return body, nil
}
func (c *Claude) setHeaders(req *http.Request) {
req.Header.Set("Content-Type", "application/json")
req.Header.Set("anthropic-version", ClaudeAPIVersion)
if c.config.APIKey != "" {
req.Header.Set("x-api-key", c.config.APIKey)
}
for k, v := range c.config.Headers {
req.Header.Set(k, v)
}
}

376
aiagent/llm/gemini.go Normal file
View File

@@ -0,0 +1,376 @@
package llm
import (
"bufio"
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"strings"
)
const (
DefaultGeminiURL = "https://generativelanguage.googleapis.com/v1beta/models"
)
// Gemini implements the LLM interface for Google Gemini API
type Gemini struct {
config *Config
client *http.Client
}
// NewGemini creates a new Gemini provider
func NewGemini(cfg *Config, client *http.Client) (*Gemini, error) {
if cfg.BaseURL == "" {
cfg.BaseURL = DefaultGeminiURL
}
return &Gemini{
config: cfg,
client: client,
}, nil
}
func (g *Gemini) Name() string {
return ProviderGemini
}
// Gemini API request/response structures
type geminiRequest struct {
Contents []geminiContent `json:"contents"`
SystemInstruction *geminiContent `json:"systemInstruction,omitempty"`
Tools []geminiTool `json:"tools,omitempty"`
GenerationConfig *geminiGenerationConfig `json:"generationConfig,omitempty"`
}
type geminiContent struct {
Role string `json:"role,omitempty"`
Parts []geminiPart `json:"parts"`
}
type geminiPart struct {
Text string `json:"text,omitempty"`
FunctionCall *geminiFunctionCall `json:"functionCall,omitempty"`
FunctionResponse *geminiFunctionResponse `json:"functionResponse,omitempty"`
}
type geminiFunctionCall struct {
Name string `json:"name"`
Args map[string]interface{} `json:"args"`
}
type geminiFunctionResponse struct {
Name string `json:"name"`
Response map[string]interface{} `json:"response"`
}
type geminiTool struct {
FunctionDeclarations []geminiFunctionDeclaration `json:"functionDeclarations,omitempty"`
}
type geminiFunctionDeclaration struct {
Name string `json:"name"`
Description string `json:"description"`
Parameters map[string]interface{} `json:"parameters,omitempty"`
}
type geminiGenerationConfig struct {
Temperature float64 `json:"temperature,omitempty"`
TopP float64 `json:"topP,omitempty"`
MaxOutputTokens int `json:"maxOutputTokens,omitempty"`
StopSequences []string `json:"stopSequences,omitempty"`
}
type geminiResponse struct {
Candidates []struct {
Content geminiContent `json:"content"`
FinishReason string `json:"finishReason"`
SafetyRatings []struct {
Category string `json:"category"`
Probability string `json:"probability"`
} `json:"safetyRatings,omitempty"`
} `json:"candidates"`
UsageMetadata *struct {
PromptTokenCount int `json:"promptTokenCount"`
CandidatesTokenCount int `json:"candidatesTokenCount"`
TotalTokenCount int `json:"totalTokenCount"`
} `json:"usageMetadata,omitempty"`
Error *struct {
Code int `json:"code"`
Message string `json:"message"`
Status string `json:"status"`
} `json:"error,omitempty"`
}
func (g *Gemini) Generate(ctx context.Context, req *GenerateRequest) (*GenerateResponse, error) {
geminiReq := g.convertRequest(req)
url := g.buildURL(false)
respBody, err := g.doRequest(ctx, url, geminiReq)
if err != nil {
return nil, err
}
var geminiResp geminiResponse
if err := json.Unmarshal(respBody, &geminiResp); err != nil {
return nil, fmt.Errorf("failed to parse response: %w", err)
}
if geminiResp.Error != nil {
return nil, fmt.Errorf("Gemini API error: %s", geminiResp.Error.Message)
}
return g.convertResponse(&geminiResp), nil
}
func (g *Gemini) GenerateStream(ctx context.Context, req *GenerateRequest) (<-chan StreamChunk, error) {
geminiReq := g.convertRequest(req)
jsonData, err := json.Marshal(geminiReq)
if err != nil {
return nil, fmt.Errorf("failed to marshal request: %w", err)
}
url := g.buildURL(true)
httpReq, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewBuffer(jsonData))
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
g.setHeaders(httpReq)
resp, err := g.client.Do(httpReq)
if err != nil {
return nil, fmt.Errorf("failed to send request: %w", err)
}
if resp.StatusCode >= 400 {
body, _ := io.ReadAll(resp.Body)
resp.Body.Close()
return nil, fmt.Errorf("Gemini API error (status %d): %s", resp.StatusCode, string(body))
}
ch := make(chan StreamChunk, 100)
go g.streamResponse(ctx, resp, ch)
return ch, nil
}
func (g *Gemini) streamResponse(ctx context.Context, resp *http.Response, ch chan<- StreamChunk) {
defer close(ch)
defer resp.Body.Close()
reader := bufio.NewReader(resp.Body)
var buffer strings.Builder
for {
select {
case <-ctx.Done():
ch <- StreamChunk{Done: true, Error: ctx.Err()}
return
default:
}
line, err := reader.ReadString('\n')
if err != nil {
if err != io.EOF {
ch <- StreamChunk{Done: true, Error: err}
} else {
ch <- StreamChunk{Done: true}
}
return
}
line = strings.TrimSpace(line)
// Gemini streams JSON objects, accumulate until we have a complete one
if line == "" {
continue
}
// Handle SSE format if present
if strings.HasPrefix(line, "data: ") {
line = strings.TrimPrefix(line, "data: ")
}
buffer.WriteString(line)
// Try to parse accumulated JSON
var geminiResp geminiResponse
if err := json.Unmarshal([]byte(buffer.String()), &geminiResp); err != nil {
// Not complete yet, continue accumulating
continue
}
// Reset buffer for next response
buffer.Reset()
if len(geminiResp.Candidates) > 0 {
candidate := geminiResp.Candidates[0]
chunk := StreamChunk{
FinishReason: candidate.FinishReason,
}
for _, part := range candidate.Content.Parts {
if part.Text != "" {
chunk.Content += part.Text
}
if part.FunctionCall != nil {
argsJSON, _ := json.Marshal(part.FunctionCall.Args)
chunk.ToolCalls = append(chunk.ToolCalls, ToolCall{
Name: part.FunctionCall.Name,
Arguments: string(argsJSON),
})
}
}
ch <- chunk
if candidate.FinishReason != "" && candidate.FinishReason != "STOP" {
ch <- StreamChunk{Done: true}
return
}
}
}
}
func (g *Gemini) convertRequest(req *GenerateRequest) *geminiRequest {
geminiReq := &geminiRequest{
GenerationConfig: &geminiGenerationConfig{
Temperature: req.Temperature,
TopP: req.TopP,
MaxOutputTokens: req.MaxTokens,
StopSequences: req.Stop,
},
}
// Convert messages
for _, msg := range req.Messages {
if msg.Role == RoleSystem {
geminiReq.SystemInstruction = &geminiContent{
Parts: []geminiPart{{Text: msg.Content}},
}
continue
}
// Map roles
role := msg.Role
if role == RoleAssistant {
role = "model"
}
geminiReq.Contents = append(geminiReq.Contents, geminiContent{
Role: role,
Parts: []geminiPart{{Text: msg.Content}},
})
}
// Convert tools
if len(req.Tools) > 0 {
var declarations []geminiFunctionDeclaration
for _, tool := range req.Tools {
declarations = append(declarations, geminiFunctionDeclaration{
Name: tool.Name,
Description: tool.Description,
Parameters: tool.Parameters,
})
}
geminiReq.Tools = []geminiTool{{FunctionDeclarations: declarations}}
}
return geminiReq
}
func (g *Gemini) convertResponse(resp *geminiResponse) *GenerateResponse {
result := &GenerateResponse{}
if len(resp.Candidates) > 0 {
candidate := resp.Candidates[0]
result.FinishReason = candidate.FinishReason
var textParts []string
for _, part := range candidate.Content.Parts {
if part.Text != "" {
textParts = append(textParts, part.Text)
}
if part.FunctionCall != nil {
argsJSON, _ := json.Marshal(part.FunctionCall.Args)
result.ToolCalls = append(result.ToolCalls, ToolCall{
Name: part.FunctionCall.Name,
Arguments: string(argsJSON),
})
}
}
result.Content = strings.Join(textParts, "")
}
if resp.UsageMetadata != nil {
result.Usage = &Usage{
PromptTokens: resp.UsageMetadata.PromptTokenCount,
CompletionTokens: resp.UsageMetadata.CandidatesTokenCount,
TotalTokens: resp.UsageMetadata.TotalTokenCount,
}
}
return result
}
func (g *Gemini) buildURL(stream bool) string {
action := "generateContent"
if stream {
action = "streamGenerateContent"
}
// Check if baseURL already contains the full path
if strings.Contains(g.config.BaseURL, ":generateContent") ||
strings.Contains(g.config.BaseURL, ":streamGenerateContent") {
return fmt.Sprintf("%s?key=%s", g.config.BaseURL, g.config.APIKey)
}
return fmt.Sprintf("%s/%s:%s?key=%s",
g.config.BaseURL,
g.config.Model,
action,
g.config.APIKey,
)
}
func (g *Gemini) doRequest(ctx context.Context, url string, req *geminiRequest) ([]byte, error) {
jsonData, err := json.Marshal(req)
if err != nil {
return nil, fmt.Errorf("failed to marshal request: %w", err)
}
httpReq, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewBuffer(jsonData))
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
g.setHeaders(httpReq)
resp, err := g.client.Do(httpReq)
if err != nil {
return nil, fmt.Errorf("failed to send request: %w", err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("failed to read response: %w", err)
}
if resp.StatusCode >= 400 {
return nil, fmt.Errorf("Gemini API error (status %d): %s", resp.StatusCode, string(body))
}
return body, nil
}
func (g *Gemini) setHeaders(req *http.Request) {
req.Header.Set("Content-Type", "application/json")
for k, v := range g.config.Headers {
req.Header.Set(k, v)
}
}

135
aiagent/llm/helper.go Normal file
View File

@@ -0,0 +1,135 @@
package llm
import (
"context"
"strings"
)
// Chat is a convenience function for simple chat completions
func Chat(ctx context.Context, llm LLM, messages []Message) (string, error) {
resp, err := llm.Generate(ctx, &GenerateRequest{
Messages: messages,
})
if err != nil {
return "", err
}
return resp.Content, nil
}
// ChatWithSystem is a convenience function for chat with a system prompt
func ChatWithSystem(ctx context.Context, llm LLM, systemPrompt string, userMessage string) (string, error) {
messages := []Message{
{Role: RoleSystem, Content: systemPrompt},
{Role: RoleUser, Content: userMessage},
}
return Chat(ctx, llm, messages)
}
// NewMessage creates a new message
func NewMessage(role, content string) Message {
return Message{Role: role, Content: content}
}
// SystemMessage creates a system message
func SystemMessage(content string) Message {
return Message{Role: RoleSystem, Content: content}
}
// UserMessage creates a user message
func UserMessage(content string) Message {
return Message{Role: RoleUser, Content: content}
}
// AssistantMessage creates an assistant message
func AssistantMessage(content string) Message {
return Message{Role: RoleAssistant, Content: content}
}
// DetectProvider attempts to detect the provider from the base URL
func DetectProvider(baseURL string) string {
baseURL = strings.ToLower(baseURL)
switch {
case strings.Contains(baseURL, "anthropic.com"):
return ProviderClaude
case strings.Contains(baseURL, "generativelanguage.googleapis.com"):
return ProviderGemini
case strings.Contains(baseURL, "aiplatform.googleapis.com"):
return ProviderVertex
case strings.Contains(baseURL, "bedrock"):
return ProviderBedrock
case strings.Contains(baseURL, "localhost:11434"):
return ProviderOllama
default:
// Default to OpenAI-compatible
return ProviderOpenAI
}
}
// DetectProviderFromModel attempts to detect the provider from the model name
func DetectProviderFromModel(model string) string {
model = strings.ToLower(model)
switch {
case strings.HasPrefix(model, "claude"):
return ProviderClaude
case strings.HasPrefix(model, "gemini"):
return ProviderGemini
case strings.HasPrefix(model, "gpt") || strings.HasPrefix(model, "o1") || strings.HasPrefix(model, "o3"):
return ProviderOpenAI
case strings.HasPrefix(model, "llama") || strings.HasPrefix(model, "mistral") || strings.HasPrefix(model, "qwen"):
return ProviderOllama
default:
return ProviderOpenAI
}
}
// BuildToolDefinition creates a tool definition with JSON schema parameters
func BuildToolDefinition(name, description string, properties map[string]interface{}, required []string) ToolDefinition {
params := map[string]interface{}{
"type": "object",
"properties": properties,
}
if len(required) > 0 {
params["required"] = required
}
return ToolDefinition{
Name: name,
Description: description,
Parameters: params,
}
}
// CollectStream collects all chunks from a stream into a single response
func CollectStream(ch <-chan StreamChunk) (*GenerateResponse, error) {
var content strings.Builder
var toolCalls []ToolCall
var finishReason string
var lastErr error
for chunk := range ch {
if chunk.Error != nil {
lastErr = chunk.Error
}
if chunk.Content != "" {
content.WriteString(chunk.Content)
}
if len(chunk.ToolCalls) > 0 {
toolCalls = append(toolCalls, chunk.ToolCalls...)
}
if chunk.FinishReason != "" {
finishReason = chunk.FinishReason
}
}
if lastErr != nil {
return nil, lastErr
}
return &GenerateResponse{
Content: content.String(),
ToolCalls: toolCalls,
FinishReason: finishReason,
}, nil
}

193
aiagent/llm/llm.go Normal file
View File

@@ -0,0 +1,193 @@
// Package llm provides a unified interface for multiple LLM providers.
// Supports OpenAI-compatible APIs, Claude/Anthropic, and Gemini.
package llm
import (
"context"
"crypto/tls"
"fmt"
"net/http"
"net/url"
"time"
)
// Provider types
const (
ProviderOpenAI = "openai" // OpenAI and compatible APIs (Azure, vLLM, etc.)
ProviderClaude = "claude" // Anthropic Claude
ProviderGemini = "gemini" // Google Gemini
ProviderOllama = "ollama" // Ollama local models
ProviderBedrock = "bedrock" // AWS Bedrock
ProviderVertex = "vertex" // Google Vertex AI
)
// Role constants
const (
RoleSystem = "system"
RoleUser = "user"
RoleAssistant = "assistant"
)
// Message represents a chat message
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}
// ToolCall represents a tool/function call from the LLM
type ToolCall struct {
ID string `json:"id"`
Name string `json:"name"`
Arguments string `json:"arguments"`
}
// ToolDefinition defines a tool that the LLM can call
type ToolDefinition struct {
Name string `json:"name"`
Description string `json:"description"`
Parameters map[string]interface{} `json:"parameters,omitempty"`
}
// GenerateRequest is the unified request for LLM generation
type GenerateRequest struct {
Messages []Message `json:"messages"`
Tools []ToolDefinition `json:"tools,omitempty"`
MaxTokens int `json:"max_tokens,omitempty"`
Temperature float64 `json:"temperature,omitempty"`
TopP float64 `json:"top_p,omitempty"`
Stop []string `json:"stop,omitempty"`
Stream bool `json:"stream,omitempty"`
}
// GenerateResponse is the unified response from LLM generation
type GenerateResponse struct {
Content string `json:"content"`
ToolCalls []ToolCall `json:"tool_calls,omitempty"`
FinishReason string `json:"finish_reason"`
Usage *Usage `json:"usage,omitempty"`
}
// Usage represents token usage statistics
type Usage struct {
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
TotalTokens int `json:"total_tokens"`
}
// StreamChunk represents a chunk in streaming response
type StreamChunk struct {
Content string `json:"content,omitempty"`
ToolCalls []ToolCall `json:"tool_calls,omitempty"`
FinishReason string `json:"finish_reason,omitempty"`
Done bool `json:"done"`
Error error `json:"error,omitempty"`
}
// LLM is the unified interface for all LLM providers
type LLM interface {
// Name returns the provider name
Name() string
// Generate sends a request to the LLM and returns the response
Generate(ctx context.Context, req *GenerateRequest) (*GenerateResponse, error)
// GenerateStream sends a request and returns a channel for streaming responses
GenerateStream(ctx context.Context, req *GenerateRequest) (<-chan StreamChunk, error)
}
// Config is the configuration for creating an LLM provider
type Config struct {
// Provider type: openai, claude, gemini, ollama, bedrock, vertex
Provider string `json:"provider"`
// API endpoint URL
BaseURL string `json:"base_url,omitempty"`
// API key or token
APIKey string `json:"api_key,omitempty"`
// Model name (e.g., "gpt-4", "claude-3-opus", "gemini-pro")
Model string `json:"model"`
// Additional headers for API requests
Headers map[string]string `json:"headers,omitempty"`
// HTTP timeout in milliseconds
Timeout int `json:"timeout,omitempty"`
// Skip SSL verification (for self-signed certs)
SkipSSLVerify bool `json:"skip_ssl_verify,omitempty"`
// HTTP proxy URL
Proxy string `json:"proxy,omitempty"`
// Provider-specific options
Options map[string]interface{} `json:"options,omitempty"`
}
// DefaultConfig returns a config with default values
func DefaultConfig() *Config {
return &Config{
Provider: ProviderOpenAI,
Timeout: 60000,
}
}
// New creates an LLM instance based on the config
func New(cfg *Config) (LLM, error) {
if cfg == nil {
cfg = DefaultConfig()
}
// Create HTTP client
client := createHTTPClient(cfg)
switch cfg.Provider {
case ProviderOpenAI, "":
return NewOpenAI(cfg, client)
case ProviderClaude:
return NewClaude(cfg, client)
case ProviderGemini:
return NewGemini(cfg, client)
case ProviderOllama:
// Ollama uses OpenAI-compatible API
if cfg.BaseURL == "" {
cfg.BaseURL = "http://localhost:11434/v1"
}
return NewOpenAI(cfg, client)
default:
return nil, fmt.Errorf("unsupported LLM provider: %s", cfg.Provider)
}
}
// createHTTPClient creates an HTTP client with the given config
func createHTTPClient(cfg *Config) *http.Client {
transport := &http.Transport{
TLSClientConfig: &tls.Config{
InsecureSkipVerify: cfg.SkipSSLVerify,
},
}
if cfg.Proxy != "" {
if proxyURL, err := url.Parse(cfg.Proxy); err == nil {
transport.Proxy = http.ProxyURL(proxyURL)
}
}
timeout := cfg.Timeout
if timeout <= 0 {
timeout = 60000
}
return &http.Client{
Timeout: time.Duration(timeout) * time.Millisecond,
Transport: transport,
}
}
// Helper function to convert internal messages to provider-specific format
func ConvertMessages(messages []Message) []Message {
result := make([]Message, len(messages))
copy(result, messages)
return result
}

416
aiagent/llm/openai.go Normal file
View File

@@ -0,0 +1,416 @@
package llm
import (
"bufio"
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"strings"
"time"
)
const (
DefaultOpenAIURL = "https://api.openai.com/v1/chat/completions"
// 重试相关配置
maxRetries = 3
initialRetryWait = 5 * time.Second // rate limit 时初始等待 5 秒
maxRetryWait = 60 * time.Second // 最大等待 60 秒
)
// OpenAI implements the LLM interface for OpenAI and compatible APIs
type OpenAI struct {
config *Config
client *http.Client
}
// NewOpenAI creates a new OpenAI provider
func NewOpenAI(cfg *Config, client *http.Client) (*OpenAI, error) {
if cfg.BaseURL == "" {
cfg.BaseURL = DefaultOpenAIURL
}
return &OpenAI{
config: cfg,
client: client,
}, nil
}
func (o *OpenAI) Name() string {
return ProviderOpenAI
}
// OpenAI API request/response structures
type openAIRequest struct {
Model string `json:"model"`
Messages []openAIMessage `json:"messages"`
Tools []openAITool `json:"tools,omitempty"`
MaxTokens int `json:"max_tokens,omitempty"`
Temperature float64 `json:"temperature,omitempty"`
TopP float64 `json:"top_p,omitempty"`
Stop []string `json:"stop,omitempty"`
Stream bool `json:"stream,omitempty"`
}
type openAIMessage struct {
Role string `json:"role"`
Content string `json:"content,omitempty"`
ToolCalls []openAIToolCall `json:"tool_calls,omitempty"`
ToolCallID string `json:"tool_call_id,omitempty"`
}
type openAITool struct {
Type string `json:"type"`
Function openAIFunction `json:"function"`
}
type openAIFunction struct {
Name string `json:"name"`
Description string `json:"description"`
Parameters map[string]interface{} `json:"parameters,omitempty"`
}
type openAIToolCall struct {
ID string `json:"id"`
Type string `json:"type"`
Function struct {
Name string `json:"name"`
Arguments string `json:"arguments"`
} `json:"function"`
}
type openAIResponse struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
Model string `json:"model"`
Choices []struct {
Index int `json:"index"`
Message openAIMessage `json:"message"`
Delta openAIMessage `json:"delta"`
FinishReason string `json:"finish_reason"`
} `json:"choices"`
Usage *struct {
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
TotalTokens int `json:"total_tokens"`
} `json:"usage,omitempty"`
Error *struct {
Message string `json:"message"`
Type string `json:"type"`
Code string `json:"code"`
} `json:"error,omitempty"`
}
func (o *OpenAI) Generate(ctx context.Context, req *GenerateRequest) (*GenerateResponse, error) {
// Convert to OpenAI format
openAIReq := o.convertRequest(req)
openAIReq.Stream = false
// Make request
respBody, err := o.doRequest(ctx, openAIReq)
if err != nil {
return nil, err
}
// Parse response
var openAIResp openAIResponse
if err := json.Unmarshal(respBody, &openAIResp); err != nil {
return nil, fmt.Errorf("failed to parse response: %w", err)
}
if openAIResp.Error != nil {
return nil, fmt.Errorf("OpenAI API error: %s", openAIResp.Error.Message)
}
if len(openAIResp.Choices) == 0 {
return nil, fmt.Errorf("no response from OpenAI")
}
// Convert to unified response
return o.convertResponse(&openAIResp), nil
}
// isRetryableStatus 检查是否是可重试的 HTTP 状态码
func isRetryableStatus(statusCode int) bool {
switch statusCode {
case 429, 500, 502, 503, 504:
return true
default:
return false
}
}
func (o *OpenAI) GenerateStream(ctx context.Context, req *GenerateRequest) (<-chan StreamChunk, error) {
// Convert to OpenAI format
openAIReq := o.convertRequest(req)
openAIReq.Stream = true
// Create request body
jsonData, err := json.Marshal(openAIReq)
if err != nil {
return nil, fmt.Errorf("failed to marshal request: %w", err)
}
var resp *http.Response
var lastErr error
retryWait := initialRetryWait
// 重试循环
for attempt := 0; attempt <= maxRetries; attempt++ {
if attempt > 0 {
// 等待后重试
select {
case <-ctx.Done():
return nil, ctx.Err()
case <-time.After(retryWait):
}
// 指数退避,但不超过最大等待时间
retryWait *= 2
if retryWait > maxRetryWait {
retryWait = maxRetryWait
}
}
httpReq, err := http.NewRequestWithContext(ctx, "POST", o.config.BaseURL, bytes.NewBuffer(jsonData))
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
o.setHeaders(httpReq)
// Make request
resp, err = o.client.Do(httpReq)
if err != nil {
lastErr = fmt.Errorf("failed to send request: %w", err)
continue // 网络错误,重试
}
if resp.StatusCode >= 400 {
body, _ := io.ReadAll(resp.Body)
resp.Body.Close()
lastErr = fmt.Errorf("OpenAI API error (status %d): %s", resp.StatusCode, string(body))
// 检查是否可重试
if isRetryableStatus(resp.StatusCode) && attempt < maxRetries {
continue // 可重试的错误,继续重试
}
return nil, lastErr
}
// 成功,跳出循环
break
}
if resp == nil {
return nil, lastErr
}
// Create channel and start streaming
ch := make(chan StreamChunk, 100)
go o.streamResponse(ctx, resp, ch)
return ch, nil
}
func (o *OpenAI) streamResponse(ctx context.Context, resp *http.Response, ch chan<- StreamChunk) {
defer close(ch)
defer resp.Body.Close()
reader := bufio.NewReader(resp.Body)
for {
select {
case <-ctx.Done():
ch <- StreamChunk{Done: true, Error: ctx.Err()}
return
default:
}
line, err := reader.ReadString('\n')
if err != nil {
if err != io.EOF {
ch <- StreamChunk{Done: true, Error: err}
} else {
ch <- StreamChunk{Done: true}
}
return
}
line = strings.TrimSpace(line)
if line == "" {
continue
}
if !strings.HasPrefix(line, "data: ") {
continue
}
data := strings.TrimPrefix(line, "data: ")
if data == "[DONE]" {
ch <- StreamChunk{Done: true}
return
}
var streamResp openAIResponse
if err := json.Unmarshal([]byte(data), &streamResp); err != nil {
continue
}
if len(streamResp.Choices) > 0 {
delta := streamResp.Choices[0].Delta
chunk := StreamChunk{
Content: delta.Content,
FinishReason: streamResp.Choices[0].FinishReason,
}
// Handle tool calls in stream
if len(delta.ToolCalls) > 0 {
for _, tc := range delta.ToolCalls {
chunk.ToolCalls = append(chunk.ToolCalls, ToolCall{
ID: tc.ID,
Name: tc.Function.Name,
Arguments: tc.Function.Arguments,
})
}
}
ch <- chunk
}
}
}
func (o *OpenAI) convertRequest(req *GenerateRequest) *openAIRequest {
openAIReq := &openAIRequest{
Model: o.config.Model,
MaxTokens: req.MaxTokens,
Temperature: req.Temperature,
TopP: req.TopP,
Stop: req.Stop,
}
// Convert messages
for _, msg := range req.Messages {
openAIReq.Messages = append(openAIReq.Messages, openAIMessage{
Role: msg.Role,
Content: msg.Content,
})
}
// Convert tools
for _, tool := range req.Tools {
openAIReq.Tools = append(openAIReq.Tools, openAITool{
Type: "function",
Function: openAIFunction{
Name: tool.Name,
Description: tool.Description,
Parameters: tool.Parameters,
},
})
}
return openAIReq
}
func (o *OpenAI) convertResponse(resp *openAIResponse) *GenerateResponse {
result := &GenerateResponse{}
if len(resp.Choices) > 0 {
choice := resp.Choices[0]
result.Content = choice.Message.Content
result.FinishReason = choice.FinishReason
// Convert tool calls
for _, tc := range choice.Message.ToolCalls {
result.ToolCalls = append(result.ToolCalls, ToolCall{
ID: tc.ID,
Name: tc.Function.Name,
Arguments: tc.Function.Arguments,
})
}
}
if resp.Usage != nil {
result.Usage = &Usage{
PromptTokens: resp.Usage.PromptTokens,
CompletionTokens: resp.Usage.CompletionTokens,
TotalTokens: resp.Usage.TotalTokens,
}
}
return result
}
func (o *OpenAI) doRequest(ctx context.Context, req *openAIRequest) ([]byte, error) {
jsonData, err := json.Marshal(req)
if err != nil {
return nil, fmt.Errorf("failed to marshal request: %w", err)
}
var lastErr error
retryWait := initialRetryWait
// 重试循环
for attempt := 0; attempt <= maxRetries; attempt++ {
if attempt > 0 {
// 等待后重试
select {
case <-ctx.Done():
return nil, ctx.Err()
case <-time.After(retryWait):
}
// 指数退避
retryWait *= 2
if retryWait > maxRetryWait {
retryWait = maxRetryWait
}
}
httpReq, err := http.NewRequestWithContext(ctx, "POST", o.config.BaseURL, bytes.NewBuffer(jsonData))
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
o.setHeaders(httpReq)
resp, err := o.client.Do(httpReq)
if err != nil {
lastErr = fmt.Errorf("failed to send request: %w", err)
continue // 网络错误,重试
}
body, err := io.ReadAll(resp.Body)
resp.Body.Close()
if err != nil {
lastErr = fmt.Errorf("failed to read response: %w", err)
continue
}
if resp.StatusCode >= 400 {
lastErr = fmt.Errorf("OpenAI API error (status %d): %s", resp.StatusCode, string(body))
// 检查是否可重试
if isRetryableStatus(resp.StatusCode) && attempt < maxRetries {
continue
}
return nil, lastErr
}
return body, nil
}
return nil, lastErr
}
func (o *OpenAI) setHeaders(req *http.Request) {
req.Header.Set("Content-Type", "application/json")
if o.config.APIKey != "" {
req.Header.Set("Authorization", "Bearer "+o.config.APIKey)
}
for k, v := range o.config.Headers {
req.Header.Set(k, v)
}
}

133
aiagent/llm/prompt.go Normal file
View File

@@ -0,0 +1,133 @@
package llm
import (
"fmt"
"runtime"
"strings"
"time"
)
// ToolInfo 工具信息(用于提示词构建)
type ToolInfo struct {
Name string
Description string
Parameters []ToolParamInfo
}
// ToolParamInfo 工具参数信息
type ToolParamInfo struct {
Name string
Type string
Description string
Required bool
}
// PromptData 提示词模板数据
type PromptData struct {
Platform string // 操作系统
Date string // 当前日期
}
// BuildToolsSection 构建工具描述段落
func BuildToolsSection(tools []ToolInfo) string {
if len(tools) == 0 {
return ""
}
var sb strings.Builder
sb.WriteString("## Available Tools\n\n")
for _, tool := range tools {
sb.WriteString(fmt.Sprintf("### %s\n", tool.Name))
sb.WriteString(fmt.Sprintf("%s\n", tool.Description))
if len(tool.Parameters) > 0 {
sb.WriteString("Parameters:\n")
for _, param := range tool.Parameters {
required := ""
if param.Required {
required = " (required)"
}
sb.WriteString(fmt.Sprintf("- %s (%s)%s: %s\n", param.Name, param.Type, required, param.Description))
}
}
sb.WriteString("\n")
}
return sb.String()
}
// BuildToolsListBrief 构建简洁的工具列表(用于 Plan 模式)
func BuildToolsListBrief(tools []ToolInfo) string {
if len(tools) == 0 {
return ""
}
var sb strings.Builder
sb.WriteString("## Available Tools\n\n")
for _, tool := range tools {
sb.WriteString(fmt.Sprintf("- **%s**: %s\n", tool.Name, tool.Description))
}
return sb.String()
}
// BuildEnvSection 构建环境信息段落
func BuildEnvSection() string {
var sb strings.Builder
sb.WriteString("## Environment\n\n")
sb.WriteString(fmt.Sprintf("- Platform: %s\n", runtime.GOOS))
sb.WriteString(fmt.Sprintf("- Date: %s\n", time.Now().Format("2006-01-02")))
return sb.String()
}
// BuildSkillsSection 构建技能指导段落
func BuildSkillsSection(skillContents []string) string {
if len(skillContents) == 0 {
return ""
}
var sb strings.Builder
sb.WriteString("## 专项技能指导\n\n")
if len(skillContents) == 1 {
sb.WriteString("你已被加载以下专项技能,请参考技能中的流程:\n\n")
sb.WriteString(skillContents[0])
sb.WriteString("\n\n")
} else {
sb.WriteString("你已被加载以下专项技能,请参考技能中的流程来制定执行计划:\n\n")
for i, content := range skillContents {
sb.WriteString(fmt.Sprintf("### 技能 %d\n\n", i+1))
sb.WriteString(content)
sb.WriteString("\n\n")
}
}
return sb.String()
}
// BuildPreviousFindingsSection 构建之前发现段落
func BuildPreviousFindingsSection(findings []string) string {
if len(findings) == 0 {
return ""
}
var sb strings.Builder
sb.WriteString("## Previous Findings\n\n")
for _, finding := range findings {
sb.WriteString(fmt.Sprintf("- %s\n", finding))
}
sb.WriteString("\n")
return sb.String()
}
// BuildCurrentStepSection 构建当前步骤段落
func BuildCurrentStepSection(goal, approach string) string {
var sb strings.Builder
sb.WriteString("## Current Step\n\n")
sb.WriteString(fmt.Sprintf("**Goal**: %s\n", goal))
sb.WriteString(fmt.Sprintf("**Approach**: %s\n\n", approach))
return sb.String()
}

571
aiagent/mcp.go Normal file
View File

@@ -0,0 +1,571 @@
package aiagent
import (
"bytes"
"context"
"crypto/tls"
"encoding/json"
"fmt"
"io"
"net/http"
"net/url"
"os"
"os/exec"
"strings"
"sync"
"time"
"github.com/modelcontextprotocol/go-sdk/mcp"
"github.com/toolkits/pkg/logger"
)
const (
// MCP 传输类型
MCPTransportStdio = "stdio" // 标准输入/输出传输
MCPTransportSSE = "sse" // HTTP Server-Sent Events 传输
// 默认超时
DefaultMCPTimeout = 30000 // 30 秒
DefaultMCPConnectTimeout = 10000 // 10 秒
)
// MCPConfig MCP 服务器配置(在 AIAgentConfig 中使用)
type MCPConfig struct {
// MCP 服务器列表
Servers []MCPServerConfig `json:"servers"`
}
// MCPServerConfig 单个 MCP 服务器配置
type MCPServerConfig struct {
// 服务器名称(唯一标识)
Name string `json:"name"`
// 传输类型stdio 或 sse
Transport string `json:"transport"`
// === stdio 传输配置 ===
Command string `json:"command,omitempty"` // 启动命令
Args []string `json:"args,omitempty"` // 命令参数
Env map[string]string `json:"env,omitempty"` // 环境变量(支持 ${VAR} 从系统环境变量读取)
// === SSE 传输配置 ===
URL string `json:"url,omitempty"` // SSE 服务器 URL
Headers map[string]string `json:"headers,omitempty"` // 请求头(支持 ${VAR} 从系统环境变量读取)
SkipSSLVerify bool `json:"skip_ssl_verify,omitempty"` // 跳过 SSL 验证
// === 鉴权配置SSE 传输)===
// 便捷鉴权配置,会自动设置对应的 Header
AuthType string `json:"auth_type,omitempty"` // 鉴权类型bearer, api_key, basic
APIKey string `json:"api_key,omitempty"` // API Key支持 ${VAR} 从系统环境变量读取)
Username string `json:"username,omitempty"` // Basic Auth 用户名
Password string `json:"password,omitempty"` // Basic Auth 密码(支持 ${VAR}
// 通用配置
Timeout int `json:"timeout,omitempty"` // 工具调用超时(毫秒)
ConnectTimeout int `json:"connect_timeout,omitempty"` // 连接超时(毫秒)
}
// MCPToolConfig MCP 工具配置(在 AgentTool 中使用)
type MCPToolConfig struct {
// MCP 服务器名称(引用 MCPConfig.Servers 中的配置)
ServerName string `json:"server_name"`
// 工具名称MCP 服务器返回的工具名)
ToolName string `json:"tool_name"`
}
// MCPTool MCP 工具定义(用于内部表示)
type MCPTool struct {
Name string `json:"name"`
Description string `json:"description,omitempty"`
InputSchema map[string]interface{} `json:"inputSchema,omitempty"`
}
// MCPToolsCallResult 工具调用结果
type MCPToolsCallResult struct {
Content []MCPContent `json:"content"`
IsError bool `json:"isError,omitempty"`
}
// MCPContent 工具返回内容
type MCPContent struct {
Type string `json:"type"`
Text string `json:"text,omitempty"`
Data string `json:"data,omitempty"`
MimeType string `json:"mimeType,omitempty"`
}
// MCPClient MCP 客户端(基于官方 go-sdk
type MCPClient struct {
config *MCPServerConfig
// SDK 客户端和会话stdio 传输)
client *mcp.Client
session *mcp.ClientSession
// SSE 传输SDK 暂不支持 SSE 客户端,保留自定义实现)
httpClient *http.Client
sseURL string
// 通用
mu sync.Mutex
initialized bool
tools []MCPTool // 缓存的工具列表
}
// expandEnvVars 展开字符串中的环境变量引用
func expandEnvVars(s string) string {
return os.ExpandEnv(s)
}
// NewMCPClient 创建 MCP 客户端
func NewMCPClient(config *MCPServerConfig) (*MCPClient, error) {
client := &MCPClient{
config: config,
}
return client, nil
}
// Connect 连接到 MCP 服务器
func (c *MCPClient) Connect(ctx context.Context) error {
c.mu.Lock()
defer c.mu.Unlock()
if c.initialized {
return nil
}
var err error
switch c.config.Transport {
case MCPTransportStdio:
err = c.connectStdio(ctx)
case MCPTransportSSE:
err = c.connectSSE(ctx)
default:
return fmt.Errorf("unsupported MCP transport: %s", c.config.Transport)
}
if err != nil {
return err
}
c.initialized = true
return nil
}
// connectStdio 通过 stdio 连接(使用官方 SDK
func (c *MCPClient) connectStdio(ctx context.Context) error {
if c.config.Command == "" {
return fmt.Errorf("stdio transport requires command")
}
// 准备环境变量
env := os.Environ()
for k, v := range c.config.Env {
expandedValue := expandEnvVars(v)
env = append(env, fmt.Sprintf("%s=%s", k, expandedValue))
}
// 创建 exec.Cmd
cmd := exec.CommandContext(ctx, c.config.Command, c.config.Args...)
cmd.Env = env
// 使用官方 SDK 的 CommandTransport
transport := &mcp.CommandTransport{
Command: cmd,
}
// 创建 MCP 客户端
c.client = mcp.NewClient(
&mcp.Implementation{
Name: "nightingale-aiagent",
Version: "1.0.0",
},
nil,
)
// 连接并初始化Connect 会自动进行 initialize 握手)
session, err := c.client.Connect(ctx, transport, nil)
if err != nil {
return fmt.Errorf("failed to connect MCP client: %v", err)
}
c.session = session
logger.Infof("MCP stdio server started: %s", c.config.Name)
return nil
}
// connectSSE 通过 SSE 连接保留自定义实现SDK 暂不支持 SSE 客户端)
func (c *MCPClient) connectSSE(ctx context.Context) error {
if c.config.URL == "" {
return fmt.Errorf("SSE transport requires URL")
}
// 创建 HTTP 客户端
transport := &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: c.config.SkipSSLVerify},
}
timeout := c.config.ConnectTimeout
if timeout <= 0 {
timeout = DefaultMCPConnectTimeout
}
c.httpClient = &http.Client{
Timeout: time.Duration(timeout) * time.Millisecond,
Transport: transport,
}
c.sseURL = c.config.URL
logger.Infof("MCP SSE client configured: %s", c.config.Name)
return nil
}
// ListTools 获取工具列表
func (c *MCPClient) ListTools(ctx context.Context) ([]MCPTool, error) {
c.mu.Lock()
if len(c.tools) > 0 {
tools := c.tools
c.mu.Unlock()
return tools, nil
}
c.mu.Unlock()
var tools []MCPTool
switch c.config.Transport {
case MCPTransportStdio:
// 使用官方 SDK
if c.session == nil {
return nil, fmt.Errorf("MCP session not initialized")
}
result, err := c.session.ListTools(ctx, nil)
if err != nil {
return nil, fmt.Errorf("failed to list tools: %v", err)
}
// 转换为内部格式
for _, tool := range result.Tools {
inputSchema := make(map[string]interface{})
if tool.InputSchema != nil {
// 将 SDK 的 InputSchema 转换为 map
schemaBytes, err := json.Marshal(tool.InputSchema)
if err == nil {
json.Unmarshal(schemaBytes, &inputSchema)
}
}
tools = append(tools, MCPTool{
Name: tool.Name,
Description: tool.Description,
InputSchema: inputSchema,
})
}
case MCPTransportSSE:
// 使用自定义 HTTP 实现
var err error
tools, err = c.listToolsSSE(ctx)
if err != nil {
return nil, err
}
}
c.mu.Lock()
c.tools = tools
c.mu.Unlock()
return tools, nil
}
// listToolsSSE 通过 SSE 获取工具列表
func (c *MCPClient) listToolsSSE(ctx context.Context) ([]MCPTool, error) {
req := map[string]interface{}{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/list",
}
resp, err := c.sendSSERequest(ctx, req)
if err != nil {
return nil, err
}
resultBytes, _ := json.Marshal(resp["result"])
var result struct {
Tools []MCPTool `json:"tools"`
}
if err := json.Unmarshal(resultBytes, &result); err != nil {
return nil, fmt.Errorf("failed to parse tools list: %v", err)
}
return result.Tools, nil
}
// CallTool 调用工具
func (c *MCPClient) CallTool(ctx context.Context, name string, arguments map[string]interface{}) (*MCPToolsCallResult, error) {
switch c.config.Transport {
case MCPTransportStdio:
return c.callToolStdio(ctx, name, arguments)
case MCPTransportSSE:
return c.callToolSSE(ctx, name, arguments)
default:
return nil, fmt.Errorf("unsupported transport: %s", c.config.Transport)
}
}
// callToolStdio 通过 stdio 调用工具(使用官方 SDK
func (c *MCPClient) callToolStdio(ctx context.Context, name string, arguments map[string]interface{}) (*MCPToolsCallResult, error) {
if c.session == nil {
return nil, fmt.Errorf("MCP session not initialized")
}
// 调用工具
result, err := c.session.CallTool(ctx, &mcp.CallToolParams{
Name: name,
Arguments: arguments,
})
if err != nil {
return nil, fmt.Errorf("tool call failed: %v", err)
}
// 转换结果
mcpResult := &MCPToolsCallResult{
IsError: result.IsError,
}
for _, content := range result.Content {
mc := MCPContent{}
// 根据具体类型提取内容
switch c := content.(type) {
case *mcp.TextContent:
mc.Type = "text"
mc.Text = c.Text
case *mcp.ImageContent:
mc.Type = "image"
mc.Data = string(c.Data)
mc.MimeType = c.MIMEType
case *mcp.AudioContent:
mc.Type = "audio"
mc.Data = string(c.Data)
mc.MimeType = c.MIMEType
case *mcp.EmbeddedResource:
mc.Type = "resource"
if c.Resource != nil {
if c.Resource.Text != "" {
mc.Text = c.Resource.Text
} else if c.Resource.Blob != nil {
mc.Data = string(c.Resource.Blob)
}
mc.MimeType = c.Resource.MIMEType
}
case *mcp.ResourceLink:
mc.Type = "resource_link"
mc.Text = c.URI
default:
// 尝试通过 JSON 序列化获取内容
if data, err := json.Marshal(content); err == nil {
mc.Type = "unknown"
mc.Text = string(data)
}
}
mcpResult.Content = append(mcpResult.Content, mc)
}
return mcpResult, nil
}
// callToolSSE 通过 SSE 调用工具
func (c *MCPClient) callToolSSE(ctx context.Context, name string, arguments map[string]interface{}) (*MCPToolsCallResult, error) {
req := map[string]interface{}{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": map[string]interface{}{
"name": name,
"arguments": arguments,
},
}
resp, err := c.sendSSERequest(ctx, req)
if err != nil {
return nil, err
}
if errObj, ok := resp["error"].(map[string]interface{}); ok {
return nil, fmt.Errorf("MCP error: %v", errObj["message"])
}
resultBytes, _ := json.Marshal(resp["result"])
var result MCPToolsCallResult
if err := json.Unmarshal(resultBytes, &result); err != nil {
return nil, fmt.Errorf("failed to parse tool call result: %v", err)
}
return &result, nil
}
// setAuthHeaders 设置鉴权请求头
func (c *MCPClient) setAuthHeaders(req *http.Request) {
cfg := c.config
if cfg.AuthType == "" && cfg.APIKey == "" {
return
}
apiKey := expandEnvVars(cfg.APIKey)
username := expandEnvVars(cfg.Username)
password := expandEnvVars(cfg.Password)
switch strings.ToLower(cfg.AuthType) {
case "bearer":
if apiKey != "" {
req.Header.Set("Authorization", "Bearer "+apiKey)
}
case "api_key", "apikey":
if apiKey != "" {
req.Header.Set("X-API-Key", apiKey)
}
case "basic":
if username != "" {
req.SetBasicAuth(username, password)
}
default:
if apiKey != "" {
req.Header.Set("Authorization", "Bearer "+apiKey)
}
}
}
// sendSSERequest 通过 HTTP 发送请求
func (c *MCPClient) sendSSERequest(ctx context.Context, req map[string]interface{}) (map[string]interface{}, error) {
baseURL := c.sseURL
if !strings.HasSuffix(baseURL, "/") {
baseURL += "/"
}
postURL := baseURL + "message"
if _, err := url.Parse(postURL); err != nil {
return nil, fmt.Errorf("invalid URL: %v", err)
}
data, err := json.Marshal(req)
if err != nil {
return nil, fmt.Errorf("failed to marshal request: %v", err)
}
httpReq, err := http.NewRequestWithContext(ctx, "POST", postURL, bytes.NewBuffer(data))
if err != nil {
return nil, fmt.Errorf("failed to create HTTP request: %v", err)
}
httpReq.Header.Set("Content-Type", "application/json")
c.setAuthHeaders(httpReq)
for k, v := range c.config.Headers {
httpReq.Header.Set(k, expandEnvVars(v))
}
resp, err := c.httpClient.Do(httpReq)
if err != nil {
return nil, fmt.Errorf("HTTP request failed: %v", err)
}
defer resp.Body.Close()
if resp.StatusCode >= 400 {
body, _ := io.ReadAll(resp.Body)
return nil, fmt.Errorf("HTTP error %d: %s", resp.StatusCode, string(body))
}
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("failed to read response: %v", err)
}
var result map[string]interface{}
if err := json.Unmarshal(body, &result); err != nil {
return nil, fmt.Errorf("failed to parse response: %v", err)
}
return result, nil
}
// Close 关闭连接
func (c *MCPClient) Close() error {
c.mu.Lock()
defer c.mu.Unlock()
if c.session != nil {
c.session.Close()
c.session = nil
}
c.client = nil
c.initialized = false
logger.Infof("MCP client closed: %s", c.config.Name)
return nil
}
// MCPClientManager MCP 客户端管理器
type MCPClientManager struct {
clients map[string]*MCPClient
mu sync.RWMutex
}
// NewMCPClientManager 创建 MCP 客户端管理器
func NewMCPClientManager() *MCPClientManager {
return &MCPClientManager{
clients: make(map[string]*MCPClient),
}
}
// GetOrCreateClient 获取或创建 MCP 客户端
func (m *MCPClientManager) GetOrCreateClient(ctx context.Context, config *MCPServerConfig) (*MCPClient, error) {
m.mu.RLock()
client, ok := m.clients[config.Name]
m.mu.RUnlock()
if ok {
return client, nil
}
m.mu.Lock()
defer m.mu.Unlock()
// 再次检查double-check locking
if client, ok := m.clients[config.Name]; ok {
return client, nil
}
// 创建新客户端
client, err := NewMCPClient(config)
if err != nil {
return nil, err
}
// 连接
if err := client.Connect(ctx); err != nil {
return nil, err
}
m.clients[config.Name] = client
return client, nil
}
// CloseAll 关闭所有客户端
func (m *MCPClientManager) CloseAll() {
m.mu.Lock()
defer m.mu.Unlock()
for name, client := range m.clients {
if err := client.Close(); err != nil {
logger.Warningf("Failed to close MCP client %s: %v", name, err)
}
}
m.clients = make(map[string]*MCPClient)
}

30
aiagent/prompts/embed.go Normal file
View File

@@ -0,0 +1,30 @@
package prompts
import (
_ "embed"
)
// ReAct 模式系统提示词
//
//go:embed react_system.md
var ReactSystemPrompt string
// Plan+ReAct 模式规划阶段系统提示词
//
//go:embed plan_system.md
var PlanSystemPrompt string
// 步骤执行提示词
//
//go:embed step_execution.md
var StepExecutionPrompt string
// 综合分析提示词
//
//go:embed synthesis.md
var SynthesisPrompt string
// 用户提示词默认模板
//
//go:embed user_default.md
var UserDefaultTemplate string

View File

@@ -0,0 +1,65 @@
You are an intelligent AI Agent capable of analyzing tasks, creating execution plans, and solving complex problems.
Your role is to understand user requests and create structured, actionable execution plans.
## Core Capabilities
- **Alert Analysis**: Analyze alerts, investigate root causes, correlate events
- **Data Analysis**: Analyze batch data, identify patterns, generate insights
- **SQL Generation**: Convert natural language to SQL queries
- **General Problem Solving**: Break down complex tasks into actionable steps
## Planning Principles
1. **Understand First**: Carefully analyze what the user is asking for
2. **Identify Key Areas**: Determine which domains, systems, or aspects are involved
3. **Create Logical Steps**: Order steps by priority or logical sequence
4. **Be Specific**: Each step should have a clear goal and concrete approach
5. **Reference Tools**: Consider available tools when designing your approach
## Response Format
You must respond in the following JSON format:
```json
{
"task_summary": "Brief summary of the input/request",
"goal": "The overall goal of this task",
"focus_areas": ["area1", "area2", "area3"],
"steps": [
{
"step_number": 1,
"goal": "What to accomplish in this step",
"approach": "How to accomplish it (which tools/methods to use)"
},
{
"step_number": 2,
"goal": "...",
"approach": "..."
}
]
}
```
## Focus Areas by Task Type
**Alert/Incident Analysis:**
- Network: latency, packet loss, DNS resolution
- Database: query performance, connections, locks, replication
- Application: error rates, response times, resource usage
- Infrastructure: CPU, memory, disk I/O, network throughput
**Batch Alert Analysis:**
- Pattern recognition: common labels, time correlation
- Aggregation: group by severity, source, category
- Trend analysis: frequency, escalation patterns
**SQL Generation:**
- Schema understanding: tables, columns, relationships
- Query optimization: indexes, join strategies
- Data validation: constraints, data types
**General Analysis:**
- Data collection: gather relevant information
- Processing: transform, filter, aggregate
- Output: format results appropriately

View File

@@ -0,0 +1,42 @@
You are an intelligent AI Agent capable of analyzing tasks, creating execution plans, and solving complex problems.
Your capabilities include but are not limited to:
- **Root Cause Analysis**: Analyze alerts, investigate incidents, identify root causes
- **Data Analysis**: Query and analyze metrics, logs, traces, and other data sources
- **SQL Generation**: Convert natural language queries to SQL statements
- **Information Synthesis**: Summarize and extract insights from complex data
- **Content Generation**: Generate titles, summaries, and structured reports
## Core Principles
1. **Systematic Analysis**: Gather sufficient information before making conclusions
2. **Evidence-Based**: Support conclusions with specific data from tool outputs
3. **Tool Efficiency**: Use tools wisely, avoid redundant calls
4. **Clear Communication**: Keep responses focused and actionable
5. **Adaptability**: Adjust your approach based on the task type
## Response Format
You must respond in the following format:
```
Thought: [Your reasoning about the current situation and what to do next]
Action: [The tool name to use, or 'Final Answer' if you have enough information]
Action Input: [The input to the action - for tools, provide JSON parameters; for Final Answer, provide your result]
```
## Task Guidelines
1. **Understand the request**: Carefully analyze what the user is asking for
2. **Choose appropriate tools**: Select tools that best fit the task requirements
3. **Iterate as needed**: Gather additional information if initial results are insufficient
4. **Validate results**: Verify your conclusions before providing the final answer
5. **Be concise**: Provide clear, well-structured responses
## Final Answer Requirements
Your Final Answer should:
- Directly address the user's request
- Be well-structured and easy to understand
- Include supporting evidence or reasoning when applicable
- Provide actionable recommendations if relevant

View File

@@ -0,0 +1,35 @@
You are an intelligent AI Agent executing a specific step as part of a larger execution plan.
## Your Task
Focus on completing the current step efficiently and thoroughly. Use the available tools to gather information, process data, or generate results as needed to achieve the step's goal.
## Response Format
Respond in this format:
```
Thought: [Your reasoning about what to do for this step]
Action: [Tool name or 'Step Complete' when done]
Action Input: [Tool parameters as JSON, or step summary for 'Step Complete']
```
## Step Execution Guidelines
1. **Stay Focused**: Only work on the current step's goal
2. **Be Thorough**: Gather enough information to achieve the goal
3. **Document Progress**: Note important findings in your thoughts
4. **Know When to Stop**: Complete the step when you have sufficient results
5. **Handle Failures**: If a tool fails, try alternatives or note the limitation
## When to Mark Step Complete
Mark the step as complete when:
- You have achieved the step's goal
- You have gathered sufficient information or generated the required output
- Further work would be outside the step's scope
Your step summary should include:
- Key results or findings relevant to the step's goal
- Tools used and their outputs
- Any limitations or issues encountered

View File

@@ -0,0 +1,37 @@
You are an intelligent AI Agent synthesizing results from multiple execution steps into a comprehensive final output.
## Your Task
Review all the results from the completed steps and provide a unified, well-structured response that addresses the original request.
## Response Guidelines
Based on the task type, structure your response appropriately:
**For Root Cause Analysis:**
- Summary of the root cause
- Supporting evidence from investigation
- Impact assessment
- Recommended actions
**For Data Analysis / SQL Generation:**
- Query results or generated SQL
- Key insights from the data
- Any caveats or limitations
**For Information Synthesis:**
- Structured summary of findings
- Key insights and patterns
- Relevant conclusions
**For Content Generation:**
- Generated content (title, summary, etc.)
- Alternative options if applicable
## Synthesis Principles
1. **Integrate Results**: Combine findings from all completed steps coherently
2. **Prioritize Relevance**: Focus on the most important information
3. **Be Structured**: Organize output in a clear, logical format
4. **Be Concise**: Avoid unnecessary verbosity while ensuring completeness
5. **Address the Request**: Ensure the final output directly answers the original task

View File

@@ -0,0 +1,7 @@
## Alert Information
{{.AlertContent}}
## Analysis Request
Please analyze this alert and identify the root cause. Provide evidence-based conclusions and actionable recommendations.

573
aiagent/skill.go Normal file
View File

@@ -0,0 +1,573 @@
package aiagent
import (
"bufio"
"context"
"encoding/json"
"fmt"
"os"
"path/filepath"
"strings"
"sync"
"time"
"github.com/toolkits/pkg/logger"
"gopkg.in/yaml.v3"
)
const (
// SkillFileName 技能主文件名
SkillFileName = "SKILL.md"
// SkillToolsDir 技能工具目录名
SkillToolsDir = "skill_tools"
// 默认配置
DefaultMaxSkills = 2
)
// SkillConfig 技能配置(在 AIAgentConfig 中使用)
// 技能目录路径通过全局配置 Plus.AIAgentSkillsPath 设置
type SkillConfig struct {
// 技能选择配置优先级SkillNames > LLM 选择 > DefaultSkills
AutoSelect bool `json:"auto_select,omitempty"` // 是否让 LLM 自动选择技能(默认 true
SkillNames []string `json:"skill_names,omitempty"` // 直接指定技能名列表(手动模式)
MaxSkills int `json:"max_skills,omitempty"` // LLM 最多选择几个技能(默认 2
DefaultSkills []string `json:"default_skills,omitempty"` // 默认技能列表LLM 无法选择时使用)
}
// SkillMetadata 技能元数据Level 1 - 总是在内存中)
type SkillMetadata struct {
// 核心字段(与 Anthropic 官方一致)
Name string `yaml:"name" json:"name"`
Description string `yaml:"description" json:"description"`
// 可选扩展字段
RecommendedTools []string `yaml:"recommended_tools,omitempty" json:"recommended_tools,omitempty"`
BuiltinTools []string `yaml:"builtin_tools,omitempty" json:"builtin_tools,omitempty"` // 内置工具列表
// 内部字段
Path string `json:"-"` // 技能目录路径
LoadedAt time.Time `json:"-"` // 加载时间
}
// SkillContent 技能内容Level 2 - 匹配时加载)
type SkillContent struct {
Metadata *SkillMetadata `json:"metadata"`
MainContent string `json:"main_content"` // SKILL.md 正文
}
// SkillTool Skill 专用工具Level 3 - 按需加载)
type SkillTool struct {
Name string `yaml:"name" json:"name"` // 工具名称
Type string `yaml:"type" json:"type"` // 处理器类型annotation_qd, script, callback 等
Description string `yaml:"description" json:"description"` // 工具描述
Config map[string]interface{} `yaml:"config" json:"config"` // 处理器配置
// 参数定义(可选)
Parameters []ToolParameter `yaml:"parameters,omitempty" json:"parameters,omitempty"`
}
// SkillResources 技能扩展资源Level 3 - 按需加载)
type SkillResources struct {
SkillTools map[string]*SkillTool `json:"skill_tools"` // 工具名 -> 工具定义
References map[string]string `json:"references"` // 引用文件内容
}
// SkillRegistry 技能注册表
type SkillRegistry struct {
skillsPath string // 技能目录路径
skills map[string]*SkillMetadata // name -> metadata
contentCache map[string]*SkillContent // name -> content (LRU cache)
toolsCache map[string]map[string]*SkillTool // skillName -> toolName -> tool
mu sync.RWMutex
}
// NewSkillRegistry 创建技能注册表
func NewSkillRegistry(skillsPath string) *SkillRegistry {
registry := &SkillRegistry{
skillsPath: skillsPath,
skills: make(map[string]*SkillMetadata),
contentCache: make(map[string]*SkillContent),
toolsCache: make(map[string]map[string]*SkillTool),
}
// 初始加载所有技能元数据
if err := registry.loadAllMetadata(); err != nil {
logger.Warningf("Failed to load skill metadata: %v", err)
}
return registry
}
// loadAllMetadata 加载所有技能的元数据Level 1
func (r *SkillRegistry) loadAllMetadata() error {
if r.skillsPath == "" {
return nil
}
// 检查目录是否存在
if _, err := os.Stat(r.skillsPath); os.IsNotExist(err) {
logger.Debugf("Skills directory does not exist: %s", r.skillsPath)
return nil
}
// 遍历技能目录
entries, err := os.ReadDir(r.skillsPath)
if err != nil {
return fmt.Errorf("failed to read skills directory: %v", err)
}
r.mu.Lock()
defer r.mu.Unlock()
for _, entry := range entries {
if !entry.IsDir() {
continue
}
skillPath := filepath.Join(r.skillsPath, entry.Name())
skillFile := filepath.Join(skillPath, SkillFileName)
// 检查 SKILL.md 是否存在
if _, err := os.Stat(skillFile); os.IsNotExist(err) {
continue
}
// 加载元数据
metadata, err := r.loadMetadataFromFile(skillFile)
if err != nil {
logger.Warningf("Failed to load skill metadata from %s: %v", skillFile, err)
continue
}
metadata.Path = skillPath
metadata.LoadedAt = time.Now()
r.skills[metadata.Name] = metadata
logger.Debugf("Loaded skill metadata: %s from %s", metadata.Name, skillPath)
}
logger.Infof("Loaded %d skills from %s", len(r.skills), r.skillsPath)
return nil
}
// 从 SKILL.md 文件加载元数据
func (r *SkillRegistry) loadMetadataFromFile(filePath string) (*SkillMetadata, error) {
file, err := os.Open(filePath)
if err != nil {
return nil, fmt.Errorf("failed to open file: %v", err)
}
defer file.Close()
// 解析 YAML frontmatter
scanner := bufio.NewScanner(file)
var inFrontmatter bool
var frontmatterLines []string
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "---" {
if !inFrontmatter {
inFrontmatter = true
continue
} else {
// frontmatter 结束
break
}
}
if inFrontmatter {
frontmatterLines = append(frontmatterLines, line)
}
}
if err := scanner.Err(); err != nil {
return nil, fmt.Errorf("failed to scan file: %v", err)
}
if len(frontmatterLines) == 0 {
return nil, fmt.Errorf("no frontmatter found in %s", filePath)
}
// 解析 YAML
frontmatter := strings.Join(frontmatterLines, "\n")
var metadata SkillMetadata
if err := yaml.Unmarshal([]byte(frontmatter), &metadata); err != nil {
return nil, fmt.Errorf("failed to parse frontmatter: %v", err)
}
if metadata.Name == "" {
return nil, fmt.Errorf("skill name is required in frontmatter")
}
return &metadata, nil
}
// GetByName 根据名称获取技能元数据
func (r *SkillRegistry) GetByName(name string) *SkillMetadata {
r.mu.RLock()
defer r.mu.RUnlock()
return r.skills[name]
}
// ListAll 列出所有技能元数据
func (r *SkillRegistry) ListAll() []*SkillMetadata {
r.mu.RLock()
defer r.mu.RUnlock()
result := make([]*SkillMetadata, 0, len(r.skills))
for _, metadata := range r.skills {
result = append(result, metadata)
}
return result
}
// LoadContent 加载技能内容Level 2
func (r *SkillRegistry) LoadContent(metadata *SkillMetadata) (*SkillContent, error) {
if metadata == nil {
return nil, fmt.Errorf("metadata is nil")
}
// 检查缓存
r.mu.RLock()
if cached, ok := r.contentCache[metadata.Name]; ok {
r.mu.RUnlock()
return cached, nil
}
r.mu.RUnlock()
// 加载内容
skillFile := filepath.Join(metadata.Path, SkillFileName)
content, err := r.loadContentFromFile(skillFile)
if err != nil {
return nil, err
}
skillContent := &SkillContent{
Metadata: metadata,
MainContent: content,
}
// 缓存
r.mu.Lock()
r.contentCache[metadata.Name] = skillContent
r.mu.Unlock()
return skillContent, nil
}
// loadContentFromFile 从 SKILL.md 文件加载正文内容
func (r *SkillRegistry) loadContentFromFile(filePath string) (string, error) {
file, err := os.Open(filePath)
if err != nil {
return "", fmt.Errorf("failed to open file: %v", err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
var inFrontmatter bool
var frontmatterEnded bool
var contentLines []string
for scanner.Scan() {
line := scanner.Text()
if strings.TrimSpace(line) == "---" {
if !inFrontmatter {
inFrontmatter = true
continue
} else {
frontmatterEnded = true
continue
}
}
if frontmatterEnded {
contentLines = append(contentLines, line)
}
}
if err := scanner.Err(); err != nil {
return "", fmt.Errorf("failed to scan file: %v", err)
}
return strings.TrimSpace(strings.Join(contentLines, "\n")), nil
}
// LoadSkillTool 加载单个 skill_toolLevel 3 - 完整配置)
func (r *SkillRegistry) LoadSkillTool(skillName, toolName string) (*SkillTool, error) {
// 检查缓存
r.mu.RLock()
if skillTools, ok := r.toolsCache[skillName]; ok {
if tool, ok := skillTools[toolName]; ok {
r.mu.RUnlock()
return tool, nil
}
}
r.mu.RUnlock()
// 获取技能元数据
metadata := r.GetByName(skillName)
if metadata == nil {
return nil, fmt.Errorf("skill '%s' not found", skillName)
}
// 加载工具
toolFile := filepath.Join(metadata.Path, SkillToolsDir, toolName+".yaml")
tool, err := r.loadToolFromFile(toolFile)
if err != nil {
return nil, err
}
// 缓存
r.mu.Lock()
if r.toolsCache[skillName] == nil {
r.toolsCache[skillName] = make(map[string]*SkillTool)
}
r.toolsCache[skillName][toolName] = tool
r.mu.Unlock()
return tool, nil
}
// LoadSkillToolDescription 加载 skill_tool 的描述信息(轻量级,只读取 name 和 description
func (r *SkillRegistry) LoadSkillToolDescription(skillName, toolName string) (string, error) {
// 检查缓存 - 如果已经加载了完整工具,直接返回 description
r.mu.RLock()
if skillTools, ok := r.toolsCache[skillName]; ok {
if tool, ok := skillTools[toolName]; ok {
r.mu.RUnlock()
return tool.Description, nil
}
}
r.mu.RUnlock()
// 获取技能元数据
metadata := r.GetByName(skillName)
if metadata == nil {
return "", fmt.Errorf("skill '%s' not found", skillName)
}
// 只加载 description不缓存完整工具保持延迟加载特性
toolFile := filepath.Join(metadata.Path, SkillToolsDir, toolName+".yaml")
tool, err := r.loadToolFromFile(toolFile)
if err != nil {
return "", err
}
return tool.Description, nil
}
// LoadAllSkillToolDescriptions 加载技能目录下所有 skill_tools 的描述
func (r *SkillRegistry) LoadAllSkillToolDescriptions(skillName string) (map[string]string, error) {
metadata := r.GetByName(skillName)
if metadata == nil {
return nil, fmt.Errorf("skill '%s' not found", skillName)
}
toolsDir := filepath.Join(metadata.Path, SkillToolsDir)
// 检查目录是否存在
if _, err := os.Stat(toolsDir); os.IsNotExist(err) {
return make(map[string]string), nil
}
// 遍历 skill_tools 目录
entries, err := os.ReadDir(toolsDir)
if err != nil {
return nil, fmt.Errorf("failed to read skill_tools directory: %v", err)
}
descriptions := make(map[string]string)
for _, entry := range entries {
if entry.IsDir() {
continue
}
// 只处理 .yaml 文件
name := entry.Name()
if !strings.HasSuffix(name, ".yaml") && !strings.HasSuffix(name, ".yml") {
continue
}
toolFile := filepath.Join(toolsDir, name)
tool, err := r.loadToolFromFile(toolFile)
if err != nil {
logger.Warningf("Failed to load skill tool %s: %v", toolFile, err)
continue
}
descriptions[tool.Name] = tool.Description
}
return descriptions, nil
}
// loadToolFromFile 从文件加载工具定义
func (r *SkillRegistry) loadToolFromFile(filePath string) (*SkillTool, error) {
data, err := os.ReadFile(filePath)
if err != nil {
return nil, fmt.Errorf("failed to read tool file: %v", err)
}
var tool SkillTool
if err := yaml.Unmarshal(data, &tool); err != nil {
return nil, fmt.Errorf("failed to parse tool file: %v", err)
}
return &tool, nil
}
// LoadReference 加载引用文件Level 3
func (r *SkillRegistry) LoadReference(metadata *SkillMetadata, refName string) (string, error) {
if metadata == nil {
return "", fmt.Errorf("metadata is nil")
}
refFile := filepath.Join(metadata.Path, refName)
data, err := os.ReadFile(refFile)
if err != nil {
return "", fmt.Errorf("failed to read reference file: %v", err)
}
return string(data), nil
}
// Reload 重新加载所有技能元数据
func (r *SkillRegistry) Reload() error {
r.mu.Lock()
r.skills = make(map[string]*SkillMetadata)
r.contentCache = make(map[string]*SkillContent)
r.toolsCache = make(map[string]map[string]*SkillTool)
r.mu.Unlock()
return r.loadAllMetadata()
}
// SkillSelector 技能选择器接口
type SkillSelector interface {
// SelectMultiple 让 LLM 根据任务内容选择最合适的技能(可多选)
SelectMultiple(ctx context.Context, taskContext string, availableSkills []*SkillMetadata, maxSkills int) ([]*SkillMetadata, error)
}
// LLMSkillSelector 基于 LLM 的技能选择器
type LLMSkillSelector struct {
llmCaller func(ctx context.Context, messages []ChatMessage) (string, error)
}
// NewLLMSkillSelector 创建 LLM 技能选择器
func NewLLMSkillSelector(llmCaller func(ctx context.Context, messages []ChatMessage) (string, error)) *LLMSkillSelector {
return &LLMSkillSelector{
llmCaller: llmCaller,
}
}
// SelectMultiple 使用 LLM 选择技能
func (s *LLMSkillSelector) SelectMultiple(ctx context.Context, taskContext string, availableSkills []*SkillMetadata, maxSkills int) ([]*SkillMetadata, error) {
if len(availableSkills) == 0 {
return nil, nil
}
if maxSkills <= 0 {
maxSkills = DefaultMaxSkills
}
// 构建提示词
systemPrompt := s.buildSelectionPrompt(availableSkills, maxSkills)
messages := []ChatMessage{
{Role: "system", Content: systemPrompt},
{Role: "user", Content: taskContext},
}
// 调用 LLM
response, err := s.llmCaller(ctx, messages)
if err != nil {
return nil, fmt.Errorf("LLM call failed: %v", err)
}
// 解析响应
selectedNames := s.parseSelectionResponse(response)
if len(selectedNames) == 0 {
return nil, nil
}
// 限制数量
if len(selectedNames) > maxSkills {
selectedNames = selectedNames[:maxSkills]
}
// 转换为 SkillMetadata
skillMap := make(map[string]*SkillMetadata)
for _, skill := range availableSkills {
skillMap[skill.Name] = skill
}
var result []*SkillMetadata
for _, name := range selectedNames {
if skill, ok := skillMap[name]; ok {
result = append(result, skill)
}
}
return result, nil
}
// buildSelectionPrompt 构建技能选择提示词
func (s *LLMSkillSelector) buildSelectionPrompt(availableSkills []*SkillMetadata, maxSkills int) string {
var sb strings.Builder
sb.WriteString(fmt.Sprintf(`你是一个技能选择器。根据以下任务上下文,选择最合适的技能(可选择 1-%d 个)。
## 可用技能
`, maxSkills))
for i, skill := range availableSkills {
sb.WriteString(fmt.Sprintf("%d. **%s**\n", i+1, skill.Name))
sb.WriteString(fmt.Sprintf(" %s\n\n", skill.Description))
}
sb.WriteString(`## 输出格式
请以 JSON 数组格式返回选中的技能名称,例如:
` + "```json\n" + `["skill-name-1", "skill-name-2"]
` + "```" + `
## 选择原则
1. 选择与任务最相关的技能
2. 如果任务涉及多个领域,可以选择多个技能
3. 优先选择更具体、更专业的技能
4. 如果没有合适的技能,返回空数组 []
请返回技能名称数组:`)
return sb.String()
}
// parseSelectionResponse 解析 LLM 的选择响应
func (s *LLMSkillSelector) parseSelectionResponse(response string) []string {
// 尝试从 JSON 代码块中提取
response = strings.TrimSpace(response)
// 查找 JSON 数组
start := strings.Index(response, "[")
end := strings.LastIndex(response, "]")
if start < 0 || end <= start {
return nil
}
jsonStr := response[start : end+1]
var skillNames []string
if err := json.Unmarshal([]byte(jsonStr), &skillNames); err != nil {
logger.Warningf("Failed to parse skill selection response: %v", err)
return nil
}
return skillNames
}

View File

@@ -79,7 +79,7 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
r := httpx.GinEngine(config.Global.RunMode, config.HTTP,
configCvalCache.PrintBodyPaths, configCvalCache.PrintAccessLog)
rt := router.New(config.HTTP, config.Alert, alertMuteCache, targetCache, busiGroupCache, alertStats, ctx, externalProcessors)
rt := router.New(config.HTTP, config.Alert, alertMuteCache, targetCache, busiGroupCache, alertStats, ctx, externalProcessors, config.Log.Dir)
if config.Ibex.Enable {
ibex.ServerStart(false, nil, redis, config.HTTP.APIForService.BasicAuth, config.Alert.Heartbeat, &config.CenterApi, r, nil, config.Ibex, config.HTTP.Port)

View File

@@ -8,7 +8,6 @@ import (
"time"
"github.com/ccfos/nightingale/v6/alert/aconf"
"github.com/ccfos/nightingale/v6/alert/common"
"github.com/ccfos/nightingale/v6/alert/queue"
"github.com/ccfos/nightingale/v6/memsto"
"github.com/ccfos/nightingale/v6/models"
@@ -99,12 +98,12 @@ func (e *Consumer) consumeOne(event *models.AlertCurEvent) {
e.dispatch.Astats.CounterAlertsTotal.WithLabelValues(event.Cluster, eventType, event.GroupName).Inc()
if err := event.ParseRule("rule_name"); err != nil {
logger.Warningf("ruleid:%d failed to parse rule name: %v", event.RuleId, err)
logger.Warningf("alert_eval_%d datasource_%d failed to parse rule name: %v", event.RuleId, event.DatasourceId, err)
event.RuleName = fmt.Sprintf("failed to parse rule name: %v", err)
}
if err := event.ParseRule("annotations"); err != nil {
logger.Warningf("ruleid:%d failed to parse annotations: %v", event.RuleId, err)
logger.Warningf("alert_eval_%d datasource_%d failed to parse annotations: %v", event.RuleId, event.DatasourceId, err)
event.Annotations = fmt.Sprintf("failed to parse annotations: %v", err)
event.AnnotationsJSON["error"] = event.Annotations
}
@@ -112,7 +111,7 @@ func (e *Consumer) consumeOne(event *models.AlertCurEvent) {
e.queryRecoveryVal(event)
if err := event.ParseRule("rule_note"); err != nil {
logger.Warningf("ruleid:%d failed to parse rule note: %v", event.RuleId, err)
logger.Warningf("alert_eval_%d datasource_%d failed to parse rule note: %v", event.RuleId, event.DatasourceId, err)
event.RuleNote = fmt.Sprintf("failed to parse rule note: %v", err)
}
@@ -131,7 +130,7 @@ func (e *Consumer) persist(event *models.AlertCurEvent) {
var err error
event.Id, err = poster.PostByUrlsWithResp[int64](e.ctx, "/v1/n9e/event-persist", event)
if err != nil {
logger.Errorf("event:%+v persist err:%v", event, err)
logger.Errorf("event:%s persist err:%v", event.Hash, err)
e.dispatch.Astats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", event.DatasourceId), "persist_event", event.GroupName, fmt.Sprintf("%v", event.RuleId)).Inc()
}
return
@@ -139,7 +138,7 @@ func (e *Consumer) persist(event *models.AlertCurEvent) {
err := models.EventPersist(e.ctx, event)
if err != nil {
logger.Errorf("event%+v persist err:%v", event, err)
logger.Errorf("event:%s persist err:%v", event.Hash, err)
e.dispatch.Astats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", event.DatasourceId), "persist_event", event.GroupName, fmt.Sprintf("%v", event.RuleId)).Inc()
}
}
@@ -157,12 +156,12 @@ func (e *Consumer) queryRecoveryVal(event *models.AlertCurEvent) {
promql = strings.TrimSpace(promql)
if promql == "" {
logger.Warningf("rule_eval:%s promql is blank", getKey(event))
logger.Warningf("alert_eval_%d datasource_%d promql is blank", event.RuleId, event.DatasourceId)
return
}
if e.promClients.IsNil(event.DatasourceId) {
logger.Warningf("rule_eval:%s error reader client is nil", getKey(event))
logger.Warningf("alert_eval_%d datasource_%d error reader client is nil", event.RuleId, event.DatasourceId)
return
}
@@ -171,7 +170,7 @@ func (e *Consumer) queryRecoveryVal(event *models.AlertCurEvent) {
var warnings promsdk.Warnings
value, warnings, err := readerClient.Query(e.ctx.Ctx, promql, time.Now())
if err != nil {
logger.Errorf("rule_eval:%s promql:%s, error:%v", getKey(event), promql, err)
logger.Errorf("alert_eval_%d datasource_%d promql:%s, error:%v", event.RuleId, event.DatasourceId, promql, err)
event.AnnotationsJSON["recovery_promql_error"] = fmt.Sprintf("promql:%s error:%v", promql, err)
b, err := json.Marshal(event.AnnotationsJSON)
@@ -185,12 +184,12 @@ func (e *Consumer) queryRecoveryVal(event *models.AlertCurEvent) {
}
if len(warnings) > 0 {
logger.Errorf("rule_eval:%s promql:%s, warnings:%v", getKey(event), promql, warnings)
logger.Errorf("alert_eval_%d datasource_%d promql:%s, warnings:%v", event.RuleId, event.DatasourceId, promql, warnings)
}
anomalyPoints := models.ConvertAnomalyPoints(value)
if len(anomalyPoints) == 0 {
logger.Warningf("rule_eval:%s promql:%s, result is empty", getKey(event), promql)
logger.Warningf("alert_eval_%d datasource_%d promql:%s, result is empty", event.RuleId, event.DatasourceId, promql)
event.AnnotationsJSON["recovery_promql_error"] = fmt.Sprintf("promql:%s error:%s", promql, "result is empty")
} else {
event.AnnotationsJSON["recovery_value"] = fmt.Sprintf("%v", anomalyPoints[0].Value)
@@ -205,6 +204,3 @@ func (e *Consumer) queryRecoveryVal(event *models.AlertCurEvent) {
}
}
func getKey(event *models.AlertCurEvent) string {
return common.RuleKey(event.DatasourceId, event.RuleId)
}

View File

@@ -171,7 +171,7 @@ func (e *Dispatch) HandleEventWithNotifyRule(eventOrigin *models.AlertCurEvent)
// 深拷贝新的 event避免并发修改 event 冲突
eventCopy := eventOrigin.DeepCopy()
logger.Infof("notify rule ids: %v, event: %+v", notifyRuleId, eventCopy)
logger.Infof("notify rule ids: %v, event: %s", notifyRuleId, eventCopy.Hash)
notifyRule := e.notifyRuleCache.Get(notifyRuleId)
if notifyRule == nil {
continue
@@ -185,7 +185,7 @@ func (e *Dispatch) HandleEventWithNotifyRule(eventOrigin *models.AlertCurEvent)
eventCopy = HandleEventPipeline(notifyRule.PipelineConfigs, eventOrigin, eventCopy, e.eventProcessorCache, e.ctx, notifyRuleId, "notify_rule")
if ShouldSkipNotify(e.ctx, eventCopy, notifyRuleId) {
logger.Infof("notify_id: %d, event:%+v, should skip notify", notifyRuleId, eventCopy)
logger.Infof("notify_id: %d, event:%s, should skip notify", notifyRuleId, eventCopy.Hash)
continue
}
@@ -193,7 +193,7 @@ func (e *Dispatch) HandleEventWithNotifyRule(eventOrigin *models.AlertCurEvent)
for i := range notifyRule.NotifyConfigs {
err := NotifyRuleMatchCheck(&notifyRule.NotifyConfigs[i], eventCopy)
if err != nil {
logger.Errorf("notify_id: %d, event:%+v, channel_id:%d, template_id: %d, notify_config:%+v, err:%v", notifyRuleId, eventCopy, notifyRule.NotifyConfigs[i].ChannelID, notifyRule.NotifyConfigs[i].TemplateID, notifyRule.NotifyConfigs[i], err)
logger.Errorf("notify_id: %d, event:%s, channel_id:%d, template_id: %d, notify_config:%+v, err:%v", notifyRuleId, eventCopy.Hash, notifyRule.NotifyConfigs[i].ChannelID, notifyRule.NotifyConfigs[i].TemplateID, notifyRule.NotifyConfigs[i], err)
continue
}
@@ -201,12 +201,12 @@ func (e *Dispatch) HandleEventWithNotifyRule(eventOrigin *models.AlertCurEvent)
messageTemplate := e.messageTemplateCache.Get(notifyRule.NotifyConfigs[i].TemplateID)
if notifyChannel == nil {
sender.NotifyRecord(e.ctx, []*models.AlertCurEvent{eventCopy}, notifyRuleId, fmt.Sprintf("notify_channel_id:%d", notifyRule.NotifyConfigs[i].ChannelID), "", "", errors.New("notify_channel not found"))
logger.Warningf("notify_id: %d, event:%+v, channel_id:%d, template_id: %d, notify_channel not found", notifyRuleId, eventCopy, notifyRule.NotifyConfigs[i].ChannelID, notifyRule.NotifyConfigs[i].TemplateID)
logger.Warningf("notify_id: %d, event:%s, channel_id:%d, template_id: %d, notify_channel not found", notifyRuleId, eventCopy.Hash, notifyRule.NotifyConfigs[i].ChannelID, notifyRule.NotifyConfigs[i].TemplateID)
continue
}
if notifyChannel.RequestType != "flashduty" && notifyChannel.RequestType != "pagerduty" && messageTemplate == nil {
logger.Warningf("notify_id: %d, channel_name: %v, event:%+v, template_id: %d, message_template not found", notifyRuleId, notifyChannel.Ident, eventCopy, notifyRule.NotifyConfigs[i].TemplateID)
logger.Warningf("notify_id: %d, channel_name: %v, event:%s, template_id: %d, message_template not found", notifyRuleId, notifyChannel.Ident, eventCopy.Hash, notifyRule.NotifyConfigs[i].TemplateID)
sender.NotifyRecord(e.ctx, []*models.AlertCurEvent{eventCopy}, notifyRuleId, notifyChannel.Name, "", "", errors.New("message_template not found"))
continue
@@ -241,12 +241,12 @@ func HandleEventPipeline(pipelineConfigs []models.PipelineConfig, eventOrigin, e
eventPipeline := eventProcessorCache.Get(pipelineConfig.PipelineId)
if eventPipeline == nil {
logger.Warningf("processor_by_%s_id:%d pipeline_id:%d, event pipeline not found, event: %+v", from, id, pipelineConfig.PipelineId, event)
logger.Warningf("processor_by_%s_id:%d pipeline_id:%d, event pipeline not found, event: %s", from, id, pipelineConfig.PipelineId, event.Hash)
continue
}
if !PipelineApplicable(eventPipeline, event) {
logger.Debugf("processor_by_%s_id:%d pipeline_id:%d, event pipeline not applicable, event: %+v", from, id, pipelineConfig.PipelineId, event)
logger.Debugf("processor_by_%s_id:%d pipeline_id:%d, event pipeline not applicable, event: %s", from, id, pipelineConfig.PipelineId, event.Hash)
continue
}
@@ -263,7 +263,7 @@ func HandleEventPipeline(pipelineConfigs []models.PipelineConfig, eventOrigin, e
}
if resultEvent == nil {
logger.Infof("processor_by_%s_id:%d pipeline_id:%d, event dropped, event: %+v", from, id, pipelineConfig.PipelineId, eventOrigin)
logger.Infof("processor_by_%s_id:%d pipeline_id:%d, event dropped, event: %s", from, id, pipelineConfig.PipelineId, eventOrigin.Hash)
if from == "notify_rule" {
sender.NotifyRecord(ctx, []*models.AlertCurEvent{eventOrigin}, id, "", "", result.Message, fmt.Errorf("processor_by_%s_id:%d pipeline_id:%d, drop by pipeline", from, id, pipelineConfig.PipelineId))
}
@@ -301,7 +301,7 @@ func PipelineApplicable(pipeline *models.EventPipeline, event *models.AlertCurEv
tagFilters, err := models.ParseTagFilter(labelFiltersCopy)
if err != nil {
logger.Errorf("pipeline applicable failed to parse tag filter: %v event:%+v pipeline:%+v", err, event, pipeline)
logger.Errorf("pipeline applicable failed to parse tag filter: %v event:%s pipeline:%+v", err, event.Hash, pipeline)
return false
}
tagMatch = common.MatchTags(event.TagsMap, tagFilters)
@@ -315,7 +315,7 @@ func PipelineApplicable(pipeline *models.EventPipeline, event *models.AlertCurEv
tagFilters, err := models.ParseTagFilter(attrFiltersCopy)
if err != nil {
logger.Errorf("pipeline applicable failed to parse tag filter: %v event:%+v pipeline:%+v err:%v", tagFilters, event, pipeline, err)
logger.Errorf("pipeline applicable failed to parse tag filter: %v event:%s pipeline:%+v err:%v", tagFilters, event.Hash, pipeline, err)
return false
}
@@ -405,7 +405,7 @@ func NotifyRuleMatchCheck(notifyConfig *models.NotifyConfig, event *models.Alert
tagFilters, err := models.ParseTagFilter(labelKeysCopy)
if err != nil {
logger.Errorf("notify send failed to parse tag filter: %v event:%+v notify_config:%+v", err, event, notifyConfig)
logger.Errorf("notify send failed to parse tag filter: %v event:%s notify_config:%+v", err, event.Hash, notifyConfig)
return fmt.Errorf("failed to parse tag filter: %v", err)
}
tagMatch = common.MatchTags(event.TagsMap, tagFilters)
@@ -423,7 +423,7 @@ func NotifyRuleMatchCheck(notifyConfig *models.NotifyConfig, event *models.Alert
tagFilters, err := models.ParseTagFilter(attributesCopy)
if err != nil {
logger.Errorf("notify send failed to parse tag filter: %v event:%+v notify_config:%+v err:%v", tagFilters, event, notifyConfig, err)
logger.Errorf("notify send failed to parse tag filter: %v event:%s notify_config:%+v err:%v", tagFilters, event.Hash, notifyConfig, err)
return fmt.Errorf("failed to parse tag filter: %v", err)
}
@@ -434,7 +434,7 @@ func NotifyRuleMatchCheck(notifyConfig *models.NotifyConfig, event *models.Alert
return fmt.Errorf("event attributes not match attributes filter")
}
logger.Infof("notify send timeMatch:%v severityMatch:%v tagMatch:%v attributesMatch:%v event:%+v notify_config:%+v", timeMatch, severityMatch, tagMatch, attributesMatch, event, notifyConfig)
logger.Infof("notify send timeMatch:%v severityMatch:%v tagMatch:%v attributesMatch:%v event:%s notify_config:%+v", timeMatch, severityMatch, tagMatch, attributesMatch, event.Hash, notifyConfig)
return nil
}
@@ -547,7 +547,7 @@ func SendNotifyRuleMessage(ctx *ctx.Context, userCache *memsto.UserCacheType, us
start := time.Now()
respBody, err := notifyChannel.SendFlashDuty(events, flashDutyChannelIDs[i], notifyChannelCache.GetHttpClient(notifyChannel.ID))
respBody = fmt.Sprintf("send_time: %s duration: %d ms %s", time.Now().Format("2006-01-02 15:04:05"), time.Since(start).Milliseconds(), respBody)
logger.Infof("duty_sender notify_id: %d, channel_name: %v, event:%+v, IntegrationUrl: %v dutychannel_id: %v, respBody: %v, err: %v", notifyRuleId, notifyChannel.Name, events[0], notifyChannel.RequestConfig.FlashDutyRequestConfig.IntegrationUrl, flashDutyChannelIDs[i], respBody, err)
logger.Infof("duty_sender notify_id: %d, channel_name: %v, event:%s, IntegrationUrl: %v dutychannel_id: %v, respBody: %v, err: %v", notifyRuleId, notifyChannel.Name, events[0].Hash, notifyChannel.RequestConfig.FlashDutyRequestConfig.IntegrationUrl, flashDutyChannelIDs[i], respBody, err)
sender.NotifyRecord(ctx, events, notifyRuleId, notifyChannel.Name, strconv.FormatInt(flashDutyChannelIDs[i], 10), respBody, err)
}
@@ -556,7 +556,7 @@ func SendNotifyRuleMessage(ctx *ctx.Context, userCache *memsto.UserCacheType, us
start := time.Now()
respBody, err := notifyChannel.SendPagerDuty(events, routingKey, siteInfo.SiteUrl, notifyChannelCache.GetHttpClient(notifyChannel.ID))
respBody = fmt.Sprintf("send_time: %s duration: %d ms %s", time.Now().Format("2006-01-02 15:04:05"), time.Since(start).Milliseconds(), respBody)
logger.Infof("pagerduty_sender notify_id: %d, channel_name: %v, event:%+v, respBody: %v, err: %v", notifyRuleId, notifyChannel.Name, events[0], respBody, err)
logger.Infof("pagerduty_sender notify_id: %d, channel_name: %v, event:%s, respBody: %v, err: %v", notifyRuleId, notifyChannel.Name, events[0].Hash, respBody, err)
sender.NotifyRecord(ctx, events, notifyRuleId, notifyChannel.Name, "", respBody, err)
}
@@ -587,10 +587,10 @@ func SendNotifyRuleMessage(ctx *ctx.Context, userCache *memsto.UserCacheType, us
start := time.Now()
target, res, err := notifyChannel.SendScript(events, tplContent, customParams, sendtos)
res = fmt.Sprintf("send_time: %s duration: %d ms %s", time.Now().Format("2006-01-02 15:04:05"), time.Since(start).Milliseconds(), res)
logger.Infof("script_sender notify_id: %d, channel_name: %v, event:%+v, tplContent:%s, customParams:%v, target:%s, res:%s, err:%v", notifyRuleId, notifyChannel.Name, events[0], tplContent, customParams, target, res, err)
logger.Infof("script_sender notify_id: %d, channel_name: %v, event:%s, tplContent:%s, customParams:%v, target:%s, res:%s, err:%v", notifyRuleId, notifyChannel.Name, events[0].Hash, tplContent, customParams, target, res, err)
sender.NotifyRecord(ctx, events, notifyRuleId, notifyChannel.Name, target, res, err)
default:
logger.Warningf("notify_id: %d, channel_name: %v, event:%+v send type not found", notifyRuleId, notifyChannel.Name, events[0])
logger.Warningf("notify_id: %d, channel_name: %v, event:%s send type not found", notifyRuleId, notifyChannel.Name, events[0].Hash)
}
}
@@ -734,7 +734,7 @@ func (e *Dispatch) Send(rule *models.AlertRule, event *models.AlertCurEvent, not
event = msgCtx.Events[0]
}
logger.Debugf("send to channel:%s event:%+v users:%+v", channel, event, msgCtx.Users)
logger.Debugf("send to channel:%s event:%s users:%+v", channel, event.Hash, msgCtx.Users)
s.Send(msgCtx)
}
}

View File

@@ -18,11 +18,11 @@ func LogEvent(event *models.AlertCurEvent, location string, err ...error) {
}
logger.Infof(
"event(%s %s) %s: rule_id=%d sub_id:%d notify_rule_ids:%v cluster:%s %v%s@%d last_eval_time:%d %s",
"alert_eval_%d event(%s %s) %s: sub_id:%d notify_rule_ids:%v cluster:%s %v%s@%d last_eval_time:%d %s",
event.RuleId,
event.Hash,
status,
location,
event.RuleId,
event.SubRuleId,
event.NotifyRuleIds,
event.Cluster,

View File

@@ -101,17 +101,17 @@ func (s *Scheduler) syncAlertRules() {
}
ds := s.datasourceCache.GetById(dsId)
if ds == nil {
logger.Debugf("datasource %d not found", dsId)
logger.Debugf("alert_eval_%d datasource %d not found", rule.Id, dsId)
continue
}
if ds.PluginType != ruleType {
logger.Debugf("datasource %d category is %s not %s", dsId, ds.PluginType, ruleType)
logger.Debugf("alert_eval_%d datasource %d category is %s not %s", rule.Id, dsId, ds.PluginType, ruleType)
continue
}
if ds.Status != "enabled" {
logger.Debugf("datasource %d status is %s", dsId, ds.Status)
logger.Debugf("alert_eval_%d datasource %d status is %s", rule.Id, dsId, ds.Status)
continue
}
processor := process.NewProcessor(s.aconf.Heartbeat.EngineName, rule, dsId, s.alertRuleCache, s.targetCache, s.targetsOfAlertRuleCache, s.busiGroupCache, s.alertMuteCache, s.datasourceCache, s.ctx, s.stats)
@@ -134,12 +134,12 @@ func (s *Scheduler) syncAlertRules() {
for _, dsId := range dsIds {
ds := s.datasourceCache.GetById(dsId)
if ds == nil {
logger.Debugf("datasource %d not found", dsId)
logger.Debugf("alert_eval_%d datasource %d not found", rule.Id, dsId)
continue
}
if ds.Status != "enabled" {
logger.Debugf("datasource %d status is %s", dsId, ds.Status)
logger.Debugf("alert_eval_%d datasource %d status is %s", rule.Id, dsId, ds.Status)
continue
}
processor := process.NewProcessor(s.aconf.Heartbeat.EngineName, rule, dsId, s.alertRuleCache, s.targetCache, s.targetsOfAlertRuleCache, s.busiGroupCache, s.alertMuteCache, s.datasourceCache, s.ctx, s.stats)

View File

@@ -109,7 +109,7 @@ func NewAlertRuleWorker(rule *models.AlertRule, datasourceId int64, Processor *p
})
if err != nil {
logger.Errorf("alert rule %s add cron pattern error: %v", arw.Key(), err)
logger.Errorf("alert_eval_%d datasource_%d add cron pattern error: %v", arw.Rule.Id, arw.DatasourceId, err)
}
Processor.ScheduleEntry = arw.Scheduler.Entry(entryID)
@@ -152,9 +152,9 @@ func (arw *AlertRuleWorker) Eval() {
defer func() {
if len(message) == 0 {
logger.Infof("rule_eval:%s finished, duration:%v", arw.Key(), time.Since(begin))
logger.Infof("alert_eval_%d datasource_%d finished, duration:%v", arw.Rule.Id, arw.DatasourceId, time.Since(begin))
} else {
logger.Warningf("rule_eval:%s finished, duration:%v, message:%s", arw.Key(), time.Since(begin), message)
logger.Warningf("alert_eval_%d datasource_%d finished, duration:%v, message:%s", arw.Rule.Id, arw.DatasourceId, time.Since(begin), message)
}
}()
@@ -236,7 +236,7 @@ func (arw *AlertRuleWorker) Eval() {
}
func (arw *AlertRuleWorker) Stop() {
logger.Infof("rule_eval:%s stopped", arw.Key())
logger.Infof("alert_eval_%d datasource_%d stopped", arw.Rule.Id, arw.DatasourceId)
close(arw.Quit)
c := arw.Scheduler.Stop()
<-c.Done()
@@ -252,7 +252,7 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) ([]models.Ano
var rule *models.PromRuleConfig
if err := json.Unmarshal([]byte(ruleConfig), &rule); err != nil {
logger.Errorf("rule_eval:%s rule_config:%s, error:%v", arw.Key(), ruleConfig, err)
logger.Errorf("alert_eval_%d datasource_%d rule_config:%s, error:%v", arw.Rule.Id, arw.DatasourceId, ruleConfig, err)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_RULE_CONFIG, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
@@ -263,7 +263,7 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) ([]models.Ano
}
if rule == nil {
logger.Errorf("rule_eval:%s rule_config:%s, error:rule is nil", arw.Key(), ruleConfig)
logger.Errorf("alert_eval_%d datasource_%d rule_config:%s, error:rule is nil", arw.Rule.Id, arw.DatasourceId, ruleConfig)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_RULE_CONFIG, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
@@ -278,7 +278,7 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) ([]models.Ano
readerClient := arw.PromClients.GetCli(arw.DatasourceId)
if readerClient == nil {
logger.Warningf("rule_eval:%s error reader client is nil", arw.Key())
logger.Warningf("alert_eval_%d datasource_%d error reader client is nil", arw.Rule.Id, arw.DatasourceId)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_CLIENT, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
@@ -314,13 +314,13 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) ([]models.Ano
// 无变量
promql := strings.TrimSpace(query.PromQl)
if promql == "" {
logger.Warningf("rule_eval:%s promql is blank", arw.Key())
logger.Warningf("alert_eval_%d datasource_%d promql is blank", arw.Rule.Id, arw.DatasourceId)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), CHECK_QUERY, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
continue
}
if arw.PromClients.IsNil(arw.DatasourceId) {
logger.Warningf("rule_eval:%s error reader client is nil", arw.Key())
logger.Warningf("alert_eval_%d datasource_%d error reader client is nil", arw.Rule.Id, arw.DatasourceId)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_CLIENT, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
continue
}
@@ -329,7 +329,7 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) ([]models.Ano
arw.Processor.Stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId), fmt.Sprintf("%d", arw.Rule.Id)).Inc()
value, warnings, err := readerClient.Query(context.Background(), promql, time.Now())
if err != nil {
logger.Errorf("rule_eval:%s promql:%s, error:%v", arw.Key(), promql, err)
logger.Errorf("alert_eval_%d datasource_%d promql:%s, error:%v", arw.Rule.Id, arw.DatasourceId, promql, err)
arw.Processor.Stats.CounterQueryDataErrorTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId)).Inc()
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), QUERY_DATA, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
@@ -341,12 +341,12 @@ func (arw *AlertRuleWorker) GetPromAnomalyPoint(ruleConfig string) ([]models.Ano
}
if len(warnings) > 0 {
logger.Errorf("rule_eval:%s promql:%s, warnings:%v", arw.Key(), promql, warnings)
logger.Errorf("alert_eval_%d datasource_%d promql:%s, warnings:%v", arw.Rule.Id, arw.DatasourceId, promql, warnings)
arw.Processor.Stats.CounterQueryDataErrorTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId)).Inc()
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), QUERY_DATA, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
}
logger.Infof("rule_eval:%s query:%+v, value:%v", arw.Key(), query, value)
logger.Infof("alert_eval_%d datasource_%d query:%+v, value:%v", arw.Rule.Id, arw.DatasourceId, query, value)
points := models.ConvertAnomalyPoints(value)
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
@@ -440,14 +440,14 @@ func (arw *AlertRuleWorker) VarFillingAfterQuery(query models.PromQuery, readerC
arw.Processor.Stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId), fmt.Sprintf("%d", arw.Rule.Id)).Inc()
value, _, err := readerClient.Query(context.Background(), curQuery, time.Now())
if err != nil {
logger.Errorf("rule_eval:%s, promql:%s, error:%v", arw.Key(), curQuery, err)
logger.Errorf("alert_eval_%d datasource_%d promql:%s, error:%v", arw.Rule.Id, arw.DatasourceId, curQuery, err)
continue
}
seqVals := getSamples(value)
// 得到参数变量的所有组合
paramPermutation, err := arw.getParamPermutation(param, ParamKeys, varToLabel, query.PromQl, readerClient)
if err != nil {
logger.Errorf("rule_eval:%s, paramPermutation error:%v", arw.Key(), err)
logger.Errorf("alert_eval_%d datasource_%d paramPermutation error:%v", arw.Rule.Id, arw.DatasourceId, err)
continue
}
// 判断哪些参数值符合条件
@@ -580,14 +580,14 @@ func (arw *AlertRuleWorker) getParamPermutation(paramVal map[string]models.Param
case "host":
hostIdents, err := arw.getHostIdents(paramQuery)
if err != nil {
logger.Errorf("rule_eval:%s, fail to get host idents, error:%v", arw.Key(), err)
logger.Errorf("alert_eval_%d datasource_%d fail to get host idents, error:%v", arw.Rule.Id, arw.DatasourceId, err)
break
}
params = hostIdents
case "device":
deviceIdents, err := arw.getDeviceIdents(paramQuery)
if err != nil {
logger.Errorf("rule_eval:%s, fail to get device idents, error:%v", arw.Key(), err)
logger.Errorf("alert_eval_%d datasource_%d fail to get device idents, error:%v", arw.Rule.Id, arw.DatasourceId, err)
break
}
params = deviceIdents
@@ -596,12 +596,12 @@ func (arw *AlertRuleWorker) getParamPermutation(paramVal map[string]models.Param
var query []string
err := json.Unmarshal(q, &query)
if err != nil {
logger.Errorf("query:%s fail to unmarshalling into string slice, error:%v", paramQuery.Query, err)
logger.Errorf("alert_eval_%d datasource_%d query:%s fail to unmarshalling into string slice, error:%v", arw.Rule.Id, arw.DatasourceId, paramQuery.Query, err)
}
if len(query) == 0 {
paramsKeyAllLabel, err := getParamKeyAllLabel(varToLabel[paramKey], originPromql, readerClient, arw.DatasourceId, arw.Rule.Id, arw.Processor.Stats)
if err != nil {
logger.Errorf("rule_eval:%s, fail to getParamKeyAllLabel, error:%v query:%s", arw.Key(), err, paramQuery.Query)
logger.Errorf("alert_eval_%d datasource_%d fail to getParamKeyAllLabel, error:%v query:%s", arw.Rule.Id, arw.DatasourceId, err, paramQuery.Query)
}
params = paramsKeyAllLabel
} else {
@@ -615,7 +615,7 @@ func (arw *AlertRuleWorker) getParamPermutation(paramVal map[string]models.Param
return nil, fmt.Errorf("param key: %s, params is empty", paramKey)
}
logger.Infof("rule_eval:%s paramKey: %s, params: %v", arw.Key(), paramKey, params)
logger.Infof("alert_eval_%d datasource_%d paramKey: %s, params: %v", arw.Rule.Id, arw.DatasourceId, paramKey, params)
paramMap[paramKey] = params
}
@@ -766,7 +766,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.Ano
var rule *models.HostRuleConfig
if err := json.Unmarshal([]byte(ruleConfig), &rule); err != nil {
logger.Errorf("rule_eval:%s rule_config:%s, error:%v", arw.Key(), ruleConfig, err)
logger.Errorf("alert_eval_%d datasource_%d rule_config:%s, error:%v", arw.Rule.Id, arw.DatasourceId, ruleConfig, err)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_RULE_CONFIG, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
@@ -777,7 +777,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.Ano
}
if rule == nil {
logger.Errorf("rule_eval:%s rule_config:%s, error:rule is nil", arw.Key(), ruleConfig)
logger.Errorf("alert_eval_%d datasource_%d rule_config:%s, error:rule is nil", arw.Rule.Id, arw.DatasourceId, ruleConfig)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_RULE_CONFIG, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
@@ -800,7 +800,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.Ano
// 如果是中心节点, 将不再上报数据的主机 engineName 为空的机器,也加入到 targets 中
missEngineIdents, exists = arw.Processor.TargetsOfAlertRuleCache.Get("", arw.Rule.Id)
if !exists {
logger.Debugf("rule_eval:%s targets not found engineName:%s", arw.Key(), arw.Processor.EngineName)
logger.Debugf("alert_eval_%d datasource_%d targets not found engineName:%s", arw.Rule.Id, arw.DatasourceId, arw.Processor.EngineName)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), QUERY_DATA, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
}
}
@@ -808,7 +808,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.Ano
engineIdents, exists = arw.Processor.TargetsOfAlertRuleCache.Get(arw.Processor.EngineName, arw.Rule.Id)
if !exists {
logger.Warningf("rule_eval:%s targets not found engineName:%s", arw.Key(), arw.Processor.EngineName)
logger.Warningf("alert_eval_%d datasource_%d targets not found engineName:%s", arw.Rule.Id, arw.DatasourceId, arw.Processor.EngineName)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), QUERY_DATA, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
}
idents = append(idents, engineIdents...)
@@ -835,7 +835,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.Ano
"",
).Set(float64(len(missTargets)))
logger.Debugf("rule_eval:%s missTargets:%v", arw.Key(), missTargets)
logger.Debugf("alert_eval_%d datasource_%d missTargets:%v", arw.Rule.Id, arw.DatasourceId, missTargets)
targets := arw.Processor.TargetCache.Gets(missTargets)
for _, target := range targets {
m := make(map[string]string)
@@ -854,7 +854,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.Ano
fmt.Sprintf("%v", arw.Processor.DatasourceId()),
"",
).Set(0)
logger.Warningf("rule_eval:%s targets not found", arw.Key())
logger.Warningf("alert_eval_%d datasource_%d targets not found", arw.Rule.Id, arw.DatasourceId)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), QUERY_DATA, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
continue
}
@@ -885,7 +885,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.Ano
}
}
logger.Debugf("rule_eval:%s offsetIdents:%v", arw.Key(), offsetIdents)
logger.Debugf("alert_eval_%d datasource_%d offsetIdents:%v", arw.Rule.Id, arw.DatasourceId, offsetIdents)
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
fmt.Sprintf("%v", arw.Processor.DatasourceId()),
@@ -912,7 +912,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.Ano
fmt.Sprintf("%v", arw.Processor.DatasourceId()),
"",
).Set(0)
logger.Warningf("rule_eval:%s targets not found", arw.Key())
logger.Warningf("alert_eval_%d datasource_%d targets not found", arw.Rule.Id, arw.DatasourceId)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), QUERY_DATA, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
continue
}
@@ -924,7 +924,7 @@ func (arw *AlertRuleWorker) GetHostAnomalyPoint(ruleConfig string) ([]models.Ano
missTargets = append(missTargets, ident)
}
}
logger.Debugf("rule_eval:%s missTargets:%v", arw.Key(), missTargets)
logger.Debugf("alert_eval_%d datasource_%d missTargets:%v", arw.Rule.Id, arw.DatasourceId, missTargets)
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
fmt.Sprintf("%v", arw.Processor.DatasourceId()),
@@ -1120,7 +1120,7 @@ func ProcessJoins(ruleId int64, trigger models.Trigger, seriesTagIndexes map[str
// 有 join 条件,按条件依次合并
if len(seriesTagIndexes) < len(trigger.Joins)+1 {
logger.Errorf("rule_eval rid:%d queries' count: %d not match join condition's count: %d", ruleId, len(seriesTagIndexes), len(trigger.Joins))
logger.Errorf("alert_eval_%d queries' count: %d not match join condition's count: %d", ruleId, len(seriesTagIndexes), len(trigger.Joins))
return nil
}
@@ -1156,7 +1156,7 @@ func ProcessJoins(ruleId int64, trigger models.Trigger, seriesTagIndexes map[str
lastRehashed = exclude(curRehashed, lastRehashed)
last = flatten(lastRehashed)
default:
logger.Warningf("rule_eval rid:%d join type:%s not support", ruleId, trigger.Joins[i].JoinType)
logger.Warningf("alert_eval_%d join type:%s not support", ruleId, trigger.Joins[i].JoinType)
}
}
return last
@@ -1276,7 +1276,7 @@ func (arw *AlertRuleWorker) VarFillingBeforeQuery(query models.PromQuery, reader
// 得到参数变量的所有组合
paramPermutation, err := arw.getParamPermutation(param, ParamKeys, varToLabel, query.PromQl, readerClient)
if err != nil {
logger.Errorf("rule_eval:%s, paramPermutation error:%v", arw.Key(), err)
logger.Errorf("alert_eval_%d datasource_%d paramPermutation error:%v", arw.Rule.Id, arw.DatasourceId, err)
continue
}
@@ -1304,10 +1304,10 @@ func (arw *AlertRuleWorker) VarFillingBeforeQuery(query models.PromQuery, reader
arw.Processor.Stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId), fmt.Sprintf("%d", arw.Rule.Id)).Inc()
value, _, err := readerClient.Query(context.Background(), promql, time.Now())
if err != nil {
logger.Errorf("rule_eval:%s, promql:%s, error:%v", arw.Key(), promql, err)
logger.Errorf("alert_eval_%d datasource_%d promql:%s, error:%v", arw.Rule.Id, arw.DatasourceId, promql, err)
return
}
logger.Infof("rule_eval:%s, promql:%s, value:%+v", arw.Key(), promql, value)
logger.Infof("alert_eval_%d datasource_%d promql:%s, value:%+v", arw.Rule.Id, arw.DatasourceId, promql, value)
points := models.ConvertAnomalyPoints(value)
if len(points) == 0 {
@@ -1446,7 +1446,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
recoverPoints := []models.AnomalyPoint{}
ruleConfig := strings.TrimSpace(rule.RuleConfig)
if ruleConfig == "" {
logger.Warningf("rule_eval:%d ruleConfig is blank", rule.Id)
logger.Warningf("alert_eval_%d datasource_%d ruleConfig is blank", rule.Id, dsId)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_RULE_CONFIG, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
@@ -1454,15 +1454,15 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
"",
).Set(0)
return points, recoverPoints, fmt.Errorf("rule_eval:%d ruleConfig is blank", rule.Id)
return points, recoverPoints, fmt.Errorf("alert_eval_%d datasource_%d ruleConfig is blank", rule.Id, dsId)
}
var ruleQuery models.RuleQuery
err := json.Unmarshal([]byte(ruleConfig), &ruleQuery)
if err != nil {
logger.Warningf("rule_eval:%d promql parse error:%s", rule.Id, err.Error())
logger.Warningf("alert_eval_%d datasource_%d promql parse error:%s", rule.Id, dsId, err.Error())
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_RULE_CONFIG, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
return points, recoverPoints, fmt.Errorf("rule_eval:%d promql parse error:%s", rule.Id, err.Error())
return points, recoverPoints, fmt.Errorf("alert_eval_%d datasource_%d promql parse error:%s", rule.Id, dsId, err.Error())
}
arw.Inhibit = ruleQuery.Inhibit
@@ -1474,7 +1474,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
plug, exists := dscache.DsCache.Get(rule.Cate, dsId)
if !exists {
logger.Warningf("rule_eval rid:%d datasource:%d not exists", rule.Id, dsId)
logger.Warningf("alert_eval_%d datasource_%d not exists", rule.Id, dsId)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_CLIENT, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
@@ -1483,11 +1483,11 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
fmt.Sprintf("%v", i),
).Set(-2)
return points, recoverPoints, fmt.Errorf("rule_eval:%d datasource:%d not exists", rule.Id, dsId)
return points, recoverPoints, fmt.Errorf("alert_eval_%d datasource_%d not exists", rule.Id, dsId)
}
if err = ExecuteQueryTemplate(rule.Cate, query, nil); err != nil {
logger.Warningf("rule_eval rid:%d execute query template error: %v", rule.Id, err)
logger.Warningf("alert_eval_%d datasource_%d execute query template error: %v", rule.Id, dsId, err)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), EXEC_TEMPLATE, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
@@ -1500,7 +1500,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
series, err := plug.QueryData(ctx, query)
arw.Processor.Stats.CounterQueryDataTotal.WithLabelValues(fmt.Sprintf("%d", arw.DatasourceId), fmt.Sprintf("%d", rule.Id)).Inc()
if err != nil {
logger.Warningf("rule_eval rid:%d query data error: %v", rule.Id, err)
logger.Warningf("alert_eval_%d datasource_%d query data error: %v", rule.Id, dsId, err)
arw.Processor.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", arw.Processor.DatasourceId()), GET_CLIENT, arw.Processor.BusiGroupCache.GetNameByBusiGroupId(arw.Rule.GroupId), fmt.Sprintf("%v", arw.Rule.Id)).Inc()
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
fmt.Sprintf("%v", arw.Rule.Id),
@@ -1508,7 +1508,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
fmt.Sprintf("%v", i),
).Set(-1)
return points, recoverPoints, fmt.Errorf("rule_eval:%d query data error: %v", rule.Id, err)
return points, recoverPoints, fmt.Errorf("alert_eval_%d datasource_%d query data error: %v", rule.Id, dsId, err)
}
arw.Processor.Stats.GaugeQuerySeriesCount.WithLabelValues(
@@ -1518,7 +1518,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
).Set(float64(len(series)))
// 此条日志很重要,是告警判断的现场值
logger.Infof("rule_eval rid:%d req:%+v resp:%v", rule.Id, query, series)
logger.Infof("alert_eval_%d datasource_%d req:%+v resp:%v", rule.Id, dsId, query, series)
for i := 0; i < len(series); i++ {
seriesHash := hash.GetHash(series[i].Metric, series[i].Ref)
tagHash := hash.GetTagHash(series[i].Metric)
@@ -1532,7 +1532,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
}
ref, err := GetQueryRef(query)
if err != nil {
logger.Warningf("rule_eval rid:%d query:%+v get ref error:%s", rule.Id, query, err.Error())
logger.Warningf("alert_eval_%d datasource_%d query:%+v get ref error:%s", rule.Id, dsId, query, err.Error())
continue
}
seriesTagIndexes[ref] = seriesTagIndex
@@ -1542,7 +1542,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
for _, query := range ruleQuery.Queries {
ref, unit, err := GetQueryRefAndUnit(query)
if err != nil {
logger.Warningf("rule_eval rid:%d query:%+v get ref and unit error:%s", rule.Id, query, err.Error())
logger.Warningf("alert_eval_%d datasource_%d query:%+v get ref and unit error:%s", rule.Id, dsId, query, err.Error())
continue
}
unitMap[ref] = unit
@@ -1565,12 +1565,12 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
for _, seriesHash := range seriesHash {
series, exists := seriesStore[seriesHash]
if !exists {
logger.Warningf("rule_eval rid:%d series:%+v not found", rule.Id, series)
logger.Warningf("alert_eval_%d datasource_%d series:%+v not found", rule.Id, dsId, series)
continue
}
t, v, exists := series.Last()
if !exists {
logger.Warningf("rule_eval rid:%d series:%+v value not found", rule.Id, series)
logger.Warningf("alert_eval_%d datasource_%d series:%+v value not found", rule.Id, dsId, series)
continue
}
@@ -1601,12 +1601,12 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
ts = int64(t)
sample = series
value = v
logger.Infof("rule_eval rid:%d origin series labels:%+v", rule.Id, series.Metric)
logger.Infof("alert_eval_%d datasource_%d origin series labels:%+v", rule.Id, dsId, series.Metric)
}
isTriggered := parser.CalcWithRid(trigger.Exp, m, rule.Id)
// 此条日志很重要,是告警判断的现场值
logger.Infof("rule_eval rid:%d trigger:%+v exp:%s res:%v m:%v", rule.Id, trigger, trigger.Exp, isTriggered, m)
logger.Infof("alert_eval_%d datasource_%d trigger:%+v exp:%s res:%v m:%v", rule.Id, dsId, trigger, trigger.Exp, isTriggered, m)
var values string
for k, v := range m {
@@ -1679,7 +1679,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
// 检查是否超过 resolve_after 时间
if now-int64(lastTs) > int64(ruleQuery.NodataTrigger.ResolveAfter) {
logger.Infof("rule_eval rid:%d series:%+v resolve after %d seconds now:%d lastTs:%d", rule.Id, lastSeries, ruleQuery.NodataTrigger.ResolveAfter, now, int64(lastTs))
logger.Infof("alert_eval_%d datasource_%d series:%+v resolve after %d seconds now:%d lastTs:%d", rule.Id, dsId, lastSeries, ruleQuery.NodataTrigger.ResolveAfter, now, int64(lastTs))
delete(arw.LastSeriesStore, hash)
continue
}
@@ -1700,7 +1700,7 @@ func (arw *AlertRuleWorker) GetAnomalyPoint(rule *models.AlertRule, dsId int64)
TriggerType: models.TriggerTypeNodata,
}
points = append(points, point)
logger.Infof("rule_eval rid:%d nodata point:%+v", rule.Id, point)
logger.Infof("alert_eval_%d datasource_%d nodata point:%+v", rule.Id, dsId, point)
}
}

View File

@@ -41,8 +41,28 @@ func IsMuted(rule *models.AlertRule, event *models.AlertCurEvent, targetCache *m
// TimeSpanMuteStrategy 根据规则配置的告警生效时间段过滤,如果产生的告警不在规则配置的告警生效时间段内,则不告警,即被mute
// 时间范围左闭右开默认范围00:00-24:00
// 如果规则配置了时区,则在该时区下进行时间判断;如果时区为空,则使用系统时区
func TimeSpanMuteStrategy(rule *models.AlertRule, event *models.AlertCurEvent) bool {
tm := time.Unix(event.TriggerTime, 0)
// 确定使用的时区
var targetLoc *time.Location
var err error
timezone := rule.TimeZone
if timezone == "" {
// 如果时区为空,使用系统时区(保持原有逻辑)
targetLoc = time.Local
} else {
// 加载规则配置的时区
targetLoc, err = time.LoadLocation(timezone)
if err != nil {
// 如果时区加载失败,记录错误并使用系统时区
logger.Warningf("Failed to load timezone %s for rule %d, using system timezone: %v", timezone, rule.Id, err)
targetLoc = time.Local
}
}
// 将触发时间转换到目标时区
tm := time.Unix(event.TriggerTime, 0).In(targetLoc)
triggerTime := tm.Format("15:04")
triggerWeek := strconv.Itoa(int(tm.Weekday()))
@@ -102,7 +122,7 @@ func IdentNotExistsMuteStrategy(rule *models.AlertRule, event *models.AlertCurEv
// 如果是target_up的告警,且ident已经不存在了,直接过滤掉
// 这里的判断有点太粗暴了,但是目前没有更好的办法
if !exists && strings.Contains(rule.PromQl, "target_up") {
logger.Debugf("[%s] mute: rule_eval:%d cluster:%s ident:%s", "IdentNotExistsMuteStrategy", rule.Id, event.Cluster, ident)
logger.Debugf("alert_eval_%d [IdentNotExistsMuteStrategy] mute: cluster:%s ident:%s", rule.Id, event.Cluster, ident)
return true
}
return false
@@ -124,7 +144,7 @@ func BgNotMatchMuteStrategy(rule *models.AlertRule, event *models.AlertCurEvent,
// 对于包含ident的告警事件check一下ident所属bg和rule所属bg是否相同
// 如果告警规则选择了只在本BG生效那其他BG的机器就不能因此规则产生告警
if exists && !target.MatchGroupId(rule.GroupId) {
logger.Debugf("[%s] mute: rule_eval:%d cluster:%s", "BgNotMatchMuteStrategy", rule.Id, event.Cluster)
logger.Debugf("alert_eval_%d [BgNotMatchMuteStrategy] mute: cluster:%s", rule.Id, event.Cluster)
return true
}
return false

View File

@@ -55,7 +55,7 @@ func (c *EventDropConfig) Process(ctx *ctx.Context, wfCtx *models.WorkflowContex
logger.Infof("processor eventdrop result: %v", result)
if result == "true" {
wfCtx.Event = nil
logger.Infof("processor eventdrop drop event: %v", event)
logger.Infof("processor eventdrop drop event: %s", event.Hash)
return wfCtx, "drop event success", nil
}

View File

@@ -131,7 +131,7 @@ func (p *Processor) Handle(anomalyPoints []models.AnomalyPoint, from string, inh
p.inhibit = inhibit
cachedRule := p.alertRuleCache.Get(p.rule.Id)
if cachedRule == nil {
logger.Warningf("process handle error: rule not found %+v rule_id:%d maybe rule has been deleted", anomalyPoints, p.rule.Id)
logger.Warningf("alert_eval_%d datasource_%d handle error: rule not found, maybe rule has been deleted, anomalyPoints:%+v", p.rule.Id, p.datasourceId, anomalyPoints)
p.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", p.DatasourceId()), "handle_event", p.BusiGroupCache.GetNameByBusiGroupId(p.rule.GroupId), fmt.Sprintf("%v", p.rule.Id)).Inc()
return
}
@@ -156,14 +156,14 @@ func (p *Processor) Handle(anomalyPoints []models.AnomalyPoint, from string, inh
eventCopy := event.DeepCopy()
event = dispatch.HandleEventPipeline(cachedRule.PipelineConfigs, eventCopy, event, dispatch.EventProcessorCache, p.ctx, cachedRule.Id, "alert_rule")
if event == nil {
logger.Infof("rule_eval:%s is muted drop by pipeline event:%v", p.Key(), eventCopy)
logger.Infof("alert_eval_%d datasource_%d is muted drop by pipeline event:%s", p.rule.Id, p.datasourceId, eventCopy.Hash)
continue
}
// event mute
isMuted, detail, muteId := mute.IsMuted(cachedRule, event, p.TargetCache, p.alertMuteCache)
if isMuted {
logger.Infof("rule_eval:%s is muted, detail:%s event:%v", p.Key(), detail, event)
logger.Infof("alert_eval_%d datasource_%d is muted, detail:%s event:%s", p.rule.Id, p.datasourceId, detail, event.Hash)
p.Stats.CounterMuteTotal.WithLabelValues(
fmt.Sprintf("%v", event.GroupName),
fmt.Sprintf("%v", p.rule.Id),
@@ -174,7 +174,7 @@ func (p *Processor) Handle(anomalyPoints []models.AnomalyPoint, from string, inh
}
if dispatch.EventMuteHook(event) {
logger.Infof("rule_eval:%s is muted by hook event:%v", p.Key(), event)
logger.Infof("alert_eval_%d datasource_%d is muted by hook event:%s", p.rule.Id, p.datasourceId, event.Hash)
p.Stats.CounterMuteTotal.WithLabelValues(
fmt.Sprintf("%v", event.GroupName),
fmt.Sprintf("%v", p.rule.Id),
@@ -247,7 +247,7 @@ func (p *Processor) BuildEvent(anomalyPoint models.AnomalyPoint, from string, no
if err := json.Unmarshal([]byte(p.rule.Annotations), &event.AnnotationsJSON); err != nil {
event.AnnotationsJSON = make(map[string]string) // 解析失败时使用空 map
logger.Warningf("unmarshal annotations json failed: %v, rule: %d", err, p.rule.Id)
logger.Warningf("alert_eval_%d datasource_%d unmarshal annotations json failed: %v", p.rule.Id, p.datasourceId, err)
}
if event.TriggerValues != "" && strings.Count(event.TriggerValues, "$") > 1 {
@@ -272,7 +272,7 @@ func (p *Processor) BuildEvent(anomalyPoint models.AnomalyPoint, from string, no
pt.GroupNames = p.BusiGroupCache.GetNamesByBusiGroupIds(pt.GroupIds)
event.Target = pt
} else {
logger.Infof("fill event target error, ident: %s doesn't exist in cache.", event.TargetIdent)
logger.Infof("alert_eval_%d datasource_%d fill event target error, ident: %s doesn't exist in cache.", p.rule.Id, p.datasourceId, event.TargetIdent)
}
}
@@ -371,19 +371,19 @@ func (p *Processor) RecoverSingle(byRecover bool, hash string, now int64, value
lastPendingEvent, has := p.pendingsUseByRecover.Get(hash)
if !has {
// 说明没有产生过异常点,就不需要恢复了
logger.Debugf("rule_eval:%s event:%v do not has pending event, not recover", p.Key(), event)
logger.Debugf("alert_eval_%d datasource_%d event:%s do not has pending event, not recover", p.rule.Id, p.datasourceId, event.Hash)
return
}
if now-lastPendingEvent.LastEvalTime < cachedRule.RecoverDuration {
logger.Debugf("rule_eval:%s event:%v not recover", p.Key(), event)
logger.Debugf("alert_eval_%d datasource_%d event:%s not recover", p.rule.Id, p.datasourceId, event.Hash)
return
}
}
// 如果设置了恢复条件,则不能在此处恢复,必须依靠 recoverPoint 来恢复
if event.RecoverConfig.JudgeType != models.Origin && !byRecover {
logger.Debugf("rule_eval:%s event:%v not recover", p.Key(), event)
logger.Debugf("alert_eval_%d datasource_%d event:%s not recover", p.rule.Id, p.datasourceId, event.Hash)
return
}
@@ -460,7 +460,7 @@ func (p *Processor) handleEvent(events []*models.AlertCurEvent) {
func (p *Processor) inhibitEvent(events []*models.AlertCurEvent, highSeverity int) {
for _, event := range events {
if p.inhibit && event.Severity > highSeverity {
logger.Debugf("rule_eval:%s event:%+v inhibit highSeverity:%d", p.Key(), event, highSeverity)
logger.Debugf("alert_eval_%d datasource_%d event:%s inhibit highSeverity:%d", p.rule.Id, p.datasourceId, event.Hash, highSeverity)
continue
}
p.fireEvent(event)
@@ -476,7 +476,7 @@ func (p *Processor) fireEvent(event *models.AlertCurEvent) {
message := "unknown"
defer func() {
logger.Infof("rule_eval:%s event-hash-%s %s", p.Key(), event.Hash, message)
logger.Infof("alert_eval_%d datasource_%d event-hash-%s %s", p.rule.Id, p.datasourceId, event.Hash, message)
}()
if fired, has := p.fires.Get(event.Hash); has {
@@ -527,7 +527,7 @@ func (p *Processor) pushEventToQueue(e *models.AlertCurEvent) {
dispatch.LogEvent(e, "push_queue")
if !queue.EventQueue.PushFront(e) {
logger.Warningf("event_push_queue: queue is full, event:%+v", e)
logger.Warningf("alert_eval_%d datasource_%d event_push_queue: queue is full, event:%s", p.rule.Id, p.datasourceId, e.Hash)
p.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", p.DatasourceId()), "push_event_queue", p.BusiGroupCache.GetNameByBusiGroupId(p.rule.GroupId), fmt.Sprintf("%v", p.rule.Id)).Inc()
}
}
@@ -538,7 +538,7 @@ func (p *Processor) RecoverAlertCurEventFromDb() {
curEvents, err := models.AlertCurEventGetByRuleIdAndDsId(p.ctx, p.rule.Id, p.datasourceId)
if err != nil {
logger.Errorf("recover event from db for rule:%s failed, err:%s", p.Key(), err)
logger.Errorf("alert_eval_%d datasource_%d recover event from db failed, err:%s", p.rule.Id, p.datasourceId, err)
p.Stats.CounterRuleEvalErrorTotal.WithLabelValues(fmt.Sprintf("%v", p.DatasourceId()), "get_recover_event", p.BusiGroupCache.GetNameByBusiGroupId(p.rule.GroupId), fmt.Sprintf("%v", p.rule.Id)).Inc()
p.fires = NewAlertCurEventMap(nil)
return

View File

@@ -22,10 +22,11 @@ type Router struct {
AlertStats *astats.Stats
Ctx *ctx.Context
ExternalProcessors *process.ExternalProcessorsType
LogDir string
}
func New(httpConfig httpx.Config, alert aconf.Alert, amc *memsto.AlertMuteCacheType, tc *memsto.TargetCacheType, bgc *memsto.BusiGroupCacheType,
astats *astats.Stats, ctx *ctx.Context, externalProcessors *process.ExternalProcessorsType) *Router {
astats *astats.Stats, ctx *ctx.Context, externalProcessors *process.ExternalProcessorsType, logDir string) *Router {
return &Router{
HTTP: httpConfig,
Alert: alert,
@@ -35,6 +36,7 @@ func New(httpConfig httpx.Config, alert aconf.Alert, amc *memsto.AlertMuteCacheT
AlertStats: astats,
Ctx: ctx,
ExternalProcessors: externalProcessors,
LogDir: logDir,
}
}
@@ -50,6 +52,9 @@ func (rt *Router) Config(r *gin.Engine) {
service.POST("/event", rt.pushEventToQueue)
service.POST("/event-persist", rt.eventPersist)
service.POST("/make-event", rt.makeEvent)
service.GET("/event-detail/:hash", rt.eventDetail)
service.GET("/alert-eval-detail/:id", rt.alertEvalDetail)
service.GET("/trace-logs/:traceid", rt.traceLogs)
}
func Render(c *gin.Context, data, msg interface{}) {

View File

@@ -0,0 +1,28 @@
package router
import (
"fmt"
"github.com/ccfos/nightingale/v6/pkg/loggrep"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
)
func (rt *Router) alertEvalDetail(c *gin.Context) {
id := ginx.UrlParamStr(c, "id")
if !loggrep.IsValidRuleID(id) {
ginx.Bomb(200, "invalid rule id format")
}
instance := fmt.Sprintf("%s:%d", rt.Alert.Heartbeat.IP, rt.HTTP.Port)
keyword := fmt.Sprintf("alert_eval_%s", id)
logs, err := loggrep.GrepLogDir(rt.LogDir, keyword)
ginx.Dangerous(err)
ginx.NewRender(c).Data(loggrep.EventDetailResp{
Logs: logs,
Instance: instance,
}, nil)
}

View File

@@ -13,9 +13,9 @@ import (
"github.com/ccfos/nightingale/v6/alert/queue"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/poster"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
@@ -75,7 +75,7 @@ func (rt *Router) pushEventToQueue(c *gin.Context) {
dispatch.LogEvent(event, "http_push_queue")
if !queue.EventQueue.PushFront(event) {
msg := fmt.Sprintf("event:%+v push_queue err: queue is full", event)
msg := fmt.Sprintf("event:%s push_queue err: queue is full", event.Hash)
ginx.Bomb(200, msg)
logger.Warningf(msg)
}
@@ -105,21 +105,21 @@ func (rt *Router) makeEvent(c *gin.Context) {
for i := 0; i < len(events); i++ {
node, err := naming.DatasourceHashRing.GetNode(strconv.FormatInt(events[i].DatasourceId, 10), fmt.Sprintf("%d", events[i].RuleId))
if err != nil {
logger.Warningf("event:%+v get node err:%v", events[i], err)
logger.Warningf("event(rule_id=%d ds_id=%d) get node err:%v", events[i].RuleId, events[i].DatasourceId, err)
ginx.Bomb(200, "event node not exists")
}
if node != rt.Alert.Heartbeat.Endpoint {
err := forwardEvent(events[i], node)
if err != nil {
logger.Warningf("event:%+v forward err:%v", events[i], err)
logger.Warningf("event(rule_id=%d ds_id=%d) forward err:%v", events[i].RuleId, events[i].DatasourceId, err)
ginx.Bomb(200, "event forward error")
}
continue
}
ruleWorker, exists := rt.ExternalProcessors.GetExternalAlertRule(events[i].DatasourceId, events[i].RuleId)
logger.Debugf("handle event:%+v exists:%v", events[i], exists)
logger.Debugf("handle event(rule_id=%d ds_id=%d) exists:%v", events[i].RuleId, events[i].DatasourceId, exists)
if !exists {
ginx.Bomb(200, "rule not exists")
}
@@ -143,6 +143,6 @@ func forwardEvent(event *eventForm, instance string) error {
if err != nil {
return err
}
logger.Infof("forward event: result=succ url=%s code=%d event:%v response=%s", ur, code, event, string(res))
logger.Infof("forward event: result=succ url=%s code=%d rule_id=%d response=%s", ur, code, event.RuleId, string(res))
return nil
}

View File

@@ -0,0 +1,27 @@
package router
import (
"fmt"
"github.com/ccfos/nightingale/v6/pkg/loggrep"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
)
func (rt *Router) eventDetail(c *gin.Context) {
hash := ginx.UrlParamStr(c, "hash")
if !loggrep.IsValidHash(hash) {
ginx.Bomb(200, "invalid hash format")
}
instance := fmt.Sprintf("%s:%d", rt.Alert.Heartbeat.IP, rt.HTTP.Port)
logs, err := loggrep.GrepLogDir(rt.LogDir, hash)
ginx.Dangerous(err)
ginx.NewRender(c).Data(loggrep.EventDetailResp{
Logs: logs,
Instance: instance,
}, nil)
}

View File

@@ -0,0 +1,28 @@
package router
import (
"fmt"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/ccfos/nightingale/v6/pkg/loggrep"
"github.com/gin-gonic/gin"
)
func (rt *Router) traceLogs(c *gin.Context) {
traceId := ginx.UrlParamStr(c, "traceid")
if !loggrep.IsValidTraceID(traceId) {
ginx.Bomb(200, "invalid trace id format")
}
instance := fmt.Sprintf("%s:%d", rt.Alert.Heartbeat.IP, rt.HTTP.Port)
keyword := "trace_id=" + traceId
logs, err := loggrep.GrepLatestLogFiles(rt.LogDir, keyword)
ginx.Dangerous(err)
ginx.NewRender(c).Data(loggrep.EventDetailResp{
Logs: logs,
Instance: instance,
}, nil)
}

View File

@@ -205,6 +205,6 @@ func PushCallbackEvent(ctx *ctx.Context, webhook *models.Webhook, event *models.
succ := queue.eventQueue.Push(event)
if !succ {
logger.Warningf("Write channel(%s) full, current channel size: %d event:%v", webhook.Url, queue.eventQueue.Len(), event)
logger.Warningf("Write channel(%s) full, current channel size: %d event:%s", webhook.Url, queue.eventQueue.Len(), event.Hash)
}
}

View File

@@ -30,14 +30,14 @@ type IbexCallBacker struct {
func (c *IbexCallBacker) CallBack(ctx CallBackContext) {
if len(ctx.CallBackURL) == 0 || len(ctx.Events) == 0 {
logger.Warningf("event_callback_ibex: url or events is empty, url: %s, events: %+v", ctx.CallBackURL, ctx.Events)
logger.Warningf("event_callback_ibex: url or events is empty, url: %s", ctx.CallBackURL)
return
}
event := ctx.Events[0]
if event.IsRecovered {
logger.Infof("event_callback_ibex: event is recovered, event: %+v", event)
logger.Infof("event_callback_ibex: event is recovered, event: %s", event.Hash)
return
}
@@ -45,9 +45,9 @@ func (c *IbexCallBacker) CallBack(ctx CallBackContext) {
}
func (c *IbexCallBacker) handleIbex(ctx *ctx.Context, url string, event *models.AlertCurEvent) {
logger.Infof("event_callback_ibex: url: %s, event: %+v", url, event)
logger.Infof("event_callback_ibex: url: %s, event: %s", url, event.Hash)
if imodels.DB() == nil && ctx.IsCenter {
logger.Warningf("event_callback_ibex: db is nil, event: %+v", event)
logger.Warningf("event_callback_ibex: db is nil, event: %s", event.Hash)
return
}
@@ -66,7 +66,7 @@ func (c *IbexCallBacker) handleIbex(ctx *ctx.Context, url string, event *models.
id, err := strconv.ParseInt(idstr, 10, 64)
if err != nil {
logger.Errorf("event_callback_ibex: failed to parse url: %s event: %+v", url, event)
logger.Errorf("event_callback_ibex: failed to parse url: %s event: %s", url, event.Hash)
return
}
@@ -82,7 +82,7 @@ func (c *IbexCallBacker) handleIbex(ctx *ctx.Context, url string, event *models.
}
if host == "" {
logger.Errorf("event_callback_ibex: failed to get host, id: %d, event: %+v", id, event)
logger.Errorf("event_callback_ibex: failed to get host, id: %d, event: %s", id, event.Hash)
return
}
@@ -92,11 +92,11 @@ func (c *IbexCallBacker) handleIbex(ctx *ctx.Context, url string, event *models.
func CallIbex(ctx *ctx.Context, id int64, host string,
taskTplCache *memsto.TaskTplCache, targetCache *memsto.TargetCacheType,
userCache *memsto.UserCacheType, event *models.AlertCurEvent, args string) (int64, error) {
logger.Infof("event_callback_ibex: id: %d, host: %s, args: %s, event: %+v", id, host, args, event)
logger.Infof("event_callback_ibex: id: %d, host: %s, args: %s, event: %s", id, host, args, event.Hash)
tpl := taskTplCache.Get(id)
if tpl == nil {
err := fmt.Errorf("event_callback_ibex: no such tpl(%d), event: %+v", id, event)
err := fmt.Errorf("event_callback_ibex: no such tpl(%d), event: %s", id, event.Hash)
logger.Errorf("%s", err)
return 0, err
}
@@ -104,13 +104,13 @@ func CallIbex(ctx *ctx.Context, id int64, host string,
// tpl.GroupId - host - account 三元组校验权限
can, err := CanDoIbex(tpl.UpdateBy, tpl, host, targetCache, userCache)
if err != nil {
err = fmt.Errorf("event_callback_ibex: check perm fail: %v, event: %+v", err, event)
err = fmt.Errorf("event_callback_ibex: check perm fail: %v, event: %s", err, event.Hash)
logger.Errorf("%s", err)
return 0, err
}
if !can {
err = fmt.Errorf("event_callback_ibex: user(%s) no permission, event: %+v", tpl.UpdateBy, event)
err = fmt.Errorf("event_callback_ibex: user(%s) no permission, event: %s", tpl.UpdateBy, event.Hash)
logger.Errorf("%s", err)
return 0, err
}
@@ -136,7 +136,7 @@ func CallIbex(ctx *ctx.Context, id int64, host string,
tags, err := json.Marshal(tagsMap)
if err != nil {
err = fmt.Errorf("event_callback_ibex: failed to marshal tags to json: %v, event: %+v", tagsMap, event)
err = fmt.Errorf("event_callback_ibex: failed to marshal tags to json: %v, event: %s", tagsMap, event.Hash)
logger.Errorf("%s", err)
return 0, err
}
@@ -164,7 +164,7 @@ func CallIbex(ctx *ctx.Context, id int64, host string,
id, err = TaskAdd(in, tpl.UpdateBy, ctx.IsCenter)
if err != nil {
err = fmt.Errorf("event_callback_ibex: call ibex fail: %v, event: %+v", err, event)
err = fmt.Errorf("event_callback_ibex: call ibex fail: %v, event: %s", err, event.Hash)
logger.Errorf("%s", err)
return 0, err
}
@@ -187,7 +187,7 @@ func CallIbex(ctx *ctx.Context, id int64, host string,
}
if err = record.Add(ctx); err != nil {
err = fmt.Errorf("event_callback_ibex: persist task_record fail: %v, event: %+v", err, event)
err = fmt.Errorf("event_callback_ibex: persist task_record fail: %v, event: %s", err, event.Hash)
logger.Errorf("%s", err)
return id, err
}

View File

@@ -72,7 +72,7 @@ func sendWebhook(webhook *models.Webhook, event interface{}, stats *astats.Stats
}
bs, err := json.Marshal(event)
if err != nil {
logger.Errorf("%s alertingWebhook failed to marshal event:%+v err:%v", channel, event, err)
logger.Errorf("%s alertingWebhook failed to marshal event err:%v", channel, err)
return false, "", err
}
@@ -145,7 +145,7 @@ func SingleSendWebhooks(ctx *ctx.Context, webhooks map[string]*models.Webhook, e
func BatchSendWebhooks(ctx *ctx.Context, webhooks map[string]*models.Webhook, event *models.AlertCurEvent, stats *astats.Stats) {
for _, conf := range webhooks {
logger.Infof("push event:%+v to queue:%v", event, conf)
logger.Infof("push event:%s to queue:%v", event.Hash, conf)
PushEvent(ctx, conf, event, stats)
}
}
@@ -183,7 +183,7 @@ func PushEvent(ctx *ctx.Context, webhook *models.Webhook, event *models.AlertCur
succ := queue.eventQueue.Push(event)
if !succ {
stats.AlertNotifyErrorTotal.WithLabelValues("push_event_queue").Inc()
logger.Warningf("Write channel(%s) full, current channel size: %d event:%v", webhook.Url, queue.eventQueue.Len(), event)
logger.Warningf("Write channel(%s) full, current channel size: %d event:%s", webhook.Url, queue.eventQueue.Len(), event.Hash)
}
}

View File

@@ -21,6 +21,12 @@ type Center struct {
CleanPipelineExecutionDay int
MigrateBusiGroupLabel bool
RSA httpx.RSAConfig
AIAgent AIAgent
}
type AIAgent struct {
Enable bool `toml:"Enable"`
SkillsPath string `toml:"SkillsPath"`
}
type Plugin struct {

View File

@@ -300,6 +300,14 @@ ops:
cname: View Alerting Engines
- name: /system/version
cname: View Product Version
- name: /ai-config/agents
cname: AI Config - Agents
- name: /ai-config/llm-configs
cname: AI Config - LLM Configs
- name: /ai-config/skills
cname: AI Config - Skills
- name: /ai-config/mcp-servers
cname: AI Config - MCP Servers
`
)

View File

@@ -136,10 +136,10 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
go cron.CleanNotifyRecord(ctx, config.Center.CleanNotifyRecordDay)
go cron.CleanPipelineExecution(ctx, config.Center.CleanPipelineExecutionDay)
alertrtRouter := alertrt.New(config.HTTP, config.Alert, alertMuteCache, targetCache, busiGroupCache, alertStats, ctx, externalProcessors)
alertrtRouter := alertrt.New(config.HTTP, config.Alert, alertMuteCache, targetCache, busiGroupCache, alertStats, ctx, externalProcessors, config.Log.Dir)
centerRouter := centerrt.New(config.HTTP, config.Center, config.Alert, config.Ibex,
cconf.Operations, dsCache, notifyConfigCache, promClients,
redis, sso, ctx, metas, idents, targetCache, userCache, userGroupCache, userTokenCache)
redis, sso, ctx, metas, idents, targetCache, userCache, userGroupCache, userTokenCache, config.Log.Dir)
pushgwRouter := pushgwrt.New(config.HTTP, config.Pushgw, config.Alert, targetCache, busiGroupCache, idents, metas, writers, ctx)
r := httpx.GinEngine(config.Global.RunMode, config.HTTP, configCvalCache.PrintBodyPaths, configCvalCache.PrintAccessLog)

View File

@@ -24,11 +24,11 @@ import (
"github.com/ccfos/nightingale/v6/prom"
"github.com/ccfos/nightingale/v6/pushgw/idents"
"github.com/ccfos/nightingale/v6/storage"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"gorm.io/gorm"
"github.com/gin-gonic/gin"
"github.com/rakyll/statik/fs"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
"github.com/toolkits/pkg/runner"
)
@@ -51,6 +51,7 @@ type Router struct {
UserGroupCache *memsto.UserGroupCacheType
UserTokenCache *memsto.UserTokenCacheType
Ctx *ctx.Context
LogDir string
HeartbeatHook HeartbeatHookFunc
TargetDeleteHook models.TargetDeleteHookFunc
@@ -61,7 +62,7 @@ func New(httpConfig httpx.Config, center cconf.Center, alert aconf.Alert, ibex c
operations cconf.Operation, ds *memsto.DatasourceCacheType, ncc *memsto.NotifyConfigCacheType,
pc *prom.PromClientMap, redis storage.Redis,
sso *sso.SsoClient, ctx *ctx.Context, metaSet *metas.Set, idents *idents.Set,
tc *memsto.TargetCacheType, uc *memsto.UserCacheType, ugc *memsto.UserGroupCacheType, utc *memsto.UserTokenCacheType) *Router {
tc *memsto.TargetCacheType, uc *memsto.UserCacheType, ugc *memsto.UserGroupCacheType, utc *memsto.UserTokenCacheType, logDir string) *Router {
return &Router{
HTTP: httpConfig,
Center: center,
@@ -80,6 +81,7 @@ func New(httpConfig httpx.Config, center cconf.Center, alert aconf.Alert, ibex c
UserGroupCache: ugc,
UserTokenCache: utc,
Ctx: ctx,
LogDir: logDir,
HeartbeatHook: func(ident string) map[string]interface{} { return nil },
TargetDeleteHook: func(tx *gorm.DB, idents []string) error { return nil },
AlertRuleModifyHook: func(ar *models.AlertRule) {},
@@ -368,6 +370,7 @@ func (rt *Router) Config(r *gin.Engine) {
// pages.GET("/alert-rules/builtin/alerts-cates", rt.auth(), rt.user(), rt.builtinAlertCateGets)
// pages.GET("/alert-rules/builtin/list", rt.auth(), rt.user(), rt.builtinAlertRules)
pages.GET("/alert-rules/callbacks", rt.auth(), rt.user(), rt.alertRuleCallbacks)
pages.GET("/timezones", rt.auth(), rt.user(), rt.timezonesGet)
pages.GET("/busi-groups/alert-rules", rt.auth(), rt.user(), rt.perm("/alert-rules"), rt.alertRuleGetsByGids)
pages.GET("/busi-group/:id/alert-rules", rt.auth(), rt.user(), rt.perm("/alert-rules"), rt.alertRuleGets)
@@ -416,6 +419,9 @@ func (rt *Router) Config(r *gin.Engine) {
pages.GET("/alert-cur-event/:eid", rt.alertCurEventGet)
pages.GET("/alert-his-event/:eid", rt.alertHisEventGet)
pages.GET("/event-notify-records/:eid", rt.notificationRecordList)
pages.GET("/event-detail/:hash", rt.eventDetailPage)
pages.GET("/alert-eval-detail/:id", rt.alertEvalDetailPage)
pages.GET("/trace-logs/:traceid", rt.traceLogsPage)
// card logic
pages.GET("/alert-cur-events/list", rt.auth(), rt.user(), rt.alertCurEventsList)
@@ -514,6 +520,50 @@ func (rt *Router) Config(r *gin.Engine) {
pages.PUT("/config", rt.auth(), rt.admin(), rt.configPutByKey)
pages.GET("/site-info", rt.siteInfo)
// AI Config management
pages.GET("/ai-agents", rt.auth(), rt.admin(), rt.aiAgentGets)
pages.GET("/ai-agent/:id", rt.auth(), rt.admin(), rt.aiAgentGet)
pages.POST("/ai-agents", rt.auth(), rt.admin(), rt.aiAgentAdd)
pages.PUT("/ai-agent/:id", rt.auth(), rt.admin(), rt.aiAgentPut)
pages.DELETE("/ai-agent/:id", rt.auth(), rt.admin(), rt.aiAgentDel)
pages.GET("/ai-llm-configs", rt.auth(), rt.admin(), rt.aiLLMConfigGets)
pages.GET("/ai-llm-config/:id", rt.auth(), rt.admin(), rt.aiLLMConfigGet)
pages.POST("/ai-llm-configs", rt.auth(), rt.admin(), rt.aiLLMConfigAdd)
pages.PUT("/ai-llm-config/:id", rt.auth(), rt.admin(), rt.aiLLMConfigPut)
pages.DELETE("/ai-llm-config/:id", rt.auth(), rt.admin(), rt.aiLLMConfigDel)
pages.POST("/ai-llm-config/test", rt.auth(), rt.admin(), rt.aiLLMConfigTest)
pages.GET("/ai-skills", rt.auth(), rt.admin(), rt.aiSkillGets)
pages.GET("/ai-skill/:id", rt.auth(), rt.admin(), rt.aiSkillGet)
pages.POST("/ai-skills", rt.auth(), rt.admin(), rt.aiSkillAdd)
pages.PUT("/ai-skill/:id", rt.auth(), rt.admin(), rt.aiSkillPut)
pages.DELETE("/ai-skill/:id", rt.auth(), rt.admin(), rt.aiSkillDel)
pages.POST("/ai-skills/import", rt.auth(), rt.admin(), rt.aiSkillImport)
pages.POST("/ai-skill/:id/files", rt.auth(), rt.admin(), rt.aiSkillFileAdd)
pages.GET("/ai-skill-file/:fileId", rt.auth(), rt.admin(), rt.aiSkillFileGet)
pages.DELETE("/ai-skill-file/:fileId", rt.auth(), rt.admin(), rt.aiSkillFileDel)
pages.GET("/mcp-servers", rt.auth(), rt.admin(), rt.mcpServerGets)
pages.GET("/mcp-server/:id", rt.auth(), rt.admin(), rt.mcpServerGet)
pages.POST("/mcp-servers", rt.auth(), rt.admin(), rt.mcpServerAdd)
pages.PUT("/mcp-server/:id", rt.auth(), rt.admin(), rt.mcpServerPut)
pages.DELETE("/mcp-server/:id", rt.auth(), rt.admin(), rt.mcpServerDel)
pages.POST("/ai-agent/:id/test", rt.auth(), rt.admin(), rt.aiAgentTest)
pages.POST("/mcp-server/test", rt.auth(), rt.admin(), rt.mcpServerTest)
pages.GET("/mcp-server/:id/tools", rt.auth(), rt.admin(), rt.mcpServerTools)
// AI Conversations
pages.GET("/ai-conversations", rt.auth(), rt.user(), rt.aiConversationGets)
pages.POST("/ai-conversations", rt.auth(), rt.user(), rt.aiConversationAdd)
pages.GET("/ai-conversation/:id", rt.auth(), rt.user(), rt.aiConversationGet)
pages.PUT("/ai-conversation/:id", rt.auth(), rt.user(), rt.aiConversationPut)
pages.DELETE("/ai-conversation/:id", rt.auth(), rt.user(), rt.aiConversationDel)
pages.POST("/ai-conversation/:id/messages", rt.auth(), rt.user(), rt.aiConversationMessageAdd)
// AI chat (SSE), dispatches by action_key
pages.POST("/ai-chat", rt.auth(), rt.user(), rt.aiChat)
// source token 相关路由
pages.POST("/source-token", rt.auth(), rt.user(), rt.sourceTokenAdd)

View File

@@ -0,0 +1,747 @@
package router
import (
"bytes"
"crypto/tls"
"encoding/json"
"fmt"
"io"
"net/http"
"net/url"
"path/filepath"
"strings"
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"gopkg.in/yaml.v3"
)
// ========================
// AI Agent handlers
// ========================
func (rt *Router) aiAgentGets(c *gin.Context) {
lst, err := models.AIAgentGets(rt.Ctx)
ginx.Dangerous(err)
ginx.NewRender(c).Data(lst, nil)
}
func (rt *Router) aiAgentGet(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AIAgentGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "ai agent not found")
}
ginx.NewRender(c).Data(obj, nil)
}
func (rt *Router) aiAgentAdd(c *gin.Context) {
var obj models.AIAgent
ginx.BindJSON(c, &obj)
ginx.Dangerous(obj.Verify())
me := c.MustGet("user").(*models.User)
ginx.Dangerous(obj.Create(rt.Ctx, me.Username))
ginx.NewRender(c).Data(obj.Id, nil)
}
func (rt *Router) aiAgentPut(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AIAgentGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "ai agent not found")
}
var ref models.AIAgent
ginx.BindJSON(c, &ref)
ginx.Dangerous(ref.Verify())
me := c.MustGet("user").(*models.User)
ginx.NewRender(c).Message(obj.Update(rt.Ctx, me.Username, ref))
}
func (rt *Router) aiAgentDel(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AIAgentGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "ai agent not found")
}
ginx.NewRender(c).Message(obj.Delete(rt.Ctx))
}
// ========================
// AI Skill handlers
// ========================
func (rt *Router) aiSkillGets(c *gin.Context) {
search := ginx.QueryStr(c, "search", "")
lst, err := models.AISkillGets(rt.Ctx, search)
ginx.Dangerous(err)
ginx.NewRender(c).Data(lst, nil)
}
func (rt *Router) aiSkillGet(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AISkillGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "ai skill not found")
}
// Include associated files (without content)
files, err := models.AISkillFileGets(rt.Ctx, id)
ginx.Dangerous(err)
obj.Files = files
ginx.NewRender(c).Data(obj, nil)
}
func (rt *Router) aiSkillAdd(c *gin.Context) {
var obj models.AISkill
ginx.BindJSON(c, &obj)
ginx.Dangerous(obj.Verify())
me := c.MustGet("user").(*models.User)
obj.CreatedBy = me.Username
obj.UpdatedBy = me.Username
ginx.Dangerous(obj.Create(rt.Ctx))
ginx.NewRender(c).Data(obj.Id, nil)
}
func (rt *Router) aiSkillPut(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AISkillGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "ai skill not found")
}
var ref models.AISkill
ginx.BindJSON(c, &ref)
ginx.Dangerous(ref.Verify())
me := c.MustGet("user").(*models.User)
ref.UpdatedBy = me.Username
ginx.NewRender(c).Message(obj.Update(rt.Ctx, ref))
}
func (rt *Router) aiSkillDel(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AISkillGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "ai skill not found")
}
// Cascade delete skill files
ginx.Dangerous(models.AISkillFileDeleteBySkillId(rt.Ctx, id))
ginx.NewRender(c).Message(obj.Delete(rt.Ctx))
}
func (rt *Router) aiSkillImport(c *gin.Context) {
file, header, err := c.Request.FormFile("file")
ginx.Dangerous(err)
defer file.Close()
ext := strings.ToLower(filepath.Ext(header.Filename))
if ext != ".md" {
ginx.Bomb(http.StatusBadRequest, "only .md files are supported")
}
content, err := io.ReadAll(file)
ginx.Dangerous(err)
meta, instructions := parseSkillMarkdown(string(content), header.Filename, ext)
me := c.MustGet("user").(*models.User)
skill := models.AISkill{
Name: meta.Name,
Description: meta.Description,
Instructions: instructions,
License: meta.License,
Compatibility: meta.Compatibility,
Metadata: meta.Metadata,
AllowedTools: meta.AllowedTools,
CreatedBy: me.Username,
UpdatedBy: me.Username,
}
ginx.Dangerous(skill.Create(rt.Ctx))
ginx.NewRender(c).Data(skill.Id, nil)
}
// parseSkillMarkdown parses a SKILL.md file with optional YAML frontmatter.
// Frontmatter format:
//
// ---
// name: my-skill
// description: what this skill does
// ---
// # Actual instructions content...
type skillFrontmatter struct {
Name string `yaml:"name"`
Description string `yaml:"description"`
License string `yaml:"license"`
Compatibility string `yaml:"compatibility"`
Metadata map[string]string `yaml:"metadata"`
AllowedTools string `yaml:"allowed-tools"`
}
func parseSkillMarkdown(content, filename, ext string) (meta skillFrontmatter, instructions string) {
text := strings.TrimSpace(content)
// Try to parse YAML frontmatter (between --- delimiters)
if strings.HasPrefix(text, "---") {
endIdx := strings.Index(text[3:], "\n---")
if endIdx >= 0 {
frontmatter := text[3 : 3+endIdx]
body := strings.TrimSpace(text[3+endIdx+4:]) // skip past closing ---
if yaml.Unmarshal([]byte(frontmatter), &meta) == nil && meta.Name != "" {
return meta, body
}
}
}
// No valid frontmatter, fallback: filename as name, entire content as instructions
meta.Name = strings.TrimSuffix(filename, ext)
return meta, content
}
// ========================
// AI Skill File handlers
// ========================
func (rt *Router) aiSkillFileAdd(c *gin.Context) {
skillId := ginx.UrlParamInt64(c, "id")
// Verify skill exists
skill, err := models.AISkillGetById(rt.Ctx, skillId)
ginx.Dangerous(err)
if skill == nil {
ginx.Bomb(http.StatusNotFound, "ai skill not found")
}
file, header, err := c.Request.FormFile("file")
ginx.Dangerous(err)
defer file.Close()
// Validate file extension
ext := strings.ToLower(filepath.Ext(header.Filename))
allowed := map[string]bool{".md": true, ".txt": true, ".json": true, ".yaml": true, ".yml": true, ".csv": true}
if !allowed[ext] {
ginx.Bomb(http.StatusBadRequest, "file type not allowed, only .md/.txt/.json/.yaml/.csv")
}
// Validate file size (2MB max)
if header.Size > 2*1024*1024 {
ginx.Bomb(http.StatusBadRequest, "file size exceeds 2MB limit")
}
content, err := io.ReadAll(file)
ginx.Dangerous(err)
me := c.MustGet("user").(*models.User)
skillFile := models.AISkillFile{
SkillId: skillId,
Name: header.Filename,
Content: string(content),
CreatedBy: me.Username,
}
ginx.Dangerous(skillFile.Create(rt.Ctx))
ginx.NewRender(c).Data(skillFile.Id, nil)
}
func (rt *Router) aiSkillFileGet(c *gin.Context) {
fileId := ginx.UrlParamInt64(c, "fileId")
obj, err := models.AISkillFileGetById(rt.Ctx, fileId)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "file not found")
}
ginx.NewRender(c).Data(obj, nil)
}
func (rt *Router) aiSkillFileDel(c *gin.Context) {
fileId := ginx.UrlParamInt64(c, "fileId")
obj, err := models.AISkillFileGetById(rt.Ctx, fileId)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "file not found")
}
ginx.NewRender(c).Message(obj.Delete(rt.Ctx))
}
// ========================
// MCP Server handlers
// ========================
func (rt *Router) mcpServerGets(c *gin.Context) {
lst, err := models.MCPServerGets(rt.Ctx)
ginx.Dangerous(err)
ginx.NewRender(c).Data(lst, nil)
}
func (rt *Router) mcpServerGet(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.MCPServerGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "mcp server not found")
}
ginx.NewRender(c).Data(obj, nil)
}
func (rt *Router) mcpServerAdd(c *gin.Context) {
var obj models.MCPServer
ginx.BindJSON(c, &obj)
ginx.Dangerous(obj.Verify())
me := c.MustGet("user").(*models.User)
obj.CreatedBy = me.Username
obj.UpdatedBy = me.Username
ginx.Dangerous(obj.Create(rt.Ctx))
ginx.NewRender(c).Data(obj.Id, nil)
}
func (rt *Router) mcpServerPut(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.MCPServerGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "mcp server not found")
}
var ref models.MCPServer
ginx.BindJSON(c, &ref)
ginx.Dangerous(ref.Verify())
me := c.MustGet("user").(*models.User)
ref.UpdatedBy = me.Username
ginx.NewRender(c).Message(obj.Update(rt.Ctx, ref))
}
func (rt *Router) mcpServerDel(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.MCPServerGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "mcp server not found")
}
ginx.NewRender(c).Message(obj.Delete(rt.Ctx))
}
// ========================
// AI LLM Config handlers
// ========================
func (rt *Router) aiLLMConfigGets(c *gin.Context) {
lst, err := models.AILLMConfigGets(rt.Ctx)
ginx.Dangerous(err)
ginx.NewRender(c).Data(lst, nil)
}
func (rt *Router) aiLLMConfigGet(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AILLMConfigGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "ai llm config not found")
}
ginx.NewRender(c).Data(obj, nil)
}
func (rt *Router) aiLLMConfigAdd(c *gin.Context) {
var obj models.AILLMConfig
ginx.BindJSON(c, &obj)
ginx.Dangerous(obj.Verify())
me := c.MustGet("user").(*models.User)
ginx.Dangerous(obj.Create(rt.Ctx, me.Username))
ginx.NewRender(c).Data(obj.Id, nil)
}
func (rt *Router) aiLLMConfigPut(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AILLMConfigGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "ai llm config not found")
}
var ref models.AILLMConfig
ginx.BindJSON(c, &ref)
ginx.Dangerous(ref.Verify())
me := c.MustGet("user").(*models.User)
ginx.NewRender(c).Message(obj.Update(rt.Ctx, me.Username, ref))
}
func (rt *Router) aiLLMConfigDel(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AILLMConfigGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "ai llm config not found")
}
ginx.NewRender(c).Message(obj.Delete(rt.Ctx))
}
func (rt *Router) aiLLMConfigTest(c *gin.Context) {
var body struct {
APIType string `json:"api_type"`
APIURL string `json:"api_url"`
APIKey string `json:"api_key"`
Model string `json:"model"`
ExtraConfig models.LLMExtraConfig `json:"extra_config"`
}
ginx.BindJSON(c, &body)
if body.APIType == "" || body.APIURL == "" || body.APIKey == "" || body.Model == "" {
ginx.Bomb(http.StatusBadRequest, "api_type, api_url, api_key, model are required")
}
obj := &models.AILLMConfig{
APIType: body.APIType,
APIURL: body.APIURL,
APIKey: body.APIKey,
Model: body.Model,
ExtraConfig: body.ExtraConfig,
}
start := time.Now()
testErr := testAIAgent(obj)
durationMs := time.Since(start).Milliseconds()
result := gin.H{
"success": testErr == nil,
"duration_ms": durationMs,
}
ginx.NewRender(c).Data(result, testErr)
}
// ========================
// AI Agent test
// ========================
func (rt *Router) aiAgentTest(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
agent, err := models.AIAgentGetById(rt.Ctx, id)
ginx.Dangerous(err)
if agent == nil {
ginx.Bomb(http.StatusNotFound, "ai agent not found")
}
llmCfg, err := models.AILLMConfigGetById(rt.Ctx, agent.LLMConfigId)
ginx.Dangerous(err)
if llmCfg == nil {
ginx.Bomb(http.StatusBadRequest, "referenced LLM config not found")
}
start := time.Now()
testErr := testAIAgent(llmCfg)
durationMs := time.Since(start).Milliseconds()
result := gin.H{
"success": testErr == nil,
"duration_ms": durationMs,
}
if testErr != nil {
result["error"] = testErr.Error()
}
ginx.NewRender(c).Data(result, nil)
}
func testAIAgent(p *models.AILLMConfig) error {
extra := p.ExtraConfig
// Build HTTP client with ExtraConfig settings
timeout := 30 * time.Second
if extra.TimeoutSeconds > 0 {
timeout = time.Duration(extra.TimeoutSeconds) * time.Second
}
transport := &http.Transport{}
if extra.SkipTLSVerify {
transport.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}
}
if extra.Proxy != "" {
if proxyURL, err := url.Parse(extra.Proxy); err == nil {
transport.Proxy = http.ProxyURL(proxyURL)
}
}
client := &http.Client{Timeout: timeout, Transport: transport}
var reqURL string
var reqBody []byte
hdrs := map[string]string{"Content-Type": "application/json"}
switch p.APIType {
case "openai":
base := strings.TrimRight(p.APIURL, "/")
if strings.HasSuffix(base, "/chat/completions") {
reqURL = base
} else {
reqURL = base + "/chat/completions"
}
reqBody, _ = json.Marshal(map[string]interface{}{
"model": p.Model,
"messages": []map[string]string{{"role": "user", "content": "Hi"}},
"max_tokens": 5,
})
hdrs["Authorization"] = "Bearer " + p.APIKey
case "claude":
reqURL = strings.TrimRight(p.APIURL, "/") + "/v1/messages"
reqBody, _ = json.Marshal(map[string]interface{}{
"model": p.Model,
"messages": []map[string]string{{"role": "user", "content": "Hi"}},
"max_tokens": 5,
})
hdrs["x-api-key"] = p.APIKey
hdrs["anthropic-version"] = "2023-06-01"
case "gemini":
reqURL = strings.TrimRight(p.APIURL, "/") + "/v1beta/models/" + p.Model + ":generateContent?key=" + p.APIKey
reqBody, _ = json.Marshal(map[string]interface{}{
"contents": []map[string]interface{}{
{"parts": []map[string]string{{"text": "Hi"}}},
},
})
default:
return fmt.Errorf("unsupported api_type: %s", p.APIType)
}
req, err := http.NewRequest("POST", reqURL, bytes.NewReader(reqBody))
if err != nil {
return err
}
for k, v := range hdrs {
req.Header.Set(k, v)
}
// Apply custom headers from ExtraConfig
for k, v := range extra.CustomHeaders {
req.Header.Set(k, v)
}
resp, err := client.Do(req)
if err != nil {
return err
}
defer resp.Body.Close()
if resp.StatusCode >= 400 {
body, _ := io.ReadAll(resp.Body)
if len(body) > 500 {
body = body[:500]
}
return fmt.Errorf("HTTP %d: %s", resp.StatusCode, string(body))
}
return nil
}
// ========================
// MCP Server test & tools
// ========================
func (rt *Router) mcpServerTest(c *gin.Context) {
var body struct {
URL string `json:"url"`
Headers map[string]string `json:"headers"`
}
ginx.BindJSON(c, &body)
if body.URL == "" {
ginx.Bomb(http.StatusBadRequest, "url is required")
}
obj := &models.MCPServer{
URL: body.URL,
Headers: body.Headers,
}
start := time.Now()
tools, testErr := listMCPTools(obj)
durationMs := time.Since(start).Milliseconds()
result := gin.H{
"success": testErr == nil,
"duration_ms": durationMs,
"tool_count": len(tools),
}
if testErr != nil {
result["error"] = testErr.Error()
}
ginx.NewRender(c).Data(result, nil)
}
func (rt *Router) mcpServerTools(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.MCPServerGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "mcp server not found")
}
tools, err := listMCPTools(obj)
ginx.Dangerous(err)
ginx.NewRender(c).Data(tools, nil)
}
type mcpTool struct {
Name string `json:"name"`
Description string `json:"description"`
}
func listMCPTools(s *models.MCPServer) ([]mcpTool, error) {
client := &http.Client{Timeout: 30 * time.Second}
hdrs := s.Headers
// Step 1: Initialize
initResp, initSessionID, err := sendMCPRPC(client, s.URL, hdrs, "", 1, "initialize", map[string]interface{}{
"protocolVersion": "2024-11-05",
"capabilities": map[string]interface{}{},
"clientInfo": map[string]interface{}{"name": "nightingale", "version": "1.0.0"},
})
if err != nil {
return nil, fmt.Errorf("initialize: %v", err)
}
_ = initResp
// Send initialized notification
sendMCPRPC(client, s.URL, hdrs, initSessionID, 0, "notifications/initialized", map[string]interface{}{})
// Step 2: List tools
toolsResp, _, err := sendMCPRPC(client, s.URL, hdrs, initSessionID, 2, "tools/list", map[string]interface{}{})
if err != nil {
return nil, fmt.Errorf("tools/list: %v", err)
}
if toolsResp == nil || toolsResp.Result == nil {
return []mcpTool{}, nil
}
toolsRaw, ok := toolsResp.Result["tools"]
if !ok {
return []mcpTool{}, nil
}
toolsJSON, _ := json.Marshal(toolsRaw)
var tools []mcpTool
json.Unmarshal(toolsJSON, &tools)
return tools, nil
}
type jsonRPCResponse struct {
JSONRPC string `json:"jsonrpc"`
ID interface{} `json:"id"`
Result map[string]interface{} `json:"result"`
Error *jsonRPCError `json:"error"`
}
type jsonRPCError struct {
Code int `json:"code"`
Message string `json:"message"`
}
func sendMCPRPC(client *http.Client, serverURL string, hdrs map[string]string, sessionID string, id int, method string, params interface{}) (*jsonRPCResponse, string, error) {
body := map[string]interface{}{
"jsonrpc": "2.0",
"method": method,
"params": params,
}
if id > 0 {
body["id"] = id
}
reqBody, _ := json.Marshal(body)
req, err := http.NewRequest("POST", serverURL, bytes.NewReader(reqBody))
if err != nil {
return nil, "", err
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Accept", "application/json, text/event-stream")
if sessionID != "" {
req.Header.Set("Mcp-Session-Id", sessionID)
}
for k, v := range hdrs {
req.Header.Set(k, v)
}
resp, err := client.Do(req)
if err != nil {
return nil, "", err
}
defer resp.Body.Close()
newSessionID := resp.Header.Get("Mcp-Session-Id")
if newSessionID == "" {
newSessionID = sessionID
}
// Notification (no id) - no response body expected
if id <= 0 {
return nil, newSessionID, nil
}
if resp.StatusCode >= 400 {
respBody, _ := io.ReadAll(resp.Body)
if len(respBody) > 500 {
respBody = respBody[:500]
}
return nil, newSessionID, fmt.Errorf("HTTP %d: %s", resp.StatusCode, string(respBody))
}
respBody, err := io.ReadAll(resp.Body)
if err != nil {
return nil, newSessionID, err
}
// Handle SSE response
contentType := resp.Header.Get("Content-Type")
if strings.Contains(contentType, "text/event-stream") {
for _, line := range strings.Split(string(respBody), "\n") {
if strings.HasPrefix(line, "data: ") {
data := strings.TrimPrefix(line, "data: ")
var rpcResp jsonRPCResponse
if json.Unmarshal([]byte(data), &rpcResp) == nil && (rpcResp.Result != nil || rpcResp.Error != nil) {
if rpcResp.Error != nil {
return &rpcResp, newSessionID, fmt.Errorf("RPC error %d: %s", rpcResp.Error.Code, rpcResp.Error.Message)
}
return &rpcResp, newSessionID, nil
}
}
}
return nil, newSessionID, fmt.Errorf("no valid JSON-RPC response in SSE stream")
}
// Handle JSON response
var rpcResp jsonRPCResponse
if err := json.Unmarshal(respBody, &rpcResp); err != nil {
if len(respBody) > 200 {
respBody = respBody[:200]
}
return nil, newSessionID, fmt.Errorf("invalid response: %s", string(respBody))
}
if rpcResp.Error != nil {
return &rpcResp, newSessionID, fmt.Errorf("RPC error %d: %s", rpcResp.Error.Code, rpcResp.Error.Message)
}
return &rpcResp, newSessionID, nil
}

View File

@@ -0,0 +1,114 @@
package router
import (
"net/http"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
)
func (rt *Router) aiConversationGets(c *gin.Context) {
me := c.MustGet("user").(*models.User)
lst, err := models.AIConversationGetsByUserId(rt.Ctx, me.Id)
ginx.Dangerous(err)
ginx.NewRender(c).Data(lst, nil)
}
func (rt *Router) aiConversationAdd(c *gin.Context) {
var obj models.AIConversation
ginx.BindJSON(c, &obj)
me := c.MustGet("user").(*models.User)
obj.UserId = me.Id
ginx.Dangerous(obj.Verify())
ginx.Dangerous(obj.Create(rt.Ctx))
ginx.NewRender(c).Data(obj, nil)
}
func (rt *Router) aiConversationGet(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AIConversationGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "conversation not found")
}
me := c.MustGet("user").(*models.User)
if obj.UserId != me.Id {
ginx.Bomb(http.StatusForbidden, "forbidden")
}
messages, err := models.AIConversationMessageGetsByConversationId(rt.Ctx, id)
ginx.Dangerous(err)
ginx.NewRender(c).Data(gin.H{
"conversation": obj,
"messages": messages,
}, nil)
}
func (rt *Router) aiConversationPut(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AIConversationGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "conversation not found")
}
me := c.MustGet("user").(*models.User)
if obj.UserId != me.Id {
ginx.Bomb(http.StatusForbidden, "forbidden")
}
var body struct {
Title string `json:"title"`
}
ginx.BindJSON(c, &body)
ginx.NewRender(c).Message(obj.Update(rt.Ctx, body.Title))
}
func (rt *Router) aiConversationDel(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AIConversationGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "conversation not found")
}
me := c.MustGet("user").(*models.User)
if obj.UserId != me.Id {
ginx.Bomb(http.StatusForbidden, "forbidden")
}
ginx.NewRender(c).Message(obj.Delete(rt.Ctx))
}
func (rt *Router) aiConversationMessageAdd(c *gin.Context) {
id := ginx.UrlParamInt64(c, "id")
obj, err := models.AIConversationGetById(rt.Ctx, id)
ginx.Dangerous(err)
if obj == nil {
ginx.Bomb(http.StatusNotFound, "conversation not found")
}
me := c.MustGet("user").(*models.User)
if obj.UserId != me.Id {
ginx.Bomb(http.StatusForbidden, "forbidden")
}
var msgs []models.AIConversationMessage
ginx.BindJSON(c, &msgs)
for i := range msgs {
msgs[i].ConversationId = id
ginx.Dangerous(msgs[i].Create(rt.Ctx))
}
// Update conversation timestamp
obj.UpdateTime(rt.Ctx)
ginx.NewRender(c).Message(nil)
}

View File

@@ -0,0 +1,345 @@
package router
import (
"encoding/json"
"fmt"
"io"
"net/http"
"time"
"github.com/ccfos/nightingale/v6/aiagent"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/ccfos/nightingale/v6/pkg/prom"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/logger"
)
// AIChatRequest is the generic chat request dispatched by action_key.
type AIChatRequest struct {
ActionKey string `json:"action_key"` // e.g. "query_generator"
UserInput string `json:"user_input"`
History []aiagent.ChatMessage `json:"history,omitempty"`
Context map[string]interface{} `json:"context,omitempty"` // action-specific params
}
// actionHandler defines how each action_key is processed.
type actionHandler struct {
useCase string // maps to AIAgent.UseCase for finding the right agent config
validate func(req *AIChatRequest) error
selectTools func(req *AIChatRequest) []string
buildPrompt func(req *AIChatRequest) string
buildInputs func(req *AIChatRequest) map[string]string
}
var actionRegistry = map[string]*actionHandler{
"query_generator": {
useCase: "chat",
validate: validateQueryGenerator,
selectTools: selectQueryGeneratorTools,
buildPrompt: buildQueryGeneratorPrompt,
buildInputs: buildQueryGeneratorInputs,
},
}
// --- query_generator action ---
func ctxStr(ctx map[string]interface{}, key string) string {
if v, ok := ctx[key]; ok {
if s, ok := v.(string); ok {
return s
}
}
return ""
}
func ctxInt64(ctx map[string]interface{}, key string) int64 {
if v, ok := ctx[key]; ok {
switch n := v.(type) {
case float64:
return int64(n)
case int64:
return n
case json.Number:
i, _ := n.Int64()
return i
}
}
return 0
}
func validateQueryGenerator(req *AIChatRequest) error {
dsType := ctxStr(req.Context, "datasource_type")
dsID := ctxInt64(req.Context, "datasource_id")
if dsType == "" {
return fmt.Errorf("context.datasource_type is required")
}
if dsID == 0 {
return fmt.Errorf("context.datasource_id is required")
}
return nil
}
func selectQueryGeneratorTools(req *AIChatRequest) []string {
dsType := ctxStr(req.Context, "datasource_type")
switch dsType {
case "prometheus":
return []string{"list_metrics", "get_metric_labels"}
case "mysql", "doris", "ck", "clickhouse", "pgsql", "postgresql":
return []string{"list_databases", "list_tables", "describe_table"}
default:
return nil
}
}
func buildQueryGeneratorPrompt(req *AIChatRequest) string {
dsType := ctxStr(req.Context, "datasource_type")
dbName := ctxStr(req.Context, "database_name")
tableName := ctxStr(req.Context, "table_name")
switch dsType {
case "prometheus":
return fmt.Sprintf(`You are a PromQL expert. The user wants to query Prometheus metrics.
User request: %s
Please use the available tools to explore the metrics and generate the correct PromQL query.
- First use list_metrics to find relevant metrics
- Then use get_metric_labels to understand the label structure
- Finally provide the PromQL query as your Final Answer
Your Final Answer MUST be a valid JSON object with these fields:
{"query": "<the PromQL query>", "explanation": "<brief explanation in the user's language>"}`, req.UserInput)
default: // SQL-based datasources
dbContext := ""
if dbName != "" {
dbContext += fmt.Sprintf("\nTarget database: %s", dbName)
}
if tableName != "" {
dbContext += fmt.Sprintf("\nTarget table: %s", tableName)
}
return fmt.Sprintf(`You are a SQL expert for %s databases. The user wants to query data.
%s
User request: %s
Please use the available tools to explore the database schema and generate the correct SQL query.
- Use list_databases to see available databases
- Use list_tables to see tables in the target database
- Use describe_table to understand the table structure
- Finally provide the SQL query as your Final Answer
Your Final Answer MUST be a valid JSON object with these fields:
{"query": "<the SQL query>", "explanation": "<brief explanation in the user's language>"}`, dsType, dbContext, req.UserInput)
}
}
func buildQueryGeneratorInputs(req *AIChatRequest) map[string]string {
inputs := map[string]string{
"user_input": req.UserInput,
}
for _, key := range []string{"datasource_type", "datasource_id", "database_name", "table_name"} {
if v := ctxStr(req.Context, key); v != "" {
inputs[key] = v
}
}
// datasource_id may be a number in JSON
if inputs["datasource_id"] == "" {
if id := ctxInt64(req.Context, "datasource_id"); id > 0 {
inputs["datasource_id"] = fmt.Sprintf("%d", id)
}
}
return inputs
}
// --- generic handler ---
func (rt *Router) aiChat(c *gin.Context) {
if !rt.Center.AIAgent.Enable {
ginx.Bomb(http.StatusServiceUnavailable, "AI Agent is not enabled")
return
}
var req AIChatRequest
ginx.BindJSON(c, &req)
if req.UserInput == "" {
ginx.Bomb(http.StatusBadRequest, "user_input is required")
return
}
if req.ActionKey == "" {
ginx.Bomb(http.StatusBadRequest, "action_key is required")
return
}
if req.Context == nil {
req.Context = make(map[string]interface{})
}
handler, ok := actionRegistry[req.ActionKey]
if !ok {
ginx.Bomb(http.StatusBadRequest, "unsupported action_key: %s", req.ActionKey)
return
}
logger.Infof("[AIChat] action=%s, user_input=%q", req.ActionKey, truncStr(req.UserInput, 100))
// Action-specific validation
if handler.validate != nil {
if err := handler.validate(&req); err != nil {
ginx.Bomb(http.StatusBadRequest, err.Error())
return
}
}
// Find AI agent by use_case
agent, err := models.AIAgentGetByUseCase(rt.Ctx, handler.useCase)
if err != nil || agent == nil {
ginx.Bomb(http.StatusBadRequest, "no AI agent configured for use_case=%s", handler.useCase)
return
}
// Resolve LLM config
llmCfg, err := models.AILLMConfigGetById(rt.Ctx, agent.LLMConfigId)
if err != nil || llmCfg == nil {
ginx.Bomb(http.StatusBadRequest, "referenced LLM config not found")
return
}
agent.LLMConfig = llmCfg
// Select tools
var tools []aiagent.AgentTool
if handler.selectTools != nil {
toolNames := handler.selectTools(&req)
if toolNames != nil {
tools = aiagent.GetBuiltinToolDefs(toolNames)
}
}
// Parse extra config
extraConfig := llmCfg.ExtraConfig
timeout := 120000
if extraConfig.TimeoutSeconds > 0 {
timeout = extraConfig.TimeoutSeconds * 1000
}
// Build prompt
userPrompt := ""
if handler.buildPrompt != nil {
userPrompt = handler.buildPrompt(&req)
}
// Build workflow inputs
inputs := map[string]string{"user_input": req.UserInput}
if handler.buildInputs != nil {
inputs = handler.buildInputs(&req)
}
// Create agent
agentCfg := aiagent.NewAgent(&aiagent.AIAgentConfig{
Provider: llmCfg.APIType,
LLMURL: llmCfg.APIURL,
Model: llmCfg.Model,
APIKey: llmCfg.APIKey,
Headers: extraConfig.CustomHeaders,
AgentMode: aiagent.AgentModeReAct,
Tools: tools,
Timeout: timeout,
Stream: true,
UserPromptTemplate: userPrompt,
SkipSSLVerify: extraConfig.SkipTLSVerify,
Proxy: extraConfig.Proxy,
Temperature: extraConfig.Temperature,
MaxTokens: extraConfig.MaxTokens,
})
// Inject PromClient getter
aiagent.SetPromClientGetter(func(dsId int64) prom.API {
return rt.PromClients.GetCli(dsId)
})
// Streaming setup
streamChan := make(chan *models.StreamChunk, 100)
wfCtx := &models.WorkflowContext{
Stream: true,
StreamChan: streamChan,
Inputs: inputs,
}
c.Header("Content-Type", "text/event-stream")
c.Header("Cache-Control", "no-cache")
c.Header("Connection", "keep-alive")
c.Header("X-Accel-Buffering", "no")
startTime := time.Now()
go func() {
defer func() {
if r := recover(); r != nil {
logger.Errorf("[AIChat] PANIC in agent goroutine: %v", r)
streamChan <- &models.StreamChunk{
Type: models.StreamTypeError,
Content: fmt.Sprintf("internal error: %v", r),
Done: true,
Timestamp: time.Now().UnixMilli(),
}
close(streamChan)
}
}()
_, _, err := agentCfg.Process(rt.Ctx, wfCtx)
if err != nil {
logger.Errorf("[AIChat] agent Process error: %v", err)
}
}()
// Stream SSE events
var accumulatedMessage string
c.Stream(func(w io.Writer) bool {
chunk, ok := <-streamChan
if !ok {
return false
}
data, _ := json.Marshal(chunk)
if chunk.Type == models.StreamTypeText || chunk.Type == models.StreamTypeThinking {
if chunk.Delta != "" {
accumulatedMessage += chunk.Delta
} else if chunk.Content != "" {
accumulatedMessage += chunk.Content
}
}
if chunk.Type == models.StreamTypeError {
fmt.Fprintf(w, "event: error\ndata: %s\n\n", data)
c.Writer.Flush()
return false
}
if chunk.Done || chunk.Type == models.StreamTypeDone {
doneData := map[string]interface{}{
"type": "done",
"duration_ms": time.Since(startTime).Milliseconds(),
"message": accumulatedMessage,
"response": chunk.Content,
}
finalData, _ := json.Marshal(doneData)
fmt.Fprintf(w, "event: done\ndata: %s\n\n", finalData)
c.Writer.Flush()
return false
}
fmt.Fprintf(w, "event: chunk\ndata: %s\n\n", data)
c.Writer.Flush()
return true
})
}
func truncStr(s string, maxLen int) string {
if len(s) <= maxLen {
return s
}
return s[:maxLen] + "..."
}

View File

@@ -4,9 +4,9 @@ import (
"net/http"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
// no param

View File

@@ -10,9 +10,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)

View File

@@ -0,0 +1,168 @@
package router
import (
"encoding/json"
"fmt"
"io"
"net/http"
"sort"
"strconv"
"strings"
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/loggrep"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
)
// alertEvalDetailPage renders an HTML log viewer page for alert rule evaluation logs.
func (rt *Router) alertEvalDetailPage(c *gin.Context) {
id := ginx.UrlParamStr(c, "id")
if !loggrep.IsValidRuleID(id) {
c.String(http.StatusBadRequest, "invalid rule id format")
return
}
logs, instance, err := rt.getAlertEvalLogs(id)
if err != nil {
c.String(http.StatusInternalServerError, "Error: %v", err)
return
}
c.Header("Content-Type", "text/html; charset=utf-8")
err = loggrep.RenderAlertEvalHTML(c.Writer, loggrep.AlertEvalPageData{
RuleID: id,
Instance: instance,
Logs: logs,
Total: len(logs),
})
if err != nil {
c.String(http.StatusInternalServerError, "render error: %v", err)
}
}
// alertEvalDetailJSON returns JSON for alert rule evaluation logs.
func (rt *Router) alertEvalDetailJSON(c *gin.Context) {
id := ginx.UrlParamStr(c, "id")
if !loggrep.IsValidRuleID(id) {
ginx.Bomb(200, "invalid rule id format")
}
logs, instance, err := rt.getAlertEvalLogs(id)
ginx.Dangerous(err)
ginx.NewRender(c).Data(loggrep.EventDetailResp{
Logs: logs,
Instance: instance,
}, nil)
}
// getAlertEvalLogs resolves the target instance(s) and retrieves alert eval logs.
func (rt *Router) getAlertEvalLogs(id string) ([]string, string, error) {
ruleId, _ := strconv.ParseInt(id, 10, 64)
rule, err := models.AlertRuleGetById(rt.Ctx, ruleId)
if err != nil {
return nil, "", err
}
if rule == nil {
return nil, "", fmt.Errorf("no such alert rule")
}
instance := fmt.Sprintf("%s:%d", rt.Alert.Heartbeat.IP, rt.HTTP.Port)
keyword := fmt.Sprintf("alert_eval_%s", id)
// Get datasource IDs for this rule
dsIds := rt.DatasourceCache.GetIDsByDsCateAndQueries(rule.Cate, rule.DatasourceQueries)
if len(dsIds) == 0 {
// No datasources found (e.g. host rule), try local grep
logs, err := loggrep.GrepLogDir(rt.LogDir, keyword)
return logs, instance, err
}
// Find unique target nodes via hash ring, with DB fallback
nodeSet := make(map[string]struct{})
for _, dsId := range dsIds {
node, err := rt.getNodeForDatasource(dsId, id)
if err != nil {
continue
}
nodeSet[node] = struct{}{}
}
if len(nodeSet) == 0 {
// Hash ring not ready, grep locally
logs, err := loggrep.GrepLogDir(rt.LogDir, keyword)
return logs, instance, err
}
// Collect logs from all target nodes
var allLogs []string
var instances []string
for node := range nodeSet {
if node == instance {
logs, err := loggrep.GrepLogDir(rt.LogDir, keyword)
if err == nil {
allLogs = append(allLogs, logs...)
instances = append(instances, node)
}
} else {
logs, nodeAddr, err := rt.forwardAlertEvalDetail(node, id)
if err == nil {
allLogs = append(allLogs, logs...)
instances = append(instances, nodeAddr)
}
}
}
// Sort logs by timestamp descending
sort.Slice(allLogs, func(i, j int) bool {
return allLogs[i] > allLogs[j]
})
if len(allLogs) > loggrep.MaxLogLines {
allLogs = allLogs[:loggrep.MaxLogLines]
}
return allLogs, strings.Join(instances, ", "), nil
}
func (rt *Router) forwardAlertEvalDetail(node, id string) ([]string, string, error) {
url := fmt.Sprintf("http://%s/v1/n9e/alert-eval-detail/%s", node, id)
req, err := http.NewRequest("GET", url, nil)
if err != nil {
return nil, node, err
}
for user, pass := range rt.HTTP.APIForService.BasicAuth {
req.SetBasicAuth(user, pass)
break
}
client := &http.Client{Timeout: 15 * time.Second}
resp, err := client.Do(req)
if err != nil {
return nil, node, fmt.Errorf("forward to %s failed: %v", node, err)
}
defer resp.Body.Close()
body, err := io.ReadAll(io.LimitReader(resp.Body, 10*1024*1024)) // 10MB limit
if err != nil {
return nil, node, err
}
var result struct {
Dat loggrep.EventDetailResp `json:"dat"`
Err string `json:"err"`
}
if err := json.Unmarshal(body, &result); err != nil {
return nil, node, err
}
if result.Err != "" {
return nil, node, fmt.Errorf("%s", result.Err)
}
return result.Dat.Logs, result.Dat.Instance, nil
}

View File

@@ -8,9 +8,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
"golang.org/x/exp/slices"
)

View File

@@ -13,6 +13,7 @@ import (
"github.com/ccfos/nightingale/v6/alert/mute"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pushgw/pconf"
"github.com/ccfos/nightingale/v6/pushgw/writer"
@@ -21,7 +22,6 @@ import (
"github.com/jinzhu/copier"
"github.com/pkg/errors"
"github.com/prometheus/prometheus/prompb"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
)
@@ -36,6 +36,7 @@ func (rt *Router) alertRuleGets(c *gin.Context) {
for i := 0; i < len(ars); i++ {
ars[i].FillNotifyGroups(rt.Ctx, cache)
}
models.FillUpdateByNicknames(rt.Ctx, ars)
}
ginx.NewRender(c).Data(ars, err)
}
@@ -76,7 +77,6 @@ func (rt *Router) alertRuleGetsByGids(c *gin.Context) {
if err == nil {
cache := make(map[int64]*models.UserGroup)
rids := make([]int64, 0, len(ars))
names := make([]string, 0, len(ars))
for i := 0; i < len(ars); i++ {
ars[i].FillNotifyGroups(rt.Ctx, cache)
@@ -85,7 +85,6 @@ func (rt *Router) alertRuleGetsByGids(c *gin.Context) {
}
rids = append(rids, ars[i].Id)
names = append(names, ars[i].UpdateBy)
}
stime, etime := GetAlertCueEventTimeRange(c)
@@ -96,14 +95,7 @@ func (rt *Router) alertRuleGetsByGids(c *gin.Context) {
}
}
users := models.UserMapGet(rt.Ctx, "username in (?)", names)
if users != nil {
for i := 0; i < len(ars); i++ {
if user, exist := users[ars[i].UpdateBy]; exist {
ars[i].UpdateByNickname = user.Nickname
}
}
}
models.FillUpdateByNicknames(rt.Ctx, ars)
}
ginx.NewRender(c).Data(ars, err)
}
@@ -135,6 +127,7 @@ func (rt *Router) alertRulesGetByService(c *gin.Context) {
ars[i].DatasourceIdsJson = rt.DatasourceCache.GetIDsByDsCateAndQueries(ars[i].Cate, ars[i].DatasourceQueries)
}
}
models.FillUpdateByNicknames(rt.Ctx, ars)
}
ginx.NewRender(c).Data(ars, err)
}
@@ -889,3 +882,28 @@ func (rt *Router) batchAlertRuleClone(c *gin.Context) {
ginx.NewRender(c).Data(reterr, nil)
}
func (rt *Router) timezonesGet(c *gin.Context) {
// 返回常用时区列表(按时差去重,每个时差只保留一个代表性时区)
timezones := []string{
"Local",
"UTC",
"Asia/Shanghai", // UTC+8 (代表 Asia/Hong_Kong, Asia/Singapore 等)
"Asia/Tokyo", // UTC+9 (代表 Asia/Seoul 等)
"Asia/Dubai", // UTC+4
"Asia/Kolkata", // UTC+5:30
"Asia/Bangkok", // UTC+7 (代表 Asia/Jakarta 等)
"Europe/London", // UTC+0 (代表 UTC)
"Europe/Paris", // UTC+1 (代表 Europe/Berlin, Europe/Rome, Europe/Madrid 等)
"Europe/Moscow", // UTC+3
"America/New_York", // UTC-5 (代表 America/Toronto 等)
"America/Chicago", // UTC-6 (代表 America/Mexico_City 等)
"America/Denver", // UTC-7
"America/Los_Angeles", // UTC-8
"America/Sao_Paulo", // UTC-3
"Australia/Sydney", // UTC+10 (代表 Australia/Melbourne 等)
"Pacific/Auckland", // UTC+12
}
ginx.NewRender(c).Data(timezones, nil)
}

View File

@@ -9,9 +9,9 @@ import (
"github.com/ccfos/nightingale/v6/alert/common"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
)
@@ -30,6 +30,7 @@ func (rt *Router) alertSubscribeGets(c *gin.Context) {
ginx.Dangerous(lst[i].FillDatasourceIds(rt.Ctx))
ginx.Dangerous(lst[i].DB2FE())
}
models.FillUpdateByNicknames(rt.Ctx, lst)
ginx.NewRender(c).Data(lst, err)
}
@@ -66,6 +67,7 @@ func (rt *Router) alertSubscribeGetsByGids(c *gin.Context) {
ginx.Dangerous(lst[i].FillDatasourceIds(rt.Ctx))
ginx.Dangerous(lst[i].DB2FE())
}
models.FillUpdateByNicknames(rt.Ctx, lst)
ginx.NewRender(c).Data(lst, err)
}

View File

@@ -7,9 +7,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
)
@@ -260,6 +260,9 @@ func (rt *Router) boardGets(c *gin.Context) {
query := ginx.QueryStr(c, "query", "")
boards, err := models.BoardGetsByGroupId(rt.Ctx, bgid, query)
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, boards)
}
ginx.NewRender(c).Data(boards, err)
}
@@ -273,6 +276,9 @@ func (rt *Router) publicBoardGets(c *gin.Context) {
ginx.Dangerous(err)
boards, err := models.BoardGets(rt.Ctx, "", "public=1 and (public_cate in (?) or id in (?))", []int64{0, 1}, boardIds)
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, boards)
}
ginx.NewRender(c).Data(boards, err)
}
@@ -312,6 +318,7 @@ func (rt *Router) boardGetsByGids(c *gin.Context) {
boards[i].Bgids = ids
}
}
models.FillUpdateByNicknames(rt.Ctx, boards)
ginx.NewRender(c).Data(boards, err)
}

View File

@@ -8,10 +8,10 @@ import (
"strings"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/file"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
"github.com/toolkits/pkg/runner"
)

View File

@@ -5,9 +5,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"gorm.io/gorm"
)

View File

@@ -3,8 +3,8 @@ package router
import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/prom"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) metricFilterGets(c *gin.Context) {
@@ -27,6 +27,8 @@ func (rt *Router) metricFilterGets(c *gin.Context) {
}
}
models.FillUpdateByNicknames(rt.Ctx, arr)
ginx.NewRender(c).Data(arr, err)
}

View File

@@ -7,9 +7,9 @@ import (
"github.com/ccfos/nightingale/v6/center/integration"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
)

View File

@@ -9,8 +9,8 @@ import (
"github.com/BurntSushi/toml"
"github.com/ccfos/nightingale/v6/center/integration"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
)

View File

@@ -5,9 +5,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
@@ -119,6 +119,9 @@ func (rt *Router) busiGroupGets(c *gin.Context) {
if len(lst) == 0 {
lst = []models.BusiGroup{}
}
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, lst)
}
ginx.NewRender(c).Data(lst, err)
}

View File

@@ -5,9 +5,9 @@ import (
"time"
"github.com/ccfos/nightingale/v6/storage"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
captcha "github.com/mojocn/base64Captcha"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)

View File

@@ -5,9 +5,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) chartShareGets(c *gin.Context) {

View File

@@ -4,9 +4,9 @@ import (
"encoding/json"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) notifyChannelsGets(c *gin.Context) {

View File

@@ -4,9 +4,9 @@ import (
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
const EMBEDDEDDASHBOARD = "embedded-dashboards"
@@ -15,6 +15,9 @@ func (rt *Router) configsGet(c *gin.Context) {
prefix := ginx.QueryStr(c, "prefix", "")
limit := ginx.QueryInt(c, "limit", 10)
configs, err := models.ConfigsGets(rt.Ctx, prefix, limit, ginx.Offset(c, limit))
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, configs)
}
ginx.NewRender(c).Data(configs, err)
}

View File

@@ -2,9 +2,9 @@ package router
import (
"github.com/ccfos/nightingale/v6/pkg/secu"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
type confPropCrypto struct {

View File

@@ -7,9 +7,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func checkAnnotationPermission(c *gin.Context, ctx *ctx.Context, dashboardId int64) {

View File

@@ -15,8 +15,8 @@ import (
"github.com/ccfos/nightingale/v6/datasource/opensearch"
"github.com/ccfos/nightingale/v6/dskit/clickhouse"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
"github.com/toolkits/pkg/logger"
)

View File

@@ -6,10 +6,10 @@ import (
"github.com/ccfos/nightingale/v6/dscache"
"github.com/ccfos/nightingale/v6/dskit/types"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/ccfos/nightingale/v6/pkg/logx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
func (rt *Router) ShowDatabases(c *gin.Context) {
@@ -18,7 +18,7 @@ func (rt *Router) ShowDatabases(c *gin.Context) {
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
if !exists {
logger.Warningf("cluster:%d not exists", f.DatasourceId)
logx.Warningf(c.Request.Context(), "cluster:%d not exists", f.DatasourceId)
ginx.Bomb(200, "cluster not exists")
}
@@ -48,7 +48,7 @@ func (rt *Router) ShowTables(c *gin.Context) {
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
if !exists {
logger.Warningf("cluster:%d not exists", f.DatasourceId)
logx.Warningf(c.Request.Context(), "cluster:%d not exists", f.DatasourceId)
ginx.Bomb(200, "cluster not exists")
}
@@ -78,7 +78,7 @@ func (rt *Router) DescribeTable(c *gin.Context) {
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
if !exists {
logger.Warningf("cluster:%d not exists", f.DatasourceId)
logx.Warningf(c.Request.Context(), "cluster:%d not exists", f.DatasourceId)
ginx.Bomb(200, "cluster not exists")
}
// 只接受一个入参

View File

@@ -5,14 +5,15 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) embeddedProductGets(c *gin.Context) {
products, err := models.EmbeddedProductGets(rt.Ctx)
ginx.Dangerous(err)
models.FillUpdateByNicknames(rt.Ctx, products)
// 获取当前用户可访问的Group ID 列表
me := c.MustGet("user").(*models.User)

View File

@@ -3,10 +3,10 @@ package router
import (
"github.com/ccfos/nightingale/v6/datasource/es"
"github.com/ccfos/nightingale/v6/dscache"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/ccfos/nightingale/v6/pkg/logx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
type IndexReq struct {
@@ -34,7 +34,7 @@ func (rt *Router) QueryIndices(c *gin.Context) {
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
if !exists {
logger.Warningf("cluster:%d not exists", f.DatasourceId)
logx.Warningf(c.Request.Context(), "cluster:%d not exists", f.DatasourceId)
ginx.Bomb(200, "cluster not exists")
}
@@ -50,7 +50,7 @@ func (rt *Router) QueryFields(c *gin.Context) {
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
if !exists {
logger.Warningf("cluster:%d not exists", f.DatasourceId)
logx.Warningf(c.Request.Context(), "cluster:%d not exists", f.DatasourceId)
ginx.Bomb(200, "cluster not exists")
}
@@ -66,7 +66,7 @@ func (rt *Router) QueryESVariable(c *gin.Context) {
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
if !exists {
logger.Warningf("cluster:%d not exists", f.DatasourceId)
logx.Warningf(c.Request.Context(), "cluster:%d not exists", f.DatasourceId)
ginx.Bomb(200, "cluster not exists")
}

View File

@@ -5,8 +5,8 @@ import (
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
// 创建 ES Index Pattern
@@ -69,6 +69,10 @@ func (rt *Router) esIndexPatternGetList(c *gin.Context) {
lst, err = models.EsIndexPatternGets(rt.Ctx, "")
}
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, lst)
}
ginx.NewRender(c).Data(lst, err)
}

View File

@@ -0,0 +1,149 @@
package router
import (
"encoding/json"
"fmt"
"io"
"net/http"
"strconv"
"time"
"github.com/ccfos/nightingale/v6/alert/naming"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/loggrep"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
)
// eventDetailPage renders an HTML log viewer page (for pages group).
func (rt *Router) eventDetailPage(c *gin.Context) {
hash := ginx.UrlParamStr(c, "hash")
if !loggrep.IsValidHash(hash) {
c.String(http.StatusBadRequest, "invalid hash format")
return
}
logs, instance, err := rt.getEventLogs(hash)
if err != nil {
c.String(http.StatusInternalServerError, "Error: %v", err)
return
}
c.Header("Content-Type", "text/html; charset=utf-8")
err = loggrep.RenderHTML(c.Writer, loggrep.PageData{
Hash: hash,
Instance: instance,
Logs: logs,
Total: len(logs),
})
if err != nil {
c.String(http.StatusInternalServerError, "render error: %v", err)
}
}
// eventDetailJSON returns JSON (for service group).
func (rt *Router) eventDetailJSON(c *gin.Context) {
hash := ginx.UrlParamStr(c, "hash")
if !loggrep.IsValidHash(hash) {
ginx.Bomb(200, "invalid hash format")
}
logs, instance, err := rt.getEventLogs(hash)
ginx.Dangerous(err)
ginx.NewRender(c).Data(loggrep.EventDetailResp{
Logs: logs,
Instance: instance,
}, nil)
}
// getNodeForDatasource returns the alert engine instance responsible for the given
// datasource and primary key. It first checks the local hashring, and falls back
// to querying the database for active instances if the hashring is empty
// (e.g. when the datasource belongs to another engine cluster).
func (rt *Router) getNodeForDatasource(datasourceId int64, pk string) (string, error) {
dsIdStr := strconv.FormatInt(datasourceId, 10)
node, err := naming.DatasourceHashRing.GetNode(dsIdStr, pk)
if err == nil {
return node, nil
}
// Hashring is empty for this datasource (likely belongs to another engine cluster).
// Query the DB for active instances.
servers, dbErr := models.AlertingEngineGetsInstances(rt.Ctx,
"datasource_id = ? and clock > ?",
datasourceId, time.Now().Unix()-30)
if dbErr != nil {
return "", dbErr
}
if len(servers) == 0 {
return "", fmt.Errorf("no active instances for datasource %d", datasourceId)
}
ring := naming.NewConsistentHashRing(int32(naming.NodeReplicas), servers)
return ring.Get(pk)
}
// getEventLogs resolves the target instance and retrieves logs.
func (rt *Router) getEventLogs(hash string) ([]string, string, error) {
event, err := models.AlertHisEventGetByHash(rt.Ctx, hash)
if err != nil {
return nil, "", err
}
if event == nil {
return nil, "", fmt.Errorf("no such alert event")
}
ruleId := strconv.FormatInt(event.RuleId, 10)
instance := fmt.Sprintf("%s:%d", rt.Alert.Heartbeat.IP, rt.HTTP.Port)
node, err := rt.getNodeForDatasource(event.DatasourceId, ruleId)
if err != nil || node == instance {
// hashring not ready or target is self, handle locally
logs, err := loggrep.GrepLogDir(rt.LogDir, hash)
return logs, instance, err
}
// forward to the target alert instance
return rt.forwardEventDetail(node, hash)
}
func (rt *Router) forwardEventDetail(node, hash string) ([]string, string, error) {
url := fmt.Sprintf("http://%s/v1/n9e/event-detail/%s", node, hash)
req, err := http.NewRequest("GET", url, nil)
if err != nil {
return nil, node, err
}
for user, pass := range rt.HTTP.APIForService.BasicAuth {
req.SetBasicAuth(user, pass)
break
}
client := &http.Client{Timeout: 15 * time.Second}
resp, err := client.Do(req)
if err != nil {
return nil, node, fmt.Errorf("forward to %s failed: %v", node, err)
}
defer resp.Body.Close()
body, err := io.ReadAll(io.LimitReader(resp.Body, 10*1024*1024)) // 10MB limit
if err != nil {
return nil, node, err
}
var result struct {
Dat loggrep.EventDetailResp `json:"dat"`
Err string `json:"err"`
}
if err := json.Unmarshal(body, &result); err != nil {
return nil, node, err
}
if result.Err != "" {
return nil, node, fmt.Errorf("%s", result.Err)
}
return result.Dat.Logs, result.Dat.Instance, nil
}

View File

@@ -8,10 +8,10 @@ import (
"github.com/ccfos/nightingale/v6/alert/pipeline/engine"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
"github.com/toolkits/pkg/logger"
)
@@ -35,6 +35,7 @@ func (rt *Router) eventPipelinesList(c *gin.Context) {
// 兼容处理:自动填充工作流字段
pipeline.FillWorkflowFields()
}
models.FillUpdateByNicknames(rt.Ctx, pipelines)
gids, err := models.MyGroupIdsMap(rt.Ctx, me.Id)
ginx.Dangerous(err)

View File

@@ -7,9 +7,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
const defaultLimit = 300

View File

@@ -15,9 +15,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pushgw/idents"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)

View File

@@ -14,16 +14,16 @@ import (
"github.com/ccfos/nightingale/v6/pkg/dingtalk"
"github.com/ccfos/nightingale/v6/pkg/feishu"
"github.com/ccfos/nightingale/v6/pkg/ldapx"
"github.com/ccfos/nightingale/v6/pkg/logx"
"github.com/ccfos/nightingale/v6/pkg/oauth2x"
"github.com/ccfos/nightingale/v6/pkg/oidcx"
"github.com/ccfos/nightingale/v6/pkg/secu"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/dgrijalva/jwt-go"
"github.com/gin-gonic/gin"
"github.com/pelletier/go-toml/v2"
"github.com/pkg/errors"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
"gorm.io/gorm"
)
@@ -37,7 +37,9 @@ type loginForm struct {
func (rt *Router) loginPost(c *gin.Context) {
var f loginForm
ginx.BindJSON(c, &f)
logger.Infof("username:%s login from:%s", f.Username, c.ClientIP())
rctx := c.Request.Context()
logx.Infof(rctx, "username:%s login from:%s", f.Username, c.ClientIP())
if rt.HTTP.ShowCaptcha.Enable {
if !CaptchaVerify(f.Captchaid, f.Verifyvalue) {
@@ -50,23 +52,25 @@ func (rt *Router) loginPost(c *gin.Context) {
if rt.HTTP.RSA.OpenRSA {
decPassWord, err := secu.Decrypt(f.Password, rt.HTTP.RSA.RSAPrivateKey, rt.HTTP.RSA.RSAPassWord)
if err != nil {
logger.Errorf("RSA Decrypt failed: %v username: %s", err, f.Username)
logx.Errorf(rctx, "RSA Decrypt failed: %v username: %s", err, f.Username)
ginx.NewRender(c).Message(err)
return
}
authPassWord = decPassWord
}
reqCtx := rt.Ctx.WithContext(rctx)
var user *models.User
var err error
lc := rt.Sso.LDAP.Copy()
if lc.Enable {
user, err = ldapx.LdapLogin(rt.Ctx, f.Username, authPassWord, lc.DefaultRoles, lc.DefaultTeams, lc)
user, err = ldapx.LdapLogin(reqCtx, f.Username, authPassWord, lc.DefaultRoles, lc.DefaultTeams, lc)
if err != nil {
logger.Debugf("ldap login failed: %v username: %s", err, f.Username)
logx.Debugf(rctx, "ldap login failed: %v username: %s", err, f.Username)
var errLoginInN9e error
// to use n9e as the minimum guarantee for login
if user, errLoginInN9e = models.PassLogin(rt.Ctx, rt.Redis, f.Username, authPassWord); errLoginInN9e != nil {
if user, errLoginInN9e = models.PassLogin(reqCtx, rt.Redis, f.Username, authPassWord); errLoginInN9e != nil {
ginx.NewRender(c).Message("ldap login failed: %v; n9e login failed: %v", err, errLoginInN9e)
return
}
@@ -74,7 +78,7 @@ func (rt *Router) loginPost(c *gin.Context) {
user.RolesLst = strings.Fields(user.Roles)
}
} else {
user, err = models.PassLogin(rt.Ctx, rt.Redis, f.Username, authPassWord)
user, err = models.PassLogin(reqCtx, rt.Redis, f.Username, authPassWord)
ginx.Dangerous(err)
}
@@ -98,7 +102,8 @@ func (rt *Router) loginPost(c *gin.Context) {
}
func (rt *Router) logoutPost(c *gin.Context) {
logger.Infof("username:%s logout from:%s", c.GetString("username"), c.ClientIP())
rctx := c.Request.Context()
logx.Infof(rctx, "username:%s logout from:%s", c.GetString("username"), c.ClientIP())
metadata, err := rt.extractTokenMetadata(c.Request)
if err != nil {
ginx.NewRender(c, http.StatusBadRequest).Message("failed to parse jwt token")
@@ -117,7 +122,7 @@ func (rt *Router) logoutPost(c *gin.Context) {
// 获取用户的 id_token
idToken, err := rt.fetchIdToken(c.Request.Context(), user.Id)
if err != nil {
logger.Debugf("fetch id_token failed: %v, user_id: %d", err, user.Id)
logx.Debugf(rctx, "fetch id_token failed: %v, user_id: %d", err, user.Id)
idToken = "" // 如果获取失败,使用空字符串
}
@@ -220,7 +225,7 @@ func (rt *Router) refreshPost(c *gin.Context) {
// 注意:这里不会获取新的 id_token只是延长 Redis 中现有 id_token 的 TTL
if idToken, err := rt.fetchIdToken(c.Request.Context(), userid); err == nil && idToken != "" {
if err := rt.saveIdToken(c.Request.Context(), userid, idToken); err != nil {
logger.Debugf("refresh id_token ttl failed: %v, user_id: %d", err, userid)
logx.Debugf(c.Request.Context(), "refresh id_token ttl failed: %v, user_id: %d", err, userid)
}
}
@@ -271,12 +276,13 @@ type CallbackOutput struct {
}
func (rt *Router) loginCallback(c *gin.Context) {
rctx := c.Request.Context()
code := ginx.QueryStr(c, "code", "")
state := ginx.QueryStr(c, "state", "")
ret, err := rt.Sso.OIDC.Callback(rt.Redis, c.Request.Context(), code, state)
ret, err := rt.Sso.OIDC.Callback(rt.Redis, rctx, code, state)
if err != nil {
logger.Errorf("sso_callback fail. code:%s, state:%s, get ret: %+v. error: %v", code, state, ret, err)
logx.Errorf(rctx, "sso_callback fail. code:%s, state:%s, get ret: %+v. error: %v", code, state, ret, err)
ginx.NewRender(c).Data(CallbackOutput{}, err)
return
}
@@ -299,7 +305,7 @@ func (rt *Router) loginCallback(c *gin.Context) {
for _, gid := range rt.Sso.OIDC.DefaultTeams {
err = models.UserGroupMemberAdd(rt.Ctx, gid, user.Id)
if err != nil {
logger.Errorf("user:%v UserGroupMemberAdd: %s", user, err)
logx.Errorf(rctx, "user:%v UserGroupMemberAdd: %s", user, err)
}
}
}
@@ -309,12 +315,12 @@ func (rt *Router) loginCallback(c *gin.Context) {
userIdentity := fmt.Sprintf("%d-%s", user.Id, user.Username)
ts, err := rt.createTokens(rt.HTTP.JWTAuth.SigningKey, userIdentity)
ginx.Dangerous(err)
ginx.Dangerous(rt.createAuth(c.Request.Context(), userIdentity, ts))
ginx.Dangerous(rt.createAuth(rctx, userIdentity, ts))
// 保存 id_token 到 Redis用于登出时使用
if ret.IdToken != "" {
if err := rt.saveIdToken(c.Request.Context(), user.Id, ret.IdToken); err != nil {
logger.Errorf("save id_token failed: %v, user_id: %d", err, user.Id)
if err := rt.saveIdToken(rctx, user.Id, ret.IdToken); err != nil {
logx.Errorf(rctx, "save id_token failed: %v, user_id: %d", err, user.Id)
}
}
@@ -355,7 +361,7 @@ func (rt *Router) loginRedirectCas(c *gin.Context) {
}
if !rt.Sso.CAS.Enable {
logger.Error("cas is not enable")
logx.Errorf(c.Request.Context(), "cas is not enable")
ginx.NewRender(c).Data("", nil)
return
}
@@ -370,17 +376,18 @@ func (rt *Router) loginRedirectCas(c *gin.Context) {
}
func (rt *Router) loginCallbackCas(c *gin.Context) {
rctx := c.Request.Context()
ticket := ginx.QueryStr(c, "ticket", "")
state := ginx.QueryStr(c, "state", "")
ret, err := rt.Sso.CAS.ValidateServiceTicket(c.Request.Context(), ticket, state, rt.Redis)
ret, err := rt.Sso.CAS.ValidateServiceTicket(rctx, ticket, state, rt.Redis)
if err != nil {
logger.Errorf("ValidateServiceTicket: %s", err)
logx.Errorf(rctx, "ValidateServiceTicket: %s", err)
ginx.NewRender(c).Data("", err)
return
}
user, err := models.UserGet(rt.Ctx, "username=?", ret.Username)
if err != nil {
logger.Errorf("UserGet: %s", err)
logx.Errorf(rctx, "UserGet: %s", err)
}
ginx.Dangerous(err)
if user != nil {
@@ -399,10 +406,10 @@ func (rt *Router) loginCallbackCas(c *gin.Context) {
userIdentity := fmt.Sprintf("%d-%s", user.Id, user.Username)
ts, err := rt.createTokens(rt.HTTP.JWTAuth.SigningKey, userIdentity)
if err != nil {
logger.Errorf("createTokens: %s", err)
logx.Errorf(rctx, "createTokens: %s", err)
}
ginx.Dangerous(err)
ginx.Dangerous(rt.createAuth(c.Request.Context(), userIdentity, ts))
ginx.Dangerous(rt.createAuth(rctx, userIdentity, ts))
redirect := "/"
if ret.Redirect != "/login" {
@@ -475,12 +482,13 @@ func (rt *Router) loginRedirectDingTalk(c *gin.Context) {
}
func (rt *Router) loginCallbackDingTalk(c *gin.Context) {
rctx := c.Request.Context()
code := ginx.QueryStr(c, "code", "")
state := ginx.QueryStr(c, "state", "")
ret, err := rt.Sso.DingTalk.Callback(rt.Redis, c.Request.Context(), code, state)
ret, err := rt.Sso.DingTalk.Callback(rt.Redis, rctx, code, state)
if err != nil {
logger.Errorf("sso_callback DingTalk fail. code:%s, state:%s, get ret: %+v. error: %v", code, state, ret, err)
logx.Errorf(rctx, "sso_callback DingTalk fail. code:%s, state:%s, get ret: %+v. error: %v", code, state, ret, err)
ginx.NewRender(c).Data(CallbackOutput{}, err)
return
}
@@ -550,12 +558,13 @@ func (rt *Router) loginRedirectFeiShu(c *gin.Context) {
}
func (rt *Router) loginCallbackFeiShu(c *gin.Context) {
rctx := c.Request.Context()
code := ginx.QueryStr(c, "code", "")
state := ginx.QueryStr(c, "state", "")
ret, err := rt.Sso.FeiShu.Callback(rt.Redis, c.Request.Context(), code, state)
ret, err := rt.Sso.FeiShu.Callback(rt.Redis, rctx, code, state)
if err != nil {
logger.Errorf("sso_callback FeiShu fail. code:%s, state:%s, get ret: %+v. error: %v", code, state, ret, err)
logx.Errorf(rctx, "sso_callback FeiShu fail. code:%s, state:%s, get ret: %+v. error: %v", code, state, ret, err)
ginx.NewRender(c).Data(CallbackOutput{}, err)
return
}
@@ -571,12 +580,22 @@ func (rt *Router) loginCallbackFeiShu(c *gin.Context) {
} else {
user = new(models.User)
defaultRoles := []string{}
defaultUserGroups := []int64{}
if rt.Sso.FeiShu != nil && rt.Sso.FeiShu.FeiShuConfig != nil {
defaultRoles = rt.Sso.FeiShu.FeiShuConfig.DefaultRoles
defaultUserGroups = rt.Sso.FeiShu.FeiShuConfig.DefaultUserGroups
}
user.FullSsoFields(feishu.SsoTypeName, ret.Username, ret.Nickname, ret.Phone, ret.Email, defaultRoles)
// create user from feishu
ginx.Dangerous(user.Add(rt.Ctx))
if len(defaultUserGroups) > 0 {
err = user.AddToUserGroups(rt.Ctx, defaultUserGroups)
if err != nil {
logx.Errorf(rctx, "sso feishu add user group error %v %v", ret, err)
}
}
}
// set user login state
@@ -600,12 +619,13 @@ func (rt *Router) loginCallbackFeiShu(c *gin.Context) {
}
func (rt *Router) loginCallbackOAuth(c *gin.Context) {
rctx := c.Request.Context()
code := ginx.QueryStr(c, "code", "")
state := ginx.QueryStr(c, "state", "")
ret, err := rt.Sso.OAuth2.Callback(rt.Redis, c.Request.Context(), code, state)
ret, err := rt.Sso.OAuth2.Callback(rt.Redis, rctx, code, state)
if err != nil {
logger.Debugf("sso.callback() get ret %+v error %v", ret, err)
logx.Debugf(rctx, "sso.callback() get ret %+v error %v", ret, err)
ginx.NewRender(c).Data(CallbackOutput{}, err)
return
}

View File

@@ -12,10 +12,10 @@ import (
"github.com/ccfos/nightingale/v6/pkg/slice"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/tplx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) messageTemplatesAdd(c *gin.Context) {
@@ -154,6 +154,7 @@ func (rt *Router) messageTemplatesGet(c *gin.Context) {
lst, err := models.MessageTemplatesGetBy(rt.Ctx, notifyChannelIdents)
ginx.Dangerous(err)
models.FillUpdateByNicknames(rt.Ctx, lst)
if me.IsAdmin() {
ginx.NewRender(c).Data(lst, nil)

View File

@@ -2,9 +2,9 @@ package router
import (
"github.com/ccfos/nightingale/v6/center/cconf"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) metricsDescGetFile(c *gin.Context) {

View File

@@ -4,9 +4,9 @@ import (
"net/http"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
// no param

View File

@@ -9,9 +9,9 @@ import (
"github.com/ccfos/nightingale/v6/alert/mute"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
)
@@ -22,6 +22,9 @@ func (rt *Router) alertMuteGetsByBG(c *gin.Context) {
query := ginx.QueryStr(c, "query", "")
expired := ginx.QueryInt(c, "expired", -1)
lst, err := models.AlertMuteGets(rt.Ctx, prods, bgid, -1, expired, query)
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, lst)
}
ginx.NewRender(c).Data(lst, err)
}
@@ -47,6 +50,9 @@ func (rt *Router) alertMuteGetsByGids(c *gin.Context) {
}
lst, err := models.AlertMuteGetsByBGIds(rt.Ctx, gids)
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, lst)
}
ginx.NewRender(c).Data(lst, err)
}
@@ -58,6 +64,9 @@ func (rt *Router) alertMuteGets(c *gin.Context) {
disabled := ginx.QueryInt(c, "disabled", -1)
expired := ginx.QueryInt(c, "expired", -1)
lst, err := models.AlertMuteGets(rt.Ctx, prods, bgid, disabled, expired, query)
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, lst)
}
ginx.NewRender(c).Data(lst, err)
}

View File

@@ -11,11 +11,11 @@ import (
"github.com/ccfos/nightingale/v6/center/cstats"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/golang-jwt/jwt"
"github.com/google/uuid"
"github.com/toolkits/pkg/ginx"
)
const (

View File

@@ -6,9 +6,9 @@ import (
"github.com/ccfos/nightingale/v6/alert/sender"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)

View File

@@ -11,8 +11,8 @@ import (
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) notifyChannelsAdd(c *gin.Context) {
@@ -118,6 +118,9 @@ func (rt *Router) notifyChannelGetBy(c *gin.Context) {
func (rt *Router) notifyChannelsGet(c *gin.Context) {
lst, err := models.NotifyChannelsGet(rt.Ctx, "", nil)
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, lst)
}
ginx.NewRender(c).Data(lst, err)
}

View File

@@ -10,10 +10,10 @@ import (
"github.com/ccfos/nightingale/v6/memsto"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/tplx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/pelletier/go-toml/v2"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/str"
)

View File

@@ -10,9 +10,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ctx"
"github.com/ccfos/nightingale/v6/pkg/slice"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
@@ -118,6 +118,7 @@ func (rt *Router) notifyRulesGet(c *gin.Context) {
lst, err := models.NotifyRulesGet(rt.Ctx, "", nil)
ginx.Dangerous(err)
models.FillUpdateByNicknames(rt.Ctx, lst)
if me.IsAdmin() {
ginx.NewRender(c).Data(lst, nil)
return
@@ -221,7 +222,7 @@ func SendNotifyChannelMessage(ctx *ctx.Context, userCache *memsto.UserCacheType,
return "", fmt.Errorf("failed to send flashduty notify: %v", err)
}
}
logger.Infof("channel_name: %v, event:%+v, tplContent:%s, customParams:%v, respBody: %v, err: %v", notifyChannel.Name, events[0], tplContent, customParams, resp, err)
logger.Infof("channel_name: %v, event:%s, tplContent:%s, customParams:%v, respBody: %v, err: %v", notifyChannel.Name, events[0].Hash, tplContent, customParams, resp, err)
return resp, nil
case "pagerduty":
client, err := models.GetHTTPClient(notifyChannel)
@@ -235,7 +236,7 @@ func SendNotifyChannelMessage(ctx *ctx.Context, userCache *memsto.UserCacheType,
return "", fmt.Errorf("failed to send pagerduty notify: %v", err)
}
}
logger.Infof("channel_name: %v, event:%+v, tplContent:%s, customParams:%v, respBody: %v, err: %v", notifyChannel.Name, events[0], tplContent, customParams, resp, err)
logger.Infof("channel_name: %v, event:%s, tplContent:%s, customParams:%v, respBody: %v, err: %v", notifyChannel.Name, events[0].Hash, tplContent, customParams, resp, err)
return resp, nil
case "http":
client, err := models.GetHTTPClient(notifyChannel)
@@ -253,7 +254,7 @@ func SendNotifyChannelMessage(ctx *ctx.Context, userCache *memsto.UserCacheType,
if dispatch.NeedBatchContacts(notifyChannel.RequestConfig.HTTPRequestConfig) || len(sendtos) == 0 {
resp, err = notifyChannel.SendHTTP(events, tplContent, customParams, sendtos, client)
logger.Infof("channel_name: %v, event:%+v, sendtos:%+v, tplContent:%s, customParams:%v, respBody: %v, err: %v", notifyChannel.Name, events[0], sendtos, tplContent, customParams, resp, err)
logger.Infof("channel_name: %v, event:%s, sendtos:%+v, tplContent:%s, customParams:%v, respBody: %v, err: %v", notifyChannel.Name, events[0].Hash, sendtos, tplContent, customParams, resp, err)
if err != nil {
return "", fmt.Errorf("failed to send http notify: %v", err)
}
@@ -261,7 +262,7 @@ func SendNotifyChannelMessage(ctx *ctx.Context, userCache *memsto.UserCacheType,
} else {
for i := range sendtos {
resp, err = notifyChannel.SendHTTP(events, tplContent, customParams, []string{sendtos[i]}, client)
logger.Infof("channel_name: %v, event:%+v, tplContent:%s, customParams:%v, sendto:%+v, respBody: %v, err: %v", notifyChannel.Name, events[0], tplContent, customParams, sendtos[i], resp, err)
logger.Infof("channel_name: %v, event:%s, tplContent:%s, customParams:%v, sendto:%+v, respBody: %v, err: %v", notifyChannel.Name, events[0].Hash, tplContent, customParams, sendtos[i], resp, err)
if err != nil {
return "", fmt.Errorf("failed to send http notify: %v", err)
}
@@ -280,7 +281,7 @@ func SendNotifyChannelMessage(ctx *ctx.Context, userCache *memsto.UserCacheType,
return resp, nil
case "script":
resp, _, err := notifyChannel.SendScript(events, tplContent, customParams, sendtos)
logger.Infof("channel_name: %v, event:%+v, tplContent:%s, customParams:%v, respBody: %v, err: %v", notifyChannel.Name, events[0], tplContent, customParams, resp, err)
logger.Infof("channel_name: %v, event:%s, tplContent:%s, customParams:%v, respBody: %v, err: %v", notifyChannel.Name, events[0].Hash, tplContent, customParams, resp, err)
return resp, err
default:
logger.Errorf("unsupported request type: %v", notifyChannel.RequestType)

View File

@@ -11,9 +11,9 @@ import (
"github.com/ccfos/nightingale/v6/center/cconf"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/tplx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/str"
)
@@ -25,11 +25,14 @@ func (rt *Router) notifyTplGets(c *gin.Context) {
m[models.EmailSubject] = struct{}{}
lst, err := models.NotifyTplGets(rt.Ctx)
ginx.Dangerous(err)
for i := 0; i < len(lst); i++ {
if _, exists := m[lst[i].Channel]; exists {
lst[i].BuiltIn = true
}
}
models.FillUpdateByNicknames(rt.Ctx, lst)
ginx.NewRender(c).Data(lst, err)
}
@@ -200,6 +203,9 @@ func (rt *Router) messageTemplateGets(c *gin.Context) {
ident := ginx.QueryStr(c, "ident", "")
tpls, err := models.MessageTemplateGets(rt.Ctx, id, name, ident)
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, tpls)
}
ginx.NewRender(c).Data(tpls, err)
}

View File

@@ -3,9 +3,9 @@ package router
import (
"github.com/ccfos/nightingale/v6/datasource/opensearch"
"github.com/ccfos/nightingale/v6/dscache"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)

View File

@@ -12,12 +12,13 @@ import (
"sync"
"time"
"github.com/ccfos/nightingale/v6/pkg/logx"
"github.com/ccfos/nightingale/v6/pkg/poster"
pkgprom "github.com/ccfos/nightingale/v6/pkg/prom"
"github.com/ccfos/nightingale/v6/prom"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/prometheus/common/model"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
"github.com/toolkits/pkg/net/httplib"
)
@@ -38,15 +39,16 @@ func (rt *Router) promBatchQueryRange(c *gin.Context) {
var f BatchQueryForm
ginx.Dangerous(c.BindJSON(&f))
lst, err := PromBatchQueryRange(rt.PromClients, f)
lst, err := PromBatchQueryRange(c.Request.Context(), rt.PromClients, f)
ginx.NewRender(c).Data(lst, err)
}
func PromBatchQueryRange(pc *prom.PromClientMap, f BatchQueryForm) ([]model.Value, error) {
func PromBatchQueryRange(ctx context.Context, pc *prom.PromClientMap, f BatchQueryForm) ([]model.Value, error) {
var lst []model.Value
cli := pc.GetCli(f.DatasourceId)
if cli == nil {
logx.Warningf(ctx, "no such datasource id: %d", f.DatasourceId)
return lst, fmt.Errorf("no such datasource id: %d", f.DatasourceId)
}
@@ -57,8 +59,9 @@ func PromBatchQueryRange(pc *prom.PromClientMap, f BatchQueryForm) ([]model.Valu
Step: time.Duration(item.Step) * time.Second,
}
resp, _, err := cli.QueryRange(context.Background(), item.Query, r)
resp, _, err := cli.QueryRange(ctx, item.Query, r)
if err != nil {
logx.Warningf(ctx, "query range error: query:%s err:%v", item.Query, err)
return lst, err
}
@@ -81,22 +84,23 @@ func (rt *Router) promBatchQueryInstant(c *gin.Context) {
var f BatchInstantForm
ginx.Dangerous(c.BindJSON(&f))
lst, err := PromBatchQueryInstant(rt.PromClients, f)
lst, err := PromBatchQueryInstant(c.Request.Context(), rt.PromClients, f)
ginx.NewRender(c).Data(lst, err)
}
func PromBatchQueryInstant(pc *prom.PromClientMap, f BatchInstantForm) ([]model.Value, error) {
func PromBatchQueryInstant(ctx context.Context, pc *prom.PromClientMap, f BatchInstantForm) ([]model.Value, error) {
var lst []model.Value
cli := pc.GetCli(f.DatasourceId)
if cli == nil {
logger.Warningf("no such datasource id: %d", f.DatasourceId)
logx.Warningf(ctx, "no such datasource id: %d", f.DatasourceId)
return lst, fmt.Errorf("no such datasource id: %d", f.DatasourceId)
}
for _, item := range f.Queries {
resp, _, err := cli.Query(context.Background(), item.Query, time.Unix(item.Time, 0))
resp, _, err := cli.Query(ctx, item.Query, time.Unix(item.Time, 0))
if err != nil {
logx.Warningf(ctx, "query instant error: query:%s err:%v", item.Query, err)
return lst, err
}
@@ -189,7 +193,7 @@ func (rt *Router) dsProxy(c *gin.Context) {
modifyResponse := func(r *http.Response) error {
if r.StatusCode == http.StatusUnauthorized {
logger.Warningf("proxy path:%s unauthorized access ", c.Request.URL.Path)
logx.Warningf(c.Request.Context(), "proxy path:%s unauthorized access ", c.Request.URL.Path)
return fmt.Errorf("unauthorized access")
}

View File

@@ -8,9 +8,9 @@ import (
"github.com/ccfos/nightingale/v6/alert/eval"
"github.com/ccfos/nightingale/v6/dscache"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/logx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
type CheckDsPermFunc func(c *gin.Context, dsId int64, cate string, q interface{}) bool
@@ -47,6 +47,7 @@ func QueryLogBatchConcurrently(anonymousAccess bool, ctx *gin.Context, f QueryFr
var mu sync.Mutex
var wg sync.WaitGroup
var errs []error
rctx := ctx.Request.Context()
for _, q := range f.Queries {
if !anonymousAccess && !CheckDsPerm(ctx, q.Did, q.DsCate, q) {
@@ -55,14 +56,14 @@ func QueryLogBatchConcurrently(anonymousAccess bool, ctx *gin.Context, f QueryFr
plug, exists := dscache.DsCache.Get(q.DsCate, q.Did)
if !exists {
logger.Warningf("cluster:%d not exists query:%+v", q.Did, q)
logx.Warningf(rctx, "cluster:%d not exists query:%+v", q.Did, q)
return LogResp{}, fmt.Errorf("cluster not exists")
}
// 根据数据源类型对 Query 进行模板渲染处理
err := eval.ExecuteQueryTemplate(q.DsCate, q.Query, nil)
if err != nil {
logger.Warningf("query template execute error: %v", err)
logx.Warningf(rctx, "query template execute error: %v", err)
return LogResp{}, fmt.Errorf("query template execute error: %v", err)
}
@@ -70,12 +71,12 @@ func QueryLogBatchConcurrently(anonymousAccess bool, ctx *gin.Context, f QueryFr
go func(query Query) {
defer wg.Done()
data, total, err := plug.QueryLog(ctx.Request.Context(), query.Query)
data, total, err := plug.QueryLog(rctx, query.Query)
mu.Lock()
defer mu.Unlock()
if err != nil {
errMsg := fmt.Sprintf("query data error: %v query:%v\n ", err, query)
logger.Warningf(errMsg)
logx.Warningf(rctx, "%s", errMsg)
errs = append(errs, err)
return
}
@@ -121,6 +122,7 @@ func QueryDataConcurrently(anonymousAccess bool, ctx *gin.Context, f models.Quer
var mu sync.Mutex
var wg sync.WaitGroup
var errs []error
rctx := ctx.Request.Context()
for _, q := range f.Queries {
if !anonymousAccess && !CheckDsPerm(ctx, f.DatasourceId, f.Cate, q) {
@@ -129,7 +131,7 @@ func QueryDataConcurrently(anonymousAccess bool, ctx *gin.Context, f models.Quer
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
if !exists {
logger.Warningf("cluster:%d not exists", f.DatasourceId)
logx.Warningf(rctx, "cluster:%d not exists", f.DatasourceId)
return nil, fmt.Errorf("cluster not exists")
}
@@ -137,16 +139,16 @@ func QueryDataConcurrently(anonymousAccess bool, ctx *gin.Context, f models.Quer
go func(query interface{}) {
defer wg.Done()
data, err := plug.QueryData(ctx.Request.Context(), query)
data, err := plug.QueryData(rctx, query)
if err != nil {
logger.Warningf("query data error: req:%+v err:%v", query, err)
logx.Warningf(rctx, "query data error: req:%+v err:%v", query, err)
mu.Lock()
errs = append(errs, err)
mu.Unlock()
return
}
logger.Debugf("query data: req:%+v resp:%+v", query, data)
logx.Debugf(rctx, "query data: req:%+v resp:%+v", query, data)
mu.Lock()
resp = append(resp, data...)
mu.Unlock()
@@ -192,6 +194,7 @@ func QueryLogConcurrently(anonymousAccess bool, ctx *gin.Context, f models.Query
var mu sync.Mutex
var wg sync.WaitGroup
var errs []error
rctx := ctx.Request.Context()
for _, q := range f.Queries {
if !anonymousAccess && !CheckDsPerm(ctx, f.DatasourceId, f.Cate, q) {
@@ -200,7 +203,7 @@ func QueryLogConcurrently(anonymousAccess bool, ctx *gin.Context, f models.Query
plug, exists := dscache.DsCache.Get(f.Cate, f.DatasourceId)
if !exists {
logger.Warningf("cluster:%d not exists query:%+v", f.DatasourceId, f)
logx.Warningf(rctx, "cluster:%d not exists query:%+v", f.DatasourceId, f)
return LogResp{}, fmt.Errorf("cluster not exists")
}
@@ -208,11 +211,11 @@ func QueryLogConcurrently(anonymousAccess bool, ctx *gin.Context, f models.Query
go func(query interface{}) {
defer wg.Done()
data, total, err := plug.QueryLog(ctx.Request.Context(), query)
logger.Debugf("query log: req:%+v resp:%+v", query, data)
data, total, err := plug.QueryLog(rctx, query)
logx.Debugf(rctx, "query log: req:%+v resp:%+v", query, data)
if err != nil {
errMsg := fmt.Sprintf("query data error: %v query:%v\n ", err, query)
logger.Warningf(errMsg)
logx.Warningf(rctx, "%s", errMsg)
mu.Lock()
errs = append(errs, err)
mu.Unlock()
@@ -250,6 +253,7 @@ func (rt *Router) QueryLogV2(c *gin.Context) {
func (rt *Router) QueryLog(c *gin.Context) {
var f models.QueryParam
ginx.BindJSON(c, &f)
rctx := c.Request.Context()
var resp []interface{}
for _, q := range f.Queries {
@@ -259,13 +263,13 @@ func (rt *Router) QueryLog(c *gin.Context) {
plug, exists := dscache.DsCache.Get("elasticsearch", f.DatasourceId)
if !exists {
logger.Warningf("cluster:%d not exists", f.DatasourceId)
logx.Warningf(rctx, "cluster:%d not exists", f.DatasourceId)
ginx.Bomb(200, "cluster not exists")
}
data, _, err := plug.QueryLog(c.Request.Context(), q)
data, _, err := plug.QueryLog(rctx, q)
if err != nil {
logger.Warningf("query data error: %v", err)
logx.Warningf(rctx, "query data error: %v", err)
ginx.Bomb(200, "err:%v", err)
continue
}

View File

@@ -7,14 +7,17 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) recordingRuleGets(c *gin.Context) {
busiGroupId := ginx.UrlParamInt64(c, "id")
ars, err := models.RecordingRuleGets(rt.Ctx, busiGroupId)
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, ars)
}
ginx.NewRender(c).Data(ars, err)
}
@@ -39,6 +42,9 @@ func (rt *Router) recordingRuleGetsByGids(c *gin.Context) {
}
ars, err := models.RecordingRuleGetsByBGIds(rt.Ctx, gids)
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, ars)
}
ginx.NewRender(c).Data(ars, err)
}

View File

@@ -6,9 +6,9 @@ import (
"github.com/ccfos/nightingale/v6/center/cconf"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) rolesGets(c *gin.Context) {

View File

@@ -5,8 +5,8 @@ import (
"github.com/ccfos/nightingale/v6/center/cconf"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
)

View File

@@ -5,9 +5,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/slice"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) savedViewGets(c *gin.Context) {
@@ -20,6 +20,7 @@ func (rt *Router) savedViewGets(c *gin.Context) {
ginx.NewRender(c).Data(nil, err)
return
}
models.FillUpdateByNicknames(rt.Ctx, lst)
userGids, err := models.MyGroupIds(rt.Ctx, me.Id)
if err != nil {

View File

@@ -5,10 +5,10 @@ import (
"github.com/ccfos/nightingale/v6/pkg/flashduty"
"github.com/ccfos/nightingale/v6/pkg/ormx"
"github.com/ccfos/nightingale/v6/pkg/secu"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/google/uuid"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)

View File

@@ -4,9 +4,9 @@ import (
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) serversGet(c *gin.Context) {

View File

@@ -5,10 +5,10 @@ import (
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/google/uuid"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
// sourceTokenAdd 生成新的源令牌

View File

@@ -13,10 +13,10 @@ import (
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pushgw/idents"
"github.com/ccfos/nightingale/v6/storage"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/prometheus/common/model"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)

View File

@@ -7,9 +7,9 @@ import (
"github.com/ccfos/nightingale/v6/alert/sender"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
)

View File

@@ -8,9 +8,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/i18n"
"github.com/toolkits/pkg/str"
)
@@ -25,6 +25,7 @@ func (rt *Router) taskTplGets(c *gin.Context) {
list, err := models.TaskTplGets(rt.Ctx, []int64{groupId}, query, limit, ginx.Offset(c, limit))
ginx.Dangerous(err)
models.FillUpdateByNicknames(rt.Ctx, list)
ginx.NewRender(c).Data(gin.H{
"total": total,
@@ -60,6 +61,7 @@ func (rt *Router) taskTplGetsByGids(c *gin.Context) {
list, err := models.TaskTplGets(rt.Ctx, gids, query, limit, ginx.Offset(c, limit))
ginx.Dangerous(err)
models.FillUpdateByNicknames(rt.Ctx, list)
ginx.NewRender(c).Data(gin.H{
"total": total,

View File

@@ -8,8 +8,8 @@ import (
"github.com/ccfos/nightingale/v6/datasource/tdengine"
"github.com/ccfos/nightingale/v6/dscache"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
type databasesQueryForm struct {

View File

@@ -0,0 +1,136 @@
package router
import (
"encoding/json"
"fmt"
"io"
"net/http"
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/ccfos/nightingale/v6/pkg/loggrep"
"github.com/toolkits/pkg/logger"
"github.com/gin-gonic/gin"
)
// traceLogsPage renders an HTML log viewer page for trace logs.
func (rt *Router) traceLogsPage(c *gin.Context) {
traceId := ginx.UrlParamStr(c, "traceid")
if !loggrep.IsValidTraceID(traceId) {
c.String(http.StatusBadRequest, "invalid trace id format")
return
}
logs, instance, err := rt.getTraceLogs(traceId)
if err != nil {
c.String(http.StatusInternalServerError, "Error: %v", err)
return
}
c.Header("Content-Type", "text/html; charset=utf-8")
err = loggrep.RenderTraceLogsHTML(c.Writer, loggrep.TraceLogsPageData{
TraceID: traceId,
Instance: instance,
Logs: logs,
Total: len(logs),
})
if err != nil {
c.String(http.StatusInternalServerError, "render error: %v", err)
}
}
// traceLogsJSON returns JSON for trace logs.
func (rt *Router) traceLogsJSON(c *gin.Context) {
traceId := ginx.UrlParamStr(c, "traceid")
if !loggrep.IsValidTraceID(traceId) {
ginx.Bomb(200, "invalid trace id format")
}
logs, instance, err := rt.getTraceLogs(traceId)
ginx.Dangerous(err)
ginx.NewRender(c).Data(loggrep.EventDetailResp{
Logs: logs,
Instance: instance,
}, nil)
}
// getTraceLogs finds the same-engine instances and queries each one
// until trace logs are found. Trace logs belong to a single instance.
func (rt *Router) getTraceLogs(traceId string) ([]string, string, error) {
keyword := "trace_id=" + traceId
instance := fmt.Sprintf("%s:%d", rt.Alert.Heartbeat.IP, rt.HTTP.Port)
engineName := rt.Alert.Heartbeat.EngineName
// try local first
logs, err := loggrep.GrepLatestLogFiles(rt.LogDir, keyword)
if err == nil && len(logs) > 0 {
return logs, instance, nil
}
// find all instances with the same engineName
servers, err := models.AlertingEngineGetsInstances(rt.Ctx,
"engine_cluster = ? and clock > ?",
engineName, time.Now().Unix()-30)
if err != nil {
return nil, "", err
}
// loop through remote instances until we find logs
for _, node := range servers {
if node == instance {
continue // already tried local
}
logs, nodeAddr, err := rt.forwardTraceLogs(node, traceId)
if err != nil {
logger.Errorf("forwardTraceLogs failed: %v", err)
continue
}
if len(logs) > 0 {
return logs, nodeAddr, nil
}
}
return nil, instance, nil
}
func (rt *Router) forwardTraceLogs(node, traceId string) ([]string, string, error) {
url := fmt.Sprintf("http://%s/v1/n9e/trace-logs/%s", node, traceId)
req, err := http.NewRequest("GET", url, nil)
if err != nil {
return nil, node, err
}
for user, pass := range rt.HTTP.APIForService.BasicAuth {
req.SetBasicAuth(user, pass)
break
}
client := &http.Client{Timeout: 15 * time.Second}
resp, err := client.Do(req)
if err != nil {
return nil, node, fmt.Errorf("forward to %s failed: %v", node, err)
}
defer resp.Body.Close()
body, err := io.ReadAll(io.LimitReader(resp.Body, 10*1024*1024))
if err != nil {
return nil, node, err
}
var result struct {
Dat loggrep.EventDetailResp `json:"dat"`
Err string `json:"err"`
}
if err := json.Unmarshal(body, &result); err != nil {
return nil, node, err
}
if result.Err != "" {
return nil, node, fmt.Errorf("%s", result.Err)
}
return result.Dat.Logs, result.Dat.Instance, nil
}

View File

@@ -9,9 +9,9 @@ import (
"github.com/ccfos/nightingale/v6/pkg/flashduty"
"github.com/ccfos/nightingale/v6/pkg/ormx"
"github.com/ccfos/nightingale/v6/pkg/secu"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
"gorm.io/gorm"
)

View File

@@ -7,9 +7,9 @@ import (
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/flashduty"
"github.com/ccfos/nightingale/v6/pkg/strx"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
"github.com/toolkits/pkg/logger"
)
@@ -27,6 +27,9 @@ func (rt *Router) userGroupGets(c *gin.Context) {
me := c.MustGet("user").(*models.User)
lst, err := me.UserGroups(rt.Ctx, limit, query)
if err == nil {
models.FillUpdateByNicknames(rt.Ctx, lst)
}
ginx.NewRender(c).Data(lst, err)
}

View File

@@ -5,9 +5,9 @@ import (
"time"
"github.com/ccfos/nightingale/v6/models"
"github.com/ccfos/nightingale/v6/pkg/ginx"
"github.com/gin-gonic/gin"
"github.com/toolkits/pkg/ginx"
)
func (rt *Router) userVariableConfigGets(context *gin.Context) {

View File

@@ -87,7 +87,7 @@ func Initialize(configDir string, cryptoKey string) (func(), error) {
alert.Start(config.Alert, config.Pushgw, syncStats, alertStats, externalProcessors, targetCache, busiGroupCache, alertMuteCache,
alertRuleCache, notifyConfigCache, taskTplsCache, dsCache, ctx, promClients, userCache, userGroupCache, notifyRuleCache, notifyChannelCache, messageTemplateCache, configCvalCache)
alertrtRouter := alertrt.New(config.HTTP, config.Alert, alertMuteCache, targetCache, busiGroupCache, alertStats, ctx, externalProcessors)
alertrtRouter := alertrt.New(config.HTTP, config.Alert, alertMuteCache, targetCache, busiGroupCache, alertStats, ctx, externalProcessors, config.Log.Dir)
alertrtRouter.Config(r)

Some files were not shown because too many files have changed in this diff Show More