最近在项目中引入skill
把之前需要在系统提示词中注明的规则,全面挪移到相应skill的 SKILL.md 说明文件中
不过AI迭代太快,Spring Ai Alibaba 文档都来不及更新,里面的bug也不少。
在本地功能测试没有问题,发布到服务器,发现Skill not found: Skill not found: lot-query
1 2 3 4 5 6 7 8
| INFO 1 --- [oundedElastic-1] c.a.c.ai.graph.agent.node.AgentLlmNode : [ThreadId 07f735f0-fd80-49b8-b4e5-7e90cfc2eb35] Agent ??? reasoning round 0 streaming output: [ToolCall[id=call_03da1a646a1a410bb82cf6, type=function, name=read_skill, arguments={"skill_name": "lot-query"}]] 2026-03-17T03:01:57.046330027Z 2026-03-17T11:01:57.046+08:00 INFO 1 --- [oundedElastic-1] c.a.c.ai.graph.agent.node.AgentToolNode : [ThreadId 07f735f0-fd80-49b8-b4e5-7e90cfc2eb35] Agent ??? acting with 1 tools. 2026-03-17T03:01:57.048982723Z 2026-03-17T11:01:57.048+08:00 INFO 1 --- [oundedElastic-1] c.a.c.ai.graph.agent.node.AgentToolNode : [ThreadId 07f735f0-fd80-49b8-b4e5-7e90cfc2eb35] Agent ??? acting, executing tool read_skill. 2026-03-17T03:01:57.050245886Z 2026-03-17T11:01:57.050+08:00 WARN 1 --- [oundedElastic-1] c.a.c.a.g.a.hook.skills.ReadSkillTool : **Skill not found: Skill not found: lot-query** 2026-03-17T03:01:57.051826871Z 2026-03-17T11:01:57.051+08:00 INFO 1 --- [oundedElastic-1] c.a.c.ai.graph.agent.node.AgentToolNode : [ThreadId 07f735f0-fd80-49b8-b4e5-7e90cfc2eb35] Agent ??? acting, tool read_skill finished 2026-03-17T03:01:57.053390410Z 2026-03-17T11:01:57.052+08:00 INFO 1 --- [oundedElastic-1] c.a.c.ai.graph.agent.node.AgentToolNode : [ThreadId 07f735f0-fd80-49b8-b4e5-7e90cfc2eb35] Agent ??? acting returned: ToolResponseMessage{responses=[ToolResponse[id=call_03da1a646a1a410bb82cf6, name=read_skill, responseData="Error: Skill not found: lot-query"]], messageType=TOOL, metadata={messageType=TOOL}}
|
原来是框架里面的ClasspathSkillRegistry 在Spring Boot fat jar 场景中有bug,会加载不到SKILL.md 文件
- 打包后的 jar 中已经确认包含 skill 文件,例如:
BOOT-INF/classes/skills/cpe-operation/SKILL.md
BOOT-INF/classes/skills/lot-operation/SKILL.md
BOOT-INF/classes/skills/lot-query/SKILL.md
BOOT-INF/classes/skills/true-north-router/SKILL.md
- 但应用启动日志显示:
TrueNorth skill registry loaded 0 skills.
TrueNorth skill registry is empty at startup.
- 随后模型调用
read_skill("lot-query") 时,得到:
Skill not found: lot-query
使用AI分析了一下源码,现在有AI 学习效率提升不少,以前还要手动翻源码,画流程图,清楚调用关系。
现在先让AI分析一下源码,输出各种结构图,快速梳理出框架主干
再针对性提问题,让他提出源码位置,细节一目了然
把AI分析结果提交个issue: https://github.com/alibaba/spring-ai-alibaba/issues/4426
再让AI重新实现一个SkillRegistry,解决这个问题