fs-lawrisk/docs/PRD.md

我需要你帮我构建一个检索系统，用户会输入问题（中文），期望匹配到对应的事项，输出事项ID 事项名称 许可事项列表，例如：

用户输入：我要办一家电影院

输出：

&nbsp; "risk\_subject": \[

&nbsp;   {

&nbsp;     "id": "384aeb24a23e913268aad33354f705e7",

&nbsp;     "name": "开办电影院",

&nbsp;     "permit\_ids": \[

&nbsp;       "04bfa019634ca1aa0b9f7c783fd85dce",

&nbsp;       "509b2872fc7c38c08f252a2b426fd49f",

&nbsp;       "54a79077-bd72-4ea9-8bb1-35afc69e2973",

&nbsp;       "709b4718d72229311066e529650b8abf",

&nbsp;       "8d49de002f24d37fcf3663574723e693",

&nbsp;       "8f7c8c613adfbd815a78c1e60ec4330e",

&nbsp;       "a0572119839422e1d11ee8801d6c58b7",

&nbsp;       "fa2f3e05c92297be096b63e25d30bfbe"

&nbsp;     ]

&nbsp;   }]

我希望你能用embedding模型来处理

首先先把事项名称从risk\_tables\_export.json 中提取出来，然后建立一个fs\_law\_risk数据库,建立表law\_sub用来存放事项向量，以json文件中的ID为主键，保存名称和向量到数据库

再建立一个表，名为law\_sub\_per,保存主题事项与许可事项的映射关系，需要有主题事项id，许可事项id列表

设置embedding相似度阈值0.5，大于阈值以上的事项全部返回

如果检索结果都小于0.5，但大于0.4，返回第一个

暴露接口/fs-ai-asistant/api/workflow/lawrisk

跨域问题处理请复用：smart\_cors\_middleware.py，你可以把这个文件移动到合适的目录

* 你可以使用的postgreSQL：

&nbsp;	- IP :8.138.196.105 

&nbsp;	- port:5432

&nbsp;	   - user:postgres

&nbsp;	   - password:difyai123456

* API 以及doc参考

我们应该只需要用同步接口

&nbsp;	- 通用文本向量同步接口API详情：https://help.aliyun.com/zh/model-studio/text-embedding-synchronous-api?spm=a2c4g.11186623.help-menu-2400256.d\_2\_7\_0.693e48233phHX8

&nbsp;	- 通用文本批处理接口API详情：https://help.aliyun.com/zh/model-studio/text-embedding-batch-api?spm=a2c4g.11186623.help-menu-2400256.d\_2\_7\_1.59233560WBHuRz

&nbsp;	- API key：sk-288824ef003e4e02bb963b8b3024b06a