feat: restore AGENTS.md and CLAUDE.md to root directory
- AGENTS.md: Development guidelines and coding standards - CLAUDE.md: Detailed Claude Code development guide - Both files are important project documentation that should be in root - Files also exist in docs/guides/ for documentation organization
This commit is contained in:
parent
b236021993
commit
e03bfec12c
|
|
@ -0,0 +1,16 @@
|
|||
# Repository Guidelines
|
||||
## 必须使用中文回复我
|
||||
## Project Structure & Module Organization
|
||||
`app.py` boots the Flask service and loads blueprints from `lawrisk/`, whose subpackages mirror the request flow: `api/` (v1/v2 routes), `services/` (retrieval logic, PostgreSQL access, DashScope orchestration), `middleware/`, and `utils/` for ingestion/export helpers. Static assets for manual QA live in `static/`, while `data/` stores exported JSON/SQL snapshots used by the ingestion scripts. Formal specs, API references, and prior agent notes are in `docs/`. Python tests (pytest) belong under `tests/`; favor mirroring the module paths (e.g., `tests/api/test_v2.py`).
|
||||
|
||||
## Build, Test, and Development Commands
|
||||
Install dependencies with `pip install -r requirements.txt` from a Python 3.11+ shell. Launch the server via `python app.py`, which binds to `http://localhost:8000` and exposes `/healthz`. Run the lightweight regression suite with `pytest`. Format and lint before submitting using `black .` and `ruff .`; both read `pyproject.toml` defaults so no extra flags are needed. For manual API smoke tests, open `static/v2_tester.html` in a browser or send `curl` requests as shown in `README.md`.
|
||||
|
||||
## Coding Style & Naming Conventions
|
||||
Use Black’s default 88-column formatting and Ruff-compatible imports; never hand-wrap lines differently. Modules, packages, and files are lowercase with underscores (`lawrisk_service.py`). Public functions favor descriptive verbs (`fetch_risk_snapshot`). Prefer dataclass-like dicts for payloads, and keep API responses snake_case to match existing endpoints. Add concise docstrings for service-layer functions that hit external systems. Configuration should be read through `lawrisk.utils.env_loader` rather than `os.environ` directly.
|
||||
|
||||
## Testing Guidelines
|
||||
Pytest is the standard; each new feature should include a focused unit test plus, when feasible, a service-level test hitting the Flask test client. Name tests after the behavior under check (`test_query_returns_ranked_risks`). Use fixtures for sample licenses/risk tables to keep tests deterministic. Aim to cover error paths (missing env vars, DashScope failures) because they are the common regressions. Run `pytest -q` locally before opening a PR, and attach logs when failures are environment-specific.
|
||||
|
||||
## Commit & Pull Request Guidelines
|
||||
Follow the existing Conventional Commit style (`feat: …`, `fix: …`, `chore: …`). Limit commits to one logical change and reference the affected domain in the subject when possible (e.g., `feat: enhance v2 risk scoring`). PRs should describe intent, summarize testing (`pytest`, manual curl, etc.), and link to any tracking issue or checklist. Include screenshots or JSON snippets when adjusting responses so reviewers can verify payload diffs quickly.
|
||||
|
|
@ -0,0 +1,681 @@
|
|||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
---
|
||||
|
||||
# LawRisk Backend - Claude Code Analysis
|
||||
|
||||
## Project Overview
|
||||
|
||||
**LawRisk** is a Flask-based Python backend service that provides intelligent legal compliance risk retrieval for business licensing and permit requirements. It uses vector embeddings and LLM-based matching to help users find relevant permits, licenses, and associated legal risks based on natural language queries (in Chinese).
|
||||
|
||||
**Python Version Requirement**: Python 3.10+ (uses PEP 604 union types like `str | None`)
|
||||
|
||||
### Key Features
|
||||
- **Semantic Search**: Uses Aliyun DashScope embeddings (text-embedding-v4) to find similar legal topics
|
||||
- **LLM-Powered Matching**: Qwen (qwen-plus-latest) for intelligent subject selection
|
||||
- **Two Database Architecture**:
|
||||
- `fs_law_risk`: Vector embeddings and subject-permit mappings
|
||||
- `licensing_risks`: Structured permit and risk data with regions, themes, and compliance information
|
||||
- **RESTful APIs**: Clean REST endpoints for V1 (legacy) and V2 (enhanced) search
|
||||
- **CORS Enabled**: Built-in CORS middleware for frontend integration
|
||||
|
||||
---
|
||||
|
||||
## Architecture & Project Structure
|
||||
|
||||
### Core Framework & Libraries
|
||||
- **Framework**: Flask (Python web framework)
|
||||
- **Database Driver**: pg8000 (PostgreSQL adapter)
|
||||
- **Vector Embeddings**: Aliyun DashScope OpenAI-compatible API
|
||||
- **LLM**: Qwen via DashScope (qwen-plus-latest)
|
||||
- **Dependencies**: Minimal footprint - Flask, pg8000, concurrent.futures
|
||||
|
||||
### Directory Structure
|
||||
```
|
||||
市监局-lawRisk-backend/
|
||||
├── app.py # Flask application entry point
|
||||
├── requirements.txt # Python dependencies
|
||||
├── .env # Environment configuration
|
||||
├── lawrisk/ # Main application package
|
||||
│ ├── __init__.py
|
||||
│ ├── api/ # API route handlers
|
||||
│ │ ├── v1.py # V1 API (legacy)
|
||||
│ │ └── v2.py # V2 API (current)
|
||||
│ ├── services/ # Business logic layer
|
||||
│ │ ├── lawrisk_service.py # Core search & embeddings
|
||||
│ │ ├── lawrisk_v2_service.py # V2 enhanced service
|
||||
│ │ └── licensing_repo.py # Data repository
|
||||
│ ├── middleware/ # HTTP middleware
|
||||
│ │ └── smart_cors_middleware.py
|
||||
│ └── utils/ # Utility functions
|
||||
│ ├── env_loader.py
|
||||
│ ├── export_risk_json.py
|
||||
│ └── ingest_lawrisk.py
|
||||
├── static/ # Static assets
|
||||
│ └── v2_tester.html # Web-based API tester
|
||||
├── tests/ # Test suite (planned)
|
||||
├── data/ # Data files
|
||||
│ ├── risk_tables_export.json
|
||||
│ └── licensing_risks_dump.sql
|
||||
└── docs/ # Documentation
|
||||
├── PRD.md
|
||||
├── API.md
|
||||
├── V2_API文档.md
|
||||
├── AGENTS.md
|
||||
├── DB_GUIDE.md
|
||||
└── CLAUDE.md
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Most Common Commands
|
||||
```bash
|
||||
# Run the application
|
||||
python app.py
|
||||
|
||||
# Export data from database
|
||||
python export_risk_json.py
|
||||
|
||||
# Ingest data with embeddings (requires DASHSCOPE_API_KEY)
|
||||
python ingest_lawrisk.py
|
||||
|
||||
# Format and lint code
|
||||
black .
|
||||
ruff .
|
||||
|
||||
# Test locally via browser
|
||||
# Open static/v2_tester.html after starting the app
|
||||
```
|
||||
|
||||
### Key Files
|
||||
- `app.py` - Flask application entry point
|
||||
- `lawrisk/` - Main application package
|
||||
- `api/v1.py` - V1 API routes (legacy)
|
||||
- `api/v2.py` - V2 API routes (current)
|
||||
- `services/lawrisk_service.py` - Core search & embeddings
|
||||
- `services/lawrisk_v2_service.py` - V2 enhanced service
|
||||
- `services/licensing_repo.py` - Data repository
|
||||
- `middleware/smart_cors_middleware.py` - CORS middleware
|
||||
- `utils/` - Utility functions
|
||||
- `static/v2_tester.html` - Web-based API testing interface
|
||||
- `requirements.txt` - Python dependencies
|
||||
- `.env` - Environment configuration
|
||||
|
||||
---
|
||||
|
||||
## Development Workflow
|
||||
|
||||
### Initial Setup
|
||||
```bash
|
||||
# 1. Create virtual environment (Windows PowerShell)
|
||||
python -m venv .venv
|
||||
.venv\Scripts\activate
|
||||
|
||||
# 2. Install dependencies
|
||||
pip install Flask pg8000 black ruff pytest
|
||||
|
||||
# 3. Configure environment
|
||||
# Edit .env with your database credentials and DashScope API key
|
||||
```
|
||||
|
||||
### Virtual Environment Activation
|
||||
```powershell
|
||||
# Windows PowerShell
|
||||
.venv\Scripts\activate
|
||||
|
||||
# Windows CMD
|
||||
.venv\Scripts\activate.bat
|
||||
|
||||
# Git Bash (Windows)
|
||||
source .venv/Scripts/activate
|
||||
```
|
||||
|
||||
### Common Commands
|
||||
|
||||
#### Run the Application
|
||||
```bash
|
||||
# Development mode
|
||||
python app.py
|
||||
|
||||
# Custom port
|
||||
PORT=8000 python app.py
|
||||
|
||||
# With debug logging
|
||||
FLASK_DEBUG=1 python app.py
|
||||
```
|
||||
|
||||
#### Data Management
|
||||
```bash
|
||||
# Export data from fs_law_risk database to JSON
|
||||
python export_risk_json.py
|
||||
|
||||
# Ingest data with embeddings into database
|
||||
python ingest_lawrisk.py
|
||||
# Requires DASHSCOPE_API_KEY in .env
|
||||
```
|
||||
|
||||
#### Code Quality
|
||||
```bash
|
||||
# Format code with Black (100 char line length)
|
||||
black .
|
||||
|
||||
# Lint with Ruff
|
||||
ruff .
|
||||
|
||||
# Format and lint specific file
|
||||
black lawrisk/services/lawrisk_v2_service.py
|
||||
ruff lawrisk/services/lawrisk_v2_service.py
|
||||
|
||||
# Run tests (when added)
|
||||
pytest -q
|
||||
|
||||
# Run tests with coverage
|
||||
pytest --cov=lawrisk
|
||||
```
|
||||
|
||||
### Data Management Commands
|
||||
```bash
|
||||
# Export data from fs_law_risk database to JSON
|
||||
# Output: data/risk_tables_export.json
|
||||
python lawrisk/utils/export_risk_json.py
|
||||
|
||||
# Ingest data with embeddings into database
|
||||
# Requires DASHSCOPE_API_KEY in .env
|
||||
python lawrisk/utils/ingest_lawrisk.py
|
||||
|
||||
# Verify exported data
|
||||
ls -lh data/
|
||||
cat data/risk_tables_export.json | head -50
|
||||
```
|
||||
|
||||
#### Database Operations
|
||||
```bash
|
||||
# Connect to PostgreSQL
|
||||
psql -h 8.138.196.105 -U postgres -d fs_law_risk
|
||||
|
||||
# Connect to licensing_risks database
|
||||
psql -h 8.138.196.105 -U postgres -d licensing_risks
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### V1 API (Legacy)
|
||||
- **Path**: `/fs-ai-asistant/api/workflow/lawrisk`
|
||||
- **Methods**: GET, POST
|
||||
- **Mode**: `llm` (default) or `embed`
|
||||
- **Input**: `query` (user question)
|
||||
- **Output**: Simple array of matching subjects with permit IDs
|
||||
|
||||
### V2 API (Current/Recommended)
|
||||
- **Base Path**: `/fs-ai-asistant/api/workflow/lawrisk/v2`
|
||||
- **Methods**: GET, POST
|
||||
- **Features**:
|
||||
- Structured results with regions, themes, permits, and risks
|
||||
- Optional region filtering
|
||||
- Debug mode with detailed execution info
|
||||
- Direct permit matching by name
|
||||
|
||||
#### V2 Sub-endpoints
|
||||
|
||||
1. **Search Endpoint**
|
||||
- Path: `/fs-ai-asistant/api/workflow/lawrisk/v2`
|
||||
- Parameters:
|
||||
- `query` (required): User question
|
||||
- `region` (optional): Filter by region (市级, 禅城区, etc.)
|
||||
- `debug` (optional): Enable debug output (1/true/yes/on)
|
||||
- `top` (optional): Number of recommendations (default: 5)
|
||||
|
||||
2. **Regions List**
|
||||
- Path: `/fs-ai-asistant/api/workflow/lawrisk/v2/regions`
|
||||
- Method: GET
|
||||
- Returns: All available regions for filtering
|
||||
|
||||
3. **Get Permits**
|
||||
- Path: `/fs-ai-asistant/api/workflow/lawrisk/getPermits`
|
||||
- Method: GET, POST
|
||||
- Input: `region` (region ID or name)
|
||||
- Returns: All permits for a specific region
|
||||
|
||||
### Health Check
|
||||
- Path: `/healthz`
|
||||
- Method: GET
|
||||
- Returns: `{"status": "ok"}`
|
||||
|
||||
---
|
||||
|
||||
## Database Schema
|
||||
|
||||
### Database 1: fs_law_risk
|
||||
Used for vector embeddings and semantic search.
|
||||
|
||||
#### Tables
|
||||
- **`law_sub`**: Subject matter with embeddings
|
||||
- `id` (TEXT, PK): Subject ID
|
||||
- `name` (TEXT): Subject name
|
||||
- `vector` (JSONB): Embedding vector
|
||||
|
||||
- **`law_sub_per`**: Subject-permit mappings
|
||||
- `sub_id` (TEXT, PK): Subject ID
|
||||
- `per_ids` (JSONB): Array of permit IDs
|
||||
|
||||
- **`law_per`**: Permit information
|
||||
- `id` (TEXT, PK): Permit ID
|
||||
- `name` (TEXT): Permit name
|
||||
- `risk_ids` (JSONB): Array of risk IDs
|
||||
|
||||
### Database 2: licensing_risks
|
||||
Used for structured compliance data.
|
||||
|
||||
#### Tables
|
||||
- **`regions`**: Administrative areas
|
||||
- `id` (PK), `name` (unique)
|
||||
|
||||
- **`business_scopes`**: Business scope definitions
|
||||
- `id` (PK), `description`
|
||||
|
||||
- **`region_scopes`**: Region-scope mappings
|
||||
|
||||
- **`themes`**: Legal themes/subjects
|
||||
- `id` (PK), `name`
|
||||
|
||||
- **`region_themes`**: Region-theme mappings
|
||||
|
||||
- **`permits`**: License/permit items
|
||||
- `id` (PK), `name`
|
||||
|
||||
- **`region_theme_permits`**: Tripartite linkage
|
||||
|
||||
- **`risks`**: Risk information
|
||||
- `id` (PK), `risk_content`, `legal_basis`, `document_no`, `summary`
|
||||
|
||||
- **`region_permit_risks`**: Risk associations
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables (.env)
|
||||
|
||||
#### DashScope (Embeddings & LLM)
|
||||
```
|
||||
DASHSCOPE_API_KEY=sk-288824ef003e4e02bb963b8b3024b06a
|
||||
DASHSCOPE_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
|
||||
DASHSCOPE_EMBED_MODEL=text-embedding-v4
|
||||
DASHSCOPE_EMBED_DIM=1024
|
||||
DASHSCOPE_MAX_BATCH=10
|
||||
DASHSCOPE_CHAT_MODEL=qwen-plus-latest
|
||||
```
|
||||
|
||||
#### PostgreSQL Configuration
|
||||
```
|
||||
# fs_law_risk database
|
||||
PG_HOST=8.138.196.105
|
||||
PG_PORT=5432
|
||||
PG_USER=postgres
|
||||
PG_PASSWORD=difyai123456
|
||||
PG_DATABASE=fs_law_risk
|
||||
PG_ADMIN_DB=postgres
|
||||
|
||||
# licensing_risks database
|
||||
LIC_PG_HOST=8.138.196.105
|
||||
LIC_PG_PORT=5432
|
||||
LIC_PG_USER=postgres
|
||||
LIC_PG_PASSWORD=difyai123456
|
||||
LIC_PG_DATABASE=licensing_risks
|
||||
```
|
||||
|
||||
#### Application Settings
|
||||
```
|
||||
FLASK_ENV=development
|
||||
|
||||
# Search thresholds (tunable)
|
||||
LAWRISK_RETURN_IF_GE=0.7
|
||||
LAWRISK_FALLBACK_GT=0.5
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
### Current Testing Status
|
||||
- **No dedicated test suite** in the repository
|
||||
- pytest framework is recommended but not configured
|
||||
- Manual testing available via `static/v2_tester.html`
|
||||
|
||||
### Testing Framework & Guidelines
|
||||
When adding tests, use:
|
||||
- **Framework**: pytest with Flask test client
|
||||
- **Dependencies**: pytest-cov for coverage reporting
|
||||
- **Focus Areas**:
|
||||
- API endpoints (V1 and V2)
|
||||
- Middleware behavior (CORS)
|
||||
- Database operations
|
||||
- LLM selection logic
|
||||
- Region filtering
|
||||
- Checkpoint operations (create, list, restore, delete)
|
||||
|
||||
**Recommended test structure:**
|
||||
```
|
||||
tests/
|
||||
├── conftest.py # Shared fixtures and test configuration
|
||||
├── test_api_v1.py # V1 API endpoint tests
|
||||
├── test_api_v2.py # V2 API endpoint tests
|
||||
├── test_admin_endpoints.py # Admin endpoint tests
|
||||
├── test_search_service.py # Core search logic
|
||||
├── test_licensing_repo.py # Database repository
|
||||
└── test_cors_middleware.py # CORS middleware behavior
|
||||
|
||||
# Test naming: test_*.py (pytest discovery pattern)
|
||||
# Test functions: test_<feature>_<scenario>()
|
||||
```
|
||||
|
||||
**Testing CORS Middleware** (from AGENTS.md):
|
||||
- Origin matching: wildcard, exact, subdomains
|
||||
- Preflight OPTIONS handling
|
||||
- X-CORS-Decision header behavior
|
||||
- NGINX_CORS_MODE functionality
|
||||
- Environment variable configurations (ALLOWED_ORIGINS, CORS_STRICT, etc.)
|
||||
|
||||
**Example test case:**
|
||||
```python
|
||||
def test_v2_search_with_debug():
|
||||
client = app.test_client()
|
||||
response = client.post('/fs-ai-asistant/api/workflow/lawrisk/v2',
|
||||
data={'query': '电影院', 'debug': '1'})
|
||||
assert response.status_code == 200
|
||||
data = response.get_json()['data']
|
||||
assert 'debug' in data
|
||||
assert 'executionTime' in data
|
||||
```
|
||||
|
||||
### Testing Commands
|
||||
```bash
|
||||
# Run all tests
|
||||
pytest
|
||||
|
||||
# Run with coverage
|
||||
pytest --cov=lawrisk_service
|
||||
|
||||
# Run specific test file
|
||||
pytest tests/test_api_v2.py -v
|
||||
```
|
||||
|
||||
### Manual Testing with V2 Tester
|
||||
1. Start the application: `python app.py` (defaults to port 8000)
|
||||
2. Open `static/v2_tester.html` in your browser
|
||||
3. Test queries like:
|
||||
- "我要办一家电影院"
|
||||
- "开办旅馆需要哪些许可"
|
||||
- "公共场所卫生许可"
|
||||
- Query with region filter: "电影院®ion=市级&debug=1"
|
||||
|
||||
The tester provides a simple UI to experiment with the V2 API and view debug information.
|
||||
|
||||
---
|
||||
|
||||
## Coding Standards & Best Practices
|
||||
|
||||
### Python Style Guidelines
|
||||
- **Indentation**: 4 spaces (no tabs)
|
||||
- **Encoding**: UTF-8 for all source files
|
||||
- **Naming Conventions**:
|
||||
- Functions/variables: `snake_case`
|
||||
- Constants: `SCREAMING_SNAKE_CASE`
|
||||
- Classes: `PascalCase`
|
||||
- **Type Hints**: Prefer type hints for all public functions
|
||||
- **Code Formatting**: Use `black` with 100-character line length
|
||||
- **Linting**: Use `ruff` with default rules
|
||||
|
||||
### Code Quality Guidelines
|
||||
- Keep functions small and side-effect free
|
||||
- Prefer pure functions where possible
|
||||
- Document complex logic with comments
|
||||
- Use type hints from `typing` module
|
||||
- Handle errors gracefully with appropriate logging
|
||||
|
||||
### Documentation Files
|
||||
- **PRD.md** (docs/PRD.md) - Product Requirements Document
|
||||
- Business logic and requirements specification
|
||||
- Feature specifications and user stories
|
||||
|
||||
- **API.md** (docs/API.md) - V1 API documentation
|
||||
- Legacy API endpoints and usage
|
||||
- Request/response formats
|
||||
|
||||
- **V2_API文档.md** (docs/V2_API文档.md) - Detailed V2 API documentation
|
||||
- Enhanced API specification
|
||||
- Admin endpoints and checkpoint operations
|
||||
- Request/response examples
|
||||
|
||||
- **AGENTS.md** (docs/AGENTS.md) - Development guidelines
|
||||
- Repository structure and module organization
|
||||
- Coding style and naming conventions
|
||||
- Commit and PR guidelines
|
||||
- Security and configuration tips
|
||||
|
||||
- **DB_GUIDE.md** (docs/DB_GUIDE.md) - Database reference
|
||||
- Schema reference for both databases
|
||||
- Query examples and optimization tips
|
||||
|
||||
- **README.md** (README.md) - Project overview and quick start
|
||||
|
||||
---
|
||||
|
||||
## Key Components Deep Dive
|
||||
|
||||
### 1. app.py - Application Entry Point (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\app.py)
|
||||
- Creates Flask app with CORS enabled
|
||||
- Registers all API routes (v1_bp, v2_bp)
|
||||
- Database initialization and schema checks on startup
|
||||
- Health check endpoint at `/healthz`
|
||||
- Logs all registered routes on startup
|
||||
- Error handling doesn't block app startup (errors surface on first request)
|
||||
|
||||
### 2. lawrisk_service.py - Core Search Logic (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\lawrisk\services\lawrisk_service.py)
|
||||
- **EmbeddingClient**: DashScope API integration for vector embeddings
|
||||
- **ChatClient**: Qwen LLM interaction for intelligent subject selection
|
||||
- Database helpers using pg8000 for PostgreSQL
|
||||
- Search algorithms:
|
||||
- Embedding-based cosine similarity search
|
||||
- LLM-based subject matching using `qwen-plus-latest`
|
||||
- Similarity threshold management (tunable via LAWRISK_RETURN_IF_GE, LAWRISK_FALLBACK_GT)
|
||||
- Concurrent execution support with ThreadPoolExecutor
|
||||
|
||||
### 3. lawrisk_v2_service.py - Enhanced API (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\lawrisk\services\lawrisk_v2_service.py)
|
||||
- Structured response formatting with regions, themes, permits, and risks
|
||||
- Region filtering logic with normalization
|
||||
- Direct permit matching by name
|
||||
- Markdown formatting for legal text
|
||||
- Complex query execution pipeline
|
||||
- Helper functions:
|
||||
- `_compose_prompt()`: Builds natural-language prompts from structured data
|
||||
- `_normalize_region_filter()`: Normalizes region filters for matching
|
||||
|
||||
### 4. licensing_repo.py - Data Repository (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\lawrisk\services\licensing_repo.py)
|
||||
- Separate database connection for `licensing_risks` database
|
||||
- Multi-table join query optimization
|
||||
- Legal text formatting helpers
|
||||
- Chinese legal document pattern matching
|
||||
- Checkpoint management:
|
||||
- `create_checkpoint()`: Database backup functionality
|
||||
- `list_checkpoints()`: List available backups
|
||||
- `restore_checkpoint()`: Restore from checkpoint (DANGEROUS!)
|
||||
- `delete_checkpoint()`: Remove old checkpoints
|
||||
|
||||
### 5. smart_cors_middleware.py - Reusable CORS (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\lawrisk\middleware\smart_cors_middleware.py)
|
||||
- Wildcard and exact origin matching
|
||||
- Subdomain support with flexible patterns
|
||||
- Preflight OPTIONS handling
|
||||
- NGINX integration mode (NGINX_CORS_MODE)
|
||||
- Debug and logging features (CORS_DEBUG)
|
||||
- Environment variable support:
|
||||
- ALLOWED_ORIGINS, CORS_STRICT, CORS_DEBUG
|
||||
- NGINX_CORS_MODE, CORS_MAX_AGE, CORS_EXPOSE_HEADERS
|
||||
|
||||
### 6. v2.py API Routes (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\lawrisk\api\v2.py)
|
||||
- **Public endpoints**: `/v2`, `/v2/regions`, `/getPermits`
|
||||
- **Admin endpoints**: `/admin/test`, `/admin/regions`, `/admin/themes`, `/admin/permits`, `/admin/checkpoints`
|
||||
- Parameter extraction supporting GET, POST, JSON, and form data
|
||||
- Concurrent execution using ThreadPoolExecutor (max_workers=2)
|
||||
- Structured error responses with consistent format
|
||||
|
||||
### 7. Utility Scripts
|
||||
- **env_loader.py**: Environment variable loading from .env file
|
||||
- **export_risk_json.py**: PostgreSQL data export utility (outputs to data/risk_tables_export.json)
|
||||
- **ingest_lawrisk.py**: Data ingestion with embeddings (requires DASHSCOPE_API_KEY)
|
||||
|
||||
---
|
||||
|
||||
## Security & Best Practices
|
||||
|
||||
### Security Guidelines
|
||||
- **NEVER hardcode secrets** in source code
|
||||
- All credentials must be in `.env` file or environment variables
|
||||
- API keys (DASHSCOPE_API_KEY) and database passwords must be externalized
|
||||
- Admin endpoints (`/admin/*`) should be protected in production
|
||||
- Database checkpoint restore is a **DANGEROUS OPERATION** and should be restricted
|
||||
|
||||
### Configuration Best Practices
|
||||
- Use `.env` file for all configuration (database, API keys, thresholds)
|
||||
- Environment variables supported by CORS middleware:
|
||||
- ALLOWED_ORIGINS: Comma-separated list of allowed origins
|
||||
- CORS_STRICT: Enable strict origin checking
|
||||
- CORS_DEBUG: Enable debug logging
|
||||
- NGINX_CORS_MODE: Enable NGINX integration
|
||||
- CORS_MAX_AGE: Preflight cache duration
|
||||
- CORS_EXPOSE_HEADERS: Headers to expose to browsers
|
||||
- Regularly backup databases using checkpoint system
|
||||
- Monitor DashScope API quota and rate limits
|
||||
|
||||
## Troubleshooting Guide
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### Database Connection Errors
|
||||
**Symptom**: `pg8000.dbapi.Error` when starting the app
|
||||
**Solutions**:
|
||||
- Check `.env` file exists with correct PostgreSQL credentials
|
||||
- Verify network connectivity to the database server
|
||||
- Ensure PostgreSQL server is running and accessible
|
||||
- Check database names: `fs_law_risk` and `licensing_risks`
|
||||
- Verify PG_HOST, PG_PORT, PG_USER, PG_PASSWORD are correct
|
||||
|
||||
#### Missing Environment Variables
|
||||
**Symptom**: Key errors, default values being used, or API failures
|
||||
**Solutions**:
|
||||
- Create `.env` file from the template (see Configuration section)
|
||||
- Ensure `DASHSCOPE_API_KEY` is set for embedding/chat features
|
||||
- Verify all required PG_* and LIC_* environment variables
|
||||
- Check that pg8000 is installed: `pip install pg8000>=1.30.0`
|
||||
|
||||
#### LLM/Embedding API Errors
|
||||
**Symptom**: API authentication failures, timeout errors, or embedding errors
|
||||
**Solutions**:
|
||||
- Verify `DASHSCOPE_API_KEY` is valid and has sufficient quota
|
||||
- Check `DASHSCOPE_BASE_URL` matches: `https://dashscope.aliyuncs.com/compatible-mode/v1`
|
||||
- Ensure network access to DashScope API servers
|
||||
- Review API rate limits and batch sizes (DASHSCOPE_MAX_BATCH)
|
||||
- Test API key: curl -H "Authorization: Bearer $DASHSCOPE_API_KEY" "$DASHSCOPE_BASE_URL/models"
|
||||
|
||||
#### Empty Search Results
|
||||
**Symptom**: API returns empty `risk_subject` array
|
||||
**Solutions**:
|
||||
- Check if database tables are populated:
|
||||
```sql
|
||||
SELECT COUNT(*) FROM fs_law_risk.law_sub;
|
||||
SELECT COUNT(*) FROM fs_law_risk.law_sub_per;
|
||||
```
|
||||
- Try `debug=1` parameter to see detailed execution info
|
||||
- Verify similarity thresholds in `.env`:
|
||||
- `LAWRISK_RETURN_IF_GE=0.7` (return if similarity >= 0.7)
|
||||
- `LAWRISK_FALLBACK_GT=0.5` (fallback if similarity > 0.5)
|
||||
- Test with known queries like "我要办一家电影院" or "开办旅馆"
|
||||
- Check if embeddings exist: `SELECT id FROM fs_law_risk.law_sub LIMIT 5;`
|
||||
|
||||
#### Port Already in Use
|
||||
**Symptom**: `OSError: [Errno 10048] Only one usage of each socket address`
|
||||
**Solutions**:
|
||||
- Change port: `PORT=8001 python app.py`
|
||||
- Kill existing process using the port:
|
||||
```powershell
|
||||
netstat -ano | findstr :8000
|
||||
taskkill /PID <PID> /F
|
||||
```
|
||||
|
||||
#### Data Export/Import Issues
|
||||
**Symptom**: export_risk_json.py or ingest_lawrisk.py fails
|
||||
**Solutions**:
|
||||
- Verify PostgreSQL credentials in `.env` match database access
|
||||
- Ensure export script isn't writing outside the repository
|
||||
- For ingestion, confirm DASHSCOPE_API_KEY is valid
|
||||
- Check data files exist: `data/risk_tables_export.json`
|
||||
- Run export first, then ingestion if needed
|
||||
|
||||
### Health Check Commands
|
||||
```bash
|
||||
# Basic health check
|
||||
curl http://localhost:8000/healthz
|
||||
|
||||
# Check registered routes (see app startup logs)
|
||||
python app.py 2>&1 | grep "Registered routes"
|
||||
|
||||
# Test V2 API with debug
|
||||
curl -X POST "http://localhost:8000/fs-ai-asistant/api/workflow/lawrisk/v2" \
|
||||
-H "Content-Type: application/x-www-form-urlencoded" \
|
||||
-d "query=我要办一家电影院&debug=1"
|
||||
|
||||
# Test regions endpoint
|
||||
curl http://localhost:8000/fs-ai-asistant/api/workflow/lawrisk/v2/regions
|
||||
|
||||
# Test admin endpoints
|
||||
curl http://localhost:8000/fs-ai-asistant/api/workflow/lawrisk/admin/test
|
||||
```
|
||||
|
||||
### Database Verification
|
||||
Verify database content with queries from `DB_GUIDE.md`:
|
||||
```sql
|
||||
-- Check subject count
|
||||
SELECT COUNT(*) FROM fs_law_risk.law_sub;
|
||||
|
||||
-- Check region-theme pairs
|
||||
SELECT COUNT(*) FROM licensing_risks.region_themes;
|
||||
|
||||
-- Check if embeddings exist
|
||||
SELECT id, name FROM fs_law_risk.law_sub LIMIT 5;
|
||||
|
||||
-- List available regions
|
||||
SELECT * FROM licensing_risks.regions ORDER BY name;
|
||||
```
|
||||
|
||||
### Debug Mode
|
||||
Enable debug logging to troubleshoot issues:
|
||||
```bash
|
||||
# Enable Flask debug mode
|
||||
FLASK_DEBUG=1 python app.py
|
||||
|
||||
# Enable CORS debug mode
|
||||
CORS_DEBUG=1 python app.py
|
||||
|
||||
# Check app logs for registered routes and errors
|
||||
# Logs are printed to console when starting the app
|
||||
```
|
||||
|
||||
### Health Checks
|
||||
- **Basic health**: `GET /healthz` → `{"status": "ok"}`
|
||||
- **V2 regions**: `GET /fs-ai-asistant/api/workflow/lawrisk/v2/regions`
|
||||
- Check logs for registered routes on app startup
|
||||
|
||||
### Data Verification
|
||||
Verify database content with queries from `DB_GUIDE.md`:
|
||||
```sql
|
||||
-- Check subject count
|
||||
SELECT COUNT(*) FROM fs_law_risk.law_sub;
|
||||
|
||||
-- Check region-theme pairs
|
||||
SELECT COUNT(*) FROM licensing_risks.region_themes;
|
||||
```
|
||||
Loading…
Reference in New Issue