feat: restore AGENTS.md and CLAUDE.md to root directory

- AGENTS.md: Development guidelines and coding standards - CLAUDE.md: Detailed Claude Code development guide - Both files are important project documentation that should be in root - Files also exist in docs/guides/ for documentation organization
2025-11-18 17:24:51 +08:00 · 2025-11-18 17:24:51 +08:00 · e03bfec12c
parent b236021993
commit e03bfec12c
2 changed files with 697 additions and 0 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@ -0,0 +1,16 @@
+# Repository Guidelines
+## 必须使用中文回复我
+## Project Structure & Module Organization
+`app.py` boots the Flask service and loads blueprints from `lawrisk/`, whose subpackages mirror the request flow: `api/` (v1/v2 routes), `services/` (retrieval logic, PostgreSQL access, DashScope orchestration), `middleware/`, and `utils/` for ingestion/export helpers. Static assets for manual QA live in `static/`, while `data/` stores exported JSON/SQL snapshots used by the ingestion scripts. Formal specs, API references, and prior agent notes are in `docs/`. Python tests (pytest) belong under `tests/`; favor mirroring the module paths (e.g., `tests/api/test_v2.py`).
+
+## Build, Test, and Development Commands
+Install dependencies with `pip install -r requirements.txt` from a Python 3.11+ shell. Launch the server via `python app.py`, which binds to `http://localhost:8000` and exposes `/healthz`. Run the lightweight regression suite with `pytest`. Format and lint before submitting using `black .` and `ruff .`; both read `pyproject.toml` defaults so no extra flags are needed. For manual API smoke tests, open `static/v2_tester.html` in a browser or send `curl` requests as shown in `README.md`.
+
+## Coding Style & Naming Conventions
+Use Black’s default 88-column formatting and Ruff-compatible imports; never hand-wrap lines differently. Modules, packages, and files are lowercase with underscores (`lawrisk_service.py`). Public functions favor descriptive verbs (`fetch_risk_snapshot`). Prefer dataclass-like dicts for payloads, and keep API responses snake_case to match existing endpoints. Add concise docstrings for service-layer functions that hit external systems. Configuration should be read through `lawrisk.utils.env_loader` rather than `os.environ` directly.
+
+## Testing Guidelines
+Pytest is the standard; each new feature should include a focused unit test plus, when feasible, a service-level test hitting the Flask test client. Name tests after the behavior under check (`test_query_returns_ranked_risks`). Use fixtures for sample licenses/risk tables to keep tests deterministic. Aim to cover error paths (missing env vars, DashScope failures) because they are the common regressions. Run `pytest -q` locally before opening a PR, and attach logs when failures are environment-specific.
+
+## Commit & Pull Request Guidelines
+Follow the existing Conventional Commit style (`feat: …`, `fix: …`, `chore: …`). Limit commits to one logical change and reference the affected domain in the subject when possible (e.g., `feat: enhance v2 risk scoring`). PRs should describe intent, summarize testing (`pytest`, manual curl, etc.), and link to any tracking issue or checklist. Include screenshots or JSON snippets when adjusting responses so reviewers can verify payload diffs quickly.
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1,681 @@
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+---
+
+# LawRisk Backend - Claude Code Analysis
+
+## Project Overview
+
+**LawRisk** is a Flask-based Python backend service that provides intelligent legal compliance risk retrieval for business licensing and permit requirements. It uses vector embeddings and LLM-based matching to help users find relevant permits, licenses, and associated legal risks based on natural language queries (in Chinese).
+
+**Python Version Requirement**: Python 3.10+ (uses PEP 604 union types like `str | None`)
+
+### Key Features
+- **Semantic Search**: Uses Aliyun DashScope embeddings (text-embedding-v4) to find similar legal topics
+- **LLM-Powered Matching**: Qwen (qwen-plus-latest) for intelligent subject selection
+- **Two Database Architecture**: 
+  - `fs_law_risk`: Vector embeddings and subject-permit mappings
+  - `licensing_risks`: Structured permit and risk data with regions, themes, and compliance information
+- **RESTful APIs**: Clean REST endpoints for V1 (legacy) and V2 (enhanced) search
+- **CORS Enabled**: Built-in CORS middleware for frontend integration
+
+---
+
+## Architecture & Project Structure
+
+### Core Framework & Libraries
+- **Framework**: Flask (Python web framework)
+- **Database Driver**: pg8000 (PostgreSQL adapter)
+- **Vector Embeddings**: Aliyun DashScope OpenAI-compatible API
+- **LLM**: Qwen via DashScope (qwen-plus-latest)
+- **Dependencies**: Minimal footprint - Flask, pg8000, concurrent.futures
+
+### Directory Structure
+```
+市监局-lawRisk-backend/
+├── app.py                          # Flask application entry point
+├── requirements.txt                # Python dependencies
+├── .env                            # Environment configuration
+├── lawrisk/                        # Main application package
+│   ├── __init__.py
+│   ├── api/                        # API route handlers
+│   │   ├── v1.py                   # V1 API (legacy)
+│   │   └── v2.py                   # V2 API (current)
+│   ├── services/                   # Business logic layer
+│   │   ├── lawrisk_service.py      # Core search & embeddings
+│   │   ├── lawrisk_v2_service.py   # V2 enhanced service
+│   │   └── licensing_repo.py       # Data repository
+│   ├── middleware/                 # HTTP middleware
+│   │   └── smart_cors_middleware.py
+│   └── utils/                      # Utility functions
+│       ├── env_loader.py
+│       ├── export_risk_json.py
+│       └── ingest_lawrisk.py
+├── static/                         # Static assets
+│   └── v2_tester.html              # Web-based API tester
+├── tests/                          # Test suite (planned)
+├── data/                           # Data files
+│   ├── risk_tables_export.json
+│   └── licensing_risks_dump.sql
+└── docs/                           # Documentation
+    ├── PRD.md
+    ├── API.md
+    ├── V2_API文档.md
+    ├── AGENTS.md
+    ├── DB_GUIDE.md
+    └── CLAUDE.md
+```
+
+---
+
+## Quick Reference
+
+### Most Common Commands
+```bash
+# Run the application
+python app.py
+
+# Export data from database
+python export_risk_json.py
+
+# Ingest data with embeddings (requires DASHSCOPE_API_KEY)
+python ingest_lawrisk.py
+
+# Format and lint code
+black .
+ruff .
+
+# Test locally via browser
+# Open static/v2_tester.html after starting the app
+```
+
+### Key Files
+- `app.py` - Flask application entry point
+- `lawrisk/` - Main application package
+  - `api/v1.py` - V1 API routes (legacy)
+  - `api/v2.py` - V2 API routes (current)
+  - `services/lawrisk_service.py` - Core search & embeddings
+  - `services/lawrisk_v2_service.py` - V2 enhanced service
+  - `services/licensing_repo.py` - Data repository
+  - `middleware/smart_cors_middleware.py` - CORS middleware
+  - `utils/` - Utility functions
+- `static/v2_tester.html` - Web-based API testing interface
+- `requirements.txt` - Python dependencies
+- `.env` - Environment configuration
+
+---
+
+## Development Workflow
+
+### Initial Setup
+```bash
+# 1. Create virtual environment (Windows PowerShell)
+python -m venv .venv
+.venv\Scripts\activate
+
+# 2. Install dependencies
+pip install Flask pg8000 black ruff pytest
+
+# 3. Configure environment
+# Edit .env with your database credentials and DashScope API key
+```
+
+### Virtual Environment Activation
+```powershell
+# Windows PowerShell
+.venv\Scripts\activate
+
+# Windows CMD
+.venv\Scripts\activate.bat
+
+# Git Bash (Windows)
+source .venv/Scripts/activate
+```
+
+### Common Commands
+
+#### Run the Application
+```bash
+# Development mode
+python app.py
+
+# Custom port
+PORT=8000 python app.py
+
+# With debug logging
+FLASK_DEBUG=1 python app.py
+```
+
+#### Data Management
+```bash
+# Export data from fs_law_risk database to JSON
+python export_risk_json.py
+
+# Ingest data with embeddings into database
+python ingest_lawrisk.py
+# Requires DASHSCOPE_API_KEY in .env
+```
+
+#### Code Quality
+```bash
+# Format code with Black (100 char line length)
+black .
+
+# Lint with Ruff
+ruff .
+
+# Format and lint specific file
+black lawrisk/services/lawrisk_v2_service.py
+ruff lawrisk/services/lawrisk_v2_service.py
+
+# Run tests (when added)
+pytest -q
+
+# Run tests with coverage
+pytest --cov=lawrisk
+```
+
+### Data Management Commands
+```bash
+# Export data from fs_law_risk database to JSON
+# Output: data/risk_tables_export.json
+python lawrisk/utils/export_risk_json.py
+
+# Ingest data with embeddings into database
+# Requires DASHSCOPE_API_KEY in .env
+python lawrisk/utils/ingest_lawrisk.py
+
+# Verify exported data
+ls -lh data/
+cat data/risk_tables_export.json | head -50
+```
+
+#### Database Operations
+```bash
+# Connect to PostgreSQL
+psql -h 8.138.196.105 -U postgres -d fs_law_risk
+
+# Connect to licensing_risks database
+psql -h 8.138.196.105 -U postgres -d licensing_risks
+```
+
+---
+
+## API Endpoints
+
+### V1 API (Legacy)
+- **Path**: `/fs-ai-asistant/api/workflow/lawrisk`
+- **Methods**: GET, POST
+- **Mode**: `llm` (default) or `embed`
+- **Input**: `query` (user question)
+- **Output**: Simple array of matching subjects with permit IDs
+
+### V2 API (Current/Recommended)
+- **Base Path**: `/fs-ai-asistant/api/workflow/lawrisk/v2`
+- **Methods**: GET, POST
+- **Features**:
+  - Structured results with regions, themes, permits, and risks
+  - Optional region filtering
+  - Debug mode with detailed execution info
+  - Direct permit matching by name
+
+#### V2 Sub-endpoints
+
+1. **Search Endpoint**
+   - Path: `/fs-ai-asistant/api/workflow/lawrisk/v2`
+   - Parameters:
+     - `query` (required): User question
+     - `region` (optional): Filter by region (市级, 禅城区, etc.)
+     - `debug` (optional): Enable debug output (1/true/yes/on)
+     - `top` (optional): Number of recommendations (default: 5)
+
+2. **Regions List**
+   - Path: `/fs-ai-asistant/api/workflow/lawrisk/v2/regions`
+   - Method: GET
+   - Returns: All available regions for filtering
+
+3. **Get Permits**
+   - Path: `/fs-ai-asistant/api/workflow/lawrisk/getPermits`
+   - Method: GET, POST
+   - Input: `region` (region ID or name)
+   - Returns: All permits for a specific region
+
+### Health Check
+- Path: `/healthz`
+- Method: GET
+- Returns: `{"status": "ok"}`
+
+---
+
+## Database Schema
+
+### Database 1: fs_law_risk
+Used for vector embeddings and semantic search.
+
+#### Tables
+- **`law_sub`**: Subject matter with embeddings
+  - `id` (TEXT, PK): Subject ID
+  - `name` (TEXT): Subject name
+  - `vector` (JSONB): Embedding vector
+
+- **`law_sub_per`**: Subject-permit mappings
+  - `sub_id` (TEXT, PK): Subject ID
+  - `per_ids` (JSONB): Array of permit IDs
+
+- **`law_per`**: Permit information
+  - `id` (TEXT, PK): Permit ID
+  - `name` (TEXT): Permit name
+  - `risk_ids` (JSONB): Array of risk IDs
+
+### Database 2: licensing_risks
+Used for structured compliance data.
+
+#### Tables
+- **`regions`**: Administrative areas
+  - `id` (PK), `name` (unique)
+
+- **`business_scopes`**: Business scope definitions
+  - `id` (PK), `description`
+
+- **`region_scopes`**: Region-scope mappings
+
+- **`themes`**: Legal themes/subjects
+  - `id` (PK), `name`
+
+- **`region_themes`**: Region-theme mappings
+
+- **`permits`**: License/permit items
+  - `id` (PK), `name`
+
+- **`region_theme_permits`**: Tripartite linkage
+
+- **`risks`**: Risk information
+  - `id` (PK), `risk_content`, `legal_basis`, `document_no`, `summary`
+
+- **`region_permit_risks`**: Risk associations
+
+---
+
+## Configuration
+
+### Environment Variables (.env)
+
+#### DashScope (Embeddings & LLM)
+```
+DASHSCOPE_API_KEY=sk-288824ef003e4e02bb963b8b3024b06a
+DASHSCOPE_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
+DASHSCOPE_EMBED_MODEL=text-embedding-v4
+DASHSCOPE_EMBED_DIM=1024
+DASHSCOPE_MAX_BATCH=10
+DASHSCOPE_CHAT_MODEL=qwen-plus-latest
+```
+
+#### PostgreSQL Configuration
+```
+# fs_law_risk database
+PG_HOST=8.138.196.105
+PG_PORT=5432
+PG_USER=postgres
+PG_PASSWORD=difyai123456
+PG_DATABASE=fs_law_risk
+PG_ADMIN_DB=postgres
+
+# licensing_risks database
+LIC_PG_HOST=8.138.196.105
+LIC_PG_PORT=5432
+LIC_PG_USER=postgres
+LIC_PG_PASSWORD=difyai123456
+LIC_PG_DATABASE=licensing_risks
+```
+
+#### Application Settings
+```
+FLASK_ENV=development
+
+# Search thresholds (tunable)
+LAWRISK_RETURN_IF_GE=0.7
+LAWRISK_FALLBACK_GT=0.5
+```
+
+---
+
+## Testing
+
+### Current Testing Status
+- **No dedicated test suite** in the repository
+- pytest framework is recommended but not configured
+- Manual testing available via `static/v2_tester.html`
+
+### Testing Framework & Guidelines
+When adding tests, use:
+- **Framework**: pytest with Flask test client
+- **Dependencies**: pytest-cov for coverage reporting
+- **Focus Areas**:
+  - API endpoints (V1 and V2)
+  - Middleware behavior (CORS)
+  - Database operations
+  - LLM selection logic
+  - Region filtering
+  - Checkpoint operations (create, list, restore, delete)
+
+**Recommended test structure:**
+```
+tests/
+├── conftest.py                  # Shared fixtures and test configuration
+├── test_api_v1.py               # V1 API endpoint tests
+├── test_api_v2.py               # V2 API endpoint tests
+├── test_admin_endpoints.py      # Admin endpoint tests
+├── test_search_service.py       # Core search logic
+├── test_licensing_repo.py       # Database repository
+└── test_cors_middleware.py      # CORS middleware behavior
+
+# Test naming: test_*.py (pytest discovery pattern)
+# Test functions: test_<feature>_<scenario>()
+```
+
+**Testing CORS Middleware** (from AGENTS.md):
+- Origin matching: wildcard, exact, subdomains
+- Preflight OPTIONS handling
+- X-CORS-Decision header behavior
+- NGINX_CORS_MODE functionality
+- Environment variable configurations (ALLOWED_ORIGINS, CORS_STRICT, etc.)
+
+**Example test case:**
+```python
+def test_v2_search_with_debug():
+    client = app.test_client()
+    response = client.post('/fs-ai-asistant/api/workflow/lawrisk/v2',
+                          data={'query': '电影院', 'debug': '1'})
+    assert response.status_code == 200
+    data = response.get_json()['data']
+    assert 'debug' in data
+    assert 'executionTime' in data
+```
+
+### Testing Commands
+```bash
+# Run all tests
+pytest
+
+# Run with coverage
+pytest --cov=lawrisk_service
+
+# Run specific test file
+pytest tests/test_api_v2.py -v
+```
+
+### Manual Testing with V2 Tester
+1. Start the application: `python app.py` (defaults to port 8000)
+2. Open `static/v2_tester.html` in your browser
+3. Test queries like:
+   - "我要办一家电影院"
+   - "开办旅馆需要哪些许可"
+   - "公共场所卫生许可"
+   - Query with region filter: "电影院&region=市级&debug=1"
+
+The tester provides a simple UI to experiment with the V2 API and view debug information.
+
+---
+
+## Coding Standards & Best Practices
+
+### Python Style Guidelines
+- **Indentation**: 4 spaces (no tabs)
+- **Encoding**: UTF-8 for all source files
+- **Naming Conventions**:
+  - Functions/variables: `snake_case`
+  - Constants: `SCREAMING_SNAKE_CASE`
+  - Classes: `PascalCase`
+- **Type Hints**: Prefer type hints for all public functions
+- **Code Formatting**: Use `black` with 100-character line length
+- **Linting**: Use `ruff` with default rules
+
+### Code Quality Guidelines
+- Keep functions small and side-effect free
+- Prefer pure functions where possible
+- Document complex logic with comments
+- Use type hints from `typing` module
+- Handle errors gracefully with appropriate logging
+
+### Documentation Files
+- **PRD.md** (docs/PRD.md) - Product Requirements Document
+  - Business logic and requirements specification
+  - Feature specifications and user stories
+
+- **API.md** (docs/API.md) - V1 API documentation
+  - Legacy API endpoints and usage
+  - Request/response formats
+
+- **V2_API文档.md** (docs/V2_API文档.md) - Detailed V2 API documentation
+  - Enhanced API specification
+  - Admin endpoints and checkpoint operations
+  - Request/response examples
+
+- **AGENTS.md** (docs/AGENTS.md) - Development guidelines
+  - Repository structure and module organization
+  - Coding style and naming conventions
+  - Commit and PR guidelines
+  - Security and configuration tips
+
+- **DB_GUIDE.md** (docs/DB_GUIDE.md) - Database reference
+  - Schema reference for both databases
+  - Query examples and optimization tips
+
+- **README.md** (README.md) - Project overview and quick start
+
+---
+
+## Key Components Deep Dive
+
+### 1. app.py - Application Entry Point (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\app.py)
+- Creates Flask app with CORS enabled
+- Registers all API routes (v1_bp, v2_bp)
+- Database initialization and schema checks on startup
+- Health check endpoint at `/healthz`
+- Logs all registered routes on startup
+- Error handling doesn't block app startup (errors surface on first request)
+
+### 2. lawrisk_service.py - Core Search Logic (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\lawrisk\services\lawrisk_service.py)
+- **EmbeddingClient**: DashScope API integration for vector embeddings
+- **ChatClient**: Qwen LLM interaction for intelligent subject selection
+- Database helpers using pg8000 for PostgreSQL
+- Search algorithms:
+  - Embedding-based cosine similarity search
+  - LLM-based subject matching using `qwen-plus-latest`
+- Similarity threshold management (tunable via LAWRISK_RETURN_IF_GE, LAWRISK_FALLBACK_GT)
+- Concurrent execution support with ThreadPoolExecutor
+
+### 3. lawrisk_v2_service.py - Enhanced API (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\lawrisk\services\lawrisk_v2_service.py)
+- Structured response formatting with regions, themes, permits, and risks
+- Region filtering logic with normalization
+- Direct permit matching by name
+- Markdown formatting for legal text
+- Complex query execution pipeline
+- Helper functions:
+  - `_compose_prompt()`: Builds natural-language prompts from structured data
+  - `_normalize_region_filter()`: Normalizes region filters for matching
+
+### 4. licensing_repo.py - Data Repository (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\lawrisk\services\licensing_repo.py)
+- Separate database connection for `licensing_risks` database
+- Multi-table join query optimization
+- Legal text formatting helpers
+- Chinese legal document pattern matching
+- Checkpoint management:
+  - `create_checkpoint()`: Database backup functionality
+  - `list_checkpoints()`: List available backups
+  - `restore_checkpoint()`: Restore from checkpoint (DANGEROUS!)
+  - `delete_checkpoint()`: Remove old checkpoints
+
+### 5. smart_cors_middleware.py - Reusable CORS (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\lawrisk\middleware\smart_cors_middleware.py)
+- Wildcard and exact origin matching
+- Subdomain support with flexible patterns
+- Preflight OPTIONS handling
+- NGINX integration mode (NGINX_CORS_MODE)
+- Debug and logging features (CORS_DEBUG)
+- Environment variable support:
+  - ALLOWED_ORIGINS, CORS_STRICT, CORS_DEBUG
+  - NGINX_CORS_MODE, CORS_MAX_AGE, CORS_EXPOSE_HEADERS
+
+### 6. v2.py API Routes (C:\Users\WIN10\Desktop\work\11th-week\法律风险提示-new\市监局-lawRisk-backend\lawrisk\api\v2.py)
+- **Public endpoints**: `/v2`, `/v2/regions`, `/getPermits`
+- **Admin endpoints**: `/admin/test`, `/admin/regions`, `/admin/themes`, `/admin/permits`, `/admin/checkpoints`
+- Parameter extraction supporting GET, POST, JSON, and form data
+- Concurrent execution using ThreadPoolExecutor (max_workers=2)
+- Structured error responses with consistent format
+
+### 7. Utility Scripts
+- **env_loader.py**: Environment variable loading from .env file
+- **export_risk_json.py**: PostgreSQL data export utility (outputs to data/risk_tables_export.json)
+- **ingest_lawrisk.py**: Data ingestion with embeddings (requires DASHSCOPE_API_KEY)
+
+---
+
+## Security & Best Practices
+
+### Security Guidelines
+- **NEVER hardcode secrets** in source code
+- All credentials must be in `.env` file or environment variables
+- API keys (DASHSCOPE_API_KEY) and database passwords must be externalized
+- Admin endpoints (`/admin/*`) should be protected in production
+- Database checkpoint restore is a **DANGEROUS OPERATION** and should be restricted
+
+### Configuration Best Practices
+- Use `.env` file for all configuration (database, API keys, thresholds)
+- Environment variables supported by CORS middleware:
+  - ALLOWED_ORIGINS: Comma-separated list of allowed origins
+  - CORS_STRICT: Enable strict origin checking
+  - CORS_DEBUG: Enable debug logging
+  - NGINX_CORS_MODE: Enable NGINX integration
+  - CORS_MAX_AGE: Preflight cache duration
+  - CORS_EXPOSE_HEADERS: Headers to expose to browsers
+- Regularly backup databases using checkpoint system
+- Monitor DashScope API quota and rate limits
+
+## Troubleshooting Guide
+
+### Common Issues
+
+#### Database Connection Errors
+**Symptom**: `pg8000.dbapi.Error` when starting the app
+**Solutions**:
+- Check `.env` file exists with correct PostgreSQL credentials
+- Verify network connectivity to the database server
+- Ensure PostgreSQL server is running and accessible
+- Check database names: `fs_law_risk` and `licensing_risks`
+- Verify PG_HOST, PG_PORT, PG_USER, PG_PASSWORD are correct
+
+#### Missing Environment Variables
+**Symptom**: Key errors, default values being used, or API failures
+**Solutions**:
+- Create `.env` file from the template (see Configuration section)
+- Ensure `DASHSCOPE_API_KEY` is set for embedding/chat features
+- Verify all required PG_* and LIC_* environment variables
+- Check that pg8000 is installed: `pip install pg8000>=1.30.0`
+
+#### LLM/Embedding API Errors
+**Symptom**: API authentication failures, timeout errors, or embedding errors
+**Solutions**:
+- Verify `DASHSCOPE_API_KEY` is valid and has sufficient quota
+- Check `DASHSCOPE_BASE_URL` matches: `https://dashscope.aliyuncs.com/compatible-mode/v1`
+- Ensure network access to DashScope API servers
+- Review API rate limits and batch sizes (DASHSCOPE_MAX_BATCH)
+- Test API key: curl -H "Authorization: Bearer $DASHSCOPE_API_KEY" "$DASHSCOPE_BASE_URL/models"
+
+#### Empty Search Results
+**Symptom**: API returns empty `risk_subject` array
+**Solutions**:
+- Check if database tables are populated:
+  ```sql
+  SELECT COUNT(*) FROM fs_law_risk.law_sub;
+  SELECT COUNT(*) FROM fs_law_risk.law_sub_per;
+  ```
+- Try `debug=1` parameter to see detailed execution info
+- Verify similarity thresholds in `.env`:
+  - `LAWRISK_RETURN_IF_GE=0.7` (return if similarity >= 0.7)
+  - `LAWRISK_FALLBACK_GT=0.5` (fallback if similarity > 0.5)
+- Test with known queries like "我要办一家电影院" or "开办旅馆"
+- Check if embeddings exist: `SELECT id FROM fs_law_risk.law_sub LIMIT 5;`
+
+#### Port Already in Use
+**Symptom**: `OSError: [Errno 10048] Only one usage of each socket address`
+**Solutions**:
+- Change port: `PORT=8001 python app.py`
+- Kill existing process using the port:
+  ```powershell
+  netstat -ano | findstr :8000
+  taskkill /PID <PID> /F
+  ```
+
+#### Data Export/Import Issues
+**Symptom**: export_risk_json.py or ingest_lawrisk.py fails
+**Solutions**:
+- Verify PostgreSQL credentials in `.env` match database access
+- Ensure export script isn't writing outside the repository
+- For ingestion, confirm DASHSCOPE_API_KEY is valid
+- Check data files exist: `data/risk_tables_export.json`
+- Run export first, then ingestion if needed
+
+### Health Check Commands
+```bash
+# Basic health check
+curl http://localhost:8000/healthz
+
+# Check registered routes (see app startup logs)
+python app.py 2>&1 | grep "Registered routes"
+
+# Test V2 API with debug
+curl -X POST "http://localhost:8000/fs-ai-asistant/api/workflow/lawrisk/v2" \
+  -H "Content-Type: application/x-www-form-urlencoded" \
+  -d "query=我要办一家电影院&debug=1"
+
+# Test regions endpoint
+curl http://localhost:8000/fs-ai-asistant/api/workflow/lawrisk/v2/regions
+
+# Test admin endpoints
+curl http://localhost:8000/fs-ai-asistant/api/workflow/lawrisk/admin/test
+```
+
+### Database Verification
+Verify database content with queries from `DB_GUIDE.md`:
+```sql
+-- Check subject count
+SELECT COUNT(*) FROM fs_law_risk.law_sub;
+
+-- Check region-theme pairs
+SELECT COUNT(*) FROM licensing_risks.region_themes;
+
+-- Check if embeddings exist
+SELECT id, name FROM fs_law_risk.law_sub LIMIT 5;
+
+-- List available regions
+SELECT * FROM licensing_risks.regions ORDER BY name;
+```
+
+### Debug Mode
+Enable debug logging to troubleshoot issues:
+```bash
+# Enable Flask debug mode
+FLASK_DEBUG=1 python app.py
+
+# Enable CORS debug mode
+CORS_DEBUG=1 python app.py
+
+# Check app logs for registered routes and errors
+# Logs are printed to console when starting the app
+```
+
+### Health Checks
+- **Basic health**: `GET /healthz` → `{"status": "ok"}`
+- **V2 regions**: `GET /fs-ai-asistant/api/workflow/lawrisk/v2/regions`
+- Check logs for registered routes on app startup
+
+### Data Verification
+Verify database content with queries from `DB_GUIDE.md`:
+```sql
+-- Check subject count
+SELECT COUNT(*) FROM fs_law_risk.law_sub;
+
+-- Check region-theme pairs
+SELECT COUNT(*) FROM licensing_risks.region_themes;
+```