Commit Graph

3 Commits

Author SHA1 Message Date
黄仁欢 8b416e9f5a feat: integrate PaddleOCRVL for seal text recognition
- Add PaddleOCRVL as optional OCR model for seal text recognition
  - New parameter: --ocr-model {ppocr_v5,paddleocr_vl}
  - PaddleOCRVL achieves 100% accuracy on test cases (vs 84% for PP-OCRv5)
  - Backward compatible: defaults to PP-OCRv5

- Fix CMA recognition regression
  - Ensure ocr_engine is always initialized for CMA extraction
  - PaddleOCRVL only used for seal text, not CMA recognition

- Add comprehensive integration guide
  - PADDLEOCRVL_INTEGRATION.md with usage examples
  - test_paddleocr_vl_quick.py for validation

Implementation details:
- run_ocr_recognition_vl(): New function for PaddleOCRVL recognition
- extract_seals_and_institutions(): Enhanced with OCR model selection
- Automatic fallback to PP-OCRv5 if PaddleOCRVL unavailable

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-07 14:03:10 +08:00
黄仁欢 2c8ab7379c 暂存 2026-02-05 13:57:22 +08:00
黄仁欢 68b6881c5a feat: implement RBAC with Sa-Token, institution switch, and backend integration tests 2026-01-28 16:15:09 +08:00