检验检测报告识别
Go to file
黄仁欢 4bd46b6f0c docs(test): add comprehensive documentation for batch testing script
Added three key documentation files:

1. TEST_ACCURACY_BATCH_README.md
   - Complete usage guide for test_accuracy_batch_full.py
   - Command-line parameters reference
   - 4 usage scenarios (quick, high-accuracy, fast, single-PDF)
   - Troubleshooting guide
   - Performance optimization tips
   - Best practices and examples

2. TEST_ACCURACY_BATCH_DEPENDENCIES.md
   - Detailed dependency analysis
   - Required files and directory structure
   - Python library dependencies
   - File size statistics
   - Dependency relationship diagram
   - Common dependency issues and solutions

3. CLEANUP_PLAN.md
   - File categorization (keep, archive, delete)
   - Step-by-step cleanup instructions
   - Archive directory structure proposal
   - Three cleanup approaches (conservative, aggressive, phased)
   - Cleanup automation script

Features:
- Comprehensive parameter reference tables
- Real-world usage examples
- Performance comparison charts
- Quick reference commands
- Development guidelines

Target audience:
- New developers joining the project
- QA team running batch tests
- DevOps engineers deploying the system

Related:
- test_accuracy_batch_full.py (v1.2.0)
- PADDLEOCRVL_TIMEOUT_FIX_SUMMARY.md
- IMPLEMENTATION_SUMMARY.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 14:32:04 +08:00
data 暂存 2026-02-05 13:57:22 +08:00
report_viz 暂存 2026-02-05 13:57:22 +08:00
scripts 暂存 2026-02-05 13:57:22 +08:00
src Checkpoint before ONNX migration 2026-02-09 09:43:28 +08:00
temp_classpath 暂存 2026-02-05 13:57:22 +08:00
.gitignore Checkpoint before ONNX migration 2026-02-09 09:43:28 +08:00
BUILD_REPORT.md feat(ocr): integrate Python test script improvements for 85% parity 2026-02-08 15:22:50 +08:00
CLEANUP_PLAN.md docs(test): add comprehensive documentation for batch testing script 2026-03-03 14:32:04 +08:00
COMPREHENSIVE_REPORT.md feat(ocr): integrate Python test script improvements for 85% parity 2026-02-08 15:22:50 +08:00
DJL_UPGRADE_ATTEMPT_REPORT.md feat(djl): attempt upgrade to DJL 0.27.0 to fix PaddlePaddle crashes 2026-02-09 00:04:40 +08:00
IMPLEMENTATION_SUMMARY.md feat(ocr): integrate Python test script improvements for 85% parity 2026-02-08 15:22:50 +08:00
INTEGRATION_GUIDE.md feat(ocr): integrate Python test script improvements for 85% parity 2026-02-08 15:22:50 +08:00
INTEGRATION_TEST_REPORT.md feat(ocr): integrate Python test script improvements for 85% parity 2026-02-08 15:22:50 +08:00
ManualTest.java 暂存 2026-02-05 13:57:22 +08:00
PADDLEOCRVL_INTEGRATION.md feat: integrate PaddleOCRVL for seal text recognition 2026-02-07 14:03:10 +08:00
README.md feat: implement RBAC with Sa-Token, institution switch, and backend integration tests 2026-01-28 16:15:09 +08:00
TEST_ACCURACY_BATCH_DEPENDENCIES.md docs(test): add comprehensive documentation for batch testing script 2026-03-03 14:32:04 +08:00
TEST_ACCURACY_BATCH_README.md docs(test): add comprehensive documentation for batch testing script 2026-03-03 14:32:04 +08:00
cma_extraction_template_primary.py fix(cma): implement robust CMA code extraction with fallback mechanism 2026-02-16 14:16:34 +08:00
jar_paths.txt 暂存 2026-02-05 13:57:22 +08:00
pom.xml Checkpoint before ONNX migration 2026-02-09 09:43:28 +08:00
reply.md 暂存 2026-02-05 13:57:22 +08:00
res.json 暂存 2026-02-05 13:57:22 +08:00
run_reference_test.bat 暂存 2026-02-05 13:57:22 +08:00
run_test.bat 暂存 2026-02-05 13:57:22 +08:00
run_test_v2.bat 暂存 2026-02-05 13:57:22 +08:00
run_viz_report.bat 暂存 2026-02-05 13:57:22 +08:00
settings.xml Checkpoint before ONNX migration 2026-02-09 09:43:28 +08:00
test_accuracy_batch_full.py feat(ocr): add PaddleOCRVL timeout protection and improve OCR accuracy 2026-03-03 14:26:46 +08:00
test_paddleocr_vl_quick.py feat: integrate PaddleOCRVL for seal text recognition 2026-02-07 14:03:10 +08:00
v_verify_logic.py 暂存 2026-02-05 13:57:22 +08:00
测试结果汇总.txt feat(ocr): integrate Python test script improvements for 85% parity 2026-02-08 15:22:50 +08:00

README.md

Report Detection Backend

Java-based backend system for automated report validation and comparison using OCR.

Technology Stack

  • Core: Java 8 (Spring Boot 2.7.18)
  • Security: Sa-Token (RBAC, Session Management)
  • OCR Engine: PaddleOCR (via DJL - Deep Java Library)
  • Database: PostgreSQL (with Dynamic Datasource support)
  • Build Tool: Maven

Features

  • RBAC Implementation: Multi-role support (ADMIN, AUDITOR, USER) with uppercase standardization.
  • Sa-Token Security: Annotation-based permission checks and secure login.
  • Auditor Context Switch: Specialized feature for Auditors to switch between institutional views.
  • PDF Processing: Automatic conversion of PDF reports to images for OCR analysis.
  • Automated Verification: Integration tests using H2 in-memory database.

Getting Started

Prerequisites

  • JDK 8 or 17
  • Maven 3.6+
  • PostgreSQL (optional for local dev if using H2 profile)

Run the Application

mvn clean package
java -jar target/report-detect-backend-1.0.0.jar

Run Tests

mvn test -Dtest=SecurityRBACVerificationTest

Security Configuration

Default accounts created on initialization:

  • admin / 123456 (ADMIN)
  • auditor / 123456 (AUDITOR)
  • user / 123456 (USER)