report-detect

检验检测报告识别

Go to file

黄仁欢 0d760ee656 fix(ocr): remove multiprocessing to fix Windows Queue synchronization issue PROBLEM: - Institution names were successfully extracted by PaddleOCRVL subprocess - But main process received empty result due to Windows multiprocessing Queue delay - Result: API returned empty institutions array despite successful OCR extraction ROOT CAUSE: - Used multiprocessing.Process with Queue for inter-process communication - On Windows, Queue has synchronization delay when process.join() returns - Subprocess put data in Queue, but main process called get_nowait() too early - Result: Data loss even though subprocess succeeded SOLUTION: - Remove multiprocessing entirely - Direct call to vl_pipeline.predict() in main process - No Queue synchronization issues - Simpler code (150 lines → 100 lines) - Faster execution (no subprocess overhead) TESTING: - Tested with 1.pdf: CMA 20211901583 extracted (99.91% confidence) - Institution extracted: 深圳市中多质量检验认证有限公司 (15 chars) - Flask API returns populated institutions array - Java backend successfully saves to database - End-to-end integration verified CHANGES: - test_accuracy_batch_full.py: run_ocr_recognition_vl() function - Removed: multiprocessing.Process, Queue, subprocess wrapper - Added: Direct call to vl_pipeline.predict() - Simplified error handling and result parsing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-03-05 09:52:45 +08:00
archive	chore(project): conservative cleanup - archive temp scripts and old docs	2026-03-03 14:35:06 +08:00
data	暂存	2026-02-05 13:57:22 +08:00
report_viz	chore(project): conservative cleanup - archive temp scripts and old docs	2026-03-03 14:35:06 +08:00
scripts	暂存	2026-02-05 13:57:22 +08:00
src	chore(project): conservative cleanup - archive temp scripts and old docs	2026-03-03 14:35:06 +08:00
.gitignore	chore(project): conservative cleanup - archive temp scripts and old docs	2026-03-03 14:35:06 +08:00
CLEANUP_COMPLETE.md	docs(cleanup): add cleanup completion report	2026-03-03 14:35:50 +08:00
CLEANUP_PLAN.md	docs(test): add comprehensive documentation for batch testing script	2026-03-03 14:32:04 +08:00
IMPLEMENTATION_SUMMARY.md	chore(project): conservative cleanup - archive temp scripts and old docs	2026-03-03 14:35:06 +08:00
TEST_ACCURACY_BATCH_DEPENDENCIES.md	docs(test): add comprehensive documentation for batch testing script	2026-03-03 14:32:04 +08:00
TEST_ACCURACY_BATCH_README.md	docs(test): add comprehensive documentation for batch testing script	2026-03-03 14:32:04 +08:00
cma_extraction_final.py	feat(cma): add CMA extraction module fallback implementation	2026-03-03 14:51:58 +08:00
cma_extraction_template_primary.py	chore(project): conservative cleanup - archive temp scripts and old docs	2026-03-03 14:35:06 +08:00
pom.xml	chore(project): conservative cleanup - archive temp scripts and old docs	2026-03-03 14:35:06 +08:00
settings.xml	chore(project): conservative cleanup - archive temp scripts and old docs	2026-03-03 14:35:06 +08:00
test_accuracy_batch_full.py	fix(ocr): remove multiprocessing to fix Windows Queue synchronization issue	2026-03-05 09:52:45 +08:00