105 lines
5.2 KiB
HTML
105 lines
5.2 KiB
HTML
|
||
<html><head><meta charset="utf-8"></head><body style="font-family: sans-serif; padding: 20px; background: #fdfdfd;">
|
||
<h1>Integrated Workflow: Paddlex Layout Analysis + OCR</h1>
|
||
|
||
<!-- CMA Code Extraction Section -->
|
||
<div style="background: white; padding: 20px; border-radius: 8px; box-shadow: 0 2px 10px rgba(0,0,0,0.05); margin-bottom: 40px;">
|
||
<h3 style="color: #2e7d32;">CMA Code Extraction (Full-page OCR + Position Filtering)</h3>
|
||
<p><strong>Method:</strong> Full-page OCR with position-based filtering (top-right area priority)</p>
|
||
<p><strong>Algorithm:</strong> Extract all text → Filter by position → Regex match → Score candidates</p>
|
||
|
||
|
||
<div style="margin-top: 20px;">
|
||
<h4 style="color: #1b5e20;">Extracted CMA Code</h4>
|
||
<p style="font-size: 32px; font-weight: bold; color: #2e7d32; margin: 10px 0;">
|
||
202319017008
|
||
</p>
|
||
<p style="color: #666;">Confidence: 99.93%</p>
|
||
<p style="font-size: 14px; color: #888;">Raw Text: "202319017008"</p>
|
||
<p style="font-size: 14px; color: #888;">Position: (376, 411)</p>
|
||
</div>
|
||
|
||
<div style="margin-top: 20px;">
|
||
<p style="margin: 5px 0;"><strong>Detection Visualization:</strong></p>
|
||
<img src="cma_detection_fullpage.png" style="max-width: 100%; border: 2px solid #4caf50; border-radius: 4px;">
|
||
</div>
|
||
|
||
</div>
|
||
|
||
<!-- Document Layout Detection Section -->
|
||
<div style="background: white; padding: 20px; border-radius: 8px; box-shadow: 0 2px 10px rgba(0,0,0,0.05); margin-bottom: 40px;">
|
||
<h3>1. Document Layout Detection (Paddlex PP-DocLayout-L)</h3>
|
||
<p>File: 关于中检测试技术(广东)集团有限公司检验检测资质的调查取证函(局长件)_pages11-14.pdf | Detected Regions: 21</p>
|
||
<img src="doc_layout_viz.png" style="max-width: 100%; border: 1px solid #999;">
|
||
</div>
|
||
|
||
<!-- Seal Extraction Section -->
|
||
<div>
|
||
<h2>2. Refined Seal Extraction, Unwarping & OCR Recognition</h2>
|
||
|
||
<div style="margin-bottom: 40px; border-bottom: 2px solid #eee; padding-bottom: 20px;">
|
||
<h3>Seal Area #0</h3>
|
||
<div style="display: flex; gap: 20px; flex-wrap: wrap;">
|
||
<div style="background:white; padding:10px; border-radius:4px; box-shadow: 0 1px 3px rgba(0,0,0,0.1);">
|
||
<p style="margin-top:0;">Detection Overlay</p>
|
||
<img src="seal_marked_0.png" style="max-height: 350px;">
|
||
</div>
|
||
<div style="flex-grow:1; background:white; padding:10px; border-radius:4px; box-shadow: 0 1px 3px rgba(0,0,0,0.1);">
|
||
<p style="margin-top:0;">Unwarped Image</p>
|
||
<img src="seal_unwarp_0.png" style="max-width: 100%; border: 1px solid #ddd;">
|
||
</div>
|
||
<div style="flex-grow:1; background:white; padding:10px; border-radius:4px; box-shadow: 0 1px 3px rgba(0,0,0,0.1);">
|
||
<p style="margin-top:0;">OCR Recognition Result</p>
|
||
|
||
<p style="font-size: 18px; font-weight: bold; color: #2e7d32;">
|
||
江西省润华教育装备集团有限公司
|
||
</p>
|
||
<p style="color: #666;">Confidence: 92.02%</p>
|
||
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<div style="margin-bottom: 40px; border-bottom: 2px solid #eee; padding-bottom: 20px;">
|
||
<h3>Seal Area #1</h3>
|
||
<div style="display: flex; gap: 20px; flex-wrap: wrap;">
|
||
<div style="background:white; padding:10px; border-radius:4px; box-shadow: 0 1px 3px rgba(0,0,0,0.1);">
|
||
<p style="margin-top:0;">Detection Overlay</p>
|
||
<img src="seal_marked_1.png" style="max-height: 350px;">
|
||
</div>
|
||
<div style="flex-grow:1; background:white; padding:10px; border-radius:4px; box-shadow: 0 1px 3px rgba(0,0,0,0.1);">
|
||
<p style="margin-top:0;">Unwarped Image</p>
|
||
<img src="seal_unwarp_1.png" style="max-width: 100%; border: 1px solid #ddd;">
|
||
</div>
|
||
<div style="flex-grow:1; background:white; padding:10px; border-radius:4px; box-shadow: 0 1px 3px rgba(0,0,0,0.1);">
|
||
<p style="margin-top:0;">OCR Recognition Result</p>
|
||
|
||
<p style="font-size: 18px; font-weight: bold; color: #2e7d32;">
|
||
中检广东)集务限公司
|
||
</p>
|
||
<p style="color: #666;">Confidence: 79.85%</p>
|
||
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
</div>
|
||
<div style="background: #f5f5f5; padding: 15px; border-radius: 4px; margin-top: 20px;">
|
||
<h3>OCR Results Summary (JSON)</h3>
|
||
<pre style="background: white; padding: 10px; border-radius: 4px; overflow-x: auto;">[
|
||
{
|
||
"seal_index": 0,
|
||
"text": "江西省润华教育装备集团有限公司",
|
||
"score": 0.9202076196670532,
|
||
"success": true
|
||
},
|
||
{
|
||
"seal_index": 1,
|
||
"text": "中检广东)集务限公司",
|
||
"score": 0.7985407114028931,
|
||
"success": true
|
||
}
|
||
]</pre>
|
||
</div>
|
||
</body></html>
|
||
|