Bug 56734 (hOCR)

Summary: Add support for hOCR format as input to Writer
Product: LibreOffice Reporter: Callegar <sergio.callegari>
Component: filters and storageAssignee: Not Assigned <libreoffice-bugs>
Status: NEW ---    
Severity: enhancement CC: bfo.bugmail, bugs
Priority: medium    
Version: 3.6.3.2 release   
Hardware: All   
OS: All   
Whiteboard: BSA
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 108254    

Description Callegar 2012-11-04 10:06:55 UTC
Namely, automatic conversion between the hocr html-derived format for the representation of ocr data to odt.

This would allow more strict integration with ocr tools like tesseract or cuneiform.
Comment 1 bfoman (inactive) 2012-11-04 12:51:05 UTC
Enhancement request.
Comment 2 Roman Eisele 2012-11-22 07:19:10 UTC
For first information about the hOCR format (with additional links), see e.g.
   http://en.wikipedia.org/wiki/hOCR

A valid enhancement request, therefore set Status to NEW.

The Component should be either “Writer” or “filters and storage”.