Description: There are some websites provides files in HWP format, but LibreOffice can't open. They were mostly exported by later version of Hancom Office. For example, the following page have a appendix, you can see it by clicking [별표] 대통령 표장도. https://law.go.kr/LSW/admRulInfoP.do?admRulSeq=2100000191838#AJAX At the same place there is a blue icon let you download a HWP file that you're unable to open with LibreOffice. Steps to Reproduce: N/A Actual Results: N/A Expected Results: N/A Reproducible: Always User Profile Reset: No Additional Info: N/A
Created attachment 199164 [details] HWP file from the same source
Hancom Office file formats are proprietary. But do have high market share for Hangul tabular word processing in South Korea, as well as for other office suite documents. Their legacy .HWP binary formats are probably out of the question for further LibreOffice import filter work. Continued issues with the legacy OOo hwpfilter (pre 1997 HPW formats) of bug 105405, and open request of bug 128907 and bug 144747. While the pyhwp suggested in bug 90513 looks to have stalled at HWP 5. But seems current XML based suite formats, notably the .HWPX would be a more reasonable import filter project. And Hancom appears to have moved the suite formats to XML to improve their ability to interoperate with MS Office products, deemphasizing the binary .HWP (think MS Office binary file formats vs MS Office OOXML formats). A parsing lib under a 'Document Liberation' project might be the best approach, especially if Hancom wanted to support it, but perhaps something directly for a LibreOffice Writer import filter to use would be well received.
The OWPML is a published Korean ITC standard KS X 6101:2024, viewable here: https://standard.go.kr/streamdocs/view/sd;streamdocsId=72059330878256386 In Korean so will need to run it through a gloss/translation service.