Bug 165215 - Implement import filter support for Hancom Office .HWPX XML document format (OWPML - KS X 6101:2024 filter)
Summary: Implement import filter support for Hancom Office .HWPX XML document format (...
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
25.2.0.3 release
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL: https://events.documentfoundation.org...
Whiteboard:
Keywords:
Depends on:
Blocks: Format-Filters CJK-Korean
  Show dependency treegraph
 
Reported: 2025-02-12 09:45 UTC by Volga
Modified: 2025-02-12 17:08 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
HWP file from the same source (1.73 MB, application/x-ole-storage)
2025-02-12 09:47 UTC, Volga
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Volga 2025-02-12 09:45:59 UTC
Description:
There are some websites provides files in HWP format, but LibreOffice can't open. They were mostly exported by later version of Hancom Office.

For example, the following page have a appendix, you can see it by clicking [별표] 대통령 표장도.
https://law.go.kr/LSW/admRulInfoP.do?admRulSeq=2100000191838#AJAX

At the same place there is a blue icon let you download a HWP file that you're unable to open with LibreOffice.

Steps to Reproduce:
N/A

Actual Results:
N/A

Expected Results:
N/A


Reproducible: Always


User Profile Reset: No

Additional Info:
N/A
Comment 1 Volga 2025-02-12 09:47:07 UTC
Created attachment 199164 [details]
HWP file from the same source
Comment 2 V Stuart Foote 2025-02-12 16:47:53 UTC
Hancom Office file formats are proprietary. But do have high market share for Hangul tabular word processing in South Korea, as well as for other office suite documents.

Their legacy .HWP binary formats are probably out of the question for further LibreOffice import filter work. Continued issues with the legacy OOo hwpfilter (pre 1997 HPW formats) of bug 105405, and open request of bug 128907 and bug 144747. While the pyhwp suggested in bug 90513 looks to have stalled at HWP 5.

But seems current XML based suite formats, notably the .HWPX would be a more reasonable import filter project. And Hancom appears to have moved the suite formats to XML to improve their ability to interoperate with MS Office products, deemphasizing the binary .HWP (think MS Office binary file formats vs MS Office OOXML formats). 

A parsing lib under a 'Document Liberation' project might be the best approach, especially if Hancom wanted to support it, but perhaps something directly for a LibreOffice Writer import filter to use would be well received.
Comment 3 V Stuart Foote 2025-02-12 17:08:11 UTC
The OWPML is a published Korean ITC standard KS X 6101:2024, viewable here:

https://standard.go.kr/streamdocs/view/sd;streamdocsId=72059330878256386

In Korean so will need to run it through a gloss/translation service.