Bug 128907 - Support for Hancom Office File Format (HWP)
Summary: Support for Hancom Office File Format (HWP)
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
: 108848 (view as bug list)
Depends on:
Blocks: Format-Filters
  Show dependency treegraph
 
Reported: 2019-11-20 06:19 UTC by PGM
Modified: 2023-10-28 22:11 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
sample hwp file. (223.50 KB, application/haansofthwp)
2019-11-20 06:20 UTC, PGM
Details
sample cell file. (12.97 KB, application/haansoftcell)
2019-11-20 06:20 UTC, PGM
Details
sample show file. (45.70 KB, application/haansoftshow)
2019-11-20 06:21 UTC, PGM
Details
cell's screenshot in HanCell. (118.47 KB, image/png)
2019-11-20 06:22 UTC, PGM
Details
cell's screenshot in Calc. (112.12 KB, image/png)
2019-11-20 06:22 UTC, PGM
Details
hwp's screenshot in HanWord. (138.67 KB, image/png)
2019-11-20 06:23 UTC, PGM
Details
hwp's screenshot in Writer. (102.04 KB, image/png)
2019-11-20 06:23 UTC, PGM
Details
show's screenshot in HanShow. (167.23 KB, image/png)
2019-11-20 06:23 UTC, PGM
Details
show's screenshot in Impress. (155.92 KB, image/png)
2019-11-20 06:24 UTC, PGM
Details

Note You need to log in before you can comment on or make changes to this bug.
Description PGM 2019-11-20 06:19:44 UTC
Description:
Extended improvement request in ID 108848.

There is Hancom Office, which is widely used in Korea like MS Office, and there are three file formats in total. (hwp, cell, show)

hwp : doc file in MS Word
Cell: xls file in MS Excel
show : pptx file in MS Powerpoint

It is compatible with MS Office in each case, currently, only the hwp file cannot be opened in LibreOffice.

I request complete compatibility with Hancom Office from LibreOffice.

I attached a sample file of three file formats from Hancom Office.

Steps to Reproduce:
Open the file (hwp, cell, show).
hwp : Writer
Cell: Calc
show : Impress

Actual Results:
(tested in LibreOffice 6.2.4.2)

hwp in Writer : The contents of the file are broken.

cell in Calc : The file can be opened normally,
but there is a difference in the letter size of the chart legend.

show in Impress : The file can be opened normally,
but there is a difference in the arrangement of the letters in shapes.

Expected Results:
none.


Reproducible: Always


User Profile Reset: No



Additional Info:
I also uploaded screenshots from Hancom Office and from LibreOffice.
Comment 1 PGM 2019-11-20 06:20:27 UTC
Created attachment 155955 [details]
sample hwp file.
Comment 2 PGM 2019-11-20 06:20:55 UTC Comment hidden (off-topic)
Comment 3 PGM 2019-11-20 06:21:16 UTC Comment hidden (off-topic)
Comment 4 PGM 2019-11-20 06:22:04 UTC Comment hidden (off-topic)
Comment 5 PGM 2019-11-20 06:22:40 UTC Comment hidden (off-topic)
Comment 6 PGM 2019-11-20 06:23:07 UTC
Created attachment 155960 [details]
hwp's screenshot in HanWord.
Comment 7 PGM 2019-11-20 06:23:30 UTC
Created attachment 155961 [details]
hwp's screenshot in Writer.
Comment 8 PGM 2019-11-20 06:23:58 UTC Comment hidden (off-topic)
Comment 9 PGM 2019-11-20 06:24:26 UTC Comment hidden (off-topic)
Comment 10 Mike Kaganski 2019-11-20 07:18:08 UTC
It is unclear how this is different from tdf#108848 that you mention.

Just as in the latter, you note that "hwp" files (that are proprietary binary format in a "compound" file) are not read by LO. Have you found a documentation for it, like MS has created for its DOC [1], or do you offer a sponsorship to fund the reverse engineering?

You confirm that "cell" files (that are simply renamed XLSX) and "show" files (that are renamed PPTX) are opened in LO. Of course, we have issues related to XLSX (tdf#108897) and PPTX (tdf#108226) support. Is your request "to fix all those bugs (and also all unknown)"? Do you realize that full compatibility is not possible even in theory, because the two suites are built on different principles/document models; and that even where compatibility is possible, bugs happen, and need to be filed and fixed one-by-one?

[1] https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-doc/ccd7b486-7881-484c-a137-51170af7cc22
Comment 11 PGM 2019-11-20 08:39:10 UTC
(In reply to Mike Kaganski from comment #10)
> It is unclear how this is different from tdf#108848 that you mention.
> 
> Just as in the latter, you note that "hwp" files (that are proprietary
> binary format in a "compound" file) are not read by LO. Have you found a
> documentation for it, like MS has created for its DOC [1], or do you offer a
> sponsorship to fund the reverse engineering?
> 
> You confirm that "cell" files (that are simply renamed XLSX) and "show"
> files (that are renamed PPTX) are opened in LO. Of course, we have issues
> related to XLSX (tdf#108897) and PPTX (tdf#108226) support. Is your request
> "to fix all those bugs (and also all unknown)"? Do you realize that full
> compatibility is not possible even in theory, because the two suites are
> built on different principles/document models; and that even where
> compatibility is possible, bugs happen, and need to be filed and fixed
> one-by-one?
> 
> [1]
> https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-doc/
> ccd7b486-7881-484c-a137-51170af7cc22

#1 It is unclear how this is different from tdf#108848 that you mention.
>> I thought it was common with tdf#108848 to request support for the hwp file.
tdf#108848 was a problem on the android platform and I mentioned it because I thought it was an extended request from Android to PC because it was a difference in the problem on the PC platform.

#2 Just as in the latter, you note that "hwp" files (that are proprietary
binary format in a "compound" file) are not read by LO. Have you found a
documentation for it, like MS has created for its DOC [1], or do you offer a
sponsorship to fund the reverse engineering?
>> https://www.hancom.com/etc/hwpDownload.do 
As a result of searching on Hancom's website, Hancom released its own HWP file format on June 29, 2010.

Let me translate the first paragraph in the page.

Hancom unveiled its binary format, HWP, and its markup language, HWPML, on June 29, 2010. We have released the previous versions of HWP 2.x/3x and HWP 2002 through HWP 2014 and HWP 2018. In addition, in October 2014, we released document, formula, and chart specifications for supplementation and distribution of HWP 5.0 specs. Using the File Type Disclosure Information document, anyone can create a variety of secondary works.
OWPML, an open document format for HWP, is supported by the ".owpml" extension from HWP 2018 and the subversion is supported with the ".hwpx" extension from HWP 2010.

The hwp file formatting guide is written in Korean, but no English-language document was found.

I also translated the last paragraph.

Hancom strongly supports the openness and standardization of document formats. Hancom has supported HWP 97 free of charge and has also released document formats for HWPML, the XML format for HWP 2002-2010 documents. Actively take part in the ethics committee associated with an open document standards and code worked for the standardization and openness of the file type. In addition, long-term retention of records in the ' Hancom Office ' standard format, pdf a the document format, international support and iso odf and actively storing files types of importing and ooxml.I desire.!
Anyone who wishes to access this document will be provided to anyone who wishes to access this document, and any person who wishes to use the copy, distribution, publication and the content contained in this document in addition to viewing this document must fully recognize and agree to this copyright by Hancom. The distribution is the original unmodified all content be limited to the original or copies. The original and copy must contain the latest version of Hancom's specifications.
Hancom may also actively exercise its rights against those who wish to acquire another exclusive and exclusive right based on the results obtained under the HWP Document File (.hwp) disclosure document and exercise it against Hancom. And this document, developed and the references listed in this document all copyright is the result of product development will be in an individual or group. However, "This product has been developed by referring to Hancom's HWP document file (.hwp) disclosure document," which must be written in all user interfaces, manuals, help, and sources within the product and only in the absence of such components.
Hancom does not warrant any accuracy or integrity of the individual or group product developed by reference to this document and the information contained in this document.

In conclusion, I felt strongly that Libreoffice needed a Korean developer. I'm not a professional developer yet.

#3 You confirm that "cell" files (that are simply renamed XLSX) and "show"
files (that are renamed PPTX) are opened in LO. Of course, we have issues
related to XLSX (tdf#108897) and PPTX (tdf#108226) support. Is your request
"to fix all those bugs (and also all unknown)"? Do you realize that full
compatibility is not possible even in theory, because the two suites are
built on different principles/document models; and that even where
compatibility is possible, bugs happen, and need to be filed and fixed
one-by-one?

>> As a result of changing the names of the two files, they appear to be the same.
I requested there's only normal, just in case it was submitted in support of the weight of the file hwp in files held by the contents of the differences a bit.Spoke with confidence that I can fix the problem. Normal, which will be held in files as you said a small problem is considered an unreasonable request of you.
It may need to be modified on the PC platform as a support request from LO Writer in HWP format.



P.S. I'm sorry that you might not fully understand my answer because I answered with a translator.
Comment 12 PGM 2019-11-21 19:10:35 UTC
Change the summary, status.
Comment 13 Mike Kaganski 2019-11-22 06:28:11 UTC
Please don't confirm your own bug reports. That should be done independently by other people.

However, I confirm this is a valid request, different from tdf#108848 (which was specific for Android viewer, as mentioned in comment 11).
Comment 14 kellnerp@earthlink.net 2022-01-01 00:52:59 UTC
I have filed ID 144747 because LO displays garbage when opening hwp files which it purportedly supports. That bug report and this one are essentially after the same thing, to be able to open hwp (Hangul Word Documents) in LO. Hangul Word is used widely withing the Korean government and academia as well as being widely used in offices and privately. The inability to open, write and edit this file format leaves out a large majority of Korean speakers from communicating with the rest of the world. Since the number of Korean speakers is similar if not greater than the number of French speakers in the world it would seem this ability to read and write hwp files is needed.

For me, in practical terms, when I get correspondence from a Korean speaker I usually have to ask them to convert hwp to docx since Hangul Word can do that.
Comment 15 Michael Weghorn 2023-01-19 07:00:42 UTC
*** Bug 108848 has been marked as a duplicate of this bug. ***
Comment 16 iyagi 2023-10-28 22:11:19 UTC
I installed the latest version 7.6.2.1.
Still unable to read hwp format.

https://www.hancom.com/etc/hwpDownload.do
Is the information incorrect?

Other websites offer conversions.
Why can't I read it in Libro?