Bug 149621 - PDF documentation files with RC4 encryption for editing require password to open
Summary: PDF documentation files with RC4 encryption for editing require password to open
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
: 113780 (view as bug list)
Depends on:
Blocks: Password-Protected
  Show dependency treegraph
 
Reported: 2022-06-19 18:13 UTC by S.B.
Modified: 2022-07-06 13:51 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description S.B. 2022-06-19 18:13:15 UTC
Description:
I downloaded a motherboard manual from ASUS for printing and LibreOffice refuses to open it without a "password".  Clicking "OK" or "Cancel" returns an I/O error.  It is perfectly accessible through Microsoft Edge with no password requirement.  This is terribly frustrating.  I need it printed through a program that can print in booklet format.  Edge can not do this. 

Steps to Reproduce:
1.Download a motherboard manual from ASUS
2.Try to open it with LibreOffice Writer
3.Password prompt is generated. "OK", "Cancel" or checking "read only" does not allow access.

Actual Results:
No access to file.

Expected Results:
Document should at least open in read only mode.


Reproducible: Always


User Profile Reset: No


OpenGL enabled: Yes

Additional Info:
The Help option does not provide information on this issue.
Comment 1 Julien Nabet 2022-06-19 18:55:06 UTC Comment hidden (obsolete)
Comment 3 Julien Nabet 2022-06-20 17:32:16 UTC
Thank you for the feedback.
I confirm I can reproduce the pb too on pc Debian x86-64 with master sources updated today.
Comment 4 Julien Nabet 2022-06-20 19:52:21 UTC
In checkEncryption, we go into "if( o_rIsEncrypted )" block
(see https://opengrok.libreoffice.org/xref/core/sdext/source/pdfimport/wrapper/wrapper.cxx?r=f71606c9#921).

That's because after line 918
o_rIsEncrypted = pPDFFile->isEncrypted()
"o_rIsEncrypted" is true.

After searching a bit why isEncrypted() return true.
This method is defined with:
1058  bool PDFFile::isEncrypted() const
1059  {
1060      return impl_getData()->m_bIsEncrypted;
1061  }

see https://opengrok.libreoffice.org/xref/core/sdext/source/pdfimport/pdfparse/pdfentries.cxx?r=776a1b9b&mo=33301&fi=1058#1058

Then searching what put "m_bIsEncrypted" to true, I found it was there:
PDFFile::impl_getData()
(see https://opengrok.libreoffice.org/xref/core/sdext/source/pdfimport/pdfparse/pdfentries.cxx?r=776a1b9b#1280).
1337 : m_pData->m_bIsEncrypted = true;

To come to this point, LO passed these:
1311              PDFDict::Map::iterator enc =
1312                  pTrailer->m_pDict->m_aMap.find( "Encrypt" );
1313              if( enc != pTrailer->m_pDict->m_aMap.end() )
=> there's indeed an "Encrypt" string in the pdf.

1328  PDFDict::Map::iterator filter = pDict->m_aMap.find( "Filter" );
...
1335                      if( filter != pDict->m_aMap.end() )
=> there's indeed a "Filter" string in the pdf.

Just opening the file with Vim at line 85:
trailer^M<</Size 11313/Prev 7344600/XRefStm 4936/Root 11232 0 R/Encrypt 11231 0 R/Info 1123 0 R/ID[<2DADBFE19AE011E2866A0016CB391DB2><C1BCA3469B3B11E2BB700016CB391DB2>]>>^Mstartxref^M0^M%%EOF^M                                    ^M11312 0 obj<</Length 2800/C 4232/E 4200/Filter/FlateDecode/I 4259/L 4216/O 4184/S 3803/T 4017>>stream^M

Now I don't know why other apps (eg: Gimp) don't ask anything and the encryption part in pdf specs isn't easy to understand (at least for me).
Comment 5 S.B. 2022-06-21 08:00:53 UTC
Unfortunately it has been a really long time since I last programmed anything and my language of choice was Delphi (Turbo Pascal).  It would take me a long time to get up to speed on this by myself.
  My thoughts are that "encryption" should not necessarily indicate that a file should be unreadable without a password, only that it may have levels for access based upon editor access permissions and viewer access permissions.  If it is not determining the difference between the different permission types and levels of access and just lumping all under "encrypted" and requiring all levels to enter a password rather than allowing an access level to be chosen (which should then request a password based upon access rights) then something has not been coded properly.

The encryption is basically prevention of editing contents without editing rights.  If general read permissions are allowed in the file, then decryption of the contents for read only purposes should be automatic (Gimp's behaviour?), while editing permission should then require the editor's password. (eg. expected behaviour would be - when someone tries to type in the document a dialogue should come up requesting the editor's password *OR* a notice to indicate that the file is "read only".)

Without actually seeing the PDF specifications, I am going to assume that permissions should be like this:

Open (editable)
Read Only (Viewable but not editable without editor password)
Private (Not viewable without read password but editable once viewable)
Private Read Only (Not viewable without Read password Not editable without editor password)

Anything Read Only or above will contain encryption.

So maybe there is a section in the specification on access permissions?
Comment 6 S.B. 2022-06-21 08:21:57 UTC
Basically the flow should go something like this...

Open File
Check for encryption
If encryption is True, determine access permissions
Parse based upon permission Type(s)
e.g.
Type One: Read Only - Decrypt file but deny editing
Type Two: Read Only, Private Edit - Decrypt file but require Edit password upon edit attempt.
Type Three: Private - Require Read Password to decrypt allow edit
Type Four: Private, Private Edit - Require Read Password, Require edit password upon edit attempt.
Comment 7 Julien Nabet 2022-06-21 20:14:30 UTC
Sorry, it wasn't for you specifically but for LO devs. But of course, if you know coding and would like to involve, don't hesitate to contribute! :-)
If interested this link may help:
https://wiki.documentfoundation.org/Development/GetInvolved
LO is mainly (at least 95%) coded in C++.

About PDF specs, you can find it here:
https://opensource.adobe.com/dc-acrobat-sdk-docs/standards/pdfstandards/pdf/PDF32000_2008.pdf
This one is Adobe version, there's also iso version but you must pay to have it if I read well the forums.
It seems there are some slight differences but not sure it's relevant here.
Comment 8 S.B. 2022-06-22 04:09:38 UTC Comment hidden (obsolete)
Comment 9 Julien Nabet 2022-06-22 16:51:39 UTC
(In reply to S.B. from comment #8)
> That's OK :-)  The top of page 59 in the specification guide may be a clue
> as to where the failure is occurring (detection of the default user password
> which should trigger an automatic decryption).  Perhaps check the routine in
> LibreOffice that reads the "padding string" to make sure it is being read
> correctly?

I don't have the patience to understand the whole mechanism but now we got a code pointer and a spec, I suppose someone (who knows coding) may give it a try.
Comment 10 S.B. 2022-06-23 03:03:13 UTC Comment hidden (no-value)
Comment 11 Timur 2022-07-06 10:21:28 UTC
This is a duplicate of bug 113780. Search needed before reporting and confirming. 
Now that this mistake was done, I'll mark the other way around, b/c of info here.
Comment 12 Timur 2022-07-06 10:23:01 UTC
*** Bug 113780 has been marked as a duplicate of this bug. ***
Comment 13 Michael Warner 2022-07-06 13:51:07 UTC
I marked Bug 55425 as See Also, because these are different bugs but related, and I expect the proposed solution to 55425 will also resolve this.