Bug 116412 - Bundled python fails to import bz2 on Windows
Summary: Bundled python fails to import bz2 on Windows
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
6.0.2.1 release
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:24.2.0
Keywords:
Depends on:
Blocks: Macro-Python
  Show dependency treegraph
 
Reported: 2018-03-15 02:36 UTC by Takeshi Abe
Modified: 2024-02-10 16:44 UTC (History)
13 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Takeshi Abe 2018-03-15 02:36:07 UTC
Description:
Importing bz2 module of Python fails with the following error message:
ImportError: No module named '__bz2'

Steps to Reproduce:
1. Start Command Prompt
2. > cd "C:\Program Files\LibreOffice\program"
3. > python
4. >>> import bz2

Actual Results:  
Got an ImportError:
---
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Program Files\LibreOffice\program\python-core-3.5.4\lib\bz2.py", line 22, in <module>
    from _bz2 import BZ2Compressor, BZ2Decompressor
ImportError: No module named '_bz2'
>>>
---

Expected Results:
Module bz2 is imported successfully.


Reproducible: Always


User Profile Reset: No



Additional Info:
Version: 6.0.2.1 (x64)
Build ID: f7f06a8f319e4b62f9bc5095aa112a65d2f3ac89
CPU threads: 8; OS: Windows 10.0; UI render: default; 
Locale: en-US (en_US); Calc: group

On the other hand, importing bz2 succeeds with python of macOS version of LibO 6.0.2.


User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0
Comment 1 Buovjaga 2018-03-16 14:07:57 UTC
Repro.

Version: 6.0.2.1 (x64)
Build ID: f7f06a8f319e4b62f9bc5095aa112a65d2f3ac89
CPU threads: 4; OS: Windows 10.0; UI render: default; 
Locale: fi-FI (fi_FI); Calc: group
Comment 2 QA Administrators 2019-03-17 03:51:38 UTC Comment hidden (noise)
Comment 3 Takeshi Abe 2019-03-18 00:57:23 UTC
Still reproducible with:

Version: 6.2.1.2 (x64)
Build ID: 7bcb35dc3024a62dea0caee87020152d1ee96e71
CPU threads: 8; OS: Windows 10.0; UI render: default; VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: threaded
Comment 4 Kim 2020-06-30 09:44:29 UTC
This problem persists. 
when importing bz2...


Python 3.7.7 (default, Jun 24 2020, 22:27:58) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import bz2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Program Files\LibreOffice\program\python-core-3.7.7\lib\bz2.py", line 19, in <module>
    from _bz2 import BZ2Compressor, BZ2Decompressor
ModuleNotFoundError: No module named '_bz2'


Found this when trying to use Pandas


problem is unique to the version of python included with LibreOffice.



Additional info
Version: 6.4.5.2 (x64)
Build ID: a726b36747cf2001e06b58ad5db1aa3a9a1872d6
CPU threads: 12; OS: Windows 10.0 Build 18363; UI render: default; VCL: win; 
Locale: en-GB (en_GB); UI-Language: en-GB
Calc: CL
Comment 5 elmau 2020-07-11 14:49:41 UTC
Python 3.7.7 (default, Jul  3 2020, 17:23:00) [MSC v.1925 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import bz2
>>>

1) Install Python 3.7.7
2) Copy from: C:\Program Files\Python37\DLLs\_bz2.pyd
3) Paste in: C:\Program Files\LibreOffice\program\python-core-3.7.7\lib

But... I don't understand why many libraries of Python are trimmed in LibreOffice for Windows, like sqlite3 for example :(
Comment 6 elmau 2020-09-16 19:43:16 UTC
Still reproducible with:

ArchLinux
LibreOffice appImage
Version: 7.0.1.2
Build ID: 7cbcfc562f6eb6708b5ff7d7397325de9e764452
CPU threads: 8; OS: Linux 5.8; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

Python 3.7.7 (default, Aug 27 2020, 22:20:34) 
[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bz2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/.mount_LibreOMUTeE3/opt/libreoffice7.0/program/python-core-3.7.7/lib/bz2.py", line 19, in <module>
    from _bz2 import BZ2Compressor, BZ2Decompressor
ModuleNotFoundError: No module named '_bz2'
>>>
Comment 7 Thomas Viehmann 2020-11-19 09:35:12 UTC
I think I found the reason here, at least for the Linux side:
The libreoffice bundled Python doesn't include an "arch triple" when looking for binary (.so), i.e. for the typical wheel, you'll have foo.cpython-37m-x86_64-linux-gnu.so but libreoffice only looks for foo.cpython-3.7m.so.

I'll look at making a patch.
Comment 8 Thomas Viehmann 2020-11-19 11:03:29 UTC
This isn't the bug I met, so I'll take it back.
The root cause here is that LibreOffice drops C extension modules from the standard library. One remedy could be to put them into whatever extension needs them (but it's a mess).

There is another aspect that external modules are not as easy, but this isn't the core of this bug report, I guess, so I'll not hijack it.
Comment 9 Julien Nabet 2022-01-22 16:12:46 UTC
I could make "import bz2" work with these changes:
diff --git a/external/python3/ExternalPackage_python3.mk b/external/python3/ExternalPackage_python3.mk
index faddf06fc36a..cadb3bf3e9f8 100644
--- a/external/python3/ExternalPackage_python3.mk
+++ b/external/python3/ExternalPackage_python3.mk
@@ -64,6 +64,7 @@ $(eval $(call gb_ExternalPackage_add_files,python3,$(LIBO_BIN_FOLDER)/python-cor
        LO_lib/_bisect.$(python3_EXTENSION_MODULE_SUFFIX).so \
        LO_lib/_blake2.$(python3_EXTENSION_MODULE_SUFFIX).so \
        LO_lib/cmath.$(python3_EXTENSION_MODULE_SUFFIX).so \
+       LO_lib/_bz2.$(python3_EXTENSION_MODULE_SUFFIX).so \
        LO_lib/_codecs_cn.$(python3_EXTENSION_MODULE_SUFFIX).so \
        LO_lib/_codecs_hk.$(python3_EXTENSION_MODULE_SUFFIX).so \
        LO_lib/_codecs_iso2022.$(python3_EXTENSION_MODULE_SUFFIX).so \
diff --git a/external/python3/UnpackedTarball_python3.mk b/external/python3/UnpackedTarball_python3.mk
index 31b6a166e6ae..68490e2e9fee 100644
--- a/external/python3/UnpackedTarball_python3.mk
+++ b/external/python3/UnpackedTarball_python3.mk
@@ -23,7 +23,6 @@ $(eval $(call gb_UnpackedTarball_add_patches,python3,\
        external/python3/python-3.3.0-darwin.patch.1 \
        external/python3/python-3.8-msvc-sdk.patch.1 \
        external/python3/python-3.7.6-msvc-ssl.patch.1 \
-       external/python3/python-3.5.4-msvc-disable.patch.1 \
        external/python3/ubsan.patch.0 \
        external/python3/python-3.5.tweak.strip.soabi.patch \
        external/python3/darwin.patch.0 \

This last part disables the patch which disables (double negation so I should say "enables") a lot of libs from Python

I suppose putting minimum allows to provide lighter packages but if Python is provided for macro scripting, I can understand the need of full Python, not just core of it.
Also, if people have already installed Python, it's a pity we force to include it again in LO.
=> In brief, I don't have strong opinion about just putting minimum libs in Python embedded in LO.

Perhaps the start would be what was the goal to include Python at the first time? macro scripting? LO QA tests in Python instead of Java? Other?
If people could provide pros and cons or perhaps should it be discussed in ESC?

Jan-Marek/Michael/David: any thoughts here?
Comment 10 Jan-Marek Glogowski 2022-06-15 11:51:30 UTC
Python UNO is a shared library, which uses CPython memory functions, which have different signatures between CPython versions. LO just ships a single Python UNO shared lib for our own Python. Linux distros build LO against the system Python, so don't have this restricted modules problem.

I don't have any opinion on the disabled modules. Probably someone wanted to keep a low(er) security profile? Maybe additional dependencies would be needed?

And the LO UI unit tests use Python and it's needed for the build, but can be disabled for scripting.

Implementation options:

1. Ship additional Python UNO shared libraries so external Python can be used
   It's probably a lot of work to implement in gbuild; IMHO "impossible"

2. Switch UNO CPython symbol lookup from compile to runtime (dlsym, etc.)
   This should generally be possible, is much easier and in theory should be a longer weekend project, if these needed functions aren't inline'd. Then a user could simply install any CPython and use it for scripting, avoiding the problem alltogether.

And you need to find somebody, who actually cares about this ;-)
Comment 11 Mike Kaganski 2022-08-16 07:48:16 UTC
(In reply to Julien Nabet from comment #9)
> Perhaps the start would be what was the goal to include Python at the first time?

Note that some core functionality (other than scripting) depends on Python. At least:
1. Spell checking
2. NatNum (and ordinal/cardinal numbering in lists)
3. Sending mail in mail merge

See bug 144902 crashing LibreOffice in the case of Python malfunction.
Comment 12 flywire 2022-08-16 12:25:13 UTC
I looked at https://docs.python.org/3/extending/embedding.html last year re installing pip. In my view just including https://bootstrap.pypa.io/get-pip.py for the Windows installation would make the biggest difference. That said, I recall pip is not designed to work on embedded python, though it seems to, and no guarantee it won't fail at some stage.

Is the distribution size really such an issue these days? An alternative would be to completely remove python and require the user to install a full system python if they want to use it (ie do no more than check user installed it). This could be an issue in corporate environments with firewalls etc.
Comment 13 Mike Kaganski 2022-08-17 07:07:42 UTC
(In reply to flywire from comment #12)
> Is the distribution size really such an issue these days?

FTR: the full 64-bit Python installer from https://www.python.org/downloads/windows/ is around 27.5 MB. I can't directly check the size that Python core takes in our MSI, but compressing content of program/python-core-3.8.12 using 7-zip (using ZIP ultra compression) gives around 6 MB. Not a big deal *IMO* - but there are calls like tdf#97991.

My personal take would be - raise a "include get-pip in the distribution", "include bz2 in the distribution", etc. one by one *by bringing the topic to ESC* as the need arises - because each package is indeed an attack surface to consider by the project and to maintain.

> An alternative would be to completely remove python and require the user to
> install a full system python

Just no. Please see above.
Comment 14 vibrationoflife 2022-08-18 15:19:41 UTC
I am authoring OOO Development Tools https://github.com/Amourspirit/python_ooo_dev_tools.

I had hopes that any package pypi.org package could be installed into LO. This is not the case. I have found it to be hit and miss to which packages will work. Also different on Windows and Linux. Any package that had dependencies not include in embedded python simply fail.

For instance: There are some wonderful GUI packages for python that simply cannot be installed in LO embedded python due to embedded limitations.

I wrote a tool just to help developers ( myself included ) who are writing python scripts for LO. (https://github.com/Amourspirit/python_lo_dev_search)
Sadly this tool can't be installed in an environment set up for LO in Windows due to no sqlite3 support (limitation of embedded python)

In my documentation I have to show user how to hack virtual environment on Windows and Linux, with Windows being more of an ugly hack, to get environment set up to develop more complex scripts (https://python-ooo-dev-tools.readthedocs.io/en/latest/)

I have been considering writing some sort of custom installer to allow a macro to install its own dependencies. I have let this idea float around in my head for several months as I work on other parts. I am leaning towards not developing this sort of installer due the the complexities of embedded python and differences between Window and Linux implementation of python.

Personally I see python as the way forward for LO scripting, in part because of the vast number of supporting libraries that have the potential to be installed if LO shipped with a full version of python.

In short I am advocating LO ship with full python.
Comment 15 Mike Kaganski 2022-08-18 16:42:28 UTC
FTR: https://git.cuates.net/elmau/zaz-pip
Comment 16 flywire 2022-08-28 01:43:52 UTC
(In reply to Mike Kaganski from comment #13)
> (In reply to flywire from comment #12)
> > An alternative would be to completely remove python and require the user to
> > install a full system python
> 
> Just no.
The model of users installing dependency packages is already adopted, eg https://books.libreoffice.org/en/GS73/GS7308-GettingStartedWithBase.html#toc5
> Some Base features... require that a Java Runtime Environment (JRE) is installed.
Comment 17 elmau 2022-09-06 17:42:52 UTC
Still reproducible with:

Version: 7.4.0.3 (x64) / LibreOffice Community
Build ID: f85e47c08ddd19c015c0114a68350214f7066f5a
CPU threads: 16; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win
Locale: es-MX (es_MX); UI: es-ES
Calc: threaded
Comment 18 Xisco Faulí 2022-09-07 10:47:51 UTC
I created a simple patch to not disable bz2 in external/python3/python-3.5.4-msvc-disable.patch.1 -> https://gerrit.libreoffice.org/c/core/+/139580
However, _bz2.vcxproj depends on some additional dependencies

xisco@xisco:~/cpython/PCbuild$ git grep bz2 .
_bz2.vcxproj:    <RootNamespace>bz2</RootNamespace>
_bz2.vcxproj:      <AdditionalIncludeDirectories>$(bz2Dir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
_bz2.vcxproj:    <ClCompile Include="..\Modules\_bz2module.c" />
_bz2.vcxproj:    <ClCompile Include="$(bz2Dir)\blocksort.c" />
_bz2.vcxproj:    <ClCompile Include="$(bz2Dir)\bzlib.c" />
_bz2.vcxproj:    <ClCompile Include="$(bz2Dir)\compress.c" />
_bz2.vcxproj:    <ClCompile Include="$(bz2Dir)\crctable.c" />
_bz2.vcxproj:    <ClCompile Include="$(bz2Dir)\decompress.c" />
_bz2.vcxproj:    <ClCompile Include="$(bz2Dir)\huffman.c" />
_bz2.vcxproj:    <ClCompile Include="$(bz2Dir)\randtable.c" />
_bz2.vcxproj:    <ClInclude Include="$(bz2Dir)\bzlib.h" />
_bz2.vcxproj:    <ClInclude Include="$(bz2Dir)\bzlib_private.h" />

which I don't know how to add in the Python-3.8.12.tar.xz file
Comment 19 Commit Notification 2023-07-30 07:20:44 UTC
Taichi Haradaguchi committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5e3510dbb62229cfb01da371d39ecc27b0d44880

tdf#116412: include bz2 in internal python

It will be available in 24.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 elmau 2023-08-04 17:04:45 UTC
Still reproducible with:

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 2b0b4ddc8bd8fdd4cd689300620fe4621d7533b7
CPU threads: 16; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: es-MX (es_MX); UI: es-ES
Calc: threaded
Comment 21 Mike Kaganski 2023-08-05 03:08:08 UTC
More specifically, this is the error in the current master:

APSO python console [LibreOfficeDev]
3.8.17 (default, Aug  2 2023, 07:20:18) [MSC v.1936 64 bit (AMD64)]
Type "help", "copyright", "credits" or "license" for more information.
>>> import bz2
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "C:\lo\build\instdir\program\uno.py", line 346, in _uno_import
    return _builtin_import(name, *optargs, **kwargs)
  File "C:\lo\build\instdir\program\python-core-3.8.17\lib\bz2.py", line 19, in <module>
    from _bz2 import BZ2Compressor, BZ2Decompressor
  File "C:\lo\build\instdir\program\uno.py", line 425, in _uno_import
    raise uno_import_exc
  File "C:\lo\build\instdir\program\uno.py", line 346, in _uno_import
    return _builtin_import(name, *optargs, **kwargs)
ImportError: No module named '_bz2' (or '_bz2.BZ2Compressor' is unknown)
Comment 22 Commit Notification 2023-08-07 03:59:33 UTC
Taichi Haradaguchi committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5cd48b4969d25400cc6634fb64706a763528ec65

Revert "tdf#116412: include bz2 in internal python"

It will be available in 24.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 23 Commit Notification 2023-10-14 06:53:31 UTC
Taichi Haradaguchi committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/c72d5d787f7a3024f2108d6d6e192b158fb144ed

tdf#116412 include bz2 module in internal python

It will be available in 24.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 24 elmau 2023-10-16 20:45:45 UTC
Fix it, in:

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 2b8b6ced7c67e6a56f06b02e92f0555a796f3b16
CPU threads: 16; OS: Linux 6.5; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded


Very thanks.
Comment 25 vibrationoflife 2023-10-29 17:31:17 UTC
I downloaded the latest nightly and the issue stil presist.
Cannnot import _bz2.

Pandas requires _bz2 and I just worte a extension for Pandas.
https://extensions.libreoffice.org/en/extensions/show/41998

My solution is to figure out what version embedded version of of python LibreOffice that the user has and from that download the correct embedded python while the extension is installing. At that point I extract _bz2.pyd from the downloaded embedded version and coping it to a know site-packages folder making it available to LibreOffice python.

The solution is is a fix to get _bz2 into LibreOffice but not the most practial.

I also just worte an extension adding Sqlite3 back into LibreOffice (Windows) using the same techniques.
https://extensions.libreoffice.org/en/extensions/show/41999

Why do we not just add the full embedded python into LibreOffice or even better if possibel full python.
Comment 26 Julien Nabet 2023-10-29 17:36:13 UTC
(In reply to vibrationoflife from comment #25)
> I downloaded the latest nightly and the issue stil presist.
> Cannnot import _bz2.
>...
Did you download a nightly build from 24.2 branch? (so from https://dev-builds.libreoffice.org/daily/master/)

Indeed, the other branches are not fixed, eg download a nightly build from 7.6 branch won't help.
Comment 27 elmau 2023-10-29 17:57:08 UTC
Fix it, in:

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 7fff4e2ca6739928f72e5f0d2eb5820823916769
CPU threads: 16; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: es-MX (es_MX); UI: es-ES
Calc: threaded


APSO python console [LibreOfficeDev]
3.8.18 (default, Oct 28 2023, 05:10:31) [MSC v.1929 64 bit (AMD64)]
Type "help", "copyright", "credits" or "license" for more information.
>>> import bz2
>>>
Comment 28 vibrationoflife 2023-10-29 21:28:29 UTC
(In reply to Julien Nabet from comment #26)
> (In reply to vibrationoflife from comment #25)
> > I downloaded the latest nightly and the issue stil presist.
> > Cannnot import _bz2.
> >...
> Did you download a nightly build from 24.2 branch? (so from
> https://dev-builds.libreoffice.org/daily/master/)
> 
> Indeed, the other branches are not fixed, eg download a nightly build from
> 7.6 branch won't help.

It seems I downloaded the incorrect nightly.
Just tired again and it seems to be working:

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 7fff4e2ca6739928f72e5f0d2eb5820823916769
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: CL threaded

My Extension handels this nicly.
From Log file: DEBUG - _bz2 is already installed. Skipping _bz2 install:  C:\Program Files\LibreOfficeDev 24\program\python-core-3.8.18\lib\_bz2.pyd

This makes my Extension backwards compatable to LO 7.0.

BTW I will still not be possible to PIP Pandas into Mac even with the _b2z fix.
There are cpythoning naming issues. See:
https://github.com/Amourspirit/python-libreoffice-pip/wiki/pyproject.toml#sym_link_cpython

I manage to fixe this issue with my Extension by creating Symbolic Links.
Comment 29 Julien Nabet 2023-10-30 07:47:49 UTC
(In reply to vibrationoflife from comment #28)
> ...
> I manage to fixe this issue with my Extension by creating Symbolic Links.

Perhaps you may be interested in contributing to LibreOffice directly?
If yes, you can start with https://wiki.documentfoundation.org/Development/GetInvolved.