Created attachment 57890 [details] Patch to revert to 3.4.5 unicode type checking It appears that some of the Python 3 compatibility changes (I believe from a09ce46818fd4d5e08b3af9a478501cd8ef5b4fe) break compatibility with the included 2.6.1 interpreter. It looks like the bug was present in any 3.5.x release but I was affected by the the python uno issues in 3.5.0 (bug 45696) so this is the first 3.5 series libreoffice I've run the script with. I've attached a small patch essentially returning to the 3.4.5 tests that corrects the issue, with the trade-off of no longer being Python 3 compatible. For example, the Char class in uno.py in 3.5.1RC1 has: # @param value pass a Unicode string with length 1 def __init__(self,value): assert isinstance(value, str) assert len(value) == 1 self.value=value Now, in Python 3, "str" will work for a unicode string, but not in Python 2.6.1. I ran into this with an existing Writer script that that copies the DecimalChar from an existing tab stop to a new tab stop. Even just referencing the prior DecimalChar value fails at this point, since it is a unicode string. So you can no longer retrieve or round trip Char values, for example. In 3.4.5 this code tested against "unicode" (or in some cases the tuple of str and unicode). The Char class also has similar failure points in its __eq__ method as does Bool (the latter when checking for literal true/false strings). In general, the switch from unicode to string doesn't appear safe with Python 2.x, especially as the Libreoffice internals are returning unicode. From looking at the changeset reference above, it seems that these issues are localized to uno.py. I'm also not sure if all needed changes were covered anyway, since ByteString is still using "str" but I don't think that's accurate for Python 3. I'm not positive of the best way to be compatible with 2.x and 3.x with a single source easily, as neither unicode nor basestring is in 3. Perhaps just defining a dummy unicode class in Py 3 and then always checking either str or unicode. Or perhaps more cleanly, just separating out the classes that need to care into separate modules that are imported into the uno wrapper namespace appropriately. But even without an immediately solution for Py3, I'd think that if given the choice at the moment, of the two, compatibility with the included Python 2.6.1 interpreter should be given preference. -- David
Created attachment 58037 [details] patch should work both with Python 2 and 3
hi David, thanks for your patch, but it seems to me that it would break Python 3, because that has no "unicode" type. i've tried to change it in a way that it works with both Python 2 and 3, see attachment. tested with a little program on the bundled Python 2 (but i think my program never used the broken features since it worked before); please tell me if this fixes the problem for you.
Yes, agreed, as submitted the patch just reverted to LO 3.4.5 behavior but wasn't compatible with Python 3. With respect to the new patch, it is a little more permissive under Python 2 than LO 3.4.5 was, in that it will let non-unicode strings through Char's constructor whereas before it wouldn't. Not sure how big a deal that is since the extension module re-checks. Actually, having been poking at the source since opening this issue, given the checks occurring within the extension module, it's not entirely clear to me why the type checking in needed in the uno.py wrapper in the first place. It seems like PyChar2Unicode (in pyuno_type.cxx) is doing exactly the same checks in C, so another alternative would be to just remove the assertion entirely. -- David
Created attachment 58044 [details] Updated Python 3 compatible patch I did some testing with the second patch (and with no checks at all) and found I can pass through regular (non-Unicode) strings in Python 2 without the errors I expected, and the internal "value" field remains a "str", so I'm not exactly sure what is happening at the uno C++ layer without further investigation. It's probably more conservative to keep the same type checks (enforce only unicode on construction) for now as existed in 3.4.5. The attached patch leaves the checks the same as in 3.4.5/3.5.0, but is Python 3 compatible by aliasing unicode to str in that case. (It's essentially the same as my first patch but with the unicode definition) -- David
David Bolen committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=4634cbc237239da771e0f6a81f78171ecec726ba fdo#46859: adapt string type checks to work with both Python 2 and 3
@mst: The commit suggests this is RESOLVED/FIXED, please update if that is not a misconception.
David Bolen committed a patch related to this issue. It has been pushed to "libreoffice-3-5": http://cgit.freedesktop.org/libreoffice/core/commit/?id=fc290187f08981c734d1f2d3f6649c94e3ac6f99&g=libreoffice-3-5 fdo#46859: adapt string type checks to work with both Python 2 and 3 It will be available in LibreOffice 3.5.2.