Bug 138109 - Create and use a dedicated Length C++ type
Summary: Create and use a dedicated Length C++ type
Status: ASSIGNED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Tomaz Vajngerl
URL:
Whiteboard:
Keywords:
: 166877 (view as bug list)
Depends on:
Blocks: Dev-related
  Show dependency treegraph
 
Reported: 2020-11-10 10:20 UTC by Mike Kaganski
Modified: 2025-10-24 02:00 UTC (History)
9 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2020-11-10 10:20:57 UTC
We use length measures intensively throughout the codebase. We use millimeters, inches, hundreds of mm, cm, m, twips, points, EMUs, and many others. We convert those in all directions.

I suggest to create a dedicated class (in tools?) that would represent generic length; consistently using such class in the code would allow to minimize rounding errors from repeated conversions (there should ideally be only two conversions: when a length is read from a file, and when it's output), and prevent confusion when a number (representing a length) is received from some function, and it's hard to know what units should be used with the number.

It possibly should use some fixed internal unit for length, and some fixed type for the value. It should provide convenience static and member functions for conversion to/from different units, which would also provide overloads for different resulting typed (esp. to differentiate floating-point results vs integer results, with rounding for the latter). It should also provide user-defined literals for simplifying definitions of things like 2_mm or 25_EMU. It needs to allow usual arithmetical operations.

Subject to discussion, I propose to use mm100 as the class' base units, and double as the type for the stored value. That allows to represent any whole number of mm100s up to 2^53 (i.e., up to 90 071 992 547 meters); it will also allow to unambiguously represent any reasonable whole value in EMUs (greater than 20 million meters).

The task would be first to create such a class, and start using it; then gradually make more and more code use it, passing it around, and avoid conversions except in places of input and output.
Comment 1 Tomaz Vajngerl 2020-11-12 09:20:45 UTC
I already worked on something like [1]. The difference to your suggestion is that the base are EMUS (it's what they were created for - common denominator between imperial and metric) stored as int64 (but there is also a double version, still using EMU as the base).
 

[1] https://cgit.freedesktop.org/libreoffice/core/commit/?h=private/tvajngerl/staging&id=21ab2d10b05cae42b465bff3420705360c49a81a
Comment 2 Mike Kaganski 2020-11-12 09:24:34 UTC
(In reply to Tomaz Vajngerl from comment #1)
> I already worked on something like [1].

Oh. I must had overlooked or not knew about that ... and given that that wasn't merged into master, that turned out more problematic that I imagined - so the "easyhack" idea seem wrong here.
Comment 3 Tomaz Vajngerl 2020-11-12 09:34:56 UTC
I was still experimenting with this outside of LO, so I didn't yet push it anywhere, but as you filled a bug, I added it into LO and pushed it to my staging repository.
Comment 4 Mike Kaganski 2020-11-12 09:45:34 UTC
(In reply to Tomaz Vajngerl from comment #3)

Nice! (Unfortunately I can't comment somewhere in gerrit, so commenting here: I suppose that value() there could better be renamed to _emu(), to correspond to other getters there?)
Comment 5 Mike Kaganski 2020-11-12 09:49:21 UTC
(In reply to Mike Kaganski from comment #4)

... and I'd also suggest making ctor taking the number private, and introduce emu() static function constructing it, so that just any creation of the length would explicitly tell which units are used?
Comment 6 Roman Kuznetsov 2021-04-13 11:36:16 UTC
Let's set it as NEW
Comment 7 Telesto 2025-06-06 13:45:45 UTC
*** Bug 166877 has been marked as a duplicate of this bug. ***
Comment 8 nuke 2025-10-17 16:09:33 UTC
Is anyone working on this bug? Or maybe it's already been fixed? I'd be interested in doing so.
Comment 9 Buovjaga 2025-10-17 16:23:22 UTC
(In reply to nuke from comment #8)
> Is anyone working on this bug? Or maybe it's already been fixed? I'd be
> interested in doing so.

The report is not in assigned status, so you can work on it.
Comment 10 nuke 2025-10-20 12:21:51 UTC
Am I correct in understanding that the functions for converting length measures already exist and are actively used in the code, just like the enum class for all length measures already exists in o3tl, and that the goal is specifically to create a class for the length measures?
I was just a little confused when I saw all the functions and templates in tools and o3tl
Comment 11 Mike Kaganski 2025-10-20 12:36:07 UTC
(In reply to nuke from comment #10)

Exactly. The class must contain a single numeric field for its value; and its purpose is to simplify, unify and guarantee correctness of all unit conversions. The user-defined literals mentioned in comment 0 are one such improvement.

Currently, I suggest to use a 64-bit integer, and standardize on EMUs as the base unit.
Comment 12 nuke 2025-10-20 14:09:03 UTC
(In reply to Mike Kaganski from comment #11)
> (In reply to nuke from comment #10)

> Currently, I suggest to use a 64-bit integer, and standardize on EMUs as the
> base unit.

Yes, I was thinking in same direction. I did have some doubts about converting EMUs to kilometers (the number seemed ridiculously large even for int64), but after doing the math, those worries disappeared :)
Comment 13 Tomaz Vajngerl 2025-10-21 02:36:16 UTC
The implementation for this length type is pretty much done including a great number of tests and its corresponding range2D (rect), size and tuple classes (can't do anything if those aren't provided as well). It lives in the branch private/tvajngerl/length [1] and mainly in commit [2] - it's just that it used in anything I think that is too useful yet, so it not in master. My plan was to use it somewhere for something first and then introduce it into master, but anything that I try to change opens a can of worms :) 

But if the classes itself would be useful already I can prepare the patches to get this into master. 

[1] https://cgit.freedesktop.org/libreoffice/core/log/?h=private/tvajngerl/length
[2] https://cgit.freedesktop.org/libreoffice/core/commit/?h=private/tvajngerl/length&id=cbbb2fb7230ea1d9fff74461f872660eb7b80204
Comment 14 Tomaz Vajngerl 2025-10-21 02:52:35 UTC
BTW even with it being 64-bit number and it seems enough, it's not enough in practice for performing some calculations, however I don't think we should switch internally to double just for that. 

I have tried to use it instead of tools:Rectangle in SdrObject and this is where the can of worms opened, where it's not easy to incorporate it because SdrObject can in some case use 100th of mm (Calc, Draw, Impress) and in other cases twips (Writer). So I'm now thinking that the better approach is top-down to change Writer to use this when it's using SdrObjects, so we get unit neutral on the facade and can also run SdrObjects always as using 100th mmm internally and only then change SdrObject to use the new length unit.
Comment 15 nuke 2025-10-21 10:21:55 UTC
I've Got acquainted with your work - it's great. In that case, should I assignee this task to you?
Comment 16 Tomaz Vajngerl 2025-10-24 02:00:35 UTC
It can be assigned to me, but I guess more importantly it shouldn't be an EasyHack.