Description: LibreOffice's interface is sluggish when resizing, scrolling and during other operations when used under macOS 10.13/14/15 with a 10-bit 5K display. The attached patch will improve the rendering performance by setting NSWindow's color depth limit. diff --git a/vcl/inc/osx/salframeview.h b/vcl/inc/osx/salframeview.h index 36537f3db958..e3a1269f2ccc 100644 --- a/vcl/inc/osx/salframeview.h +++ b/vcl/inc/osx/salframeview.h @@ -79,6 +79,7 @@ enum class SalEvent; // #i102807# used by magnify event handler NSTimeInterval mfLastMagnifyTime; float mfMagnifyDeltaSum; + BOOL mbColorDepthIsSet; } +(void)unsetMouseFrame: (AquaSalFrame*)pFrame; -(id)initWithSalFrame: (AquaSalFrame*)pFrame; diff --git a/vcl/osx/salframeview.mm b/vcl/osx/salframeview.mm index d26617eb3d15..21a52b39d4e5 100644 --- a/vcl/osx/salframeview.mm +++ b/vcl/osx/salframeview.mm @@ -465,6 +465,7 @@ -(id)initWithSalFrame: (AquaSalFrame*)pFrame mpReferenceWrapper = nil; mpMouseEventListener = nil; mpLastSuperEvent = nil; + mbColorDepthIsSet = NO; } mfLastMagnifyTime = 0.0; @@ -510,6 +511,13 @@ -(BOOL)isOpaque -(void)drawRect: (NSRect)aRect { + if (!mbColorDepthIsSet) + { + [mpFrame->getNSWindow() setDynamicDepthLimit:NO]; + [mpFrame->getNSWindow() setDepthLimit:NSWindowDepthTwentyfourBitRGB]; + mbColorDepthIsSet = YES; + } + AquaSalInstance *pInstance = GetSalData()->mpInstance; assert(pInstance); if (!pInstance) Actual Results: Sluggish interface Expected Results: Smooth resizing and scrolling. Reproducible: Always User Profile Reset: No Additional Info: Set window color depth limit.
The improved patch: diff --git a/vcl/osx/salframeview.mm b/vcl/osx/salframeview.mm index d26617eb3d15..4647807718ec 100644 --- a/vcl/osx/salframeview.mm +++ b/vcl/osx/salframeview.mm @@ -203,6 +203,9 @@ -(id)initWithSalFrame: (AquaSalFrame*)pFrame // Disable window restoration until we support it directly [pNSWindow setRestorable: NO]; + [pNSWindow setDynamicDepthLimit: NO]; + [pNSWindow setDepthLimit: NSWindowDepthTwentyfourBitRGB]; + return static_cast<SalFrameWindow *>(pNSWindow); } There are still large amount of pixel format conversions from agbr32 to rgba32 even under the display is in SRGB color space.
Another patch to create a bitmap with the color space of the main display, thus slightly improves rendering performance. But there are still layers of other color spaces, so there are still pixel format conversions. diff --git a/vcl/osx/saldata.cxx b/vcl/osx/saldata.cxx index 5235f657f8ca..4f00e268e711 100644 --- a/vcl/osx/saldata.cxx +++ b/vcl/osx/saldata.cxx @@ -51,7 +51,6 @@ SalData::SalData() mpFirstPrinter( nullptr ), mpFontList( nullptr ), mpStatusItem( nil ), - mxRGBSpace( CGColorSpaceCreateWithName(kCGColorSpaceSRGB) ), mxGraySpace( CGColorSpaceCreateWithName(kCGColorSpaceGenericGrayGamma2_2) ), maCursors(), mbIsScrollbarDoubleMax( false ), @@ -66,6 +65,10 @@ SalData::SalData() maCursors.fill( INVALID_CURSOR_PTR ); if( s_aAutoReleaseKey == nullptr ) s_aAutoReleaseKey = osl_createThreadKey( releasePool ); + + CGDirectDisplayID did = CGMainDisplayID(); + CGColorSpaceRef cs = CGDisplayCopyColorSpace(did); + mxRGBSpace = cs; } SalData::~SalData()
@Meeks / Tor A potential contributor + a (partial) patch proposal here for the perf issue with Retina displays
Technical a duplicate but lets set it to new for now
Thanks, Leo, very interesting! Will investigate.
Tor Lillqvist committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/2cd47f53b4c7bd300c210eaa466e13adc832c9b5 tdf#137468: Add debug output function for CGColorSpaceRef It will be available in 7.1.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Leo, do you have any way to objectively measure the impact of these patches, out do you just estimate by eyeball? (or is the difference so very obvious if one just knows what kind of document to use and what to do while editing it?) I submitted the first patch on your behalf as https://gerrit.libreoffice.org/c/core/+/104476 .
Do I understand correctly that a colour space is only concerned with what colour channels are used, what the (XYZ) colours of the primaries are, what the whitepoint is, and such issues, and the number of bits per colour component is not as such related to the colour space used? (Although in practice, the sRGB colour space is typically used with 8 bits per RGB channel, and wider gamut colour spaces are used with 10 bits per channel or floating point components even?)
Submitted the second patch as https://gerrit.libreoffice.org/c/core/+/104477 .
(In reply to Tor Lillqvist from comment #7) > Leo, do you have any way to objectively measure the impact of these patches, > out do you just estimate by eyeball? (or is the difference so very obvious > if one just knows what kind of document to use and what to do while editing > it?) > > I submitted the first patch on your behalf as > https://gerrit.libreoffice.org/c/core/+/104476 . With iStat Menus' CPU/GPU dropdown menu: Original: 5-7 fps (almost unusable) Patched: 12-18 fps (acceptable, but not smooth) when the content is pure text. Quartz Debug also gives the FPS number, but since it updates the dock icon frequently, the number is usually much higher than the actual, especially when the FPS is very low. I am still trying to find where the pixel format conversions are introduced. AquaSalGraphics::copyArea and AquaSalGraphics::copyBits are the hottest spots, they call CGContextDrawLayerInRect and CGContextDrawLayerAtPoint and those 2 functions call argb32_image_mark then rgba32_sample_rgba32 is called.
Leo, will the iStat Menus in the App Store display the frame rate? Or just the one outside the App Store? I am a bit careful in installing random "useful" binaries.
(In reply to Tor Lillqvist from comment #11) > Leo, will the iStat Menus in the App Store display the frame rate? Or just > the one outside the App Store? I am a bit careful in installing random > "useful" binaries. I use iStat Menus just for convenience. You don't need install it. Get the fps number by the following code. --- [1.c] --- #include <stdint.h> #include <stdio.h> #include <unistd.h> typedef int CGSConnectionID; extern CGSConnectionID _CGSDefaultConnection(void); extern int32_t CGSGetPerformanceData(CGSConnectionID cid, float *outFPS, float *unk, float *unk2, float *unk3); int main(int argc, char **argv) { CGSConnectionID cid = _CGSDefaultConnection(); for (;;) { float fps, f1, f2, f3; CGSGetPerformanceData(cid, &fps, &f1, &f2, &f3); printf("fps=%f\n", fps); sleep(1); } } --- Compile it with: cc -o 1 1.c -framework CoreGraphics
Wow, thanks.
Nice work Leo. I really wonder though whether just switching to use Skia to do our rendering here wouldn't solve the performance problems, switch to an optimized backend and allow a huge amount of helpful code-sharing. Lubos - any pointers / skeleton patch for enabling / testing Skia on Mac ? Leo - a chunk of the work there is (I suspect) testing, and tweaking =) if you're interested.
(In reply to Michael Meeks from comment #14) I like they idea of shared code; including ability of using Vulkan. Bit tired of testing all those backends (even though MacOS backend doing pretty OK) While being hell of topic here.. what's the reason for separate GTK3/KDE5 backends? Seeing GTK3 bugs in they bugtracker on a pretty regular basis. Those are nicely maintained, but require quite some resources.
(In reply to Telesto from comment #15) > While being hell of topic here.. what's the reason for separate GTK3/KDE5 > backends? Seeing GTK3 bugs in they bugtracker on a pretty regular basis. > Those are nicely maintained, but require quite some resources. The reason is integration into the desktop.
I would be prepared to commit both patches https://gerrit.libreoffice.org/c/core/+/104476 and https://gerrit.libreoffice.org/c/core/+/104477 . It would be nice to get the changes into the nightly builds. Or, Leo, do you think we should wait for more investigation first? (If it turns out that either or both need to be reverted, that can be trivially done later.)
(In reply to Michael Meeks from comment #14) > Nice work Leo. I really wonder though whether just switching to use Skia to > do our rendering here wouldn't solve the performance problems, switch to an > optimized backend and allow a huge amount of helpful code-sharing. > > Lubos - any pointers / skeleton patch for enabling / testing Skia on Mac ? I'm not familiar with the quartz VCL backend, but I doubt there's a simple patch for making it use Skia, the backend would need to get ported to use Skia APIs (or use pluggable implementations the same way Windows and 'gen' backends do).
(In reply to Tor Lillqvist from comment #17) > I would be prepared to commit both patches > https://gerrit.libreoffice.org/c/core/+/104476 and > https://gerrit.libreoffice.org/c/core/+/104477 . It would be nice to get the > changes into the nightly builds. Or, Leo, do you think we should wait for > more investigation first? > > (If it turns out that either or both need to be reverted, that can be > trivially done later.) I think the two patches can be submitted.
(In reply to Luboš Luňák from comment #18) > I'm not familiar with the quartz VCL backend, but I doubt there's a simple > patch for making it use Skia, the backend would need to get ported to use > Skia APIs (or use pluggable implementations the same way Windows and 'gen' > backends do). To elaborate more on this, in case somebody would like to implement Skia support for Mac: The Skia support on Windows (and on Linux with the 'gen' VCL backend) works by having the WinSalGraphics class forward all the graphic calls to SalGraphicsImpl base class that actually performs them (see e.g. WinSalGraphics::drawLine()). And Skia use is done by having WinSkiaSalGraphicsImpl class be the actual class that performs the drawing. The rest of the platform-specific code such as window handling is still done by WinSalGraphics and other classes from the 'win' backend. So Skia support on Mac would presumably require changing the 'quartz' VCL backend to work similarly. I.e. make AquaSalGraphics forward drawing calls to SalGraphicsImpl implementation, move the current drawing code to AquaSalGraphicsImpl (or whatever it would be called), and then create AquaSkiaSalGraphicsImpl, which would inherit from SkiaSalGraphicsImpl.
Leo, even if the patches are small, counting lines of code, could you please send a license statement to the development mailing list, as described in https://wiki.documentfoundation.org/Development/Developers#Example_Statement , thanks.
Leo Wang committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/c2fa92b09b069cae2693ddb56143984f06cb9dc2 tdf#137468: Restrict window depth to 24 bit RGB It will be available in 7.1.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Leo Wang committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/cb48b7205cc6cdcf6741bc430266481a8b6b4769 tdf#137468: Use the colour space of the main display It will be available in 7.1.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Tor Lillqvist committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/86d4419c6c4961c8b473decccea99338c176ffe1 Revert "tdf#137468: Use the colour space of the main display" It will be available in 7.1.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Leo, sadly the colour space change had to be reverted, see above. Do you think using CGColorSpaceCreateDeviceRGB() instead of CGDisplayCopyColorSpace(CGMainDisplayID()) would also help for the performance? That doesn't make the unit test fail. But on the other hand, CGColorSpaceCreateDeviceRGB() is weird. What does it actually do? The docs say that it returns "a device-dependent RGB color space", but not say for what device.
(In reply to Tor Lillqvist from comment #25) > Leo, sadly the colour space change had to be reverted, see above. > > Do you think using CGColorSpaceCreateDeviceRGB() instead of > CGDisplayCopyColorSpace(CGMainDisplayID()) would also help for the > performance? That doesn't make the unit test fail. But on the other hand, > CGColorSpaceCreateDeviceRGB() is weird. What does it actually do? The docs > say that it returns "a device-dependent RGB color space", but not say for > what device. CGDisplayCopyColorSpace(CGMainDisplayID()) brings performance improvement in my 5K 10-bit display.
But as it causes a unit test to fail, we can't really use it as such. Either the test is wrong and needs to be fixed, or code elsewhere in LibreOffice needs to be updated to match the changed RGB colour space.
(In reply to Tor Lillqvist from comment #27) > But as it causes a unit test to fail, we can't really use it as such. Either > the test is wrong and needs to be fixed, or code elsewhere in LibreOffice > needs to be updated to match the changed RGB colour space. I saw the failed test (BitmapRenderTest). I think the test needs to be improved so that it is aware of the color value shift due to the nature of color space difference. <html> <table bgcolor=#22ff55> <tr> <td width=500>aaa</td> </tr> </table> </html> This makes a quick demo: the background color of the table cell is #22ff55. In Digital Color Meter, with Safari 14: - sRGB: #23ff55 - native value: #79fb6e - generic RGB: #2dff44 (ff is clipped) - Adobe RGB: #93ff64 with Chrome 87: - sRGB: #23ff55 - native value: #22ff55 - generic RGB: #2cff44 (ff is clipped) - Adobe RGB: #93ff64 If a bitmap is created with other color space then SRGB, the RGB color value of a pixel extracted with SRGB color space is different vs the original value.
Currently the hottest spot in the downstream of the copyArea and copyBits calls is: 7 CoreGraphics 2781.0 ripc_DrawLayer 6 CoreGraphics 2777.0 ripc_DrawImage 5 CoreGraphics 2774.0 ripc_RenderImage 4 CoreGraphics 2774.0 RIPLayerBltImage 3 CoreGraphics 2774.0 ripl_Mark 2 CoreGraphics 2774.0 ARGB32_image 1 CoreGraphics 2773.0 rgba32_image_mark 0 CoreGraphics 1985.0 rgba32_sample_rgba32 where the bottom rgba32_sample_rgba32 is not optimized with SIMD. If the following modification is made: diff --git a/vcl/quartz/salgdiutils.cxx b/vcl/quartz/salgdiutils.cxx index 426aea29dc78..3ca31c2a3a7b 100644 --- a/vcl/quartz/salgdiutils.cxx +++ b/vcl/quartz/salgdiutils.cxx @@ -134,7 +134,7 @@ bool AquaSalGraphics::CheckContext() const int nBytesPerRow = (nBitmapDepth * nScaledWidth) / 8; void* pRawData = std::malloc(nBytesPerRow * nScaledHeight); - const int nFlags = kCGImageAlphaNoneSkipFirst; + const int nFlags = kCGImageAlphaNoneSkipFirst | kCGBitmapByteOrder32Host; CGContextHolder aContextHolder(CGBitmapContextCreate( pRawData, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); i.e., to create a bitmap with the host byte order, at least in my Mac, all calls to rgba32_sample_rgba32 are avoided. Here is it: 8 CoreGraphics 2009.0 CGContextDrawLayerAtPoint 7 CoreGraphics 2009.0 ripc_DrawLayer 6 CoreGraphics 2004.0 ripc_DrawImage 5 CoreGraphics 2002.0 ripc_RenderImage 4 CoreGraphics 2002.0 RIPLayerBltImage 3 CoreGraphics 2002.0 ripl_Mark 2 CoreGraphics 2002.0 argb32_image 1 CoreGraphics 2002.0 CGSBlend8888toRGBA8888 0 vImage 2001.0 vPremultipliedAlphaBlendWithPermute_RGBA8888_CV_avx512 This brings almost 30% performance boost in my Mac Pro 2019 (2.5G Intel Xeon W), there are optimizations for older CPUs, found in vImage (macOS 10.15.7): 0000000000663030 t _vPremultipliedAlphaBlendWithPermute_RGBA8888_CV_avx 000000000076b660 t _vPremultipliedAlphaBlendWithPermute_RGBA8888_CV_avx2 0000000000017560 t _vPremultipliedAlphaBlendWithPermute_RGBA8888_CV_avx512 0000000000583740 t _vPremultipliedAlphaBlendWithPermute_RGBA8888_CV_sse4_1 00000000003aa2b0 t _vPremultipliedAlphaBlendWithPermute_RGBA8888_CV_vec I don't have other Macs, but I think it is safe to say the series of functions should be faster than the unoptimized rgba32_sample_rgba32 function. This also depends on the CGDisplayCopyColorSpace(CGMainDisplayID()) modification.
Thanks for all the effort Leo! Sounds promising (not that I'm able to asses code ;-). I think will make a group of people really happy: bug 113104
(In reply to Telesto from comment #30) > Thanks for all the effort Leo! Sounds promising (not that I'm able to asses > code ;-). I think will make a group of people really happy: bug 113104 I don't have other Macs, so please let me know what happens on non-10-bit displays.
> I think the test needs to be improved so that it is aware of the color value shift due to the nature of color space difference. Maybe, but surely also other code all over the place (slight exaggeration) would need to be adjusted, too, if the colour space used to display LO's windows were a device-dependent, potentially wide gamut, colour space. As it is (now with the change reverted), everything is sRGB.
(In reply to Tor Lillqvist from comment #32) > > I think the test needs to be improved so that it is aware of the color value shift due to the nature of color space difference. > > Maybe, but surely also other code all over the place (slight exaggeration) > would need to be adjusted, too, if the colour space used to display LO's > windows were a device-dependent, potentially wide gamut, colour space. As it > is (now with the change reverted), everything is sRGB. Currently the test unit creates a bitmap, draws a point with specific color, gets the pixel's color value and compares against the original value. This only works when there is only one color space. Maybe we can add a "hidden" preference item to let the user choose? Without a complete rewrite of the renderer, it is very difficult to make use of the GPU: all bitmap drawing works have to be done with CPU, and it is hard to predict what color space CGLayer determines to blend all the layers to generate the final bitmap. PS: with all the 3 patches to "consistentize" the color spaces, on my Mac LibreOffice retrieves 20-25fps for text, spreadsheets and simple presentations.
But it isn't only about the test. LibreOffice itself must also continue working and show colours as before. If we start claiming to the OS that the windows and bitmaps we use are in a wide-gamut colour space, but still use RGB values in them that are in sRGB, won't the colours look all wrong? If a document uses an 8 bits per channel RGB colour (0, 255, 0), that means the colour is the green primary of sRGB (which is the implicit default colour space), doesn't it? And if that colour then is used as such in a pixmap claimed to be in the display's native wide-gamut colour space, and drawn to a window that also uses the same wide-gamut colourspace, it will be shown as a much different green, as the green primary of the wide-gamut colour space. Or am I missing something?
Not read it in a long time; but assuming the OS/X backend does like the gtk3 and headless backends - and keeps a back-buffer for the whole window around as well as tracking damage to do incremental rendering - possibly switching the back-buffer to the display format might help subsequent blitting of that ... just a thought.
The following patch which works with the color depth limit patch provides speed improvements while keeping the color space of the internal bitmap the same as before (SRGB). It creates a CGImage of the color space of the current display when painting the layer to the screen context. diff --git a/vcl/inc/quartz/salgdi.h b/vcl/inc/quartz/salgdi.h index 8058b68378b6..165e704dbeb9 100644 --- a/vcl/inc/quartz/salgdi.h +++ b/vcl/inc/quartz/salgdi.h @@ -134,6 +134,7 @@ class AquaSalGraphics : public SalGraphics { CGLayerHolder maLayer; // Quartz graphics layer CGContextHolder maContextHolder; // Quartz drawing context + CGContextHolder maCSContextHolder; // Quartz drawing context considering the color space XorEmulation* mpXorEmulation; int mnXorMode; // 0: off 1: on 2: invert only diff --git a/vcl/osx/saldata.cxx b/vcl/osx/saldata.cxx index 14124e4b00f8..66b4e6a4bed9 100644 --- a/vcl/osx/saldata.cxx +++ b/vcl/osx/saldata.cxx @@ -52,7 +52,8 @@ SalData::SalData() mpFirstPrinter( nullptr ), mpFontList( nullptr ), mpStatusItem( nil ), - mxRGBSpace( CGDisplayCopyColorSpace( CGMainDisplayID() ) ), + mxRGBSpace( CGColorSpaceCreateWithName(kCGColorSpaceSRGB) ), + //mxRGBSpace( CGDisplayCopyColorSpace( CGMainDisplayID() ) ), mxGraySpace( CGColorSpaceCreateWithName(kCGColorSpaceGenericGrayGamma2_2) ), maCursors(), mbIsScrollbarDoubleMax( false ), diff --git a/vcl/quartz/salgdiutils.cxx b/vcl/quartz/salgdiutils.cxx index 426aea29dc78..3521dac35def 100644 --- a/vcl/quartz/salgdiutils.cxx +++ b/vcl/quartz/salgdiutils.cxx @@ -120,6 +120,7 @@ bool AquaSalGraphics::CheckContext() CGContextRelease(maContextHolder.get()); } maContextHolder.set(nullptr); + maCSContextHolder.set(nullptr); maLayer.set(nullptr); } @@ -134,13 +135,18 @@ bool AquaSalGraphics::CheckContext() const int nBytesPerRow = (nBitmapDepth * nScaledWidth) / 8; void* pRawData = std::malloc(nBytesPerRow * nScaledHeight); - const int nFlags = kCGImageAlphaNoneSkipFirst; + int nFlags = kCGImageAlphaNoneSkipFirst | kCGBitmapByteOrder32Host; CGContextHolder aContextHolder(CGBitmapContextCreate( pRawData, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); maLayer.set(CGLayerCreateWithContext(aContextHolder.get(), aLayerSize, nullptr)); maLayer.setScale(fScale); + pRawData = std::malloc(nBytesPerRow * nScaledHeight); + nFlags = kCGImageAlphaNoneSkipFirst | kCGBitmapByteOrder32Host; + maCSContextHolder.set(CGBitmapContextCreate( + pRawData, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); + CGContextRef xDrawContext = CGLayerGetContext(maLayer.get()); maContextHolder = xDrawContext; @@ -217,8 +223,13 @@ void AquaSalGraphics::UpdateWindow( NSRect& ) const CGSize aSize = maLayer.getSizePoints(); const CGRect aRect = CGRectMake(0, 0, aSize.width, aSize.height); - - CGContextDrawLayerInRect(rCGContextHolder.get(), aRect, maLayer.get()); + const CGRect xRect = { CGPointZero, maLayer.getSizePixels() }; + CGContextDrawLayerInRect(maCSContextHolder.get(), xRect, maLayer.get()); + CGImageRef img = CGBitmapContextCreateImage(maCSContextHolder.get()); + CGImageRef displayColorSpaceImage = CGImageCreateCopyWithColorSpace(img, CGDisplayCopyColorSpace(CGMainDisplayID())); + CGContextDrawImage(rCGContextHolder.get(), aRect, displayColorSpaceImage); + CGImageRelease(img); + CGImageRelease(displayColorSpaceImage); rCGContextHolder.restoreState(); }
Corrected patch. :-( --- diff --git a/vcl/inc/quartz/salgdi.h b/vcl/inc/quartz/salgdi.h index 8058b68378b6..165e704dbeb9 100644 --- a/vcl/inc/quartz/salgdi.h +++ b/vcl/inc/quartz/salgdi.h @@ -134,6 +134,7 @@ class AquaSalGraphics : public SalGraphics { CGLayerHolder maLayer; // Quartz graphics layer CGContextHolder maContextHolder; // Quartz drawing context + CGContextHolder maCSContextHolder; // Quartz drawing context considering the color space XorEmulation* mpXorEmulation; int mnXorMode; // 0: off 1: on 2: invert only diff --git a/vcl/quartz/salgdiutils.cxx b/vcl/quartz/salgdiutils.cxx index 426aea29dc78..3521dac35def 100644 --- a/vcl/quartz/salgdiutils.cxx +++ b/vcl/quartz/salgdiutils.cxx @@ -120,6 +120,7 @@ bool AquaSalGraphics::CheckContext() CGContextRelease(maContextHolder.get()); } maContextHolder.set(nullptr); + maCSContextHolder.set(nullptr); maLayer.set(nullptr); } @@ -134,13 +135,18 @@ bool AquaSalGraphics::CheckContext() const int nBytesPerRow = (nBitmapDepth * nScaledWidth) / 8; void* pRawData = std::malloc(nBytesPerRow * nScaledHeight); - const int nFlags = kCGImageAlphaNoneSkipFirst; + int nFlags = kCGImageAlphaNoneSkipFirst | kCGBitmapByteOrder32Host; CGContextHolder aContextHolder(CGBitmapContextCreate( pRawData, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); maLayer.set(CGLayerCreateWithContext(aContextHolder.get(), aLayerSize, nullptr)); maLayer.setScale(fScale); + pRawData = std::malloc(nBytesPerRow * nScaledHeight); + nFlags = kCGImageAlphaNoneSkipFirst | kCGBitmapByteOrder32Host; + maCSContextHolder.set(CGBitmapContextCreate( + pRawData, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); + CGContextRef xDrawContext = CGLayerGetContext(maLayer.get()); maContextHolder = xDrawContext; @@ -217,8 +223,13 @@ void AquaSalGraphics::UpdateWindow( NSRect& ) const CGSize aSize = maLayer.getSizePoints(); const CGRect aRect = CGRectMake(0, 0, aSize.width, aSize.height); - - CGContextDrawLayerInRect(rCGContextHolder.get(), aRect, maLayer.get()); + const CGRect xRect = { CGPointZero, maLayer.getSizePixels() }; + CGContextDrawLayerInRect(maCSContextHolder.get(), xRect, maLayer.get()); + CGImageRef img = CGBitmapContextCreateImage(maCSContextHolder.get()); + CGImageRef displayColorSpaceImage = CGImageCreateCopyWithColorSpace(img, CGDisplayCopyColorSpace(CGMainDisplayID())); + CGContextDrawImage(rCGContextHolder.get(), aRect, displayColorSpaceImage); + CGImageRelease(img); + CGImageRelease(displayColorSpaceImage); rCGContextHolder.restoreState(); }
Further improved patch based on the color depth limit patch. 1. Improved drawing performance on 10-bit displays by avoiding multiple color space conversions. 2. Make all drawing context bitmaps have the same color space and byte order of the host system. 3. Fixed serious CG-related memory leak (existed no later than version 7.0) when changing the window size. diff --git a/vcl/inc/quartz/salgdi.h b/vcl/inc/quartz/salgdi.h index 8058b68378b6..0aaf71f0f839 100644 --- a/vcl/inc/quartz/salgdi.h +++ b/vcl/inc/quartz/salgdi.h @@ -134,6 +134,8 @@ class AquaSalGraphics : public SalGraphics { CGLayerHolder maLayer; // Quartz graphics layer CGContextHolder maContextHolder; // Quartz drawing context + CGContextHolder maBGContextHolder; // Quartz drawing context for CGLayer + CGContextHolder maCSContextHolder; // Quartz drawing context considering the color space XorEmulation* mpXorEmulation; int mnXorMode; // 0: off 1: on 2: invert only diff --git a/vcl/quartz/salgdiutils.cxx b/vcl/quartz/salgdiutils.cxx index 426aea29dc78..1bd3063358bf 100644 --- a/vcl/quartz/salgdiutils.cxx +++ b/vcl/quartz/salgdiutils.cxx @@ -69,11 +69,26 @@ void AquaSalGraphics::SetPrinterGraphics( CGContextRef xContext, long nDPIX, lon void AquaSalGraphics::InvalidateContext() { UnsetState(); + + CGContextRelease(maContextHolder.get()); + CGContextRelease(maBGContextHolder.get()); + CGContextRelease(maCSContextHolder.get()); + maContextHolder.set(nullptr); + maCSContextHolder.set(nullptr); + maBGContextHolder.set(nullptr); } void AquaSalGraphics::UnsetState() { + if (maBGContextHolder.isSet()) + { + CGContextRelease(maBGContextHolder.get()); + } + if (maCSContextHolder.isSet()) + { + CGContextRelease(maCSContextHolder.get()); + } if (maContextHolder.isSet()) { maContextHolder.restoreState(); @@ -119,7 +134,12 @@ bool AquaSalGraphics::CheckContext() { CGContextRelease(maContextHolder.get()); } + CGContextRelease(maBGContextHolder.get()); + CGContextRelease(maCSContextHolder.get()); + maContextHolder.set(nullptr); + maBGContextHolder.set(nullptr); + maCSContextHolder.set(nullptr); maLayer.set(nullptr); } @@ -133,14 +153,17 @@ bool AquaSalGraphics::CheckContext() const CGSize aLayerSize = { static_cast<CGFloat>(nScaledWidth), static_cast<CGFloat>(nScaledHeight) }; const int nBytesPerRow = (nBitmapDepth * nScaledWidth) / 8; - void* pRawData = std::malloc(nBytesPerRow * nScaledHeight); - const int nFlags = kCGImageAlphaNoneSkipFirst; - CGContextHolder aContextHolder(CGBitmapContextCreate( - pRawData, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); + int nFlags = kCGImageAlphaNoneSkipFirst | kCGBitmapByteOrder32Host; + maBGContextHolder.set(CGBitmapContextCreate( + NULL, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); - maLayer.set(CGLayerCreateWithContext(aContextHolder.get(), aLayerSize, nullptr)); + maLayer.set(CGLayerCreateWithContext(maBGContextHolder.get(), aLayerSize, nullptr)); maLayer.setScale(fScale); + nFlags = kCGImageAlphaPremultipliedFirst | kCGBitmapByteOrder32Host; + maCSContextHolder.set(CGBitmapContextCreate( + NULL, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); + CGContextRef xDrawContext = CGLayerGetContext(maLayer.get()); maContextHolder = xDrawContext; @@ -217,8 +240,13 @@ void AquaSalGraphics::UpdateWindow( NSRect& ) const CGSize aSize = maLayer.getSizePoints(); const CGRect aRect = CGRectMake(0, 0, aSize.width, aSize.height); - - CGContextDrawLayerInRect(rCGContextHolder.get(), aRect, maLayer.get()); + const CGRect xRect = { CGPointZero, maLayer.getSizePixels() }; + CGContextDrawLayerInRect(maCSContextHolder.get(), xRect, maLayer.get()); + CGImageRef img = CGBitmapContextCreateImage(maCSContextHolder.get()); + CGImageRef displayColorSpaceImage = CGImageCreateCopyWithColorSpace(img, CGDisplayCopyColorSpace(CGMainDisplayID())); + CGContextDrawImage(rCGContextHolder.get(), aRect, displayColorSpaceImage); + CGImageRelease(img); + CGImageRelease(displayColorSpaceImage); rCGContextHolder.restoreState(); }
The patch in comment #38 replaces the ones in comment #36 and #37, right? Will try and submit to Gerrit for you.
(In reply to Tor Lillqvist from comment #39) > The patch in comment #38 replaces the ones in comment #36 and #37, right? > Will try and submit to Gerrit for you. Yes.
See https://gerrit.libreoffice.org/c/core/+/104756 I cleaned up the last bit of the patch (and a preceding line) to be more consistent (initialise both CGRect structs there the same way), and to not use the "x" prefix for something that is not a UNO reference.
Leo, thanks for this, nice work. But I notice we are introducing an extra copy into the hot path - what is this going to do on low end macs, where the display is likely to be 8-bit RGB already? Do we not need some kind of check for > 8 bit before doing the extra copy?
I am not sure whether any actual significant copy of data is generated or not. Isn't CGIMage a very "lazy" class and operations on a such don't necessarily do any work or allocate any significant amounts of data unless absolutely needed? Only profiling can tell.
As far as I know, most Mac displays are color-managed, even very old models of notebook and workstation, which means the color space conversion is unavoidable. Also, color space conversions can be performed between two same bit-per-component formats (e.g. 8-bit SRGB and 8-bit calibrated SRGB).
Patch dated 2020-10-27. 1. Code style adjustments according to comments on Gerrit. 2. Small performance improvements: - Create the CGImage based on window's color space instead of the display's. - Set blend mode to kCGBlendModeCopy when copying bitmaps. ---------------------------------------- diff --git a/vcl/inc/quartz/salgdi.h b/vcl/inc/quartz/salgdi.h index 8058b68378b6..0aaf71f0f839 100644 --- a/vcl/inc/quartz/salgdi.h +++ b/vcl/inc/quartz/salgdi.h @@ -134,6 +134,8 @@ class AquaSalGraphics : public SalGraphics { CGLayerHolder maLayer; // Quartz graphics layer CGContextHolder maContextHolder; // Quartz drawing context + CGContextHolder maBGContextHolder; // Quartz drawing context for CGLayer + CGContextHolder maCSContextHolder; // Quartz drawing context considering the color space XorEmulation* mpXorEmulation; int mnXorMode; // 0: off 1: on 2: invert only diff --git a/vcl/quartz/salgdicommon.cxx b/vcl/quartz/salgdicommon.cxx index 7f96124f96ac..7c7dcac7898f 100644 --- a/vcl/quartz/salgdicommon.cxx +++ b/vcl/quartz/salgdicommon.cxx @@ -426,11 +426,14 @@ void AquaSalGraphics::copyArea( long nDstX, long nDstY,long nSrcX, long nSrcY, CGContextScaleCTM( xSrcContext, +1, -1 ); aSrcPoint.y = (nScaledSourceY + nScaledSourceHeight) - (mnHeight * fScale); } + CGContextSetBlendMode(xSrcContext, kCGBlendModeCopy); + CGContextDrawLayerAtPoint(xSrcContext, aSrcPoint, maLayer.get()); } // draw at new destination const CGRect aTargetRect = CGRectMake(nScaledTargetX, nScaledTargetY, nScaledSourceWidth, nScaledSourceHeight); + CGContextSetBlendMode(xCopyContext, kCGBlendModeCopy); CGContextDrawLayerInRect(xCopyContext, aTargetRect, sSourceLayerHolder.get()); maContextHolder.restoreState(); diff --git a/vcl/quartz/salgdiutils.cxx b/vcl/quartz/salgdiutils.cxx index 426aea29dc78..57953e536796 100644 --- a/vcl/quartz/salgdiutils.cxx +++ b/vcl/quartz/salgdiutils.cxx @@ -69,11 +69,28 @@ void AquaSalGraphics::SetPrinterGraphics( CGContextRef xContext, long nDPIX, lon void AquaSalGraphics::InvalidateContext() { UnsetState(); + + CGContextRelease(maContextHolder.get()); + CGContextRelease(maBGContextHolder.get()); + CGContextRelease(maCSContextHolder.get()); + maContextHolder.set(nullptr); + maCSContextHolder.set(nullptr); + maBGContextHolder.set(nullptr); } void AquaSalGraphics::UnsetState() { + if (maBGContextHolder.isSet()) + { + CGContextRelease(maBGContextHolder.get()); + maBGContextHolder.set(nullptr); + } + if (maCSContextHolder.isSet()) + { + CGContextRelease(maCSContextHolder.get()); + maBGContextHolder.set(nullptr); + } if (maContextHolder.isSet()) { maContextHolder.restoreState(); @@ -119,7 +136,12 @@ bool AquaSalGraphics::CheckContext() { CGContextRelease(maContextHolder.get()); } + CGContextRelease(maBGContextHolder.get()); + CGContextRelease(maCSContextHolder.get()); + maContextHolder.set(nullptr); + maBGContextHolder.set(nullptr); + maCSContextHolder.set(nullptr); maLayer.set(nullptr); } @@ -133,14 +155,17 @@ bool AquaSalGraphics::CheckContext() const CGSize aLayerSize = { static_cast<CGFloat>(nScaledWidth), static_cast<CGFloat>(nScaledHeight) }; const int nBytesPerRow = (nBitmapDepth * nScaledWidth) / 8; - void* pRawData = std::malloc(nBytesPerRow * nScaledHeight); - const int nFlags = kCGImageAlphaNoneSkipFirst; - CGContextHolder aContextHolder(CGBitmapContextCreate( - pRawData, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); + int nFlags = kCGImageAlphaNoneSkipFirst | kCGBitmapByteOrder32Host; + maBGContextHolder.set(CGBitmapContextCreate( + NULL, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); - maLayer.set(CGLayerCreateWithContext(aContextHolder.get(), aLayerSize, nullptr)); + maLayer.set(CGLayerCreateWithContext(maBGContextHolder.get(), aLayerSize, nullptr)); maLayer.setScale(fScale); + nFlags = kCGImageAlphaPremultipliedFirst | kCGBitmapByteOrder32Host; + maCSContextHolder.set(CGBitmapContextCreate( + NULL, nScaledWidth, nScaledHeight, 8, nBytesPerRow, GetSalData()->mxRGBSpace, nFlags)); + CGContextRef xDrawContext = CGLayerGetContext(maLayer.get()); maContextHolder = xDrawContext; @@ -217,8 +242,17 @@ void AquaSalGraphics::UpdateWindow( NSRect& ) const CGSize aSize = maLayer.getSizePoints(); const CGRect aRect = CGRectMake(0, 0, aSize.width, aSize.height); + const CGRect aRectPoints = { CGPointZero, maLayer.getSizePixels() }; + CGContextSetBlendMode(maCSContextHolder.get(), kCGBlendModeCopy); + CGContextDrawLayerInRect(maCSContextHolder.get(), aRectPoints, maLayer.get()); + + CGImageRef img = CGBitmapContextCreateImage(maCSContextHolder.get()); + CGImageRef displayColorSpaceImage = CGImageCreateCopyWithColorSpace(img, [[mpFrame->getNSWindow() colorSpace] CGColorSpace]); + CGContextSetBlendMode(rCGContextHolder.get(), kCGBlendModeCopy); + CGContextDrawImage(rCGContextHolder.get(), aRect, displayColorSpaceImage); - CGContextDrawLayerInRect(rCGContextHolder.get(), aRect, maLayer.get()); + CGImageRelease(img); + CGImageRelease(displayColorSpaceImage); rCGContextHolder.restoreState(); }
Leo, could you please start submitting your patches through Gerrit? That is what everybody else uses... I have done it for you a couple of times, but it is rather tedious.
(In reply to Tor Lillqvist from comment #46) > Leo, could you please start submitting your patches through Gerrit? That is > what everybody else uses... I have done it for you a couple of times, but it > is rather tedious. I have created my Gerrit account, but failed to `git push` (403).
Leo, come hang out at #libreoffice-dev on freenode IRC, and we can help you
They commit has been pushed https://cgit.freedesktop.org/libreoffice/core/commit/?id=87964eb39e2668f80bcbf503d9a3b55a7f86ce28 Somehow without notification.. probably because of the missing TDF bug number in the commit title
I forgot, would someone please backport this to 7.0 branch. I can't confirm if it work, though (non-Retina) but I assume it does :-)
(In reply to Telesto from comment #49) > They commit has been pushed > https://cgit.freedesktop.org/libreoffice/core/commit/ > ?id=87964eb39e2668f80bcbf503d9a3b55a7f86ce28 > > Somehow without notification.. probably because of the missing TDF bug > number in the commit title Hi Telesto, Before it's backported I would like to hear from someone who could reproduce this issue to see if things have improved.
As I have just posted in the related bug report 113104 I can report that the fix is definitely working and improved performance significantly. The sooner this is backported the better as LibreOffice is unusable on macOS at the moment. See https://bugs.documentfoundation.org/show_bug.cgi?id=113104#c31 for FPS and details.
(In reply to Telesto from comment #49) > They commit has been pushed > https://cgit.freedesktop.org/libreoffice/core/commit/ > ?id=87964eb39e2668f80bcbf503d9a3b55a7f86ce28 > > Somehow without notification.. probably because of the missing TDF bug > number in the commit title Backported to libreoffice-7-0 in https://cgit.freedesktop.org/libreoffice/core/commit/?h=libreoffice-7-0&id=ab436ca98b11a6c72e695169662ad3b5a314fa9c
*** Bug 113104 has been marked as a duplicate of this bug. ***
The bugfix does not seem to be part of 7.0.4.2 -- or it is not working. Could somebody please update on when we can hope for improvement on this? Working with LO on a Retina Mac is *really* a pain. I have a few Macs of various ages (some Retina, some not) and gladly volunteer for QA.
(In reply to birnbach@posteo.de from comment #55) > The bugfix does not seem to be part of 7.0.4.2 -- or it is not working. > > Could somebody please update on when we can hope for improvement on this? > Working with LO on a Retina Mac is *really* a pain. > > I have a few Macs of various ages (some Retina, some not) and gladly > volunteer for QA. So in any workflow. Not specific related to images? There pretty big issue in that area (solved with 7.0.5). Any change to test against Master? https://dev-builds.libreoffice.org/daily/master/current.html
No notable change in 10.0.5.2. I find that slow display is even a problem even when working with text if the file is large or the user very quick. That's on a 27" Retina 5K iMac with 3 GHz 6-Core Intel Core i5 from 2019 running 10.15.7. With NC this machine is easily outperformed by my 10 years old MacBook Air 1,7 GHz Intel Core i5 from 2011 without Retina display running 10.13.4 As NC is practically not usable on a state of the art Retina Mac I propose changing bug severity from normal to major. Proposing assistance in bug documentation and testing again.
Draw performance is significantly improved in 7.1.2.2, we are back in the 'usable' area. Working with more complex objects is still kind of lagging, though, and some more efficiency would be quite welcome. Could somebody please kindly check if Leo Wang's patch is part of the mentioned release? I could find no mentioning of speed related changes in the Release Notes and the bug is still open.
I haven't checked the source but I can confirm that the current 7.1 release is about as fast as the Develoepr Beta that I ran my initial tests with back in November of 2020 (see comment 52). I get around 10-12 FPS with pre-patched versions and 25-27 FPS with the Developer Beta, 7.0.5, and 7.1.3 downloaded from libreoffice.org as well as LibreOffice Vanilla 7.1.2 downloaded from the Mac App Store. After discovering that unlike the libreoffice.org release the Mac App Store release is already a Universal Binary I ran the same tests on my low-end MacBook Air M1 as well, just for sh*ts and giggles. Lo and behold, my M1 Air runs the universal binary at a buttery smooth 50+ FPS. Which is 50% faster than the Intel binary (tried Dev and 7.1.3, both of which yielded around 35 FPS) or any version of LibreOffice on my Intel iMac, which seems to peak at around 25-27 FPS.
Hi Leo, Tor, Should this issue be closed as RESOLVED FIXED ?
I hate to spoil the party but shouldn't we make sure unnecessary pixel copying is safely disabled in all versions before we close the bug? The M1 framerates are phenomenal but there is a lot of Intel Macs out there. (In reply to Martin Jungowski from comment #59) > I haven't checked the source but I can confirm that the current 7.1 release > is about as fast as the Develoepr Beta that I ran my initial tests with back > in November of 2020 (see comment 52). I get around 10-12 FPS with > pre-patched versions and 25-27 FPS with the Developer Beta, 7.0.5, and 7.1.3 > downloaded from libreoffice.org as well as LibreOffice Vanilla 7.1.2 > downloaded from the Mac App Store. > > After discovering that unlike the libreoffice.org release the Mac App Store > release is already a Universal Binary I ran the same tests on my low-end > MacBook Air M1 as well, just for sh*ts and giggles. Lo and behold, my M1 Air > runs the universal binary at a buttery smooth 50+ FPS. Which is 50% faster > than the Intel binary (tried Dev and 7.1.3, both of which yielded around 35 > FPS) or any version of LibreOffice on my Intel iMac, which seems to peak at > around 25-27 FPS.
(In reply to Xisco Faulí from comment #60) > Hi Leo, Tor, > Should this issue be closed as RESOLVED FIXED ? I guess so..