16-bit unicode string literals Background: Our ASCII string handling is slow and inefficient. C++0x introduces "UTF-16 string literals" (Wikipedia) and see icu/source/common/unicode/unistr.h and icu/source/common/unicode/platform.h for hackery for other platforms. mozilla even uses -wshort-wchar and L"string literals", though the standard approach is better. To implement this, we should add a SAL_STRING_STATIC_FLAG to create rtl_uStrings with, and instrument rtl_uString_assign - to deep copy these when necessary. Skills: build, C++
Some of the work is done. SAL_STRING_STATIC_FLAG is defined in core/sal/rtl/source/strimp.hxx #define SAL_STRING_STATIC_FLAG 0x40000000 SAL_STRING_STATIC_FLAG is used in core/sal/rtl/source/strimp.hxx #define SAL_STRING_IS_STATIC(a) ((a)->refCount & SAL_STRING_STATIC_FLAG) SAL_STRING_STATIC_FLAG is used in core/sal/rtl/source/ustring.cxx in the initializer for static rtl_uString. SAL_STRING_STATIC_FLAG is used in core/sal/rtl/source/string.cxx in the initializer for static rtl_String. SAL_STRING_IS_STATIC is used directly and indirectly in many of the methods defined in /core/sal/rtl/source/strtmpl.cxx. SAL_STRING_IS_STATIC is also used (trivially) in /core/sal/rtl/source/hash.cxx I think rtl_uString_assign is defined in strtmpl.cxx in code for void SAL_CALL IMPL_RTL_STRINGNAME( assign )( IMPL_RTL_STRINGDATA** ppThis, IMPL_RTL_STRINGDATA* pStr )
Deteted "Easyhack" from summary
This is most probably not doable without a macro. Since rtl_uString allocates the string as a part of itself, the extra space would need to be allocated as well for each literal, with the string data inside it, but even inline functions, templates and whatnot don't seem to do. With a macro it's doable with something along the lines of #define OUStringLiteral( str ) \ ( \ ([]() -> OUString { static const rtl_uString_sized< sizeof( str ) > data = { SAL_STRING_STATIC_FLAG|1, sizeof( str ) - 1, u"" str }; return OUString( &data ); })() \ ) but that pretty much means putting RTL_CONSTASCII_USTRINGPARAM back everywhere :(. That's kinda lame, after all the work to remove it, and it would be good to first check if the uglification is actually worth the gain.
Yeah, I'd forgotten about this easy hack. It was intended alright to attempt adapting RTL_CONSTASCII_USTRINGPARAM. Lets drop this easy hack after all.
Migrating Whiteboard tags to Keywords: (EasyHack) [NinjaEdit]