=== accelerate Perl installer/builder ===
'''Background:''' For now we use solenv/bin/make_installer.pl, with Perl modules in solenv/bin/modules/installer, to do handle the production of an installer or data to go into packages. Unfortunately it is slow even on a fast machine. The Perl is written without the benefit of a deep understanding of Perl which makes it less efficient, and it also likes to spawn processes where native code would be better: We need to (in general) retain the heap of Perl to create MSI files on Windows.
The Perl code also suffers from overuse of copy-paste. Lots of subroutines have been simply copied with small changes when similar code is needed for two related tasks, instead of adding a parameter and using conditional statements. In many cases the (as such typically quite useless) comment preceding a subroutine has also been copy-pasted, but not changed at all, thus being totally misleading for the copy...
Some of the perl is dead slow; there is a [http://users.freedesktop.org/~michael/make_installer.txt profile of this on Windows] - it would be great to do some re-factoring and/or optimisation to improve this.
'''Skills:''' Perl, building
Deteted "Easyhack" from summary
I'm new here. I'd like to start contributing and this looks like a good place for me to start - I've done a lot of perl hacking. There hasn't been an update to this bug for a while - is work still needed?
Hi Bob - yes, plenty more work is needed :-) in particularly make_installer.pl -still- seems very slow (to me) for what it does.
It seems to spend a ton of time messing around inside .zip archives (very slowly) and extracting a load of (big) cruft to /tmp/ while it is making the install set that seems (to me) redundant.
It'd be lovely to try to avoid that - which requires some structural work. There are others thinking about / cleaning up and improving that code though so perhaps good to poke Tim Retout - who is hacking there.
Looking forward to your work :-)
Yesterday I profiled make_installer.pl from the master branch on Windows with NYTProf v4. All languages (109) were configured. The result is here:
I hope it will be useful, because not many people are hacking on Windows, and some design problems become obvious only when many languages are configured.
Thanks Andras, this is very useful. There are two particularly painful places:
- translate_idtfile, called 8 x 109 times for a total of 4106s, translating the same eight files into 109 languages.
There's a lot of substitution going on - there are only 660 lines total in those files, but because each language is processed independently, a lot of the work gets repeated. Maybe if we processed all languages together, we could avoid that.
- create_defaultdir_directorynames, called 109 times for a total of 2211s, which is spending a lot of time converting names into "8+3" form.
Perl on w32 has a function Win32::GetShortPathName() for building the 8+3 form of an *existing* path - i.e. it's meant to be the filesystem's responsibility to generate this. So while there is a hashref-based "make_eight_three_conform_with_hash" (which could still be improved upon performance-wise), I think we should ask why we're even doing this.
(In reply to comment #3)
> Hi Bob - yes, plenty more work is needed :-) in particularly
> make_installer.pl -still- seems very slow (to me) for what it does.
> It seems to spend a ton of time messing around inside .zip archives (very
> slowly) and extracting a load of (big) cruft to /tmp/ while it is making the
> install set that seems (to me) redundant.
> It'd be lovely to try to avoid that - which requires some structural work.
> There are others thinking about / cleaning up and improving that code though
> so perhaps good to poke Tim Retout - who is hacking there.
> Looking forward to your work :-)
Sorry, I thought I was CC'd on this ticket! I didn't notice these comments until today.
Can someone tell me the how we invoke make_installer.pl from a fresh git checkout? My plan is to have two copies - one unmodified and one that I'll hack on based on the NYTProf output. That way I can verify that the files that are generated are identical to the unmodified script.
(In reply to comment #5)
Unfortunately short filenames are mandatory according to documentation (see e.g.: http://msdn.microsoft.com/en-us/library/windows/desktop/aa368590%28v=vs.85%29.aspx). But I noticed that although Directory table (Director.itd) is localizable in theory, in fact we do not use localized directory names. So we do not have to produce 109 identical copies of Director.itd, including 109 calls to create_defaultdir_directorynames.
(In reply to comment #6)
> Can someone tell me the how we invoke make_installer.pl from a fresh git
make_installer.pl is invoked from instsetoo_native/util/makefile.mk. You need to build LibreOffice first, the you can test make_installer.pl. It is not useable from a fresh git checkout, because it tries to package binaries that need to built first.
Andras Timar committed a patch related to this issue.
It has been pushed to "master":
use utf-8 instead of legacy code pages in all msi tables (related: fdo#39595)
The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
Affected users are encouraged to test the fix and report feedback.
adding LibreOffice developer list as CC to unresolved EasyHacks for better visibility.
see e.g. http://nabble.documentfoundation.org/minutes-of-ESC-call-td4076214.html for details
Removing EasyHack for now as its been around since 2011-06.
@Michael: Maybe just close this one and possibly open a new one with crisp and current hints?
I guess this is old enough to be obsolete, and we had some great work done here - not least the instdir bits by Michael Stahl =)
Migrating Whiteboard tags to Keywords: (DifficultyBeginner SkillScript )