Stuff Michael Meeks is doing |
Older items: 2023: ( J F M A M J ), 2022: ( J F M A M J J A S O N D ), 2021, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, legacy html
Today we release LibreOffice 5.0.0, a new foundation for ongoing work over the next months and years. It also has a fine suite of new features for people to enjoy - you can read and enjoy all the great news about the user visible features from so many great hackers, but there are, as always, many contributors whose work is primarily behind the scenes, and a lot of work that is more technical than user-facing. That work is, of course, still vitally important to the project. It can be hard to extract those from around eleven thousand commits since LibreOffice 4.4 was branched, so let me try to expand:
One of the largest areas of work in LibreOffice 5.0 is in the VCL toolkit, the graphics toolkit LibreOffice uses for all the widgets and rendering. 5.0 means modernizing and improving several aspects of it and bringing them into line with other cross-platform toolkits.
This is a rather major change that landed in 5.0, and is a vital under-pinning to the ongoing attempts to make VCL and LibreOffice more efficient and performant, thanks to Jennifer Liebel and Tobias Madl (interview). The essential problem with our previous approach to deciding what to do next in LibreOffice (eg. should I do some more word-counting ? or process some deferred window re-sizing work ? or re-paint a windows' contents ?) was decided by a rather arbitrary set of random number of millisecond timeouts eg. 30ms for a re-paint, 50ms for a re-size - which was not only race prone, but also horribly inefficient - there being no solid basis to these pseudo-random numbers.
Thankfully in LibreOffice 5.0 we have a new 'idle' concept that prioritizes tasks we want to get completed and allows them to be executed in order at top speed. This combined with Jan Holesovsky (Collabora)'s work to ensure we can queue sub 10ms timeouts on Windows means we finally have a reasonably useful mainloop.
This has also helped us to find some power-draining bad behavior that was previously less visible - since frequently executed (say every 30ms) shortish tasks that wastefully woke the CPU without making any progress, now cause a 100% CPU spike - and can be addressed. Thanks to Ashod Nakashian for attacking several of these.
For much of its lifetime, VCL widget lifecycle was a bit of a mystery, even to VCL itself. Widgets could be heap allocated, stack allocated, or be members of other widgets. If heap allocated they could be wrapped in various flavours of shared pointers. As such predicting when a widgets would be destroyed, and/or following its lifecycle across the code was non-trivial. Inside VCL we often used dog-tags: special listeners that would turn null when an object was destroyed to try to avoid referencing an object involved in several back-to-back callbacks. Unfortunately this support was rather incomplete, and lots of code would end up deferring deleting heap allocated widgets until idle in an attempt to avoid problems.
In an attempt to solve all of this mess, we now have a single
smart pointer type: VclPtr
to reference-count all Window
(and OutputDevice) sub-classes, which are now always heap allocated.
This gives a consistent lifecycle mechanism, which is even
documented.
We moved to a 'dispose' mechanism to break reference-cycles, replacing
the previous explicit or implicit 'delete' mechanism, and have made lots
of methods safe to call even on disposed widgets. This should, in the
end provide predictable lifecycle, and much less fragile destruction
code paths making it easier to safely re-factor code. In the meantime
we continue to iron out problems, thanks to Noel Grandin (Peralex)
for his invaluable help to me with this work, and Caolan McNamara
(RedHat) and Julien Nabet among others for helping to fixup some
of the aftermath. It is hoped that (ultimately) nearly all long-lived
VCL types will use a similar lifecycle mechanism. This work was made
possible by Caolan's huge re-factor to use VclBuilder for all dialogs.
A bold attempt to switch the code-base from immediate rendering to deferred rendering was initiated. LibreOffice previously rendered what is seen on the screen in one of two ways - either immediately: ie. when you press an 'A' it tries to nail the pixels for 'A' immediately to the screen; or - via. a very deferred (30+ms delay) idle rendering = callback.
This situation is really non-ideal for modern rendering hardware and APIs - where we want to ensure the scene is fully and perfectly painted as a whole before showing it on-screen. Happily with the new idle handling work, there is no longer a hard-coded delay before deferred rendering can occur; so we started the task of removing immediate rendering, and replacing it with deferred rendering. This means replacing explicit rendering calls with area invalidation to queue this area for later re-rendering. In many cases this can remove any visible flickering and other intermediate rendering artifacts as the UI refreshes. Many thanks to Tomaž Vajngerl (Collabora), Miklos Vajna (Collabora), with help and fixing from Krisztian Pinter, Noel Grandin (Peralex), Jan Holesovsky (Collabora), Caolán McNamara (RedHat), Laszlo Nemeth (Collabora)
An very rough, initial gtk3 port was hacked together long ago by yours truly to prototype LibreOffice online via gdk-broadway. However thanks to Caolán McNamara (RedHat) who has done the 80% of the hard work to finish this, giving us a polished and complete VCL backend for gtk3. His blog entry focuses on the importance of this for running LibreOffice natively under wayland - the previous gtk2 backend was heavily tied to raw X11 rendering, while the new gtk3 backend uses CPU rendering via the VCL headless backend, of which more below.
The OpenGL rendering backend also significantly matured in this version, allowing us to talk directly to the hardware to accelerate much of our rendering, with large numbers of bug fixes and improvements. Many thanks to Louis-Francis Ratté-Boulianne (Collabora), Markus Mohrhard, Luboš Luňák (Collabora), Tomaž Vajngerl (Collabora), Jan Holesovsky (Collabora), Tor Lillqvist (Collabora), Chris Sherlock and others . It is hoped that with the ongoing bug-fixing here, that this can be enabled by default as a late feature, after suitable review, for LibreOffice 5.0.1 or at the outside 5.0.2.
LibreOfficeKit provides an easy way to re-use the rendering, file-format and now editing core from LibreOffice. In the last six months it has gone from being primaily useful for file format conversion, to being the foundation of LibreOffice on Android, and Online.
LibreOfficeKit re-uses our headless
rendering
backend, which allows us to render documents without underlying OS
assistance, ie. without X11, Windows, OS/X etc. A number of
performance and other rendering fixes were implemented here as
part of the gtk3 and online work (headless rendering is also used
on Android while our GL backend is maturing for that platform).
Thanks to Caolán McNamara (RedHat) and Michael Meeks (Collabora).
Android editing builds on top of the LibreOfficeKit editing features, and provides the user with the Android equivalent of the gtktiledviewer feature list, like native cursor, text and graphic selection, resizing and more. Thanks to The Document Foundation & their generous donors these significant API extensions and core work are thanks to Miklos Vajna, Tor Lillqvist, Andrzej Hunt, Siqi Liu, Mihai Varga, Tomaž Vajngerl and Jan Holesovsky all of Collabora, as well as work from Pranav Kant (GSOC), and cleanups from Stephan Bergmann (RedHat)..
LibreOfficeKit (alongside an adapted leaflet) is the basis for the new work targetting LibreOffice at the Cloud, checkout the code and a presentation. Huge amounts of tangled heavy lifting here were done thanks to: Tor Lillqvist, Mihai Varga, Jan Holesovsky, Henry Castro and Miklos Vajna, all of Collabora. With thanks to IceWarp for funding this work.
LibreOfficeKit provides a nice simple, clean API for loading and
saving (ie. converting) documents. Thanks to Laszlo Nemeth (Collabora) and Mihai Varga (Collabora)
we now have a new filter attribute: SkipImages
to allow a
significant acceleration for the use-case of converting any file type
to HTML. This is really useful for re-using the wide range of LibreOffice
filters to do document text indexing - giving a very significant speedup
for large and complex documents. Another vital win here was to avoid
doing an accurate word-count before export (for document statistics).
Document conversion to text with this option should be significantly
quicker for certain documents.
With increasing template use in headers, compile times have taken a
turn for the slower, thanks to Michael Stahl (Red Hat) who created a
nice script bin/includebloat
script to locate the largest and
most problematic headers to be removed. As an example dropping
boost/utility.hpp
from several places removes ~830Mb of
boost/preprocessor/seq/fold_left.hpp
pre-processing.
The 5.0 release debuts a Win64 build - with many thanks to David Ostrovsky (CIB) with help from Thorsten Behrens (CIB), Norbert Thiebaud, Stephan Bergmann (RedHat) and others fixing and cleaning up a number of nasty platform-specific corner-cases across the suite. While we have had many 64bit platforms for years, the Windows LLP64 model can create issues.
Work is ongoing around code quality in many areas, with 120
or so cppcheck fixes thanks to Caolán McNamara (RedHat),
Michael Weghorn, Julien Nabet, Noel Grandin (Peralex), and others.
along with the daily commits to build without any compile
warnings -Werror -Wall -Wextra
etc. on many platforms
with thanks primarily to Tor Lillqvist (Collabora) and Caolán
McNamara (Red Hat) - this category of problems however is
shrinking thanks to the increasing use of CI.
Having hit nearly zero coverity issues Caolán McNamara (RedHat) (with some help from others) does an awesome job of keeping the count at (or nearly at) zero each week with ~360 commits this cycle. We routinely have a few new issues in each build and fix a few others, the total being currently two issues (of 6+ million lines analyzed). Hopefully keeping the numbers at zero is a reasonably achiveable goal:
The company OOO "Program Verification Systems" develops the PVS-Studio static analysis tool and made results of a one-time analysis run available to LibreOffice developers. Dozens of reported issues were fixed by Caolán McNamara (RedHat), Michael Stahl (RedHat), David Tardon (RedHat), and Markus Mohrhard. You can read more about that (with cartoon) here.
The new TDF donor funded crash-testing hardware combined with a significant effort from Caolán McNamara (RedHat), Michael Stahl (RedHat), Markus Mohrhard and several others we have got the number of (paranoid) assertions and/or crashes on import of our significant bugzilla document corpus (of 75k+ dodgy bug documents) down to effectively zero. It's wonderful to be able to catch commits that cause regressions here and nail them within days on master, before they have a chance to escape into the user-base.
Ongoing work here is to compile the crash-testing binaries with Address Sanitizer as well as starting to fuzz various document types and expanding the set of input file-types.
We have continued to add to our clang compiler plugins; a quick git grep
for 'Registration' in compilerplugins
shows that we've gone from
38 to 59 in the last six months (double the growth of last release). These are
used to check for all manner of nasty gotchas, and also to automatically re-write
various problematic bits of code. Many are run automatically by tinderboxes to
catch badness. Thanks to:
Stephan Bergmann (Red Hat) and Noel Grandin (Peralex) for their hard work on these checkers this cycle.
The new plugins do all sorts of things, and usually come complete with a set of relevant fixes for the underlying code; here are some examples:
a ? false : true
to !a
.class Foo; Foo *pFoo = (Foo *)pBaa;
ie. when a type is incomplete. These should really be safer static_casts.
Also we detected and removed un-necessary casts to make the code
easier to understand.= delete
to entail further compiler
optimizations and warnings.Other sets of cleanups were also clang assisted such as Noel's attack on cleaning up, making consistent and nicely scoping our enumerations. Stephan's drive to detect and remove implicit bool conversion, switching many inline methods from sal_Bool (really an unsigned char) to a true 'bool' whever possible, and several other helpful plugins.
We also built and executed more unit tests with LibreOffice 5.0 to avoid regressions as we change the code. Grepping for the relevant TEST and ASSERT macros we continue to grow the number of unit tests:
qa/
directories: Miklos Vajna (Collabora), Markus Mohrhard,
Caolán McNamara (RedHat) Stephan Bergmann (RedHat), Noel Grandin (Peralex),
Michael Meeks (Collabora), Michael Stahl (RedHat), Zolnai Tamás,
Tor Lillqvist (Collabora), Bjoern Michaelsen (Canonical), Eike Rathke (RedHat),
Takeshi Abe, Andras Timar (Collabora), PriyankaGaikwad (Synerzip)
While we have had a subset of unit tests that we run at compile time
on Windows, our larger battery of make check
tests has been
hindered by strange thread-affine behavior on Windows related to handling
various Window and event resources. Thanks to various locking, and inter-
thread messaging fixes from Michael Stahl (RedHat), and Stephan
Bergmann (Redhat) we now have far more robust and reliable
unit testing on Windows.
One metric we watch in the ESC call is who is in the top ten in the freedesktop Weekly bug summary. Here is a list of the people who have appeared more than five times in the weekly list of top bug closers in order of frequency of appearance: Adolfo Jayme, Beluga, Caolán McNamara (RedHat), raal, Julien Nabet, Jean-Baptiste Faure, Markus Mohrhard, m.a.riosv, Gordo, V Stuart Foote, Eike Rathke (RedHat), Andras Timar (Collabora), Alex Thurgood, Yousuf (Jay) Philips, Miklos Vajna (Collabora), Joel Madero, Cor Nouws, Michael Stahl (RedHat), Michael Meeks (Collabora), Matthew Francis, David Tardon (Redhat), tommy27, Timur, Robinson Tryon (qubit) (TDF). And thanks to the many others that helped to close and triage so many bugs for this release.
Thanks to Norbert Thiebaud - we now have some rather excellent Jenkins / CI integration with gerrit, to allow us to test-build all incoming patches across our three major platforms. Using CI to test patches before pushing them to master has become another valuable tool to increase the quality of master (and thus its accessibility to casual builders), and to allow those without access to Windows & Mac devices to check their code builds there. Thanks to ByteMark and TDF donors we hope to have even more, fast hardware to throw at the CI build farm soon making this an even more attractive route to test submitted code. With over 25,000 builds from 13 build slaves since the beginning of the year (which compares favourably with the around 11,000 commits, it is hoped that with enough hardware we can compile and run tests vs. all incoming commits in future without introducing excessive latency.
Also for the next development cycle we have enabled tests beyond
those run during compile. We enable a slew of extra assertions in a
dbgutil
build and run make check
at least on Linux
to apply a much larger set of extra tests to each individual commit.
In this cycle we expanded the great Bi(nary)Bisect(ion) repositories - which contain thousands of compressed pre-built binaries to allow end-users to quickly ascertain almost down to a single commit that introduced a regression long after the date - to include Mac and Windows builds for the 5.0 epoch (ie. the range from the 4.4 branch to 5.0 branching. The 5.1 epoch is being built and refreshed reasonably regularly. Many thanks to Norbert Thiebaud, Matthew Jay Francis & Robinson Tryon (qubit) (TDF)
Code that is dirty should be cleaned up - so we did a lot of that left & right:
In the 5.0 release we started to move more aggressively to the
subset of C++11 we can now use with our updated compiler baselined.
Features such as variadic templates, simpler initializations, and more.
Work also involved removing deprecated std::
functions such
as std::ptr_fun
using std::any_of
&
std::none_of
and other newer constructs such as
auto
. Thanks goes to many hackers cleaning the code
including Stephan Bergmann (RedHat), Takeshi Abe, Nathan Yee,
Bjoern Michaelsen (Canonical) and others.
Thanks to Maxim Monastirsky we saved many hundreds lines of duplicate code from the framework, by creating nice generic controllers that could be controlled via small, clean XML configuration descriptions - great to see such cleanups.
A number of legacy structures in LibreOffice have used 16bit indicees, and stored / serialied these to various structures for many years. This can cause problems with very large mail merged - such as those in-use at Munich City. Thanks to Katarina Behrens (CIB) - Writer in 5.0 allows more than 64k of: Page Descriptions, Sections and Style Names.
We continued to make progress, but somehow the
last ~5000 lines of comment persistently appear to defy translation.
Answers by E-mail postcard from German speakers much appreciated. Many
thanks to: Michael Weghorn, Michael Jaumann (Munich), Daniel Sikeler
(Munich), Albert Thuswaldner, Christian M. Heller, Philipp Weissenbacher.
There are now only the following eight modules left to do:
include, reportdesign, rsc, sc, sfx2, stoc, svx, sw
A systematic set of improvements to our usage of the std:: containers has
been going on through the code. Things like avoiding inheritance from std::vector
, changing std::deque
to std::vector
and starting
to use the newer C++ constructs for iteration like for (auto& it : aTheContainer) { ... }
. There are many people to credit here, thanks to
Stephan Bergmann (Red Hat), Takeshi Abe, Tor Lillqvist (Collabora),
Caolán McNamara (Red Hat), Michaël Lefèvre, and many others.
Thanks to Bjoern Michaelsen (Canonical) we have had a few key, long desired writer cleanups in 5.0. These include:
SwClient/SwModify
), also adding a test harness
to clarify its interface. Ultimately, the goal is to move away
from this implementation towards one of the more modern
implementations we use elsewhere. This work should help find a
migration path later.sw::Ring
and adding
tests to clarify its interface.The resourcemodel building block of writerfilter (that handles Writer’s DOCX and RTF import in LibreOffice) was basically a bucket of old and unused stuff. The few still needed pieces from it are now moved into the relevant mapper/tokenizer/filter parts, and the rest is now removed. You can read more detail thanks to Miklos Vajna (Collabora).
We had a number of other wins that are somewhat difficult to categorize, but well worth noting:
MS Office 2007 has an unhelpfully different set of default values for many of its attributes - ie. the same XML (with an attribute ommitted) can produce different results in Office 2007 and later versions. Clearly this is a little irritating. Thanks to Markus Mohrhard for adding some infrastructure (and a set of fixes) for known problematic attributes in this regard. This should improve our interoperability with the zoo of documents out there.
Thanks to TDF's donors and Jacobo Aragunde Pérez (Igalia) we implemented an abstract file-system API for Android - to allow arbitrary file-system backends to be plugged in (in a separate thread). An example OwnCloud backend was implemented to show-case this.
Thanks to Matthew Nicholls we removed a couple of thousand
lines of redundant wrappers in svx
's dbtoolsclient - which
was duplicated elsewhere in connectivity
. Great to see this
much cruft leave the code-base.
I hope you get the idea that more developers continue to find a home at LibreOffice and work together to complete some rather significant work both under the hood, and also on the surface. If you want to get involved there are plenty of great people to meet and work alongside. As you can see individuals make a huge impact to the diversity of LibreOffice (the colour legends on the right should be read left to right, top to bottom, which maps to top down in the chart):
And also in terms of diversity of code commits, we love to see the unaffiliated volunteers contribution by volume, though clearly the volume and balance changes with the season, release cycle, and volunteers vacation / business plans:
Naturally we maintain a list of small, bite-sized tasks which you can use to get involved at our Easy Hacks page, with simple build / setup instructions. It is extremely easy to build LibreOffice, each easy-hack should have code pointers and be a nicely self contained task that is easy to solve. In addition some of them are really nice-to-have features or performance improvements. Please do consider getting stuck in with something.
Another thing that really helps is running pre-release builds and reporting bugs just grab and install a pre-release and you're ready to contribute alongside the rest of the development team.
LibreOffice 5.0 is a great new foundation for building the next series of releases which will incrementally improve not only features, but also the foundation of the Free Software office suite. It is of course not perfect yet, this is the first in a long series of monthly 5.0.x releases, and six monthly 5.x releases which will bring a stream of bug fixes and quality improvements over the next months and years.
I hope you enjoy LibreOffice 5.0.0, thanks for reading, don't forget to checkout the user visible feature page and thank you for supporting LibreOffice.
Raw data for many of the above graphs is available.
My content in this blog and associated images / data under
images/
and data/
directories are (usually)
created by me and (unless obviously labelled otherwise) are licensed under
the public domain, and/or if that doesn't float your boat a CC0
license. I encourage linking back (of course) to help people decide for
themselves, in context, in the battle for ideas, and I love fixes /
improvements / corrections by private mail.
In case it's not painfully obvious: the reflections reflected here are my own; mine, all mine ! and don't reflect the views of Collabora, SUSE, Novell, The Document Foundation, Spaghetti Hurlers (International), or anyone else. It's also important to realise that I'm not in on the Swedish Conspiracy. Occasionally people ask for formal photos for conferences or fun.
Michael Meeks (michael.meeks@collabora.com)