So I've just finished writing xliffmerge scipt that is direct analog of msgmerge@gettext and pomerge@translate-toolkit, but specially for XLIFF, i.e. it leverages all the advantages XLIFF brings, like saving information about each template update, specifying tha phase in which each unit was modified last time, and so on.
I was surprised to find out that translate-toolkit parses XML files into its own internal representation, which means they loose everything they don't support.
So to do it in a nice way, I chose to work directly with DOM representation of XLIFF file (using QDomDocument and friends). In Lokalize XliffStorage class is just a wrapper around QDomDocument.
I enabled preserving the whitespace, but in some cases it added additional whitespace, which means it modified user-editable text. The cases were clear: when no character data is between tags, it 'formats' them: inserts newline character + indent spaces. To override this behaviour I just added insertion of empty text nodes between tags. You can see the code in the end of xliffmerge.py (fixWhiteSpace*())