Apache OpenOffice (AOO) Bugzilla – Issue 125179
Opening complex docx document takes several minutes (but succeeds)
Last modified: 2017-05-20 10:35:14 UTC
Opening the attached docx file (the current F# draft language specification) initially appears to hang, with the soffice.bin process using 100% of one processor core. It eventually succeeds (taking just short of 4 minutes on my Core i7 system, loading the document from SSD, with 32GB RAM). If Writer was launched by double-clicking the file then no GUI becomes visible (and if Writer was already open then it remains unresponsive) until this process is complete. The conversion result seems faultless. There is no problem saving to *.odt or *.doc format, and printing also works fine. The document has 301 pages (in original format) and many tables. Is a delay of this order to be expected? If so, some sort of UI (perhaps with a progress bar) would be helpful.
Created attachment 83623 [details] F# 3.1 draft language specification Source: http://fsharp.org/specs/language-spec/
OK (though a little slow) with 3.4.1 and 4.0.1 Regression in 4.1.0 and nightly build (AOO420m1(Build:9800)- Rev. 1605918, 2014-06-27_04:11:21 - Rev. 1605944) @Tim: if this is a blocker for your daily work, try reverting to previous version 4.0.1
(In reply to Tim Baigent from comment #0) > Is a delay of this order to be expected? If so, some sort of UI (perhaps > with a progress bar) would be helpful. The lack of progress bar seems to be another bug with the filter, because the document in Issue 125055 shows a progress bar (it is loaded with a different filter).
@Ariel, Thanks for giving your attention to this. (Any faster and you would have responded before I submitted the report!) It's not a blocker for me personally since I can usually avoid docx. Thanks again.
@Ariel: Can you figure out, if this performance decrease has the same root cause as issue 125055?
@Tim: Could you give the recent developer snapshot build which had been created for early testing of our planned 4.1.1 release a try - you find them under https://cwiki.apache.org/confluence/display/OOOUSERS/Development+Snapshot+Builds
@Oliver, With 4.1.1 M1 I get a slightly different result: Opening the file by double-clicking Windows Explorer, I now immediately get an OpenOffice splash screen, with a wait cursor. Otherwise the same. Process soffice.bin using one full core, OpenOffice GUI with translated document appears at 3mn50s. Tim
(In reply to Oliver-Rainer Wittmann from comment #5) > @Ariel: > Can you figure out, if this performance decrease has the same root cause as > issue 125055? It is reproducible with AOO411m1(Build:9770) - Rev. 1603804 2014-06-19 10:08 - Linux x86_64 AOO420m1(Build:9800)- Rev. 1605918 2014-06-27_04:11:21 - Rev. 1605944 so the fix for issue 125055 doesn't seem to solve this one.
With AOO411m1(Build:9770) - Rev. 1604099 2014-06-30_07:13:14-Rev.1606633 on Linux-32, and 4GB RAM, I gave up after 4 mins -- no document appeared, and AOO was hung up basically.
Took a look using trunk debug version and VerySleepy. I have not enough expertise in Writer, but wanted to check for an evtl. easy to find bottleneck. Finf´dings: - SwXBookmark::attach takes a lot of time (deactivated to get over it) - SwTable::CheckConsistency() takes a lot of time, maybe should only be called after importing (?) - Only in trunk: SwFmt::GetBackground used to decide if to call SetCompletePaint(), this could be avoided with checking first if it's already set
One more: After repagination a huge number of assertions at every action from SwTxtINetFmt::GetCharFmt() "<SwTxtINetFmt::GetCharFmt()> - missing character format at hyperlink attribute". Looks like something that should be corrected at import time (?)
taking over to have a closer look.
I have already figured out that the *.docx import of overlapping bookmarks is broken since AOO 4.1.0 - see issue 125215 For the in-place editing of Input Fields some new data structures are introduced at the mark manager which also cause a certain performance decrease.
(In reply to Oliver-Rainer Wittmann from comment #13) > I have already figured out that the *.docx import of overlapping bookmarks > is broken since AOO 4.1.0 - see issue 125215 > > For the in-place editing of Input Fields some new data structures are > introduced at the mark manager which also cause a certain performance > decrease. Solutions to these two observations - provided with issue 125215 - solved the observed performance decrease.
Great. What was the problem? (Verbose details very welcome.) In which builds is it fixed?
(In reply to Tim Baigent from comment #15) > Great. > > What was the problem? (Verbose details very welcome.) > I think three issues are causing the performance decrease: (a) The one which has been already fixed with issue 125055. The sorting of all mark containers was triggered each time text is inserted into a paragraph when a mark ends at the insertion position. This is needed to fix issue 124338. But the sorting is only needed, if another mark starts at the insertion position. (b) The import of bookmarks of *.docx document was completely broken. This caused the insertion of hundreds of wrong bookmarks without a corresponding bookmark name. The method to find a unique name is not efficient, but do not have to, because it should not be triggered that often. (c) An additional mark container had been introduced for the enhancement of annotations/comments on text ranges. But the costs of this container does not justify its benefits. It was only use in two use cases, which could also work with existing containers. > In which builds is it fixed? on trunk: the next build bot builds which work on trunk should include the fixes. for planned 4.1.1: the fixes will not be included in the announced milestone 2, but the next milestone will contain them.
(In reply to Oliver-Rainer Wittmann from comment #16) Thanks for satisfying my curosity. It's very interesting to get some feel for how these things are tackled. I've been very impressed with the process.