Apache OpenOffice (AOO) Bugzilla – Issue 114482
Creating new Hebrew document in Word 95 format produces corrupted file (Attached)
Last modified: 2017-12-04 23:18:16 UTC
Hello, Managed to reproduce this bug twice on my Fedora 13/x86_64 machine. Steps: 1. Open new OOWriter document. 2. Write a Hebrew document. 3. Save file as Word/2K .doc file. 4. Open file using Writer. - Gilboa
Created attachment 71667 [details] Corrupted file.
When trying to "save as" your document in OOo 3.3 the filter listbox proposed Word 95 which indicates you have saved as Word 95 and not Word 2000. Furthermore I couldn't notice a change in format when saving on SuSE with the native OOo build. Thus: - Either you choose the wrong filter while saving - or you are using a non native (from your distribution) OOo version which shows Word 2000 but saves to Word 95 (?). Anyway, not reproducible e.g. invalid.
Closed
Forgive me for reopening the issue. Seems that I was mistaken, and I did unmistakably saved the document as Word95 Never the less, I believe you mis-understood the bug report. Take document A. (Minrav-tiles-initial.odt) Save it as word 2K. (Minrav-tiles-initial-2000.doc) Save it as word 95. (Minrav-tiles-initial-95.doc) Assuming that you cannot read Hebrew, you'll should be able to notice that the Word95 version is completely corrupted and uses the wrong character set. Now, it's entirely possible that Word95 doesn't currently support UTF8, hence breaking non-English text completely. In such a case, trying to save any unicode document in this format should be disabled. If Word95 does currently support UTF8, this is a bug. - Gilboa
Created attachment 71671 [details] Document - original
Created attachment 71672 [details] Document - Word2K
Created attachment 71673 [details] Document - Word95, corrupted.
Reproducible on OpenOffice.org 3.2.1 OOO320m19 (Build:9505) The corruption happens when exporting directly from ODT to word95. So the word2k can only show it's not a general odt-> doc problem.
THis only happens when exporting to Word 95. Changing summary accordingly. @hbrinkm: MS Office XP seems to be able to load and save Hebrew to Word 95 format. Hebrew text is lost when saving to Word 95 in OOo. Don't know which effort we can put into fixing this for a 15 years ole file format...
This also happens with word 6 format, not only word 95. And the problem is in the export. I could open my old documents which were created 15 years ago. @es: it might be an old file format, but might be important for archives.
Created attachment 72104 [details] screen shot of the problem in 3.2.1
Created attachment 72105 [details] screen shot of the problem in 3.3RC1
There's seem to be an improvement regarding this bug in 3.3RC1. The word95 and word6 files not look like they the wrong encoding instead of being completely corrupted. I've attached two screen shots to better illustrate the change.
This issue afects not only Hebrew, but also Arabic and Persian (AR + FA), and all east-asian languages (Japanese, Korean, Chienese...): export to word-6 / word-95 format causes text to be unreadable. (note that orig file need to be closed before loading the exported file)
Created attachment 75362 [details] orig odt file
Created attachment 75363 [details] the file after exporting to word 6 format (same result for w95), 3.3-M15 (native)
Verified in v4.1.3, Mac OS: Creating and saving a file in "Word 95 (doc)", but when the file the characters are corrupt. Suggestion for quick fix: Remove the 95 format at all, as 97/2000/xp (doc) format works fine.
Raised Priority to P2 and asked as blocker. If Writer can't save and open simple text files in all its stated formats, then I think it's a show stopper. Either remove what isn't supported well (word 95), or fix it.
Reset assigne to the default "issues@openoffice.apache.org".
We have 4.1.4-dev packages available... Can we get confirmation that it still exists in HEAD of aoo-414 ?
Created attachment 86289 [details] File from AOO 4.1.4 confirmed in 4.1.4 Steps opened original odt document save as to word 95 document closed AOO opened newly created word 95 document in AOO appeared to be the same as the original corrupted document