Issue 125495 - Awkward Chinese (ZH-TW) numbering suffix when importing RTF document
Summary: Awkward Chinese (ZH-TW) numbering suffix when importing RTF document
Status: RESOLVED FIXED
Alias: None
Product: Writer
Classification: Application
Component: open-import (show other issues)
Version: 4.1.0
Hardware: All All
: P3 Normal (vote)
Target Milestone: 4.1.14
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-08-23 14:11 UTC by Mark Hung
Modified: 2023-01-08 11:39 UTC (History)
6 users (show)

See Also:
Issue Type: PATCH
Latest Confirmation in: 4.1.13
Developer Difficulty: ---


Attachments
Illustration for the issue (43.44 KB, image/png)
2014-08-23 14:11 UTC, Mark Hung
no flags Details
Patch to fix Chinese numbering suffix issue in RTF parser (1.18 KB, patch)
2014-08-23 14:18 UTC, Mark Hung
no flags Details | Diff
Sample test case. (55.88 KB, application/msword)
2014-08-23 14:18 UTC, Mark Hung
no flags Details
Screenshot for the effect of the patch in Linux. (50.00 KB, image/png)
2014-08-26 13:35 UTC, Mark Hung
no flags Details
Screenshot of AOO 4.2.0 on Windows (26.69 KB, image/png)
2023-01-05 21:45 UTC, Matthias Seidel
no flags Details
Screenshot of AOO 4.1.14-dev on Windows (25.81 KB, image/png)
2023-01-07 14:19 UTC, Matthias Seidel
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description Mark Hung 2014-08-23 14:11:00 UTC
Created attachment 83880 [details]
Illustration for the issue

Create different types of traditional Chinese numbering list in MSO 2010 ( ZH-TW). Open the file with OpenOffice, Chinese numbering suffix become awkward characters.
Comment 1 Mark Hung 2014-08-23 14:18:13 UTC
Created attachment 83881 [details]
Patch to fix Chinese numbering suffix issue in RTF parser


Because codepage encoding options like \ansicp950 appears later than the first bracket '{', the first parsing state has been pushed into the stack before correct encoding were set. Later when it was popped, the encoding of later state were affected and become the default even if \ansicp950 already appears, in consequence it affect multibyte string conversion for text token. Updated code will overwrite the encoding of the state on top of the frame, and has verified to work in my environment.

In theory , all multibyte chracter encoding ( not merely ZH-TTW) were affected.
Comment 2 Mark Hung 2014-08-23 14:18:50 UTC
Created attachment 83882 [details]
Sample test case.
Comment 3 Steve Yin 2014-08-26 10:34:05 UTC
(In reply to Mark Hung from comment #2)
> Created attachment 83882 [details]
> Sample test case.

This patch does not work with the sample. Can you check it?
Comment 4 Mark Hung 2014-08-26 13:35:50 UTC
Created attachment 83889 [details]
Screenshot for the effect of the patch in Linux.

The screenshot is taken under my environment, which is a Ubuntu 12 VM.  
Note that this screenshot also include the fix of Chinese numbering issue of 125400, which is independent of this issue.

Uname -a list as follows:
Linux ubuntu 3.11.0-15-generic #25~precise1-Ubuntu SMP Thu Jan 30 17:39:31 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux.

I haven't setup environment for Windows yet, so I can't verified this patch in Windows. 

Have you applied the patch successfully? I just found that I made diff under svtools module instead of main or top directory.
Which environment are you testing?
Comment 5 Steve Yin 2014-08-27 06:32:05 UTC
(In reply to Mark Hung from comment #4)
> Created attachment 83889 [details]
> Screenshot for the effect of the patch in Linux.
> 
> The screenshot is taken under my environment, which is a Ubuntu 12 VM.  
> Note that this screenshot also include the fix of Chinese numbering issue of
> 125400, which is independent of this issue.
> 
> Uname -a list as follows:
> Linux ubuntu 3.11.0-15-generic #25~precise1-Ubuntu SMP Thu Jan 30 17:39:31
> UTC 2014 x86_64 x86_64 x86_64 GNU/Linux.
> 
> I haven't setup environment for Windows yet, so I can't verified this patch
> in Windows. 
> 
> Have you applied the patch successfully? I just found that I made diff under
> svtools module instead of main or top directory.
> Which environment are you testing?

Confirmed. I was misled by the first screenshot. I tested the sample on Win 7 Pro 64bit and Mac OS 10.7.5.

The patch works for rtf/doc import/export except docx import.
Comment 6 Steve Yin 2014-08-27 09:59:27 UTC
(In reply to Steve Yin from comment #5)
> (In reply to Mark Hung from comment #4)
> > Created attachment 83889 [details]
> > Screenshot for the effect of the patch in Linux.
> > 
> > The screenshot is taken under my environment, which is a Ubuntu 12 VM.  
> > Note that this screenshot also include the fix of Chinese numbering issue of
> > 125400, which is independent of this issue.
> > 
> > Uname -a list as follows:
> > Linux ubuntu 3.11.0-15-generic #25~precise1-Ubuntu SMP Thu Jan 30 17:39:31
> > UTC 2014 x86_64 x86_64 x86_64 GNU/Linux.
> > 
> > I haven't setup environment for Windows yet, so I can't verified this patch
> > in Windows. 
> > 
> > Have you applied the patch successfully? I just found that I made diff under
> > svtools module instead of main or top directory.
> > Which environment are you testing?
> 
> Confirmed. I was misled by the first screenshot. I tested the sample on Win
> 7 Pro 64bit and Mac OS 10.7.5.
> 
> The patch works for rtf/doc import/export except docx import.

For this issue only, the patch works. For the docx format, it should be fixed by i125400
Comment 7 Oliver-Rainer Wittmann 2014-08-27 11:01:02 UTC
First, I think we can change the issue status to confirmed as we are already reviewing the applied patch.

Second, I will also have a look at the patch

Third, some questions:
- Does the patch also fixes the RTF export?
- Does the patch also fixes the WW8 (*.doc) import/export as Steve mentioned?
Comment 8 Mark Hung 2014-08-27 14:50:26 UTC
(In reply to Oliver-Rainer Wittmann from comment #7)
> First, I think we can change the issue status to confirmed as we are already
> reviewing the applied patch.
> 
> Second, I will also have a look at the patch
> 
> Third, some questions:
> - Does the patch also fixes the RTF export?
No. It's for import instead of export.

> - Does the patch also fixes the WW8 (*.doc) import/export as Steve mentioned?
No. It has nothing to do with WW8 as far as I know.
Comment 9 Mark Hung 2014-11-22 09:20:04 UTC
Hi Developers, 

What's happening? 
Is there anyone working on verifying or merging the patch? 
Do I need to improve anything or provide any information
 in order to get it merged?
Comment 10 Steve Yin 2014-11-28 13:50:15 UTC
Hi Mark,

Any update for this issue?
Comment 11 Mark Hung 2014-11-28 14:13:03 UTC
Hi Steve,

No update. My patch uploaded on Aug. 23 is OK. Please merge the patch if the solution is acceptable for you.
Comment 12 Steve Yin 2014-11-28 15:34:59 UTC
(In reply to Mark Hung from comment #11)
> Hi Steve,
> 
> No update. My patch uploaded on Aug. 23 is OK. Please merge the patch if the
> solution is acceptable for you.

It's OK. The patch was delivered.
Comment 13 SVN Robot 2014-11-28 15:45:45 UTC
"steve_y" committed SVN revision 1642312 into trunk:
Issue 125495 - Awkward Chinese (ZH-TW) numbering suffix when importing RTF do...
Comment 14 Kay 2015-09-07 22:29:39 UTC
We could use some additional testing on Windows I think.
Comment 15 Matthias Seidel 2023-01-05 21:23:09 UTC
This patch was committed to trunk and is therefore now in AOO42X.

But it was never committed to AOO41X (although the Target Milestone was 4.1.2).

Reopening and setting Target Milestone to 4.2.0 now.

We should test if this can be merged to AOO41X.
Comment 16 Matthias Seidel 2023-01-05 21:45:45 UTC
Created attachment 87161 [details]
Screenshot of AOO 4.2.0 on Windows

Looks Good To Me!
Comment 17 Matthias Seidel 2023-01-07 14:19:26 UTC
Created attachment 87162 [details]
Screenshot of AOO 4.1.14-dev on Windows
Comment 18 Matthias Seidel 2023-01-07 14:26:27 UTC
Applying this to AOO4114-dev does fix the numbering suffix, but not the numbering itself.

Looks like the patch for issue 125400 could also be needed.
Comment 19 Matthias Seidel 2023-01-08 11:39:47 UTC
Cherry-picked for AOO41X with:
https://github.com/apache/openoffice/commit/93ce685fe327053ff7f749df54963ad0c4a7c3d3