Issue 38719 - Want CVS-friendly file-format
Summary: Want CVS-friendly file-format
Status: UNCONFIRMED
Alias: None
Product: xml
Classification: Code
Component: external filters (show other issues)
Version: OOo 1.1.3
Hardware: All All
: P3 Trivial with 2 votes (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-12-09 09:22 UTC by toralf
Modified: 2013-02-07 22:33 UTC (History)
2 users (show)

See Also:
Issue Type: FEATURE
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description toralf 2004-12-09 09:22:30 UTC
I thought I might add an RFE related to a recent thread of mine on the mailing
lists:

In some cases I want to check in OOo documents in a CVS archive. I know the
document format has built-in version handling, but sometimes I wish to integrate
revision control of OOo docs with other data. For instance, I may want to
retrieve matching versions of source code and OOo docs that document it. Also,
I've grown so used to CVS that I'm pretty nervious about files that aren't in
the archive...

Now, OOo docs may indeed be checked into CVS, but you have to use the "binary"
format, so you a full copy of the file will probably be appended on every
check-in, and you can forget about getting useful diff listings or use "Tags" etc.

In other words, a "more textual" format would be nice. There is "Flat XML", of
course, but unfortunately, using that doesn't make a lot of different. The
problem is that the XML code is encoded as a single line.
Comment 1 jogi 2004-12-09 10:30:47 UTC
RFE
Comment 2 ace_dent 2008-05-16 02:12:40 UTC
OpenOffice.org Issue Tracker - Feedback Request.

The Issue you raised is currently 'Unconfirmed' pending review, but has not been
updated within the last 3 years. Please consider re-testing with one of the
latest versions of OOo, as the problem(s) may have already been addressed.
Either use the recent stable version: http://download.openoffice.org/index.html
or consider trying the new OOo 3 BETA (still in testing):
http://download.openoffice.org/3.0beta/
 
Please report back the outcome so this Issue may be Closed or Progressed as
necessary - otherwise it may be Resolved as Invalid in the future. You may also
wish to search for (and note) any duplicates of this Issue that may have
advanced further by checking the Issue Tracker:
http://www.openoffice.org/issues/query.cgi
 
Many thanks,
Andrew
 
Cleaning-up and Closing old Issues as part of:
~ The Grand Bug Squash, pre v3 ~
http://marketing.openoffice.org/3.0/announcementbeta.html
Comment 3 xquery 2009-03-19 13:51:55 UTC
i can suggest as workaround implementing XSLT filter (¨as is¨)

both import/export filter just copy source XML so you can easy save/open ¨as is¨
files

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
      xmlns:xs="http://www.w3.org/2001/XMLSchema"
      exclude-result-prefixes="xs"
      version="2.0">
    <xsl:output indent="yes" method="xml"/>
<xsl:template match="node()">
    <xsl:copy>
        <xsl:copy-of select="@*"/>
        <xsl:apply-templates/>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>
Comment 4 alvarezp2000 2009-03-20 07:48:34 UTC
Though the use of a plain XML is worth trying, it really doesn't solve it. I'm 
currently using .fodt files and it kinda works, except that useless information 
is getting to the .fodt and disrupts the purpose of RCS.

I'll try giving some examples taken from a blank document like this: creating a 
new Writer document and saving it as a .fodt file, loading it again without 
"Load user settings" and "Load printer settings", resaving and filtering it 
through "tidy -utf8 -q -xml -i -w 0" for easier analysis.

Compatibility-related options are included by default when they should not: 
"Options > Writer > Compatibility > Consider wrapping style when positioning 
Objects" is being included in the document as "ConfigTextWrapOnObjPos". It is a 
compatibility-only option that should really not appear in the file unless it 
is explicitly set to true or false. Why? Because if the file is loaded in an 
old-enough version of OOo, the option will not be understood at all by the old 
Writer, and if it is loaded into a future version of OOo, it should detect the 
OOo version from the Generator label and --only then-- set the option. This 
applies for other compatibility-related options as well.

CurrentDatabaseDataSource: it is a useless empty-string value set at creation 
time and included in the file.

AllowPrintJobCancel: I don't know for sure what this does, I must confess, but 
sounds like a setting that should be in the user settings in the PC instead of 
the document. The reason is that this is printer-dependant, not document-
dependant.

initial-creator: isn't this some info that only certain users will be 
interested in, and besides, by default it should NOT include any personal 
information in the document, unless explicitly requested? Also, this breaks RCS 
in the sense that this information should be stored in the RCS.


Now, for the following format settings, there might be a good reason, I just 
fail to see how it is useful to include thwm before even being used/applied.

style:font-face-style, maybe used by the subsequent default outline styles? Or 
is it the alias setting for fonts?

text:outline-level-style: why is it that all levels, though unused, are 
included by default in the absolutely empty file?


Now, for most of the above, I can just filter them out automatically by some 
RCS (like Git). However, what really breaks RCS in general is the following:

I imported a Word document and saved it as .fodt. Filtered it through tidy and 
saved it as "version a". I loaded the .fodt and saved it again without any 
change, not even View or else. I saved the file again and filtered it through 
tidy to save it as "version b". A diff between both versions shows the 
following:

1. All xml:id where rewritten.
2. For some reason, it inclued a soft-page-break before some paragraphs.
3. It rewrote some style:names.
4. The new save included some PrinterSetup info that the fist .fodt export 
didn't.

I repeated the fodt file load-and-saving procedure. Still, I found some 
differences (besides the ViewArea/Visible settings):

1. style:paragraph-properties was changed in a paragraph from style:writing-
mode="page" to style:writing-mode="lr-tb".
2. More style:style style:name="P8" renaming.

This last part, rather than describing an RFE, describes a bug that needs 
fixing.

Thank you for your attention to this long comment.
Comment 5 bettina.haberer 2010-05-21 14:46:18 UTC
To grep the issues easier via "requirements" I put the issues currently lying on
my owner to the owner "requirements".