Apache OpenOffice (AOO) Bugzilla – Issue 35255
Add Esperanto locale entry to i18npool/source/localedata/data/
Last modified: 2013-08-07 15:00:08 UTC
Hi Pavel, There appears to be changes needed to i18npool/source/localedata/data/ 1) adding an .xml file 2) updating the makefile 3) and possibly updating the .map's (although technically Esperanto has no country but it might be an interesting idea to add it to all of the maps). In any event, the last time I saw this format was in issue 35152 and 35153 in which you said in 2.0 it wouldn't be needed. I can't build Esperanto with m55 (long story) so it's worth asking you how I generate the XML file to contribute the patch (and then test at the next dev build). I say this because they all look generated from some script I haven't yet located. I'm hoping this is another 'wontfix'. :-) Thanks again, Joey
locale -> Eike. k0fcc: the description of locale is available at http://l10n.openoffice.org/
It would be quite difficult to add locale data for Esperanto, as there is no country whatsoever associated with it. There's no cultural data, no currency, no calendar, nor the way defined how numbers and dates and so on should be represented, please correct me if I'm wrong on this. So why should we add any data and what data should we use? Inventing locale data IMHO is not the way to go. @Pavel: it's quite senseless to assign such an issue to me and target it to OOo2.0 if there is no locale data attached to the issue.. retargeting now to "not determined", but as of the concerns expressed above it might as well soon become a "wontfix". @k0fcc: whether being able to build for Esperanto is not related to the presence of locale data. And no, the .xml files are not generated by some script (well, only in the beginning they were), they are all carefully hand craftet.. Eike
Thank Eike... Esperanto in fact does have some Locale data. Dates for example: yyy-mm-dd if it's all in figures, so today is 2004-10-11. If it's a single-digit month or day. If it's written out in full, then it's la 11-a de oktobro 2004 (with no capital letter for the month). I was planning on adding this to the calendar scheduling wizard for example.
Hi Joey, (I guess I'm right in assuming that here k0fcc is Joey Stanford by real name?) That YYYY-MM-DD is one example. What about all the other parts that make up locale data, like decimal and group separators, date and time separators, and so on, and all the other number formats related data? What about the currency, which one to use as a default? I'm reassigning this issue to you. If you have a locale data set ready of which you think that it serves the need of an Esperanto community, please attach it here and reassign this issue to me then. Please have the data comply with i18npool/source/localedata/data/locale.dtd (please read also the comments therein) and take i18npool/source/localedata/data/en_US.xml or similar as a sample template. Thanks Eike
Tim is working this.
Reassigning to Tim so he sees it in his issue queue.
Tim is actually trying to find someone who'll work on this! Naive question: aren't a lot of the things we're talking about (currency symbol, time format, etc.) defined at OS level? And if so, isn't there a way to adopt those settings on the fly, either at installation or even at program launch? In the meantime, I'll ask our dev list if someone's keen to do this. Tim
Tim, OOo currently doesn't inherit values from the OS other than the locale itself. The locale data is defined completely within OOo. Cross platform differences are too big. But even if the settings were inherited from the OS: so far I've seen no OS offering settings for Esperanto, except Linux. In the mean time the ICU started to support an Esperanto locale, maybe we can borrow some settings from them, take a look at http://oss.software.ibm.com/cgi-bin/icu/lx/ Eike
Tim, FYI.. I've had to use eo_EO (ie. eo for iso code and EO for country code) as part of the lingu testing because the lingu subsystem needs to bind the spellchecker with a country code. Pavel has previously indicated that we use only eo (and if fact when Pavel made the changes for me he used eo only) in the main OpenOffice program. My only advise it consider the use of eo_EO on the locale entry and we'll try to sort out the bugs during testing. Joey
Hi Tim, Joey, as Martin pointed out on the releases list we are coming closer to 2.0 Beta. http://www.openoffice.org/servlets/ReadMsg?list=releases&msgNo=8258 Is your work making progress and would be ready for 2.0 as the target milestone indicates? Thanks, Stefan
I suspect this isn't going to happen for 2.0. It's also not required for 2.0 therefore I've changed the target to later.
The locale files are named according to their language and the country respectively, e.g. en_GB.xml, sv_SE.xml. In the list of locale files there is only one that does not include the country name: eu.xml. I assume that for political reasons the Basques don´t want to mention their country. The part of that file which concerns the country has empty elements. According to the documentation XLocaleData the country part is mandatory. IMHO, empty elements are a kludge and I expect that they may cause problems in use. Strangely the currency is defined (Peseta and Euro) so the country can be inferred to be Spain. A better example for our purposes is hu_HU.xml for Hungarian and Hungary. It contains the country name (in English!) and the currency Forint. It seems to me that for Esperanto we need a separate file for each country, because the country and currency data are mandatory. So for me, for example, a file eo_NZ.xml will need a section about my country (New Zealand) and a section about the currency of NZ. The currency can be redirected thus: <LC_CURRENCY ref="en_NZ"/> so that it will pick up the definition in the file en_NZ.xml. Is there a way of doing this automagically? If so we would need only one file. The file name referred to in the currency element must exist in the list of locale files and, I suppose, must exist on the user´s computer. I have put together an initial version of eo_NZ.xml. Other Esperanto users can easily change the value of the currency ref from ¨en_NZ¨ to say ¨en_US¨ or ¨hu_HU¨ according to their own requirements. I have of course included data on number and date formats, the calendar, days, months etc. I used the Swedish file as the model for eo_NZ.xml because the date formats in it are close to those of the ISO.
Created attachment 22493 [details] locale file for esperanto in NZ
Ok Don, you sold me. It's massively cumbersome to create all of those XML files though. If you want to take a crack at this I'll reassign this task to you. ...Joey
Viranaso, > According to the documentation XLocaleData the country part is > mandatory. What exactly are you referring to? According to the locale.dtd the element is mandatory, but nevertheless it can be empty. The XLocaleData documentation doesn't say anything about this. > IMHO, empty elements are a kludge and I expect that they may cause > problems in use. Of course they are a kludge, after all it is named locale data, and not language data. But the so far only problem that I'm aware of, mentioned above, where an absent country code caused problems with the spellchecker, was fixed long time ago (see issue 6133 about Latin) by introducing the word ANY as a special country code in the dictionary.lst file. The corresponding entry for Esperanto in dictionary.lst would be DICT eo ANY eo > It seems to me that for Esperanto we need a separate file for each > country, because the country and currency data are mandatory. No, please don't create locale data for different countries, I would not include them. Please bear in mind that each locale needs a MS-LangID assigned, and also must have a string resource entry in langtab.src to be selectable from the language listbox. We will not bloat that with dozens of Esperanto entries. Create one eo.xml, and for the currency use, like in ia.xml (Interlingua), Euro, for example, as that probably is most widely used with Esperanto, I guess. A different default currency than Euro that overrides the definition may always be selected by the user under menu Tools.Options.LanguageSettings.Languages. Eike
Thanks Eike for your clarification. This puts us back to my original idea of just creating one XML file with GMT, Euro, etc. as the standards for now. I'm reassigning to Don since he's been so kind as to accept this task.
Don is graciously trying to figure out the best course of action for us.
Created attachment 22802 [details] Generic locale file for Esperanto.
I have created a generic locale file for Esperanto. Please note the following features: 1. It has no country associated with it, therefore the name of the file is eo.xml and the data for country is blank: <Country> <CountryID></CountryID> <DefaultName></DefaultName> </Country> 2. The currency is set to Euro since that is the currency used by the Central Office of the world Esperanto organisation ¨Universala Esperanto-Asocio¨. 3. Date formats are primarily in ISO format YYYY-MMM-DD. Other formats are modelled on those in se_SV. The formatting of decimal numbers are those traditionally used in Esperanto writing. The days of the week, names of months, eras etc are in Esperanto. 4. The collation and search properties I have set to ignore case. 5. The numbering levels and outline numbering levels I have linked to those in en_US since most of the locales do that. I don´t know what the consequences of that are, but I guess that it will do for a start.
Eike, Sorry about this, I've forgotten to reassign this to you for you too look at. Don ald was ableto craft up an eo.xml file for locale data back in Februrary. Please review and advise whether this will work. Does NOT need to be in first release of 2.0.
Hi Joey, Thank you for contributing. But before we can proceed we need to check whether the creator of the file signed the JCA and is listed at http://www.openoffice.org/copyright/copyrightapproved.html, so what is the full name of Donald "viranaso"? As for the data: you should make sure that it at least validates against the current locale.dtd. Other things that jumped to my eye: The LC_INDEX section lists only A-Z, aren't there other letters in Esperanto that are not to be simply appended to an index list by their Unicode order? The FollowPageWord elements aren't translated, p. and pp. are the English abbreviations. And for couriosity: did someone perform the non-product tests mentioned in locale.dtd? Thanks Eike
"viranaso" is Donald Rogers, who has submitted a JCA and has had write access to our CVS tree for a few months now, with which he's been doing a fine job of improving the website of our subdomain eo.ooo.org. I'm assuming he's the "Donald Evan Rogers" listed on http://www.openoffice.org/issues/show_bug.cgi?id=35255 but I'll ask him by copy of this message to confirm this. Donald: would you also be able to look into Eike's other queries too please? Thanks in advance. Tim
Sorry, pasted wrong link into previous message. Obviously meant http://www.openoffice.org/copyright/copyrightapproved.html Tim
Ah, ok. I'm scheduling this for OOo2.0.1
I feel at a disadvantage here since I am new to locale programming and I have not been involved in OOo programming. I was just trying to help to get things going by putting together most of the Esperanto bits needed. Eike> As for the data: you should make sure that it at least validates against the > current locale.dtd. The only copy of locale.dtd that I could find does not contain the element LC_INDEX. After renoval of the LC_INDEX section, eo.xml is valid. > Other things that jumped to my eye: The LC_INDEX section > lists only A-Z, aren't there other letters in Esperanto that are not to be > simply appended to an index list by their Unicode order? None of the other locales use this element. Should we leave it out? The documentation at translate.sourceforge.net/doc/locales-openoffice.html does not show how to set this up. The Esperanto alphabetical order is abcĉdefgĝhĥijĵklmnoprsŝtuŭvz plus upper case. > The FollowPageWord > elements aren't translated, p. and pp. are the English abbreviations. I don't know of any standard abbreviation for "paĝoj", so I left it as "pp." > And for > couriosity: did someone perform the non-product tests mentioned in locale.dtd? No. I am not familiar with these tests. Presumably one has to get a spreadsheet program to read the eo.xml file? Donald (Evan) Rogers
Hi Donald, > The only copy of locale.dtd that I could find does not contain the element > LC_INDEX. After renoval of the LC_INDEX section, eo.xml is valid. Then you're working on the wrong (an older, presumably OOo1.x) branch of the source tree. Please use a recent source tree, which is either HEAD for current, or, probably more preferable, a dedicated SRC680_m97 milestone, for example. > None of the other locales use this element. Should we leave it out? Same as above, old branch. Please add the data. > The documentation at > translate.sourceforge.net/doc/locales-openoffice.html does not show > how to set this up. Yes, it does. It should get updated with a newer version from http://www.khmeros.info/tools/openoffice_locale_.htm however. > The Esperanto alphabetical order is ... plus upper case. Please use that order then. It isn't necessary to list lower and upper case separatley unless they are treated differently, just use the upper case letters. >> The FollowPageWord elements aren't translated, p. and pp. are the >> English abbreviations. > > I don't know of any standard abbreviation for "pa?oj", so I left it > as "pp." Ok, I just wanted to mention. >> And for >> couriosity: did someone perform the non-product tests mentioned in locale.dtd? > > No. I am not familiar with these tests. Presumably one has to get a spreadsheet > program to read the eo.xml file? You'd need a non-product build of the OOo application to do these tests, which you probably don't have available. A newer milestone, once CWS localedata4 will be integrated, will also allow to run the tests in a product-build. See also http://l10n.openoffice.org/servlets/ReadMsg?list=dev&msgNo=5252 Thanks Eike
Created attachment 25886 [details] Amended Locale element; LC_INDEX, IndexKey to include accented letters
Looks better now. Detailed investigation and running checks will tell ;-) Thanks Eike
Retargeting to OOo2.0, will add this to CWS localedata5.
On branch cws_src680_localedata5: i18npool/source/localedata/localedata.cxx 1.30.8.1 i18npool/source/localedata/data/eo.xml 1.1.2.1 i18npool/source/localedata/data/localedata_others.map 1.6.8.1 i18npool/source/localedata/data/makefile.mk 1.26.8.1 Note that I changed the following, see revision checked in: - In LC_FORMAT added replaceFrom="[CURRENCY]" replaceTo="[$...]" attributes to actually display the currency symbol in the currency number format codes. - Replaced the plain ordinary space group (AKA thousand) separator with a non-breaking space (U+00A0) in definition and number formats, which prevents line breaks in numbers. - In time format codes replaced ATM/PTM (being the abbreviation in Esperanto) with the keyword AM/PM, which gets replaced with the definition "atm" respectively "ptm" upon display. Otherwise a literal ATx/PTx is displayed, with x being some month number. - Removed empty <DefaultName/> elements from <FormatElement> just to smoothen the file. - Beautified with indentation (2 spaces ;-) Btw, I found it a bit confusing that the TimeSeparator is a '.' dot, the only locale not having ':' as the TimeSeparator was Italian it_IT so far.. are you sure about that? Also the ICU lists ':' as the separator, see http://www.ibm.com/software/globalization/icu/demo/locales/?d_=en&_=eo If this should be changed to a ':' please mention in this issue ASAP, so I can correct it in the same childworkspace. If '.' is correct then please file a bug against the ICU and give some reference, online or books, to strengthen your claim. See http://www.jtcsv.com/cgibin/locale-bugs Thanks Eike
I'm certainly not aware of any great attachment to `.' instead of `:' as a time separator in the Esperanto world. If everyone else uses a colon, that sounds fine to me. No strong preference either way. All your other changes sound eminently sensible too. Thanks for your help with this Eike. Tim
i18npool/source/localedata/data/eo.xml 1.1.2.2 Changed TimeSeparator to ':', adapted number format codes. This also results in potential less confusion during input about what is recognized as time, as people are used to the ':' separator.
Ready for QA. re-open issue and reassign to oc@openoffice.org
reassign to oc@openoffice.org
reset resolution to FIXED
verified in internal build cws_localedata5
closed because fix available in OOo1.9m121