Apache OpenOffice (AOO) Bugzilla – Issue 23541
Index from concordance file to ignore optional hyphens in doc
Last modified: 2021-11-01 21:16:54 UTC
Action: generating alphabetical indexes from a concordance file. Enhancement: Would be nice if optional hyphens in words were ignored when matching words with the concordance file.
Reassigned to BH
If one uses optional hyphens [-] in index entries they are also treated differently. For example the following would generate four entries in the index: concordance con[-]cordance concor[-]dance con[-]cor[-]dance It would be better if all optional hyphens [-]could be ignored when comparing index entries. Perhaps ordinary hyphens should be ignored too, though this is debatable. Perhaps the following should be treated as a single entry? (anti-semitism) anti-semitism antisemitism anti[-]semitism antisemit[-]ism anti-semit[-]ism
Created attachment 53695 [details] Test Case with Generated Index
This issue still affects release 2.4 Optional hyphens should be ignored for indexing purposes, though it would be useful to include them in the index so that long words still break in the desired place in the index. Non-breaking hyphens and regular hyphens should be treated as the same. Optionally, hyphenated and unhyphenated terms that are otherwise identical could be combined under a single index entry, i.e. for indexing purposes anti-semitism = antisemitism = Anti-semitism. Whichever spelling was used first would take precedence as the index entry.
To grep the issues easier via "requirements" I put the issues currently lying on my owner to the owner "requirements".