Bug ID | 917030 |
---|---|
Summary | idzebra-2.0: Issue with ICU token processing |
Classification | openSUSE |
Product | openSUSE Distribution |
Version | 13.2 |
Hardware | Other |
URL | http://git.indexdata.com/?p=idzebra.git;a=commit;h=704fd190292cb771df94553b0ed6f9f4b71660a6 |
OS | Other |
Status | NEW |
Severity | Normal |
Priority | P5 - None |
Component | Maintenance |
Assignee | ke@suse.com |
Reporter | dcook@prosentient.com.au |
QA Contact | qa-bugs@suse.de |
Found By | --- |
Blocker | --- |
Dear Karl, Bj�rn Lie mentioned that you were the maintainer for idzebra-2.0 and that I should assign this bug to you. He also mentioned that he's already prepared an update for the devel repo at https://build.opensuse.org/request/show/284830. On to the actual bug report: Zebra 2.0.59 has an issue where search queries involving hyphens are tokenised, but only the first token is used for searching. So a search for "Mont-Royal" will actually just be a search for "Mont". Or a search for "up-to-date" will just be a search for "up". This is the case even when trying to use ICU transformation/transliteration rules to remove the hyphen before tokenising. I reported the bug to Indexdata on February 4th and they fixed it and released version 2.0.60 on February 7th with this fixed. Here is the link to the relevant git commit: http://git.indexdata.com/?p=idzebra.git;a=commit;h=704fd190292cb771df94553b0ed6f9f4b71660a6 and the relevant news patch: http://git.indexdata.com/?p=idzebra.git;a=commitdiff;h=b51184e7cf9eabd2c609f50f721d6568351fbc33 I've already tested the fix on a Debian system using the Debian packages that Indexdata provides, and it works great. Please let me know if you need any information on reproducing the bug or whatever else. Thanks.