LetoDMS Community Forum
Indexer iconv problem string conversion - Printable Version

+- LetoDMS Community Forum (https://community.letodms.com)
+-- Forum: LetoDMS Support (https://community.letodms.com/forumdisplay.php?fid=4)
+--- Forum: Bugs/Errors (https://community.letodms.com/forumdisplay.php?fid=11)
+--- Thread: Indexer iconv problem string conversion (/showthread.php?tid=537)



Indexer iconv problem string conversion - AlSchedl - 07-27-2012

Hello out there,

Lucene Indexer (for full text search) gives to me the following Notice on serveral uploaded documents:

PHP Notice: iconv(): Detected an illegal character in input string in ..../letodms/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 58

According to that, the word list (viewed by fulltext index info form Admin-Menu) shows, that sometimes Transliteration (iconv ASCII//TRANSLIT) of german "Umlaute" e.g.

ä -> ae , ö -> oe, ü -> ue, ß -> ss

works "as designed", but sometimes it fails, truncating the Words in the index Wordlist, e.g "bersetzen" instead of "uebersetzen" as would be expected be, when the German "ü" correctly transliterated to "ue".

How to fix this behaviour ?

VG
Alex


RE: Indexer iconv problem string conversion - steinm - 07-28-2012

(07-27-2012, 01:29 PM)AlSchedl Wrote: Hello out there,

Lucene Indexer (for full text search) gives to me the following Notice on serveral uploaded documents:

PHP Notice: iconv(): Detected an illegal character in input string in ..../letodms/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 58

According to that, the word list (viewed by fulltext index info form Admin-Menu) shows, that sometimes Transliteration (iconv ASCII//TRANSLIT) of german "Umlaute" e.g.

ä -> ae , ö -> oe, ü -> ue, ß -> ss

works "as designed", but sometimes it fails, truncating the Words in the index Wordlist, e.g "bersetzen" instead of "uebersetzen" as would be expected be, when the German "ü" correctly transliterated to "ue".

How to fix this behaviour ?

Can you provide an example document to reproduce the error.

Uwe