@NamnaR

Как исправить ошибки в wordform.txt?

Доброго дня!
Вижу в логе searchd.log совершенно странные ошибки:
1. duplicate wordform found - overridden
2. all source tokens are stopwords
3. no destination token found

Пример таких ошибок:
[Sat Oct 27 23:22:12.418 2018] [27077] WARNING: index 'bitrix': no destination token found (wordform='gee > GE', file='/etc/sphinxsearch/wordforms/wordforms.txt'). IGNORED.
[Sat Oct 27 23:22:12.418 2018] [27077] WARNING: index 'bitrix': all source tokens are stopwords (wordform='je > GE', file='/etc/sphinxsearch/wordforms/wordforms.txt'). IGNORED.
[Sat Oct 27 23:22:12.418 2018] [27077] WARNING: index 'bitrix': all source tokens are stopwords (wordform='же > GE', file='/etc/sphinxsearch/wordforms/wordforms.txt'). IGNORED.
[Sat Oct 27 23:22:12.418 2018] [27077] WARNING: index 'bitrix': all source tokens are stopwords (wordform='жи > GE', file='/etc/sphinxsearch/wordforms/wordforms.txt'). IGNORED.
[Sat Oct 27 23:22:12.418 2018] [27077] WARNING: index 'bitrix': all source tokens are stopwords (wordform='жэ > GE', file='/etc/sphinxsearch/wordforms/wordforms.txt'). IGNORED.

[Sat Oct 27 23:22:12.418 2018] [27077] WARNING: index 'bitrix': no destination token found (wordform='джепи > GP', file='/etc/sphinxsearch/wordforms/wordforms.txt'). IGNORED.
[Sat Oct 27 23:22:12.418 2018] [27077] WARNING: index 'bitrix': no destination token found (wordform='джипи > GP', file='/etc/sphinxsearch/wordforms/wordforms.txt'). IGNORED.
[Sat Oct 27 23:22:12.418 2018] [27077] WARNING: index 'bitrix': no destination token found (wordform='джыпи > GP', file='/etc/sphinxsearch/wordforms/wordforms.txt'). IGNORED.


[Sat Oct 27 23:24:34.833 2018] [27077] WARNING: index 'bitrix': duplicate wordform found - overridden ( current='defendere > Defender Pilot', old='defender pilot > defender pilot pilot' ). Fix your wordforms file '/etc/sphinxsearch/wordforms/wordforms.txt'.
[Sat Oct 27 23:24:34.833 2018] [27077] WARNING: index 'bitrix': duplicate wordform found - overridden ( current='defendr > Defender Pilot', old='defender pilot > defender pilot pilot' ). Fix your wordforms file '/etc/sphinxsearch/wordforms/wordforms.txt'.

Конфиг sphinx.conf
index bitrix
{
    #main settings
        source= bitrix
        type = rt
        path = /var/lib/sphinxsearch/data/bitrix
        #docinfo = inline 
	wordforms = /etc/sphinxsearch/wordforms/wordforms.txt
        #exceptions = /etc/sphinxsearch/exceptions/exceptions.txt
    #choose appropriate type of morphology to use
        #morphology = lemmatize_ru_all, lemmatize_en_all, lemmatize_de_all, stem_enru
        #morphology = lemmatize_ru_all, lemmatize_en_all
        morphology = stem_enru, soundex
    #these settings are used by bitrix:search.title component
        prefix_fields = title
        infix_fields=
        #min_prefix_len = 2

        rt_mem_limit = 512M
        ondisk_attrs = 1
       
        #min_prefix_len = 3
        min_word_len = 3
        #min_infix_len = 1
        min_stemming_len =3 

        expand_keywords = 1
        index_exact_words = 1
      
         
        #enable_star = 1
    #all fields must be defined exactly as followed
        rt_field = title
        rt_field = body
        rt_attr_uint = module_id
        rt_attr_string = module
        rt_attr_uint = item_id
        rt_attr_string = item
        rt_attr_uint = param1_id
        rt_attr_string = param1
        rt_attr_uint = param2_id
        rt_attr_string = param2
        rt_attr_timestamp = date_change
        rt_attr_timestamp = date_to
        rt_attr_timestamp = date_from
        rt_attr_uint = custom_rank
        rt_attr_multi = tags
        rt_attr_multi = right
        rt_attr_multi = site
        rt_attr_multi = param
    #depends on settings of your site
        # uncomment for single byte character set
       #charset_type = sbcs
       # uncomment for UTF character set
       # charset_type = utf-8
       charset_table = 0..9, A..Z->a..z, x->U+0445, c->U+0441, _, a..z, \
    U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+0435, U+451->U+0435
	blend_chars = U+002C, U+2010, U+2012, U+2013, U+2014, U+2044, U+002F, U+002D, U+2d, /
}


Подскажите, пожалуйста, как устранить данные ошибки в wordforms.txt?
Заранее спасибо!
  • Вопрос задан
  • 392 просмотра
Пригласить эксперта
Ваш ответ на вопрос

Войдите, чтобы написать ответ

Похожие вопросы