Author: CMScom



Text character normalization process uses Python unicodedata.
Convert full-width numeric and alphabet character into half-width equivalent.
Convert half-width Katakana into full-width equivalent.
Therefore all of above character variations can be recognized as same ones.

Language Specifications:

- Chinese

 - No space between words.
 - There is only Kanji(Chinese) character
 - Process	with Bigram(2-gram) model

- Japanese

 - No space between words
 - Combination 0f Kanji(Chinese), Katakana, and Hiragana character

- Korean

 - There are spaces between words, but it contains a particle
 - Combination of Korean alphabet and Kanji(Chinese) character
 - Discriminate Korean alphabet and Kanji(Chinese) character and processed with Bigram(2-gram) model

- Thai

 - No space between words
 - It's very difficult to handle this language in a computer
 - A vowel and a consonant are registered in Unicode separately so that it is difficult to recognize as one word.
 - However, there is a possibility of dealing with Thai characters to use Bigram(2-gram) model.

- Other languages (Including English)

 - There is a space between words

 - It is indexed each word


- Source Code

  Since no documents are available on how to develop 'word splitter', we refer to other splitter source code. But I still have a number of questions. If you have any more information, please feel free let us know.

- Hotfix to Plone 3.0 source code

  Because Plone 3.x catalog setting, catalog.xml, doesn't have existing index overwrite mechanism, we developed hotfix and added XML attribute. We believe Plone 3 XML define mechanism is simple and clear, so that we take this approach. We appreciate any comment.

Use zc.buildout
- Add ``Products.BigramSplitter`` to the list of eggs to install, e.g.::

    eggs =
- Tell the plone.recipe.zope2instance recipe to install a ZCML slug::

    recipe = plone.recipe.zope2instance
    zcml =
- Re-run buildout, e.g. with::

    $ ./bin/buildout

- Restart Zope
- Plone setting -- Add on products  -- Quick install

Old Style
- Untar downloaded file, then copy to 'Products' directory of your Plone instance.
- Restart Zope
- Plone setting -- Add on products  -- Quick install

- Plone3.0.x or higher

- See docs/LICENSE.txt

- CMScom

 - Manabu Terada  e-mail :
 - Mikio Hokari
 - Naoki Nakanishi
 - Naotaka Hotta
 - Takashi Nagai


1.0 (2010-12-06)

* Adding uninstall script

1.0b4 (2010-06-07)

* Fixed missing skin folder name

1.0b3 (2010-03-20)

* Adding keyword highlight (JavaScript)

1.0a2 (2010-01-29)

* Fixed full width space for and search

1.0a1 (2009-12-05)

* Initial release