Author: Dieter Maurer



``dm.incrementalsearch`` is an efficient low level search execution engine.
Its primary purpose is to incrementally perform searches involving
``and``, ``or`` and ``not`` queries over ``ZODB.BTrees``.

Incrementally means here
that hits are determined one at a time: the first hit, then the second hit,
then the third hit, etc. Therefore, the first few hits can be determined
extremely fast. But even if all hits need to be determined, the
incremental execution of subqueries can lead to speedups of several orders
for some query types (especially those dominated by specific ``and`` queries).

Queries involving large ``or`` subqueries are difficult to optimize
in the standard way. But often they can be replaced by incremental
filtering. With this technique, a subquery is removed from the
original search, the modified search executed and the result filtered
by the removed subquery. ``incrementalsearch`` supports incremental filtering
and thereby can again gain serveral orders of speedup for otherwise
difficult to treat query types.

The primary concept is that of an ``ISearch`` (incremental search).
This is conceptionally a sorted list, computed incrementally
(or lazily). The elements of this list are the ``ISearch``'s hits.
The ``ISearch``'s ``keytype`` determines the type of the list
elements. Currently supported are ``OBJECT`` (comparable Python objects),
``INT`` (Python 32 bit integers) and ``LONG`` (Python 64 bit integers).

Example usage
``incrementalsearch`` is rarely used directly but usually indirectly
via a higher level search engine such as ``Products.AdvancedQuery``.
Let's nevertheless look at some examples.

Assume we want to find elements available in each of a sequence
of integer BTrees ``seq_btrees``. I.e. we are interested in elements contained
in the intersection of these BTrees.

>>> from dm.incrementalsearch import IBTree, IAnd_int, Unspecified, \

We transform the BTrees into incrementally searchable objects by
applying the ``IBTree`` operator (Incremental BTree). Then
we combine them by ``IAnd_int`` and indicate that no more searches
will be added by calling ``complete``. This causes several optimizations
to take place.

>>> isearch = IAnd_int(*map(IBTree, seq_btrees))
>>> isearch.complete()

An ``ISearch`` has attributes ``value`` (the current value) and ``classification``
(it indicates whether the current value is a hit and whether there may be
more values) and methods ``advanceFrom`` and ``advanceTo`` to move forward
in the search. To determine the first hit, we can use:

>>> cl = isearch.advanceFrom(Unspecified, Unspecified)
>>> if cl != AT_END:
...     first_hit = isearch.value

The next hit is determined by

>>> cl = isearch.advanceFrom(isearch.value, Unspecified)
>>> if cl != AT_END:
...     next_hit = isearch.value

If we know that hits at or below 1000 are irrelevant, we can use

>>> cl = isearch.advanceFrom(1000, Unspecified)
>>> if cl != AT_END:
...     hit_above_1000 = isearch.value

We can also indicate that we are interested only in hits below
some upper value.

>>> cl = isearch.advanceFrom(2000, 10000)
>>> if cl == INLIST_SUCCESS:
...     hit_above_2000_below_10000 = isearch.value

We can use the ``asSet`` method to obtain all hits
as a (BTree) set. ``asSet`` should only be called on fresh, i.e. (as yet)
unadvanced isearches (we may remove this restriction later, if there
is a need to).

>>> isearch = IAnd_int(*map(IBTree, seq_btrees))
>>> isearch.complete()
>>> bset = isearch.asSet()

Isearches also support iteration.

>>> for hit in isearch:
...   ...

``ISearch`` constructors

The primary ``ISearch`` constructors are

  turns a ``ZODB.BTrees`` object (tree, bucket, set or treeset) into
  an ``ISearch``.
  combines ``ISearches`` by an ``and``
  combines ``ISearches`` by an ``or``
  negates an ``ISearch``
  a filtering ``ISearch``
  the empty ``ISearch``
  a base class to define your own ``ISearch`` es