zc.blist

HomePage: UNKNOWN

Author: Gary Poster

Download: https://pypi.python.org/packages/source/z/zc.blist/zc.blist-1.0b2.tar.gz

        ~~~~~~~~
zc.blist
~~~~~~~~

.. contents::

========
Overview
========

The sequence in this package has a list-like API, but stores its values in
individual buckets. This means that, for small changes in large sequences, the
sequence could be a big win. For instance, an ordered BTree-based container
might want to store order in a sequence, so that moves only cause a bucket or
two--around 50 strings or less--to be rewritten in the database, rather than
the entire contents (which might be thousands of strings, for instance).

If the sequence is most often completely rearranged, the complexity of the code
in this package is not desirable.  It only makes sense if changes most
frequently are fairly small.

One downside is that reading and writing is more work than with a normal list.
If this were to actually gain traction, perhaps writing some or all of it in C
would be helpful.  However, it still seems pretty snappy.

Another downside is the corollary of the bucket advantage listed initially:
with more persistent objects, iterating over it will fill a lot of ZODB's
object cache (which is based on the number of objects cached, rather than the
size). Consider specifying a big object cache if you are using these to store a
lot of data and are frequently iterating or changing.

These sequences return slices as iterators, and add some helpful iteration
methods.  It adds a ``copy`` method that provides a cheap copy of the blist
that shares all buckets and indexes until a write happens, at which point it
copies and mutates the affected indexes and buckets.

We'll take a glance at how these differences work, and then describe the
implementation's basic mechanism, and close with a brief discussion of
performance characteristics in the abstract.

==============================
Differences from Python's List
==============================

Slices are Iterators
====================

This doesn't need much discussion.  Getting slices of all sorts returns
iterators.

    >>> from zc.blist import BList
    >>> l = BList(range(1000))
    >>> l[345:351] # doctest: +ELLIPSIS
    <generator object at ...>
    >>> list(l[345:351])
    [345, 346, 347, 348, 349, 350]

    >>> l[351:345:-1] # doctest: +ELLIPSIS
    <generator object at ...>
    >>> list(l[351:345:-1])
    [351, 350, 349, 348, 347, 346]

    >>> l[345:351:2] # doctest: +ELLIPSIS
    <generator object at ...>
    >>> list(l[345:351:2])
    [345, 347, 349]

Additional Iteration Methods
============================

``iterReversed`` lets you iterate over the list in reverse order, efficiently,
with a given start point.  It is used for slices that proceed with a step of
-1.

    >>> i = l.iterReversed()
    >>> i.next()
    999
    >>> i.next()
    998
    >>> list(i)[-10:]
    [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

``iterSlice`` lets you iterate over the list with a slice.  It is equivalent to
using a slice with __getitem__.

    >>> i = l.iterSlice(345, 351, 2)
    >>> i # doctest: +ELLIPSIS
    <generator object at ...>
    >>> list(i)
    [345, 347, 349]

Cheap ``copy``
==============

The ``copy`` method produces a cheap copy of the given blist.  All buckets
and indexes are shared until a change is made to either side.  Copies can
safely be made of other copies.

    >>> c = l.copy()
    >>> l == c
    True
    >>> list(c) == list(l)
    True
    >>> del c[10:]
    >>> list(l) == range(1000)
    True
    >>> list(c)
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    >>> l == c
    False
    >>> c2 = c.copy()
    >>> c2 == c
    True
    >>> list(c2)
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

=========
Mechanism
=========

In its implementation, the sequence is an adapted B+ tree. Indexes are keys, but
each bucket or branch starts at 0. For instance, a perfectly-balanced bucket
sequence with 16 items, and a limit of 3 entries in a bucket or branch, would
have "keys" like this. In the diagram, the top three rows are indexes, and the
bottom row consists of buckets::

        0           8
     0     4     0     4
    0 2   0 2   0 2   0 2
   01 01 01 01 01 01 01 01

So, for instance, you would get the value a