Did I find the right examples for you? yes no      Crawl my project      Python Jobs

All Samples(2)  |  Call(0)  |  Derive(0)  |  Import(2)
Textual tokens identified by an NLP pipeline and marked up with
metadata from automatic taggers and possibly also Labels from
humans.

Attributes:
 - token_num: zero-based index into the stream of tokens from a document
 - token: actual token string, must always be a UTF8 encoded string, not a
unicode string, because thrift stores them as 8-bit.
 - offsets: offsets into the original data (see Offset.content_form)
 - sentence_pos: zero-based index into the sentence, which is used for dependency(more...)

src/s/t/streamcorpus-0.3.30/src/streamcorpus/dump.py   streamcorpus(Download)
import collections
from streamcorpus._chunk import Chunk as _Chunk
from streamcorpus.ttypes import OffsetType, Token, EntityType, MentionType
 
from streamcorpus.ttypes import StreamItem as StreamItem_v0_3_0

src/s/t/streamcorpus-dev-0.3.23.dev30/src/streamcorpus/dump.py   streamcorpus-dev(Download)
import collections
from streamcorpus._chunk import Chunk
from streamcorpus.ttypes import OffsetType, Token, EntityType, MentionType
 
from streamcorpus.ttypes import StreamItem as StreamItem_v0_3_0