codec.base module

This module contains base classes/interfaces for “codec” objects.

Classes

class whoosh.codec.base.Codec
class whoosh.codec.base.PerDocumentWriter
class whoosh.codec.base.FieldWriter
class whoosh.codec.base.PostingsWriter
written()

Returns True if this object has already written to disk.

class whoosh.codec.base.TermsReader
class whoosh.codec.base.PerDocumentReader
all_doc_ids()

Returns an iterator of all (undeleted) document IDs in the reader.

class whoosh.codec.base.Segment(indexname)

Do not instantiate this object directly. It is used by the Index object to hold information about a segment. A list of objects of this class are pickled as part of the TOC file.

The TOC file stores a minimal amount of information – mostly a list of Segment objects. Segments are the real reverse indexes. Having multiple segments allows quick incremental indexing: just create a new segment for the new documents, and have the index overlay the new segment over previous ones for purposes of reading/search. “Optimizing” the index combines the contents of existing segments into one (removing any deleted documents along the way).

create_file(storage, ext, **kwargs)

Convenience method to create a new file in the given storage named with this segment’s ID and the given extension. Any keyword arguments are passed to the storage’s create_file method.

delete_document(docnum, delete=True)

Deletes the given document number. The document is not actually removed from the index until it is optimized.

Parameters:
  • docnum – The document number to delete.
  • delete – If False, this undeletes a deleted document.
deleted_count()

Returns the total number of deleted documents in this segment.

doc_count()

Returns the number of (undeleted) documents in this segment.

doc_count_all()

Returns the total number of documents, DELETED OR UNDELETED, in this segment.

has_deletions()

Returns True if any documents in this segment are deleted.

is_deleted(docnum)

Returns True if the given document number is deleted.

open_file(storage, ext, **kwargs)

Convenience method to open a file in the given storage named with this segment’s ID and the given extension. Any keyword arguments are passed to the storage’s open_file method.