Span

class

A slice from a Doc object.

Span.init method

Create a Span object from the slice doc[start : end].

Name	Description
`doc`	The parent document. Doc
`start`	The index of the first token of the span. int
`end`	The index of the first token after the span. int
`label`	A label to attach to the span, e.g. for named entities. Union[str, int]
`vector`	A meaning representation of the span. numpy.ndarray[ndim=1, dtype=float32]
`vector_norm`	The L2 norm of the document’s vector representation. float
`kb_id`	A knowledge base ID to attach to the span, e.g. for named entities. Union[str, int]
`span_id`	An ID to associate with the span. Union[str, int]

Span.getitem method

Get a Token object.

Name	Description
`i`	The index of the token within the span. int
RETURNS	The token at `span[i]`. Token

Get a Span object.

Name	Description
`start_end`	The slice of the span to get. Tuple[int, int]
RETURNS	The span at `span[start : end]`. Span

Span.iter method

Iterate over Token objects.

Name	Description
YIELDS	A `Token` object. Token

Span.len method

Get the number of tokens in the span.

Name	Description
RETURNS	The number of tokens in the span. int

Span.set_extension classmethod

Define a custom attribute on the Span which becomes available via Span._. For details, see the documentation on custom attributes.

Name	Description
`name`	Name of the attribute to set by the extension. For example, `"my_attr"` will be available as `span._.my_attr`. str
`default`	Optional default value of the attribute if no getter or method is defined. Optional[Any]
`method`	Set a custom method on the object, for example `span._.compare(other_span)`. Optional[Callable[[Span, …], Any]]
`getter`	Getter function that takes the object and returns an attribute value. Is called when the user accesses the `._` attribute. Optional[Callable[[Span], Any]]
`setter`	Setter function that takes the `Span` and a value, and modifies the object. Is called when the user writes to the `Span._` attribute. Optional[Callable[[Span, Any], None]]
`force`	Force overwriting existing attribute. bool

Span.get_extension classmethod

Look up a previously registered extension by name. Returns a 4-tuple (default, method, getter, setter) if the extension is registered. Raises a KeyError otherwise.

Name	Description
`name`	Name of the extension. str
RETURNS	A `(default, method, getter, setter)` tuple of the extension. Tuple[Optional[Any], Optional[Callable], Optional[Callable], Optional[Callable]]

Span.has_extension classmethod

Check whether an extension has been registered on the Span class.

Name	Description
`name`	Name of the extension to check. str
RETURNS	Whether the extension has been registered. bool

Span.remove_extension classmethod

Remove a previously registered extension.

Name	Description
`name`	Name of the extension. str
RETURNS	A `(default, method, getter, setter)` tuple of the removed extension. Tuple[Optional[Any], Optional[Callable], Optional[Callable], Optional[Callable]]

Span.char_span method

Create a Span object from the slice span.text[start:end]. Returns None if the character indices don’t map to a valid span.

Name	Description
`start`	The index of the first character of the span. int
`end`	The index of the last character after the span. int
`label`	A label to attach to the span, e.g. for named entities. Union[int, str]
`kb_id`	An ID from a knowledge base to capture the meaning of a named entity. Union[int, str]
`vector`	A meaning representation of the span. numpy.ndarray[ndim=1, dtype=float32]
`id`	Unused. Union[int, str]
`alignment_mode` v3.5.1	How character indices snap to token boundaries. Options: `"strict"` (no snapping), `"contract"` (span of all tokens completely within the character span), `"expand"` (span of all tokens at least partially covered by the character span). Defaults to `"strict"`. str
`span_id` v3.5.1	An identifier to associate with the span. Union[int, str]
RETURNS	The newly constructed object or `None`. Optional[Span]

Span.similarity methodNeeds model

Make a semantic similarity estimate. The default estimate is cosine similarity using an average of word vectors.

Name	Description
`other`	The object to compare with. By default, accepts `Doc`, `Span`, `Token` and `Lexeme` objects. Union[Doc,Span,Token,Lexeme]
RETURNS	A scalar similarity score. Higher is more similar. float

Calculates the lowest common ancestor matrix for a given Span. Returns LCA matrix containing the integer index of the ancestor, or -1 if no common ancestor is found, e.g. if span excludes a necessary ancestor.

Name	Description
RETURNS	The lowest common ancestor matrix of the `Span`. numpy.ndarray[ndim=2, dtype=int32]

Span.to_array method

Given a list of M attribute IDs, export the tokens to a numpy ndarray of shape (N, M), where N is the length of the document. The values will be 32-bit integers.

Name	Description
`attr_ids`	A list of attributes (int IDs or string names) or a single attribute (int ID or string name). Union[int, str, List[Union[int, str]]]
RETURNS	The exported attributes as a numpy array. Union[numpy.ndarray[ndim=2, dtype=uint64],numpy.ndarray[ndim=1, dtype=uint64]]

Span.ents propertyNeeds model

The named entities that fall completely within the span. Returns a tuple of Span objects.

Name	Description
RETURNS	Entities in the span, one `Span` per entity. Tuple[Span, …]

Span.noun_chunks propertyNeeds model

Iterate over the base noun phrases in the span. Yields base noun-phrase Span objects, if the document has been syntactically parsed. A base noun phrase, or “NP chunk”, is a noun phrase that does not permit other NPs to be nested within it – so no NP-level coordination, no prepositional phrases, and no relative clauses.

If the noun_chunk syntax iterator has not been implemented for the given language, a NotImplementedError is raised.

Name	Description
YIELDS	Noun chunks in the span. Span

Span.as_doc method

Create a new Doc object corresponding to the Span, with a copy of the data.

When calling this on many spans from the same doc, passing in a precomputed array representation of the doc using the array_head and array args can save time.

Name	Description
`copy_user_data`	Whether or not to copy the original doc’s user data. bool
`array_head`	Precomputed array attributes (headers) of the original doc, as generated by `Doc._get_array_attrs()`. Tuple
`array`	Precomputed array version of the original doc as generated by `Doc.to_array`. numpy.ndarray
RETURNS	A `Doc` object of the `Span`’s content. Doc

Span.root propertyNeeds model

The token with the shortest path to the root of the sentence (or the root itself). If multiple tokens are equally high in the tree, the first token is taken.

Name	Description
RETURNS	The root token. Token

Span.conjuncts propertyNeeds model

A tuple of tokens coordinated to span.root.

Name	Description
RETURNS	The coordinated tokens. Tuple[Token, …]

Span.lefts propertyNeeds model

Tokens that are to the left of the span, whose heads are within the span.

Name	Description
YIELDS	A left-child of a token of the span. Token

Span.rights propertyNeeds model

Tokens that are to the right of the span, whose heads are within the span.

Name	Description
YIELDS	A right-child of a token of the span. Token

Span.n_lefts propertyNeeds model

The number of tokens that are to the left of the span, whose heads are within the span.

Name	Description
RETURNS	The number of left-child tokens. int

Span.n_rights propertyNeeds model

The number of tokens that are to the right of the span, whose heads are within the span.

Name	Description
RETURNS	The number of right-child tokens. int

Span.subtree propertyNeeds model

Tokens within the span and tokens which descend from them.

Name	Description
YIELDS	A token within the span, or a descendant from it. Token

Span.has_vector propertyNeeds model

A boolean value indicating whether a word vector is associated with the object.

Name	Description
RETURNS	Whether the span has a vector data attached. bool

Span.vector propertyNeeds model

A real-valued meaning representation. Defaults to an average of the token vectors.

Name	Description
RETURNS	A 1-dimensional array representing the span’s vector. `numpy.ndarray[ndim=1, dtype=float32]

Span.vector_norm propertyNeeds model

The L2 norm of the span’s vector representation.

Name	Description
RETURNS	The L2 norm of the vector representation. float

Span.sent propertyNeeds model

The sentence span that this span is a part of. This property is only available when sentence boundaries have been set on the document by the parser, senter, sentencizer or some custom function. It will raise an error otherwise.

If the span happens to cross sentence boundaries, only the first sentence will be returned. If it is required that the sentence always includes the full span, the result can be adjusted as such:

Name	Description
RETURNS	The sentence span that this span is a part of. Span

Span.sents propertyv3.2.1Needs model

Returns a generator over the sentences the span belongs to. This property is only available when sentence boundaries have been set on the document by the parser, senter, sentencizer or some custom function. It will raise an error otherwise.

If the span happens to cross sentence boundaries, all sentences the span overlaps with will be returned.

Name	Description
RETURNS	A generator yielding sentences this `Span` is a part of Iterable[Span]

Attributes

Name	Description
`doc`	The parent document. Doc
`tensor`	The span’s slice of the parent `Doc`’s tensor. numpy.ndarray
`start`	The token offset for the start of the span. int
`end`	The token offset for the end of the span. int
`start_char`	The character offset for the start of the span. int
`end_char`	The character offset for the end of the span. int
`text`	A string representation of the span text. str
`text_with_ws`	The text content of the span with a trailing whitespace character if the last token has one. str
`orth`	ID of the verbatim text content. int
`orth_`	Verbatim text content (identical to `Span.text`). Exists mostly for consistency with the other attributes. str
`label`	The hash value of the span’s label. int
`label_`	The span’s label. str
`lemma_`	The span’s lemma. Equivalent to `"".join(token.text_with_ws for token in span)`. str
`kb_id`	The hash value of the knowledge base ID referred to by the span. int
`kb_id_`	The knowledge base ID referred to by the span. str
`ent_id`	The hash value of the named entity the root token is an instance of. int
`ent_id_`	The string ID of the named entity the root token is an instance of. str
`id`	The hash value of the span’s ID. int
`id_`	The span’s ID. str
`sentiment`	A scalar value indicating the positivity or negativity of the span. float
`_`	User space for adding custom attribute extensions. Underscore

Suggest edits

Containers

Span

Span.init method

Span.getitem method

Span.iter method

Span.len method

Span.set_extension classmethod

Span.get_extension classmethod

Span.has_extension classmethod

Span.remove_extension classmethod

Span.char_span method

Span.similarity methodNeeds model

Span.get_lca_matrix method

Span.to_array method

Span.ents propertyNeeds model

Span.noun_chunks propertyNeeds model

Span.as_doc method

Span.root propertyNeeds model

Span.conjuncts propertyNeeds model

Span.lefts propertyNeeds model

Span.rights propertyNeeds model

Span.n_lefts propertyNeeds model

Span.n_rights propertyNeeds model

Span.subtree propertyNeeds model

Span.has_vector propertyNeeds model

Span.vector propertyNeeds model

Span.vector_norm propertyNeeds model

Span.sent propertyNeeds model

Span.sents propertyv3.2.1Needs model

Attributes

Containers

Span.__init__ method

Span.__getitem__ method

Span.__iter__ method

Span.__len__ method

Span.set_extension classmethod

Span.get_extension classmethod

Span.has_extension classmethod

Span.remove_extension classmethod

Span.char_span method

Span.similarity methodNeeds model

Span.get_lca_matrix method

Span.to_array method

Span.ents propertyNeeds model

Span.noun_chunks propertyNeeds model

Span.as_doc method

Span.root propertyNeeds model

Span.conjuncts propertyNeeds model

Span.lefts propertyNeeds model

Span.rights propertyNeeds model

Span.n_lefts propertyNeeds model

Span.n_rights propertyNeeds model

Span.subtree propertyNeeds model

Span.has_vector propertyNeeds model

Span.vector propertyNeeds model

Span.vector_norm propertyNeeds model

Span.sent propertyNeeds model

Span.sents propertyv3.2.1Needs model

Attributes

Span.init method

Span.getitem method

Span.iter method

Span.len method