Legacy functions and architectures
The spacy-legacy
package includes
outdated registered functions and architectures. It is installed automatically
as a dependency of spaCy, and provides backwards compatibility for archived
functions that may still be used in projects.
You can find the detailed documentation of each such legacy function on this page.
Architectures
These functions are available from @spacy.registry.architectures
.
spacy.Tok2Vec.v1
The spacy.Tok2Vec.v1
architecture was expecting an encode
model of type
Model[Floats2D, Floats2D]
such as spacy.MaxoutWindowEncoder.v1
or
spacy.MishWindowEncoder.v1
.
Construct a tok2vec model out of two subnetworks: one for embedding and one for encoding. See the “Embed, Encode, Attend, Predict” blog post for background.
Name | Description |
---|---|
embed | Embed tokens into context-independent word vector representations. For example, CharacterEmbed or MultiHashEmbed. Model[List[Doc], List[Floats2d]] |
encode | Encode context into the embeddings, using an architecture such as a CNN, BiLSTM or transformer. For example, MaxoutWindowEncoder.v1. Model[Floats2d,Floats2d] |
CREATES | The model using the architecture. Model[List[Doc], List[Floats2d]] |
spacy.MaxoutWindowEncoder.v1
The spacy.MaxoutWindowEncoder.v1
architecture was producing a model of type
Model[Floats2D, Floats2D]
. Since spacy.MaxoutWindowEncoder.v2
, this has been
changed to output type Model[List[Floats2d], List[Floats2d]]
.
Encode context using convolutions with maxout activation, layer normalization and residual connections.
Name | Description |
---|---|
width | The input and output width. These are required to be the same, to allow residual connections. This value will be determined by the width of the inputs. Recommended values are between 64 and 300 . int |
window_size | The number of words to concatenate around each token to construct the convolution. Recommended value is 1 . int |
maxout_pieces | The number of maxout pieces to use. Recommended values are 2 or 3 . int |
depth | The number of convolutional layers. Recommended value is 4 . int |
CREATES | The model using the architecture. Model[Floats2d,Floats2d] |
spacy.MishWindowEncoder.v1
The spacy.MishWindowEncoder.v1
architecture was producing a model of type
Model[Floats2D, Floats2D]
. Since spacy.MishWindowEncoder.v2
, this has been
changed to output type Model[List[Floats2d], List[Floats2d]]
.
Encode context using convolutions with
Mish
activation, layer normalization
and residual connections.
Name | Description |
---|---|
width | The input and output width. These are required to be the same, to allow residual connections. This value will be determined by the width of the inputs. Recommended values are between 64 and 300 . int |
window_size | The number of words to concatenate around each token to construct the convolution. Recommended value is 1 . int |
depth | The number of convolutional layers. Recommended value is 4 . int |
CREATES | The model using the architecture. Model[Floats2d,Floats2d] |
spacy.HashEmbedCNN.v1
Identical to spacy.HashEmbedCNN.v2
except
using spacy.StaticVectors.v1
if vectors are included.
spacy.MultiHashEmbed.v1
Identical to spacy.MultiHashEmbed.v2
except with spacy.StaticVectors.v1
if vectors are
included.
spacy.CharacterEmbed.v1
Identical to spacy.CharacterEmbed.v2
except using spacy.StaticVectors.v1
if vectors are
included.
spacy.TextCatEnsemble.v1
The spacy.TextCatEnsemble.v1
architecture built an internal tok2vec
and
linear_model
. Since spacy.TextCatEnsemble.v2
, this has been refactored so
that the TextCatEnsemble
takes these two sublayers as input.
Stacked ensemble of a bag-of-words model and a neural network model. The neural network has an internal CNN Tok2Vec layer and uses attention.
Name | Description |
---|---|
exclusive_classes | Whether or not categories are mutually exclusive. bool |
pretrained_vectors | Whether or not pretrained vectors will be used in addition to the feature vectors. bool |
width | Output dimension of the feature encoding step. int |
embed_size | Input dimension of the feature encoding step. int |
conv_depth | Depth of the tok2vec layer. int |
window_size | The number of contextual vectors to concatenate from the left and from the right. int |
ngram_size | Determines the maximum length of the n-grams in the BOW model. For instance, ngram_size=3 would give unigram, trigram and bigram features. int |
dropout | The dropout rate. float |
nO | Output dimension, determined by the number of different labels. If not set, the TextCategorizer component will set it when initialize is called. Optional[int] |
CREATES | The model using the architecture. Model[List[Doc],Floats2d] |
spacy.TextCatCNN.v1
Since spacy.TextCatCNN.v2
, this architecture has become resizable, which means
that you can add labels to a previously trained textcat. TextCatCNN
v1 did not
yet support that. TextCatCNN
has been replaced by the more general
TextCatReduce
layer. TextCatCNN
is
identical to TextCatReduce
with use_reduce_mean=true
,
use_reduce_first=false
, reduce_last=false
and use_reduce_max=false
.
A neural network model where token vectors are calculated using a CNN. The vectors are mean pooled and used as features in a feed-forward network. This architecture is usually less accurate than the ensemble, but runs faster.
Name | Description |
---|---|
exclusive_classes | Whether or not categories are mutually exclusive. bool |
tok2vec | The tok2vec layer of the model. Model |
nO | Output dimension, determined by the number of different labels. If not set, the TextCategorizer component will set it when initialize is called. Optional[int] |
CREATES | The model using the architecture. Model[List[Doc],Floats2d] |
spacy.TextCatCNN.v2
A neural network model where token vectors are calculated using a CNN. The vectors are mean pooled and used as features in a feed-forward network. This architecture is usually less accurate than the ensemble, but runs faster.
TextCatCNN
has been replaced by the more general
TextCatReduce
layer. TextCatCNN
is
identical to TextCatReduce
with use_reduce_mean=true
,
use_reduce_first=false
, reduce_last=false
and use_reduce_max=false
.
Name | Description |
---|---|
exclusive_classes | Whether or not categories are mutually exclusive. bool |
tok2vec | The tok2vec layer of the model. Model |
nO | Output dimension, determined by the number of different labels. If not set, the TextCategorizer component will set it when initialize is called. Optional[int] |
CREATES | The model using the architecture. Model[List[Doc],Floats2d] |
TextCatCNN.v1 had the exact same signature, but was not yet resizable. Since v2, new labels can be added to this component, even after training.
spacy.TextCatBOW.v1
Since spacy.TextCatBOW.v2
, this architecture has become resizable, which means
that you can add labels to a previously trained textcat. TextCatBOW
v1 did not
yet support that. Versions of this model before spacy.TextCatBOW.v3
used an
erroneous sparse linear layer that only used a small number of the allocated
parameters.
An n-gram “bag-of-words” model. This architecture should run much faster than the others, but may not be as accurate, especially if texts are short.
Name | Description |
---|---|
exclusive_classes | Whether or not categories are mutually exclusive. bool |
ngram_size | Determines the maximum length of the n-grams in the BOW model. For instance, ngram_size=3 would give unigram, trigram and bigram features. int |
no_output_layer | Whether or not to add an output layer to the model (Softmax activation if exclusive_classes is True , else Logistic ). bool |
nO | Output dimension, determined by the number of different labels. If not set, the TextCategorizer component will set it when initialize is called. Optional[int] |
CREATES | The model using the architecture. Model[List[Doc],Floats2d] |
spacy.TextCatBOW.v2
Versions of this model before spacy.TextCatBOW.v3
used an erroneous sparse
linear layer that only used a small number of the allocated parameters.
An n-gram “bag-of-words” model. This architecture should run much faster than the others, but may not be as accurate, especially if texts are short.
Name | Description |
---|---|
exclusive_classes | Whether or not categories are mutually exclusive. bool |
ngram_size | Determines the maximum length of the n-grams in the BOW model. For instance, ngram_size=3 would give unigram, trigram and bigram features. int |
no_output_layer | Whether or not to add an output layer to the model (Softmax activation if exclusive_classes is True , else Logistic ). bool |
nO | Output dimension, determined by the number of different labels. If not set, the TextCategorizer component will set it when initialize is called. Optional[int] |
CREATES | The model using the architecture. Model[List[Doc],Floats2d] |
spacy.TransitionBasedParser.v1
Identical to
spacy.TransitionBasedParser.v2
except the use_upper
was set to True
by default.
Layers
These functions are available from @spacy.registry.layers
.
spacy.StaticVectors.v1
Identical to spacy.StaticVectors.v2
except
for the handling of tokens without vectors.
Loggers
These functions are available from @spacy.registry.loggers
.
spacy.ConsoleLogger.v1
Writes the results of a training step to the console in a tabular format.
Note that the cumulative loss keeps increasing within one epoch, but should start decreasing across epochs.
Name | Description |
---|---|
progress_bar | Whether the logger should print the progress bar bool |
Logging utilities for spaCy are implemented in the
spacy-loggers
repo, and the
functions are typically available from @spacy.registry.loggers
.
More documentation can be found in that repo’s readme file.