A Julia package wrapping catch22, which is a set of 22 time-series features shown by Lubba et al. (2019) to be performant in a range of time-series classification problems.
The catch22 repository provides these 22 features, originally coded in Matlab as part of the hctsa toolbox, as C functions (in addition to Matlab and Python wrappers). This package simply uses Julia's ccall
to wrap these C functions from a shared library that is accessed through catch22_jll and compiled by the fantastic BinaryBuilder package.
Below we provide a brief getting-started guide to using Catch22.jl. For more detailed information on the catch22 feature set, such as in-depth descriptions of each feature and a list of publications that use catch22, see the catch22 GitBook documentation.
using Pkg
Pkg.add("Catch22")
using Catch22
The input time series can be provided as a Vector{Float64}
or Array{Float64, 2}
. If an array is provided, the time series must occupy its columns. For example, this package contains a few test time series from catch22:
𝐱 = Catch22.testdata[:testSinusoid] # a Vector{Float64}
X = randn(1000, 10) # an Array{Float64, 2} with 10 time series
A list of features (as symbols) can be obtained with getnames(catch22)
and their short descriptions with getdescriptions(catch22)
. Each feature can be evaluated for a time series array or vector with the catch22
FeatureSet
. For example, the feature DN_HistogramMode_5
can be evaluated using:
f = catch22[:DN_HistogramMode_5](𝐱) # Returns a scalar Float64
𝐟 = catch22[1](X) # Returns a 1×10 Matrix{Float64}
All features are returned as Float64's, even though some may be constrained to the integers.
Alternatively, functions that calculate each feature individually are exported. DN_HistogramMode_5
can be evaluated with:
f = DN_HistogramMode_5(𝐱)
All catch22 features can be evaluated with:
𝐟 = catch22(𝐱)
F = catch22(X)
If an array is provided, containing one time series in each of N columns, then a 22×N FeatureArray
of feature values will be returned (a subtype of AbstractDimArray).
A FeatureArray
has most of the properties and methods of an Array but is annotated with feature names that can be accessed with getnames(F)
.
If a vector is provided (a single time series) then a vector of feature values will be returned as a FeatureVector
, a one-dimensional FeatureArray
.
Finally, note that since catch22
is a FeatureSet
it can be indexed with a vector of feature names as symbols to calculate a FeatureArray
for a subset of catch22. For details on the Feature
, FeatureSet
and FeatureArray
types check out the package docs.
Calculating features for a single time series of a given length: