Skip to content

brendanjohnharris/Catch22.jl

Repository files navigation

Catch22.jl

Build Status Coverage DOI

A Julia package wrapping catch22, which is a set of 22 time-series features shown by Lubba et al. (2019) to be performant in a range of time-series classification problems.

The catch22 repository provides these 22 features, originally coded in Matlab as part of the hctsa toolbox, as C functions (in addition to Matlab and Python wrappers). This package simply uses Julia's ccall to wrap these C functions from a shared library that is accessed through catch22_jll and compiled by the fantastic BinaryBuilder package.

Below we provide a brief getting-started guide to using Catch22.jl. For more detailed information on the catch22 feature set, such as in-depth descriptions of each feature and a list of publications that use catch22, see the catch22 GitBook documentation.


Usage

Installation

using Pkg
Pkg.add("Catch22")
using Catch22

Input time series

The input time series can be provided as a Vector{Float64} or Array{Float64, 2}. If an array is provided, the time series must occupy its columns. For example, this package contains a few test time series from catch22:

𝐱 = Catch22.testdata[:testSinusoid] # a Vector{Float64}
X = randn(1000, 10) # an Array{Float64, 2} with 10 time series

Evaluating a feature

A list of features (as symbols) can be obtained with getnames(catch22) and their short descriptions with getdescriptions(catch22). Each feature can be evaluated for a time series array or vector with the catch22 FeatureSet. For example, the feature DN_HistogramMode_5 can be evaluated using:

f = catch22[:DN_HistogramMode_5](𝐱) # Returns a scalar Float64
𝐟 = catch22[1](X) # Returns a 1×10 Matrix{Float64}

All features are returned as Float64's, even though some may be constrained to the integers.

Alternatively, functions that calculate each feature individually are exported. DN_HistogramMode_5 can be evaluated with:

f = DN_HistogramMode_5(𝐱)

Evaluating a feature set

All catch22 features can be evaluated with:

𝐟 = catch22(𝐱)
F = catch22(X)

If an array is provided, containing one time series in each of N columns, then a 22×N FeatureArray of feature values will be returned (a subtype of AbstractDimArray). A FeatureArray has most of the properties and methods of an Array but is annotated with feature names that can be accessed with getnames(F). If a vector is provided (a single time series) then a vector of feature values will be returned as a FeatureVector, a one-dimensional FeatureArray.

Finally, note that since catch22 is a FeatureSet it can be indexed with a vector of feature names as symbols to calculate a FeatureArray for a subset of catch22. For details on the Feature, FeatureSet and FeatureArray types check out the package docs.


Single-threaded performance

Calculating features for a single time series of a given length: scaling

Multithreaded performance

Calculating features for 100 time series of a given length: multithread_scaling