Benutzer:Debilski/Artikelentwurf

- Alter Text:

Independent Component Analysis (ICA) löst das Blind-Source-Separation-Problem (BSS), indem angenommen wird, dass die Quellen einer Datenmischung statistisch unabhängig sind. Anders ausgedrückt löst ICA die Gleichung $X=AS$ wobei $X$ , $A$ und $S$ jeweils Matrizen sind, wenn nur $X$ bekannt ist und $A$ und $S$ berechnet werden sollen. Die einzige Annahme die dazu notwendig ist, ist diejenige, dass die Spalten von $S$ statistisch unabhängig sind und höchstens eine der Spalten von $S$ gaußverteilt ist.

Definition

Ein Problem in der Analyse von Daten ist oftmals, dass eine Überlagerung der Quellen stattgefunden hat und man anhand der überlagerten Daten auf die Quellen schließen muss. Ein reales Beispiel ist hier der Cocktailparty-Effekt, der die Fähigkeit des menschlichen Gehörsinns bezeichnet, unter einer allgemeinen Geräuschkulisse spezifische Signale, wie etwa Sprache herauszuhören.

Im Regelfall geht man für ICA davon aus, dass die Zahl der Quellen $D=\dim(s)$ identisch mit der Zahl der vermischten Messungen $J=\dim(x)$ identisch ist, jedoch gibt es auch Vorgehensweisen, die sich mit dem unter- ( $J<D$ ) und dem überbestimmten Fall ( $J>D$ ) auseinandersetzen. Darüber hinaus wird zumeist vereinfachend angenommen, dass in den Daten weder ein Echo, noch eine zeitliche Verzögerung auftritt.

Mathematische Beschreibung

Voraussetzungen

Wichtigste Grundlage für die Anwendung der Independent Component Analysis ist der Zentrale Grenzwertsatz. Dieser besagt, dass eine Summe aus statistisch unabhängigen Zufallsvariablen sich der Gaußverteilung annähert. Über die Messung der Nicht-Gaußheit kann man unter dieser Annahme auf die originalen Quellen schließen.

Mathematisches Modell

The data is represented by the random vector $x=(x_{1},\ldots ,x_{m})$ and the components as the random vector $s=(s_{1},\ldots ,s_{n})$ . The task is to transform the observed data $x$ , using a linear static transformation $s=Wx$ , into maximally independent components $s$ measured by some function $F(s_{1},\ldots ,s_{n})$ of independence.

Generative model

Linear noiseless ICA

The components $x_{i}$ of the observed random vector $x=(x_{1},\ldots ,x_{m})^{T}$ are generated as a sum of the independent components $s_{k}$ , $k=1,\ldots ,n$ :

$x_{i}=a_{i,1}s_{1} \ldots a_{i,k}s_{k} \ldots a_{i,n}s_{n}$

weighted by the mixing weights $a_{i,k}$ .

The same generative model can be written in vectorial form as $x=\sum _{k=1}^{n}s_{k}a_{k}$ , where the observed random vector $x$ is represented by the basis vectors $a_{k}=(a_{1,k},\ldots ,a_{m,k})^{T}$ . The basis vectors $a_{k}$ form the columns of the mixing matrix $A=(a_{1},\ldots ,a_{n})$ and the generative formula can be written as $x=As$ , where $s=(s_{1},\ldots ,s_{n})^{T}$ .

Given the model and realizations (samples) $x_{1},\ldots ,x_{N}$ of the random vector $x$ , the task is to estimate both the mixing matrix $A$ and the sources $s$ . This is done by adaptively calculating the $w$ vectors and setting up a cost function which either maximizes the nongaussianity of the calculated $s_{k}=(w^{T}*x)$ or minimizes the mutual information. In some cases, a priori knowledge of the probability distributions of the sources can be used in the cost function.

The original sources $s$ can be recovered by multiplying the observed signals $x$ with the inverse of the mixing matrix $W=A^{-1}$ , also known as the unmixing matrix. Here it is assumed that the mixing matrix is square ( $n=m$ ).

Algorithmen

Einschränkungen

Siehe auch

-- Englischer Artikel:

When the independence assumption is correct, blind ICA separation of a mixed signal gives very good results. It is also used for signals that are not supposed to be generated by a mixing for analysis purposes. A simple application of ICA is the “cocktail party problem”, where the underlying speech signals are separated from a sample data consisting of people talking simultaneously in a room. Usually the problem is simplified by assuming no time delays and echoes. An important note to consider is that if N sources are present, at least N observations (i.e. microphones) are needed to get the original signals. This constitutes the square (J = D, where D is the input dimension of the data and J is the dimension of the model). Other cases of underdetermined (J < D) and overdetermined (J > D) have been investigated.

The statistical method finds the independent components (aka factors, latent variables or sources) by maximizing the statistical independence of the estimated components. Non-Gaussianity, motivated by the central limit theorem, is one method for measuring the independence of the components. Non-Gaussianity can be measured, for instance, by kurtosis or approximations of negentropy. Mutual information is another popular criterion for measuring statistical independence of signals.

Typical algorithms for ICA use centering, whitening and dimensionality reduction as preprocessing steps in order to simplify and reduce the complexity of the problem for the actual iterative algorithm. Whitening and dimension reduction can be achieved with principal component analysis or singular value decomposition. Whitening ensures that all dimensions are treated equally a priori before the algorithm is run. Algorithms for ICA include infomax, FastICA and JADE, but there are many others also.

Most ICA methods are not able to extract the actual number of source signals, the order of the source signals, nor the signs or the scales of the sources.

ICA is important to blind signal separation and has many practical applications. It is closely related to (or even a special case of) the search for a factorial code of the data, i. e., a new vector-valued representation of each data vector such that it gets uniquely encoded by the resulting code vector (loss-free coding), but the code components are statistically independent.

Mathematical definitions

Linear independent component analysis can be divided into noiseless and noisy cases, where noiseless ICA is a special case of noisy ICA. Nonlinear ICA should be considered as a separate case.

General definition

The data is represented by the random vector $x=(x_{1},\ldots ,x_{m})$ and the components as the random vector $s=(s_{1},\ldots ,s_{n})$ . The task is to transform the observed data $x$ , using a linear static transformation $s=Wx$ , into maximally independent components $s$ measured by some function $F(s_{1},\ldots ,s_{n})$ of independence.

Generative model

Linear noiseless ICA

The components $x_{i}$ of the observed random vector $x=(x_{1},\ldots ,x_{m})^{T}$ are generated as a sum of the independent components $s_{k}$ , $k=1,\ldots ,n$ :

$x_{i}=a_{i,1}s_{1} \ldots a_{i,k}s_{k} \ldots a_{i,n}s_{n}$

weighted by the mixing weights $a_{i,k}$ .

The same generative model can be written in vectorial form as $x=\sum _{k=1}^{n}s_{k}a_{k}$ , where the observed random vector $x$ is represented by the basis vectors $a_{k}=(a_{1,k},\ldots ,a_{m,k})^{T}$ . The basis vectors $a_{k}$ form the columns of the mixing matrix $A=(a_{1},\ldots ,a_{n})$ and the generative formula can be written as $x=As$ , where $s=(s_{1},\ldots ,s_{n})^{T}$ .

Given the model and realizations (samples) $x_{1},\ldots ,x_{N}$ of the random vector $x$ , the task is to estimate both the mixing matrix $A$ and the sources $s$ . This is done by adaptively calculating the $w$ vectors and setting up a cost function which either maximizes the nongaussianity of the calculated $s_{k}=(w^{T}*x)$ or minimizes the mutual information. In some cases, a priori knowledge of the probability distributions of the sources can be used in the cost function.

The original sources $s$ can be recovered by multiplying the observed signals $x$ with the inverse of the mixing matrix $W=A^{-1}$ , also known as the unmixing matrix. Here it is assumed that the mixing matrix is square ( $n=m$ ).

Linear noisy ICA

With the added assumption of zero-mean and uncorrelated Gaussian noise $n\sim N(0,\operatorname {diag} (\Sigma ))$ , the ICA model takes the form $x=As n$ .

Nonlinear ICA

The mixing of the sources does not need to be linear. Using a nonlinear mixing function $f(\cdot |\theta )$ with parameters $\theta$ the nonlinear ICA model is $x=f(s|\theta ) n$ .

Identifiability

The identifiability of independent component analysis requires that:

Only one of the sources $s_{k}$ can be Gaussian,
The number of observed mixtures, $m$ , must be at least as large as the number of estimated components $n$ : $m\geq n$ ,
The mixing matrix $A$ must be of full rank.

External links

Independent Component Analysis: a new concept by Pierre Comon. The original 1994 paper describing the concept of ICA.
What is independent component analysis? by Aapo Hyvärinen
Nonlinear ICA, Unsupervised Learning, Redundancy Reduction by Jürgen Schmidhuber, with links to papers
A Brief Introduction to Independent Component Analysis by JV Stone, 2005 (7 pages).
Introductory chapter of the book A. Hyvärinen, J. Karhunen, E. Oja (2001). Independent Component Analysis
FastICA as a package for Matlab, in R language, C , and Python
ICALAB Toolboxes for Matlab, developed at RIKEN
High Performance Signal Analysis Toolkit provides C implementations of FastICA and Infomax
Free software for ICA by JV Stone.
ICA toolbox Matlab tools for ICA with Bell-Sejnowski, Molgedey-Schuster and mean field ICA. Developed at DTU.
Demonstration of the cocktail party problem
EEGLAB Toolbox ICA of EEG for Matlab, developed at UCSD.
FMRLAB Toolbox ICA of fMRI for Matlab, developed at UCSD
Discussion of ICA used in a biomedical shape-representation context

Benutzer:Debilski/Artikelentwurf

Inhaltsverzeichnis

Definition