How to efficiently compute the jacobian matrix of a model with respect to it's parameters #8538

VirgileUniv · 2024-06-04T09:29:27Z

VirgileUniv
Jun 4, 2024

Hello everyone.

I'm trying to implement Neural Galerkin scheme in taichi (https://arxiv.org/pdf/2203.01360).

Hence my goal is to compute (efficiently) the Jacobian matrix of a model with respect to it's parameters.

Let's take the simple example of a single (unbiased) linear unit with $1$ input and $n$ output. If we denote the model as a function $U$ of it's parameters $w_0,...,w_n$ and input $x$, we have $U(w_0,...,w_n, x) = (w_0 \times x,...,w_n \times x)$. Hence the jacobian will be :

$$\nabla_{w_0,...,w_n}U = \begin{pmatrix} \frac{\partial U}{\partial w_0} \\ \vdots \\ \frac{\partial U}{\partial w_n} \end{pmatrix}=x \times Id_n$$

My solution for the moment is to set i-th component of the output.grad to 1, the rest to 0 and hence retrieve the i-th line of the jacobian matrix in weight.grad after running linear.grad() (where linear is the kernel computing the linear unit). I'll put the code down bellow.

However this seems pretty unefficient to me, and i wonder if there isn't a way to do it much faster. Indeed even though i have no idea on how does kernel.grad works internally, i'm pretty confident that it does compute the jacobian matrix at some point, before doing a vector-matrix mutiplication (with the output.grad most certainly). If I'm not mistaken on that point, i'll just need to retrieve that matrix before the vector multiplication, but I have no idea on how to do it (or even certain that i'm not mistaken on that point).

If anyone can help me on that, or just have some info on the interal of kernel.grad it would be much appreciated.

Here is the code on the example of a (unbiased) linear unit :

import taichi as ti
real = ti.f32
ti.init(default_fp=real)

import numpy as np

n_in = 1
n_out = 5
input_field = ti.field(dtype=real, shape=(n_in), needs_grad=True)
for i in range(n_in):
    input_field[i] = -(i 1)

output_field = ti.field(dtype=real, shape=(n_out), needs_grad=True)

weight = ti.field(dtype=real, shape=(n_out, n_in), needs_grad=True)
weight.from_numpy(np.random.randn(n_out, n_in))

@ti.kernel
def linear(input:ti.template(), output:ti.template(), weight:ti.template()):
    for i, j in ti.ndrange(weight.shape[0], weight.shape[1]):
        output[i]  = weight[i, j] * input[j]

def line_jacobian(i:int, input_field:ti.template(), output_field:ti.template(), weight:ti.template()):

    output_field.fill(0.)
    output_field.grad.fill(0.)
    output_field.grad[i] = 1.

    input_field.grad.fill(0.)
    weight.grad.fill(0.)

    linear(input_field, output_field, weight)
    linear.grad(input_field, output_field, weight)

    return weight.grad.to_numpy().flatten()

def compute_Jacobian(input_field, output_field, weight):
    J = np.zeros((n_out, weight.shape[0]*weight.shape[1]))
    for i in range(n_out):
        J[i] = line_jacobian(i, input_field, output_field, weight)
    return J

J = compute_Jacobian(input_field, output_field, weight)
print(J)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to efficiently compute the jacobian matrix of a model with respect to it's parameters #8538

{{title}}

Replies: 0 comments

Select a reply

How to efficiently compute the jacobian matrix of a model with respect to it's parameters #8538

VirgileUniv Jun 4, 2024

Replies: 0 comments

VirgileUniv
Jun 4, 2024