Einsum

Last updated 17 days ago

Was this helpful?

Einsum

Please read the post for better understanding of einsum.

Problem with conventional tensor contraction codes

If you've ever wrote a pytorch code to build a Neural Network, then you have experienced tensor contraction code. Tensor contraction is a fancy term meaning combine tensors and build a tensor

Dot product, Cross product, Matrix multiplication, Matrix element-wise multiplication etc. All kinds of operations we know are actually a subset of tensor contraction.

In this post, I will give you powerful tool to express tensor contraction in a single line~~(without any dirty unsqueeze, transpose, axis swap)~~

Einsum

Let's say we have two tensors $P \in \mathbb{R}^{p_1 \times p_2 \times ... p_n} , Q \in \mathbb{R}^{q_1 \times q_2 \times ... q_m}$

Notation $P(i_1, i_2, ..., i_n)$ means element of index $(i_1, i_2, ..., i_n)$

Then we can express einsum as following:

\sum_a \sum_b P(i_1, i_2, ..., i_{n-1}, a) \ Q(b, j_2, j_3, ..., j_m) \\ = i_1i_2...i_{n-1}a, \ bj_2j_3...j_m \rightarrow i_1i_2...i_{n-1}j_2j_3...j_m

Why we use einsum?

Einsum may be more optimizable because it compress various operations into compact expression. This gives compiler more opportunity to optimize.

Also, it is beutiful!

Hands-on Einsum

You can try out some examples!

References

Last updated 17 days ago

Was this helpful?

Please read the post for better understanding of einsum.

Problem with conventional tensor contraction codes

If you've ever wrote a pytorch code to build a Neural Network, then you have experienced tensor contraction code. Tensor contraction is a fancy term meaning combine tensors and build a tensor

Dot product, Cross product, Matrix multiplication, Matrix element-wise multiplication etc. All kinds of operations we know are actually a subset of tensor contraction.

In this post, I will give you powerful tool to express tensor contraction in a single line~~(without any dirty unsqueeze, transpose, axis swap)~~

Einsum

Let's say we have two tensors $P \in \mathbb{R}^{p_1 \times p_2 \times ... p_n} , Q \in \mathbb{R}^{q_1 \times q_2 \times ... q_m}$

Notation $P(i_1, i_2, ..., i_n)$ means element of index $(i_1, i_2, ..., i_n)$

Then we can express einsum as following:

\sum_a \sum_b P(i_1, i_2, ..., i_{n-1}, a) \ Q(b, j_2, j_3, ..., j_m) \\ = i_1i_2...i_{n-1}a, \ bj_2j_3...j_m \rightarrow i_1i_2...i_{n-1}j_2j_3...j_m

Why we use einsum?

Einsum may be more optimizable because it compress various operations into compact expression. This gives compiler more opportunity to optimize.

Also, it is beutiful!

Hands-on Einsum

You can try out some examples!

References

[1]