Contents

First touch to mamba

Contents

https://prod-files-secure.s3.us-west-2.amazonaws.com/9ae3228c-6982-46ec-8946-abb7d53f72af/f4425041-9cff-41b3-9da7-b628790af0b0/Untitled.jpeg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIAT73L2G45HZZMZUHI%2F20240817%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20240817T140757Z&X-Amz-Expires=3600&X-Amz-Signature=d55f34a10f55e3dcd1c148b4408a8117219cbb8cd809a8ad31562e8c90ea4368&X-Amz-SignedHeaders=host&x-id=GetObject

From the perceptive of the structure of mamba, this is a discrete selective space machine that runs in linear time using linear space.

lets say, matrix A is a state space matrix for the last system status h(t). we then can calculate the next h(t+1) based on the following equation:

$$ \begin{equation}h(t) = Ah(t-1) + Bx(t)\end{equation} $$

$$ y = C*h(t) $$

Where B is a weight for input x(t) and C is the weight for output y.

We define A matrix in a HiPPO matrix manner.

$$ A = \begin{cases} \sqrt{(2n+1)(2k+1)} && everything-below -diagonal \ n+1 && on-diagonal \ 0 && everything-beyond-diagonal \end{cases} $$

By doing this, we can use SVD partition for reducing the computing demand.

$$ A=V\Lambda V^* - PQ^T = V(\Lambda - (V^*P)(V^Q)^)V $$

This can be done