First touch to mamba
From the perceptive of the structure of mamba, this is a discrete selective space machine that runs in linear time using linear space.
lets say, matrix A is a state space matrix for the last system status h(t). we then can calculate the next h(t+1) based on the following equation:
$$ \begin{equation}h(t) = Ah(t-1) + Bx(t)\end{equation} $$
$$ y = C*h(t) $$
Where B is a weight for input x(t) and C is the weight for output y.
We define A matrix in a HiPPO matrix manner.
$$ A = \begin{cases} \sqrt{(2n+1)(2k+1)} && everything-below -diagonal \ n+1 && on-diagonal \ 0 && everything-beyond-diagonal \end{cases} $$
By doing this, we can use SVD partition for reducing the computing demand.
$$ A=V\Lambda V^* - PQ^T = V(\Lambda - (V^*P)(V^Q)^)V $$
This can be done