Thoughts on state space models/the prospect of whether we'll see a move from a predominantly transformer-based era into a more mixed architecture period in 2024? Seems like there's a good chance to get past some of the existing limitations on long sequences, and really bring something like working memory to AI, using SSMs
Thoughts on state space models/the prospect of whether we'll see a move from a predominantly transformer-based era into a more mixed architecture period in 2024? Seems like there's a good chance to get past some of the existing limitations on long sequences, and really bring something like working memory to AI, using SSMs