07/10/2025
Reactive Transformer research
We just published research paper that's introducing our Reactive Transformer (RxT) architecture. We would be grateful if you could check it and upvote on HuggingFace Daily Papers - https://huggingface.co/papers/2510.03561
Architecture is based on stateful real-time processing with innovational asynchronous memory update. Instead of reprocessing all the conversation history for each message, it's processing only single query with all the context moved to dedicated memory layers. Memory is updated after generating the answer, so it's not influencing latency - in tests, time to first token was almost the same as generating a single token. It has also better quality/accuracy in multi-turn dialogue than the same size stateless decoder-only model.
Initial experiments were small scale (12M to 160M params models trained on simple synthetic datasets), but today we started training of bigger 270M params model on real data
Join the discussion on this paper page