Skip to content
Snippets Groups Projects
Commit 578ba724 authored by Gregor Kobsik's avatar Gregor Kobsik
Browse files

implement the fast recurrent sampler to speed up inference

- add API for token_embedding, transformer_module and generative_head to architecture implementation
- extend composite head to process only last depth layer
- extend ShapeSampler to accept "recurrent" implementation. TODO: set recurrent sampler for "fast" architecture as default.
- implemented recurrent sampler + basic/substitution/composite token generator
parent 1f7c86a6
No related branches found
No related tags found
No related merge requests found
Showing with 589 additions and 20 deletions
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment