Commit 578ba724 authored 3 years ago by Gregor Kobsik

implement the fast recurrent sampler to speed up inference

- add API for token_embedding, transformer_module and generative_head to architecture implementation
- extend composite head to process only last depth layer
- extend ShapeSampler to accept "recurrent" implementation. TODO: set recurrent sampler for "fast" architecture as default.
- implemented recurrent sampler + basic/substitution/composite token generator

parent 1f7c86a6

No related branches found

No related tags found

No related merge requests found

Show whitespace changes

Inline Side-by-side

Showing with 589 additions and 20 deletions

Please register or to comment