[ot][spam][log] poking at km-infinite was: attempt bits: talk to a language model on list

Undescribed Horrific Abuse, One Victim & Survivor of Many gmkarl at gmail.com
Mon Sep 11 09:22:04 PDT 2023


i’m looking at lm-infinite a little bit, maybe i can make steps on
making it work

paper: https://arxiv.org/pdf/2308.16137.pdf
partial implementation:
https://github.com/kyegomez/LM-Infinite/blob/main/infinite/main.py

it seems like the theory is that if out-of-context tokens are moved to
the very start of the context window in some empirically-determined
way, results on long context outputs radically increase in quality

the partial implementation doesn’t include the new calculation of
position encodings, which is different depending on the model the
length extension is applied to.


More information about the cypherpunks mailing list