I had to open this PR since I initially worked from my fork, and it
requires a handful of work to trigger a new github action on my fork's
specific branch (couldn't find a way, at least, despite trying all of
them).
---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
IDK what else to add in this guide, I looked for relevant code in TGI
codebase and saw that it's used in quantization as well (maybe I could
add that?)
PR for conceptual guide on flash attention. I will add more info unless
I'm told otherwise.
---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
I added ToC for docs v1 & started setting up for doc-builder. cc @Narsil
@osanseviero
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: osanseviero <osanseviero@gmail.com>
Co-authored-by: Mishig <mishig.davaadorj@coloradocollege.edu>