text-generation-inference/backends/llamacpp
Mohit Sharma 02715dc53f
Add option to configure prometheus port (#3187)
* add prometheus port

* fix doc

* add port for trtllm and llamacpp

* Fixing format after rebase.

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2025-04-23 20:43:25 +05:30
..
src Add option to configure prometheus port (#3187) 2025-04-23 20:43:25 +05:30
build.rs Update the llamacpp backend (#3022) 2025-03-11 09:19:01 +01:00
Cargo.toml Update the llamacpp backend (#3022) 2025-03-11 09:19:01 +01:00
README.md [Backend] Add Llamacpp backend (#2975) 2025-02-14 13:40:57 +01:00
requirements.txt Update the llamacpp backend (#3022) 2025-03-11 09:19:01 +01:00

Llamacpp backend

If all your dependencies are installed at the system level, running cargo build should be sufficient. However, if you want to experiment with different versions of llama.cpp, some additional setup is required.

Install llama.cpp

LLAMACPP_PREFIX=$(pwd)/llama.cpp.out

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build \
    -DCMAKE_INSTALL_PREFIX="$LLAMACPP_PREFIX" \
    -DLLAMA_BUILD_COMMON=OFF \
    -DLLAMA_BUILD_TESTS=OFF \
    -DLLAMA_BUILD_EXAMPLES=OFF \
    -DLLAMA_BUILD_SERVER=OFF
cmake --build build --config Release -j
cmake --install build

Build TGI

PKG_CONFIG_PATH="$LLAMACPP_PREFIX/lib/pkgconfig" cargo build