mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-06-24 10:00:16 +00:00
* chore(neuron): bump version to 0.2.0 * refactor(neuron): use named parameters in inputs helpers This allows to hide the differences between the two backends in terms of input parameters. * refactor(neuron): remove obsolete code paths * fix(neuron): use neuron_config whenever possible * fix(neuron): use new cache import path * fix(neuron): neuron config is not stored in config anymore * fix(nxd): adapt model retrieval to new APIs * fix(generator): emulate greedy in sampling parameters When on-device sampling is enabled, we need to emulate the greedy behaviour using top-k=1, top-p=1, temperature=1. * test(neuron): update models and expectations * feat(neuron): support on-device sampling * fix(neuron): adapt entrypoint * tests(neuron): remove obsolete models * fix(neuron): adjust test expectations for llama on nxd
17 lines
295 B
Bash
Executable File
17 lines
295 B
Bash
Executable File
#!/bin/bash
|
|
set -e -o pipefail -u
|
|
|
|
export ENV_FILEPATH=$(mktemp)
|
|
|
|
trap "rm -f ${ENV_FILEPATH}" EXIT
|
|
|
|
touch $ENV_FILEPATH
|
|
|
|
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
|
|
|
|
${SCRIPT_DIR}/tgi_entry_point.py $@
|
|
|
|
source $ENV_FILEPATH
|
|
|
|
exec text-generation-launcher $@
|