mirror of
https://github.com/wassname/Open-Assistant.git
synced 2026-07-01 16:50:12 +08:00
7961fe3b71
Fixes #913
99 lines
2.4 KiB
Markdown
99 lines
2.4 KiB
Markdown
# OpenAssistant Inference
|
|
|
|
Preliminary implementation of the inference engine for OpenAssistant.
|
|
|
|
## Development Variant 1 (docker compose)
|
|
|
|
The services of the inference stack are prefixed with "inference-" in the
|
|
[unified compose descriptor](../docker-compose.yaml). <br/> Prior to building
|
|
those, please ensure that you have Docker's new
|
|
[BuildKit](https://docs.docker.com/build/buildkit/) backend enabled. See the
|
|
[FAQ](https://projects.laion.ai/Open-Assistant/docs/faq#enable-dockers-buildkit-backend)
|
|
for more info.
|
|
|
|
To build the services, run:
|
|
|
|
```shell
|
|
docker compose --profile inference build
|
|
```
|
|
|
|
Spin up the stack:
|
|
|
|
```shell
|
|
docker compose --profile inference up -d
|
|
```
|
|
|
|
Tail the logs:
|
|
|
|
```shell
|
|
docker compose logs -f \
|
|
inference-server \
|
|
inference-worker \
|
|
inference-text-client \
|
|
inference-text-generation-server
|
|
```
|
|
|
|
Attach to the text-client, and start chatting:
|
|
|
|
```shell
|
|
docker attach open-assistant-inference-text-client-1
|
|
```
|
|
|
|
> **Note:** In the last step, `open-assistant-inference-text-client-1` refers to
|
|
> the name of the `text-client` container started in step 2.
|
|
|
|
> **Note:** The compose file contains the bind mounts enabling you to develop on
|
|
> the modules of the inference stack, and the `oasst-shared` package, without
|
|
> rebuilding.
|
|
|
|
> **Note:** You can spin up any number of workers by adjusting the number of
|
|
> replicas of the `inference-worker` service to your liking.
|
|
|
|
> **Note:** Please wait for the `inference-text-generation-server` service to
|
|
> output `{"message":"Connected"}` before starting to chat.
|
|
|
|
## Development Variant 2 (you'll need tmux)
|
|
|
|
Run `./full-dev-setup.sh` to start the full development setup. Make sure to wait
|
|
until the 2nd terminal is ready and says `{"message":"Connected"}` before
|
|
entering input into the last terminal.
|
|
|
|
## Development Variant 3 (you'll need multiple terminals)
|
|
|
|
Run a redis container (or use the one of the general docker compose file):
|
|
|
|
```bash
|
|
docker run --rm -it -p 6379:6379 redis
|
|
```
|
|
|
|
Run the inference server:
|
|
|
|
```bash
|
|
cd server
|
|
pip install -r requirements.txt
|
|
uvicorn main:app --reload
|
|
```
|
|
|
|
Run one (or more) workers:
|
|
|
|
```bash
|
|
cd worker
|
|
pip install -r requirements.txt
|
|
python __main__.py
|
|
```
|
|
|
|
For the worker, you'll also want to have the text-generation-inference server
|
|
running:
|
|
|
|
```bash
|
|
docker run --rm -it -p 8001:80 -e MODEL_ID=distilgpt2 ykilcher/text-generation-inference
|
|
```
|
|
|
|
Run the client:
|
|
|
|
```bash
|
|
cd text-client
|
|
pip install -r requirements.txt
|
|
python __main__.py
|
|
```
|