mirror of https://github.com/wassname/Open-Assistant.git synced 2026-06-27 16:10:30 +08:00

Files

T

Yannic Kilcher 1709dc0324 Initial implementation of the inference system (#869 )

* very primitive implementation of inference

* re-worked with security in mind

* removed polling from clients

* switched workers to websockets

* implemented back and forth chats

2023-01-21 22:38:18 +01:00

467 B

Raw Permalink Blame History

OpenAssistant Inference Server

Workers communicate with the /work endpoint via Websocket. They provide their configuration and if a task is available, the server returns it. The worker then performs the task and returns the result in a streaming fashion to the server, also via websocket.

Clients first call /chat to make a new chat, then add to that via /chat/<id>/message. The response is a SSE event source, which will send tokens as they are available.

467 B Raw Permalink Blame History

OpenAssistant Inference Server

467 B

Raw Permalink Blame History