mirror of
https://github.com/wassname/Open-Assistant.git
synced 2026-06-30 16:40:05 +08:00
1709dc0324
* very primitive implementation of inference * re-worked with security in mind * removed polling from clients * switched workers to websockets * implemented back and forth chats
11 lines
467 B
Markdown
11 lines
467 B
Markdown
# OpenAssistant Inference Server
|
|
|
|
Workers communicate with the `/work` endpoint via Websocket. They provide their
|
|
configuration and if a task is available, the server returns it. The worker then
|
|
performs the task and returns the result in a streaming fashion to the server,
|
|
also via websocket.
|
|
|
|
Clients first call `/chat` to make a new chat, then add to that via
|
|
`/chat/<id>/message`. The response is a SSE event source, which will send tokens
|
|
as they are available.
|