Llama
Inference based on our fork of https://github.com/ggerganov/llama.cpp
Instances
general
General instance
Instance Parameters
ctx_size
Size of the context
integer
batch_size
Size of the single batch
integer
ubatch_size
Size of the context
integer
Operations
run
Run the llama.cpp inference and produce some output
Parameters
prompt
Prompt to complete
string
antiprompts
Antiprompts to trigger stop
array
Default: []
Items: string
max_tokens
Maximum number of tokens to generate. 0 for unlimited
integer
Default: 0
Required: prompt
Return
result
Generated result (completion of prompt)
string
Required: result
begin-chat
Begin a chat session
Parameters
setup
Initial setup for the chat session
string
Default: ""
role_user
Role name for the user
string
Default: "User"
role_assistant
Role name for the assistant
string
Default: "Assistant"
add-chat-prompt
Add a prompt to the chat session as a user
Parameters
prompt
Prompt to add to the chat session
string
Default: ""
get-chat-response
Get a response from the chat session
Return
response
Response from the chat session
string
Required: response