endpoint |
(none) |
String |
Full URL of the Triton Inference Server endpoint, e.g., https://triton-server:8000/v2/models. Both HTTP and HTTPS are supported; HTTPS is recommended for production. |
model-name |
(none) |
String |
Name of the model to invoke on Triton server. |
model-version |
"latest" |
String |
Version of the model to use. Defaults to 'latest'. |
timeout |
30 s |
Duration |
HTTP request timeout (connect + read + write). This applies per individual request and is separate from Flink's async timeout. Defaults to 30 seconds. |