Kind
Embeddingserver
Group
toolhive.stacklok.dev
Version
v1alpha1
apiVersion: toolhive.stacklok.dev/v1alpha1 kind: Embeddingserver metadata: name: example
View raw schema
apiVersion string
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
kind string
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
metadata object
spec object
EmbeddingServerSpec defines the desired state of EmbeddingServer
args []string
Args are additional arguments to pass to the embedding inference server
env []object
Env are environment variables to set in the container
name string required
Name of the environment variable
value string required
Value of the environment variable
hfTokenSecretRef object
HFTokenSecretRef is a reference to a Kubernetes Secret containing the huggingface token. If provided, the secret value will be provided to the embedding server for authentication with huggingface.
key string required
Key is the key within the secret
name string required
Name is the name of the secret
image string
Image is the container image for the embedding inference server. Images must be from HuggingFace Text Embeddings Inference (https://github.com/huggingface/text-embeddings-inference).
imagePullPolicy string
ImagePullPolicy defines the pull policy for the container image
enum: Always, Never, IfNotPresent
model string
Model is the HuggingFace embedding model to use (e.g., "sentence-transformers/all-MiniLM-L6-v2")
modelCache object
ModelCache configures persistent storage for downloaded models When enabled, models are cached in a PVC and reused across pod restarts
accessMode string
AccessMode is the access mode for the PVC
enum: ReadWriteOnce, ReadWriteMany, ReadOnlyMany
enabled boolean
Enabled controls whether model caching is enabled
size string
Size is the size of the PVC for model caching (e.g., "10Gi")
storageClassName string
StorageClassName is the storage class to use for the PVC If not specified, uses the cluster's default storage class
podTemplateSpec object
PodTemplateSpec allows customizing the pod (node selection, tolerations, etc.) This field accepts a PodTemplateSpec object as JSON/YAML. Note that to modify the specific container the embedding server runs in, you must specify the 'embedding' container name in the PodTemplateSpec.
port integer
Port is the port to expose the embedding service on
format: int32
minimum: 1
maximum: 65535
replicas integer
Replicas is the number of embedding server replicas to run
format: int32
minimum: 1
resourceOverrides object
ResourceOverrides allows overriding annotations and labels for resources created by the operator
persistentVolumeClaim object
PersistentVolumeClaim defines overrides for the PVC resource
annotations object
Annotations to add or override on the resource
labels object
Labels to add or override on the resource
service object
Service defines overrides for the Service resource
annotations object
Annotations to add or override on the resource
labels object
Labels to add or override on the resource
statefulSet object
StatefulSet defines overrides for the StatefulSet resource
annotations object
Annotations to add or override on the resource
labels object
Labels to add or override on the resource
podTemplateMetadataOverrides object
PodTemplateMetadataOverrides defines metadata overrides for the pod template
annotations object
Annotations to add or override on the resource
labels object
Labels to add or override on the resource
resources object
Resources defines compute resources for the embedding server
limits object
Limits describes the maximum amount of compute resources allowed
cpu string
CPU is the CPU limit in cores (e.g., "500m" for 0.5 cores)
memory string
Memory is the memory limit in bytes (e.g., "64Mi" for 64 megabytes)
requests object
Requests describes the minimum amount of compute resources required
cpu string
CPU is the CPU limit in cores (e.g., "500m" for 0.5 cores)
memory string
Memory is the memory limit in bytes (e.g., "64Mi" for 64 megabytes)
status object
EmbeddingServerStatus defines the observed state of EmbeddingServer
conditions []object
Conditions represent the latest available observations of the EmbeddingServer's state
lastTransitionTime string required
lastTransitionTime is the last time the condition transitioned from one status to another. This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
format: date-time
message string required
message is a human readable message indicating details about the transition. This may be an empty string.
maxLength: 32768
observedGeneration integer
observedGeneration represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date with respect to the current state of the instance.
format: int64
minimum: 0
reason string required
reason contains a programmatic identifier indicating the reason for the condition's last transition. Producers of specific condition types may define expected values and meanings for this field, and whether the values are considered a guaranteed API. The value should be a CamelCase string. This field may not be empty.
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
minLength: 1
maxLength: 1024
status string required
status of the condition, one of True, False, Unknown.
enum: True, False, Unknown
type string required
type of condition in CamelCase or in foo.example.com/CamelCase.
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
maxLength: 316
message string
Message provides additional information about the current phase
observedGeneration integer
ObservedGeneration reflects the generation most recently observed by the controller
format: int64
phase string
Phase is the current phase of the EmbeddingServer
enum: Pending, Downloading, Ready, Failed, Terminating
readyReplicas integer
ReadyReplicas is the number of ready replicas
format: int32
url string
URL is the URL where the embedding service can be accessed
Copied!