Capabilities
GET /capabilities is how a client discovers what the server hosts. No
request body. Response is a Capabilities object.
Response shape
{
"protocol_version": "1.0",
"rotation_format": "quaternion_xyzw",
"coordinate_system": "right_handed_y_up",
"units": "meters",
"response_formats": ["gltf_2.0_json", "gltf_2.0_binary"],
"models": [
{
"id": "kimodo-soma-rp",
"fps": 30.0,
"supports_retargeting": false,
"supports_async": false,
"supported_constraints": ["root_path", "effector_target", "pose_keyframe"],
"supported_segments": ["text", "unconditioned"],
"supported_guidance_types": ["nocfg", "regular", "separated"],
"predicted_contact_joints": ["LeftHeel", "LeftToe", "RightHeel", "RightToe"],
"native_clip_seconds": 10.0,
"chunking": "stitched",
"recommended_max_duration_seconds": 12.0,
"canonical_skeleton": { "joints": [ ] },
"limits": {
"max_duration_seconds": 30.0,
"max_num_samples": 16,
"max_constraints_per_request": 64,
"max_prompt_length": 1000,
"max_request_bytes": 1048576
}
}
]
}
Model fields
| Field | Type | Notes |
|---|---|---|
id | string | Used as request.model |
fps | number | Native frame rate of the model |
supports_retargeting | bool | If false, requests MUST send canonical_skeleton verbatim |
supports_async | bool | If true, server may return 202 Accepted for long generations (see Async) |
supported_constraints | string[] | Constraint type names the model accepts |
supported_segments | string[] | Segment types the model accepts. Defaults to ["text", "unconditioned"]; backbones with a specialized text-to-pose model add "pose" |
supported_guidance_types | string[] | Subset of ["nocfg", "regular", "separated"] |
predicted_contact_joints | string[] | Joint names (from canonical_skeleton) for which the model emits foot_contacts. Empty means no contact prediction |
native_clip_seconds | number | Duration the model was trained on. Requests beyond this are stitched per chunking |
chunking | "none" | "stitched" | If "stitched", server may chunk longer requests internally |
recommended_max_duration_seconds | number | Comfort-zone duration; clients SHOULD warn the user past this point |
canonical_skeleton | Skeleton | The skeleton the model was trained on. See Skeletons → |
limits | object | Per-request limits |
Limits
| Field | Type | Notes |
|---|---|---|
max_duration_seconds | number | Largest accepted total duration |
max_num_samples | int | Largest accepted Options.num_samples |
max_constraints_per_request | int | Sum across all constraints in the request |
max_prompt_length | int | Codepoints, per TextSegment.prompt |
max_request_bytes | int | Body size limit. Exceeding returns 413 payload_too_large |
Using the canonical skeleton
A client that wants to use a model without retargeting reads
canonical_skeleton from this endpoint and submits it verbatim in
subsequent /generate requests:
curl -s https://server/capabilities \
| jq '.models[] | select(.id == "kimodo-soma-rp") | .canonical_skeleton' \
> canonical.json
Then in your request:
{
"protocol_version": "1.0",
"model": "kimodo-soma-rp",
"skeleton": { "joints": [/* paste from canonical.json */] },
"segments": [/* … */]
}
When supports_retargeting: true (the common case for production servers),
you can ignore the canonical and send the user's actual skeleton.