Reference · Information

`senkani_embed`

Live Replaces API call Savings $0/call

Text embeddings on Apple Silicon via MLX. MiniLM-L6-v2 → 384-dim Float32. Zero API cost.

Signature

senkani_mcp.call(tool="senkani_embed", args={...})

Behavior

Runs on the Neural Engine through MLX. Sub-200 ms per call on M-series. Shared `MLXInferenceLock` FIFO-serializes every MLX call and drops loaded model containers on macOS memory-pressure warnings.

Inputs

Name

Type

Default

Description

texts

array<string>

—

Batch of strings to embed.

normalize

boolean

true

L2-normalize the output vectors.

Output

Array of 384-dim Float32 vectors, one per input.

Example

{"tool":"senkani_embed","args":{"texts":["orders","payments"]}}

Details

First call loads the model (~80 MB) into unified memory; subsequent calls reuse it. Idle models drop on memory pressure.

⚔

Fully offline once the model is downloaded. No outbound traffic.

senkani_embed

Signature

Behavior

Inputs

Output

Example

Details

See also

`senkani_embed`