AI on demand: Voxtral

From MediaWiki
Jump to navigation Jump to search

Calling the model

# Set your personal key
STONEY_KEY=sk-...

# Set the desired model
MODEL_ID=mistralai/Voxtral-Mini-3B-2507

# Path to the audio file to transcribe
AUDIO_FILE=your-audio-file.wav

curl -s https://llm.stoney-cloud.com/v1/audio/transcriptions \
  -H "Authorization: Bearer $STONEY_KEY" \
  -F "file=@$AUDIO_FILE" \
  -F "model=$MODEL_ID" \
  | jq .

Example output:

{
  "text": "Hello and welcome to the Stepping Stone LLM Gateway.",
  "usage": {
    "type": "duration",
    "seconds": 4
  }
}

Limitations

The maximum request body size is currently 50 MiB, which limits the size of the audio file.

The following audio formats/containers/codecs are supported: aiff, au, avr, caf, flac, htk, ircam, mat4, mat5, mp3, mpc2k, nist, ogg, paf, pvf, raw, rf64, sd2, sds, svx, voc, w64, wav, wavex, wve, xi