Request Types
The Resultity API supports multiple inference request types to accommodate a wide range of use cases — from fast single-turn queries to heavy distributed tasks and multimodal jobs.
Currently Available
Synchronous
- Standard request-response cycle.
- The client sends a request and waits for the result in the same HTTP connection.
- Ideal for fast, low-latency jobs (e.g., short completions).
Polling
- The client sends a job and immediately receives a
request_id. - The result must be fetched later using a status endpoint.
- Designed for longer or queued tasks.
Planned
Batch
- Submit multiple requests in a single call.
- Suitable for preloading embeddings, generating multiple completions, etc.
Streaming
- Sends tokens to the client as they are generated.
- Used for chatbots, streaming UIs, and voice assistants.
Multimodel
- Executes the same prompt on multiple models in parallel.
- Can be used for ensemble approaches or fallback logic.
- Examples:
mistral+gemma, oropenchat+zephyr.
Multinode
- Executes a single heavy model (e.g., LLaMA2 70B) across several nodes using sharding.
- Enables distributed inference with large context windows.
Media-Oriented Extensions
Planned support for non-text jobs using compatible APIs:
Vision
/v1/images/generations— generate images from prompts (DALL·E-style);/v1/images/variations— modify or enhance existing images;/v1/images/description— describe or caption images;- Based on models such as Kandinsky, Stable Diffusion, Playground v2.
Audio
/v1/audio/transcriptions— convert speech to text (e.g., Whisper, SeamlessM4T);/v1/audio/speech— text-to-speech (e.g., Bark, xtts, tts-zero);/v1/audio/translation— audio-based translation.
Video (experimental)
- Future support for generation and description (e.g., SVD, AnimateDiff, Zer0Scope);
- Initially via Space or custom Subclouds, not main Cloud API.
This modular structure allows Resultity to evolve from OpenAI-compatible endpoints to full multimodal capability, leveraging existing open-source and hosted models while maintaining API consistency.