Skip to main content

Quickstart

Start deploying AI applications on Cordatus in 10 minutes. This page is a simple guide. For full steps, screenshots, and videos, follow the links in each section.

Prerequisites

  • A Cordatus account with appropriate permissions
  • At least one device connected to Cordatus (status: Online or Connected)
  • Device with GPU support (recommended for LLM applications)
  • Cordatus Client installed and running on your device
tip

If you don't have a device connected yet, complete the Device Hub Quickstart first.


Getting Started in 10 Minutes

1. Browse Available Applications (1 minute)

  1. Go to Containers > Applications from the left menu
  2. Browse the application catalog:
    • vLLM (high-throughput LLM inference)
    • TensorRT-LLM (optimized NVIDIA inference)
    • Ollama (simple local models)
    • NVIDIA Dynamo (distributed multi-GPU)
    • NVIDIA VSS (video analysis)
  3. Click on any application to view its Detail Page
  4. Check supported platforms and available Docker image versions

See full details → Application Launch Guide


2. Launch Your First Application (3 minutes)

Example: Deploy vLLM with a Small Model

  1. Click Start Application on the vLLM detail page
  2. Select Device: Choose your connected device
  3. Select Version: Choose the latest Docker image version
    • Green checkmark = already downloaded
    • Download icon = will be downloaded
  4. Click Next to proceed to Advanced Settings

See full details → Application Launch Guide - Section 4


3. Configure Basic Settings (3 minutes)

  • General Settings:

    • Environment Name: Leave blank for auto-generated name (or enter custom name like vllm-llama2-7b)
    • Select GPU: Choose All GPU or select specific GPUs
    • Resource Limits: Keep default values (or customize CPU/RAM limits)
    • Enable Open Web UI: Check this box (creates chat interface automatically)
  • Model Selection:

    • Switch to Cordatus Models tab
    • Search for llama-2-7b or choose any small model (7B or 13B recommended for first deployment)
    • Click on the model to select it
  • Skip Other Settings for Now:

    • Docker Options: Auto-configured
    • Environment Variables: Pre-filled with defaults
    • Engine Arguments: Optimized for selected model

See full details → Application Launch Guide - Section 5


4. Launch the Container (1 minute)

  1. Click Start Environment button (bottom right)
  2. Enter your sudo password when prompted
  3. If Docker image needs download, confirm the download
  4. Wait for deployment to complete (1-5 minutes depending on image size)

What happens behind the scenes:

  • Cordatus connects to your device
  • Downloads Docker image (if needed)
  • Creates and starts the container
  • Configures networking and volumes
  • Starts Open Web UI container (if enabled)

See full details → Application Launch Guide - Section 6


5. Verify Deployment (2 minutes)

  1. Go to Containers > Containers from the left menu
  2. Find your newly created container group
  3. Verify status shows Running
  4. Click See the Container Informations (three dots menu)
  5. Check Logs tab - you should see model loading messages
  6. Go to Ports tab - copy the Local URL

Access Your Application:

  • API Endpoint: Open the Local URL in browser (shows API documentation)
  • Open Web UI: If enabled, find the second container in the group and open its URL
  • Test with curl:
    curl http://localhost:8000/v1/models

See full details → Container Management Guide - Section 3.6


Expected Result

After completing these steps, you should have:

  • Container Running

    • Status: Running (green indicator)
    • Logs show successful model loading
    • API endpoint accessible via Local URL
  • Open Web UI Active (if enabled)

    • Second container in group shows Running
    • Chat interface accessible via browser
    • Can send messages and receive AI responses
  • Application in Containers List

    • Visible under Containers > Containers
    • Container group properly organized
    • All components healthy

What's Next?

Explore VRAM Calculator (5 minutes)

Before deploying larger models, calculate VRAM requirements:

  1. Go to VRAM Calculator from the main menu
  2. Select Model: Choose a larger model (e.g., Llama-2-70B)
  3. Select GPU: Choose your GPU model
  4. Review Results: Check if VRAM is sufficient
  5. Adjust Settings: Try different quantization levels (INT8, INT4)

Learn to:

  • Calculate memory requirements before deployment
  • Compare different quantization options
  • Determine optimal batch size and sequence length
  • Plan multi-GPU deployments

See full details VRAM Calculator User Guide


Add Your Own Models (10 minutes)

Use models you already downloaded on your device:

  1. Configure Model Paths:

    • Connect to your device
    • Go to Metrics > Model Info
    • Define paths for Huggingface, Ollama, or NVIDIA NIM
  2. Scan for Models:

    • Go to LLM Models page
    • Click Explore Models on Your Device
    • Click Start Scanning
    • Select models to add to Cordatus
  3. Deploy User Model:


Deploy Advanced Applications (15-30 minutes)

Try more complex deployments:

NVIDIA AI Dynamo (Multi-GPU Distributed Inference):

  • Configure processing mode (Aggregated/Disaggregated)
  • Set up router strategy (KV-Aware recommended)
  • Create multiple workers with GPU assignments
  • Launch distributed inference pipeline See full details NVIDIA AI Dynamo Creation Guide

NVIDIA VSS (Video Analysis):

  • Configure main VSS container
  • Set up VLM, LLM, Embed, and Rerank components
  • Choose to create new or use existing components
  • Deploy complete video analysis pipeline See full details NVIDIA VSS Creation Guide

Manage Your Containers (5 minutes)

Learn container management operations:

  1. Start/Stop Containers:

    • Click Start/Stop buttons for any container
    • Select multiple containers for batch operations
  2. View Container Information:

    • Monitor real-time logs
    • Review configuration parameters
    • Check port mappings
  3. Generate Public URLs:

    • Make your applications accessible externally
    • Share access with team members
  4. Create Open Web UI:

    • Add chat interface to existing LLM containers
    • Generate public URLs for sharing
  5. Duplicate Containers:


Troubleshooting Quick Tips

Container Won't Start:

  • Check device status is Connected
  • Verify GPU is available and not in use
  • Review container logs for error messages
  • Ensure sufficient disk space for Docker image

Out of VRAM Error:

  • Use VRAM Calculator to verify requirements
  • Try lower quantization (FP16 → INT8 → INT4)
  • Reduce batch size or sequence length
  • Add more GPUs or use smaller model

Model Not Found:

  • For Custom Models: Verify model name is correct
  • For User Models: Ensure model paths are configured
  • Check model is accessible from container
  • Review volume mappings in Docker Options

Open Web UI Not Working:

  • Verify container is running
  • Check port is not already in use
  • Review Open Web UI container logs
  • Ensure network connectivity between containers

Quick Reference

Application Types

TypeUse CaseComplexitySetup Time
Standard AppsBasic containersLow5 min
LLM EnginesModel inferenceMedium5 min
NVIDIA DynamoMulti-GPU distributedHigh5 min
NVIDIA VSSVideo analysisHigh10 min
Model SizeQuantizationMinimum VRAMRecommended GPU
7BINT44-6 GBRTX 3090, RTX 4090
7BINT88-10 GBRTX 4090, A10
13BINT48-10 GBRTX 4090, A10
13BINT814-16 GBA10, A100 40GB
70BINT440-50 GBA100 80GB, 2x A100 40GB
70BINT880-90 GBA100 80GB, 2x A100 80GB

Key Shortcuts

  • Containers > Applications: Browse application catalog
  • Containers > Containers: Manage running containers
  • LLM Models: Access Cordatus Models and User Models
  • VRAM Calculator: Calculate memory requirements
  • Device Metrics: View GPU/CPU/RAM usage

Get Help

For detailed documentation with screenshots and videos: