Quickstart

Start deploying AI applications on Cordatus in 10 minutes. This page is a simple guide. For full steps, screenshots, and videos, follow the links in each section.

Prerequisites

A Cordatus account with appropriate permissions
At least one device connected to Cordatus (status: Online or Connected)
Device with GPU support (recommended for LLM applications)
Cordatus Client installed and running on your device

tip

If you don't have a device connected yet, complete the Device Hub Quickstart first.

Getting Started in 10 Minutes

1. Browse Available Applications (1 minute)

Go to Containers > Applications from the left menu
Browse the application catalog:
- vLLM (high-throughput LLM inference)
- TensorRT-LLM (optimized NVIDIA inference)
- Ollama (simple local models)
- NVIDIA Dynamo (distributed multi-GPU)
- NVIDIA VSS (video analysis)
Click on any application to view its Detail Page
Check supported platforms and available Docker image versions

See full details → Application Launch Guide

2. Launch Your First Application (3 minutes)

Example: Deploy vLLM with a Small Model

Click Start Application on the vLLM detail page
Select Device: Choose your connected device
Select Version: Choose the latest Docker image version
- Green checkmark = already downloaded
- Download icon = will be downloaded
Click Next to proceed to Advanced Settings

See full details → Application Launch Guide - Section 4

3. Configure Basic Settings (3 minutes)

General Settings:
- Environment Name: Leave blank for auto-generated name (or enter custom name like vllm-llama2-7b)
- Select GPU: Choose All GPU or select specific GPUs
- Resource Limits: Keep default values (or customize CPU/RAM limits)
- Enable Open Web UI: Check this box (creates chat interface automatically)
Model Selection:
- Switch to Cordatus Models tab
- Search for llama-2-7b or choose any small model (7B or 13B recommended for first deployment)
- Click on the model to select it
Skip Other Settings for Now:
- Docker Options: Auto-configured
- Environment Variables: Pre-filled with defaults
- Engine Arguments: Optimized for selected model

See full details → Application Launch Guide - Section 5

4. Launch the Container (1 minute)

Click Start Environment button (bottom right)
Enter your sudo password when prompted
If Docker image needs download, confirm the download
Wait for deployment to complete (1-5 minutes depending on image size)

What happens behind the scenes:

Cordatus connects to your device
Downloads Docker image (if needed)
Creates and starts the container
Configures networking and volumes
Starts Open Web UI container (if enabled)

See full details → Application Launch Guide - Section 6

5. Verify Deployment (2 minutes)

Go to Containers > Containers from the left menu
Find your newly created container group
Verify status shows Running
Click See the Container Informations (three dots menu)
Check Logs tab - you should see model loading messages
Go to Ports tab - copy the Local URL

Access Your Application:

API Endpoint: Open the Local URL in browser (shows API documentation)
Open Web UI: If enabled, find the second container in the group and open its URL
Test with curl:
```
curl http://localhost:8000/v1/models
```

See full details → Container Management Guide - Section 3.6

Expected Result

After completing these steps, you should have:

Container Running
- Status: Running (green indicator)
- Logs show successful model loading
- API endpoint accessible via Local URL
Open Web UI Active (if enabled)
- Second container in group shows Running
- Chat interface accessible via browser
- Can send messages and receive AI responses
Application in Containers List
- Visible under Containers > Containers
- Container group properly organized
- All components healthy

What's Next?

Explore VRAM Calculator (5 minutes)

Before deploying larger models, calculate VRAM requirements:

Go to VRAM Calculator from the main menu
Select Model: Choose a larger model (e.g., Llama-2-70B)
Select GPU: Choose your GPU model
Review Results: Check if VRAM is sufficient
Adjust Settings: Try different quantization levels (INT8, INT4)

Learn to:

Calculate memory requirements before deployment
Compare different quantization options
Determine optimal batch size and sequence length
Plan multi-GPU deployments

See full details VRAM Calculator User Guide

Add Your Own Models (10 minutes)

Use models you already downloaded on your device:

Configure Model Paths:
- Connect to your device
- Go to Metrics > Model Info
- Define paths for Huggingface, Ollama, or NVIDIA NIM
Scan for Models:
- Go to LLM Models page
- Click Explore Models on Your Device
- Click Start Scanning
- Select models to add to Cordatus
Deploy User Model:
- Go to LLM Models > User Models tab
- Click Deploy next to any model
- Select inference engine
- Configure and launch See full details User Models and Model Transfer Guide

Deploy Advanced Applications (15-30 minutes)

Try more complex deployments:

NVIDIA AI Dynamo (Multi-GPU Distributed Inference):

Configure processing mode (Aggregated/Disaggregated)
Set up router strategy (KV-Aware recommended)
Create multiple workers with GPU assignments
Launch distributed inference pipeline See full details NVIDIA AI Dynamo Creation Guide

NVIDIA VSS (Video Analysis):

Configure main VSS container
Set up VLM, LLM, Embed, and Rerank components
Choose to create new or use existing components
Deploy complete video analysis pipeline See full details NVIDIA VSS Creation Guide

Manage Your Containers (5 minutes)

Learn container management operations:

Start/Stop Containers:
- Click Start/Stop buttons for any container
- Select multiple containers for batch operations
View Container Information:
- Monitor real-time logs
- Review configuration parameters
- Check port mappings
Generate Public URLs:
- Make your applications accessible externally
- Share access with team members
Create Open Web UI:
- Add chat interface to existing LLM containers
- Generate public URLs for sharing
Duplicate Containers:
- Copy existing configurations
- Modify settings and redeploy See full details Container Management Guide

Troubleshooting Quick Tips

Container Won't Start:

Check device status is Connected
Verify GPU is available and not in use
Review container logs for error messages
Ensure sufficient disk space for Docker image

Out of VRAM Error:

Use VRAM Calculator to verify requirements
Try lower quantization (FP16 → INT8 → INT4)
Reduce batch size or sequence length
Add more GPUs or use smaller model

Model Not Found:

For Custom Models: Verify model name is correct
For User Models: Ensure model paths are configured
Check model is accessible from container
Review volume mappings in Docker Options

Open Web UI Not Working:

Verify container is running
Check port is not already in use
Review Open Web UI container logs
Ensure network connectivity between containers

Quick Reference

Application Types

Type	Use Case	Complexity	Setup Time
Standard Apps	Basic containers	Low	5 min
LLM Engines	Model inference	Medium	5 min
NVIDIA Dynamo	Multi-GPU distributed	High	5 min
NVIDIA VSS	Video analysis	High	10 min

Recommended GPU Memory

Model Size	Quantization	Minimum VRAM	Recommended GPU
7B	INT4	4-6 GB	RTX 3090, RTX 4090
7B	INT8	8-10 GB	RTX 4090, A10
13B	INT4	8-10 GB	RTX 4090, A10
13B	INT8	14-16 GB	A10, A100 40GB
70B	INT4	40-50 GB	A100 80GB, 2x A100 40GB
70B	INT8	80-90 GB	A100 80GB, 2x A100 80GB

Key Shortcuts

Containers > Applications: Browse application catalog
Containers > Containers: Manage running containers
LLM Models: Access Cordatus Models and User Models
VRAM Calculator: Calculate memory requirements
Device Metrics: View GPU/CPU/RAM usage

Get Help

For detailed documentation with screenshots and videos:

Prerequisites​

Getting Started in 10 Minutes​

Expected Result​

What's Next?​

Explore VRAM Calculator (5 minutes)​

Add Your Own Models (10 minutes)​

Deploy Advanced Applications (15-30 minutes)​

Manage Your Containers (5 minutes)​

Troubleshooting Quick Tips​

Quick Reference​

Application Types​

Recommended GPU Memory​

Key Shortcuts​

Get Help​