Developer Portal

Build with Logix.

API-first model compression and deployment. From ingestion to edge delivery in a single pipeline. Python SDK, Swift runtime, CLI tools.

API Reference

Deploy quantized models to any target with a single function call.

Deploy a Model

python

Deploy a quantized model to any target device with one call.

1import logix
2
3# Initialize the Logix client
4client = logix.Client(api_key="lx-sk-...")
5
6# Deploy a quantized medical model to iPhone
7deployment = client.deploy(
8 model_id="logix-phi4-medical-v3",
9 target="iphone_16_pro",
10 precision="int4",
11 runtime="coreml",
12 config={
13 "max_context_length": 4096,
14 "batch_size": 1,
15 "thermal_profile": "sustained"
16 }
17)
18
19print(f"Deployed: {deployment.id}")
20print(f"Artifact size: {deployment.artifact_size_mb}MB")
21print(f"Estimated latency: {deployment.est_latency_ms}ms")

Run Inference

swift

Run on-device inference with the deployed model.

1// Swift — On-Device Inference (CoreML)
2import LogixRuntime
3
4let model = try LogixModel(
5 artifact: "logix-phi4-medical-v3.mlpackage",
6 precision: .int4
7)
8
9let result = try await model.generate(
10 prompt: "Summarize the patient's cardiac markers",
11 context: patientRecord.deidentified(),
12 config: .init(
13 maxTokens: 256,
14 temperature: 0.1, // Low temperature for clinical precision
15 topP: 0.9
16 )
17)
18
19print(result.text)
20// Latency: 23ms | Tokens/sec: 42 | Memory: 1.2GB
Logix CLI

One CLI.
Full pipeline.

The Logix CLI handles the entire model lifecycle — from ingestion and distillation to pruning, quantization, benchmarking, and deployment. Install it and own the pipeline.

# Install the Logix CLI
$ pip install logix-cli
Distillation
Quantization
Pruning
Benchmarks
Deployment
OTA Updates
logix-cli — zsh

Initialize a new Logix project with default config

$logix init my-medical-slm

Ingest a base model and training dataset

$logix ingest --model meta-llama/Llama-3.2-1B --dataset ./clinical_notes/

Run recursive distillation with GPT-5 as teacher

$logix distill --teacher gpt-5 --student ./student_1B/ --cycles 47

Quantize the refined model to INT4 with AWQ

$logix quantize --precision int4 --method awq --calibration ./cal_set/

Apply structured pruning at 30% sparsity

$logix prune --sparsity 0.3 --structured --preserve-heads 0-7

Run comprehensive benchmark suite on target hardware

$logix benchmark --suite mmlu,hellaswag,medqa --device m4-pro

Package and deploy to NVIDIA Orin via OTA

$logix deploy --target nvidia_orin --runtime tensorrt --package ota