Now in Beta - Start Free Today

GPU Inference Without the GPU

Run PyTorch inference on powerful remote GPUs from any device. A $6 VPS. A Raspberry Pi. A Chromebook. If it runs Python, it runs ML now.

Signups temporarily disabled See the Code Comparison
inference.py
# No GPU on your machine? No problem.
import remotorch

# Connect to a remote RTX 4090
remotorch.connect(
    api_key="rk_...",
    gpu_type="rtx4090"  # Choose your GPU
)

# Use PyTorch as normal - tensors live on remote GPU
x = remotorch.tensor([1.0, 2.0, 3.0])
y = remotorch.tensor([4.0, 5.0, 6.0])
z = x + y  # Computed on remote GPU

print(z.cpu())  # tensor([5., 7., 9.])

GPU Options & Pricing

Choose the right GPU for your workload. Pay only for what you use.

GPU Inventory Coming Soon

We're bringing more GPUs online. Check back soon!

Everything You Need for Remote Inference

Focus on your models, not infrastructure

Instant Access

Connect to a GPU in seconds. No provisioning, no waiting for instances to boot. Just connect and compute.

PyTorch Compatible

Use familiar PyTorch syntax. Your existing code works with minimal changes. All standard operations supported.

Secure & Isolated

Each session runs in an isolated container. Your models and data are private and encrypted in transit.

Model Storage

Upload your models once, load them instantly. No need to transfer weights every session.

Usage Dashboard

Track GPU hours, monitor sessions, and manage API keys from a beautiful dashboard.

Pay As You Go

Prepaid balance, billed per minute. No subscriptions, no idle charges. Your balance never expires.

Run ML From Literally Anywhere

Your code runs locally. The GPU computation happens remotely. No CUDA installation. No GPU drivers. No expensive hardware.

$6/mo VPS

Deploy an ML-powered API on the cheapest DigitalOcean or Hetzner droplet. No GPU instance needed.

Raspberry Pi

Run image classification, object detection, or LLM inference from a $35 Pi at the edge.

MacBook Air

Develop and test ML models on your laptop without worrying about CUDA compatibility.

AWS Lambda

Add GPU inference to serverless functions. No cold starts loading massive models.

CI/CD Pipelines

Run ML tests in GitHub Actions without expensive GPU runners. Pay only for actual compute.

That Old Laptop

Dust off your 2015 ThinkPad. If it runs Python, it can now run GPU inference.

Chromebook

Run Stable Diffusion from a $200 Chromebook with Linux enabled. Seriously.

Tiny Containers

Ship 50MB Docker images instead of 15GB GPU containers. No nvidia-docker needed.

The Math Makes Sense

Why pay $300+/month for a GPU instance that sits idle 90% of the time? With Remotorch, you only pay for actual GPU seconds used.

  • $6 droplet + Remotorch beats a $300 GPU instance for bursty workloads
  • No CUDA driver headaches, no PyTorch version conflicts
  • Scale to zero when not in use - no idle costs
Monthly cost comparison for 10 hrs/day inference
AWS p3.2xlarge (reserved) $918/mo
Lambda Labs A10 $219/mo
Remotorch + $6 VPS ~$50/mo*

*Based on pay-as-you-go pricing with typical inference workloads. Your usage may vary.

How It Works

Get running in under 5 minutes

1

Install the Package

pip install git+https://github.com/vocalogic/remotorch-py.git
View on GitHub
2

Get Your API Key

Sign up for free and create an API key from your dashboard.

3

Start Computing

Connect and run your PyTorch code on remote GPUs instantly.

Pay As You Go

No subscriptions. No commitments. Only pay for what you use.

Add Funds

Top up your account with $10, $25, $50, or $100. Your balance never expires.

Use GPUs

Run your workloads on any available GPU. Billed per minute of actual usage.

Auto-Refresh

Optionally enable auto-refresh to keep your balance topped up automatically.

Recommended

Prepaid Balance

$0 minimum

Add funds and pay only for what you use. No monthly fees.

What's included

  • Per-minute billing
  • Access to all GPU types
  • Multiple concurrent sessions
  • Model storage included

Top-up amounts

$10
$25
$50
$100
Signups temporarily disabled

Enterprise

Custom

For teams with high-volume needs

  • Volume discounts
  • Dedicated GPUs
  • SLA & priority support
  • Invoice billing
  • On-premise option
Contact Sales

Frequently Asked Questions

How does billing work?

Add funds to your account balance, then use GPUs as needed. Your balance is deducted based on actual GPU time used, billed in 1-minute increments with a 1-minute minimum. Check your balance and transaction history anytime in your dashboard.

What's the minimum balance to start a session?

You need enough balance to cover at least 10 minutes of GPU time for the GPU type you're using. For example, an RTX 4090 at $0.50/hr requires about $0.09 minimum balance.

Does my balance expire?

No, your balance never expires. Add funds whenever you need them and use at your own pace.

What is auto-refresh?

Auto-refresh automatically tops up your balance when it drops below a threshold you set. This ensures your workloads are never interrupted due to insufficient funds.

Ready to Run Inference at Scale?

Join developers using Remotorch for their ML workloads.

Signups temporarily disabled