HASP Public API — HIPAA-compliant AI inference for developers

Q: Are the responses identical to the upstream provider's?

On /v1/messages , response shape and field names are response-compatible with the leading AI provider APIs. We add meta.usage and meta.billing fields for observability — these are additive, not breaking.

Q: What happens when a model is deprecated?

90-day deprecation notice, then 410 MODEL_RETIRED . New model IDs surface in the docs and in the admin model-allowlist UI before deprecation lands.

1 import AI from "@your-ai/sdk";

3 const client = new AI({

4- apiKey: process.env.PROVIDER_API_KEY,

5+ apiKey: process.env.HASP_API_KEY,

6+ baseURL: "https://api.usehasp.com",

7 });

9 const msg = await client.messages.create({

10 model: "claude-sonnet-4-6",

11 max_tokens: 1024,

12 messages: [{ role: "user", content: prompt }],

13 });

15 // msg.meta — added by HASP gateway:

16 // { billing: { credits_used: 312, cap: 500_000 },

17 // audit: { entry_id: "ae_01j…", signed: true },

18 // phi: { scanned: true, detected: 0 } }

What those 2 lines add

Compliance · frameworks we ship under

Active HIPAA BAA included

AOC under NDA SOC 2 Type II · inherited

AOC under NDA HITRUST r2 · inherited

EU + UK GDPR Art. 17 · 20 · 30

Active + CPPA-ready PIPEDA / CPPA Canada

Q3 2026 ISO 27001 In progress

Architecture

What runs between your code and the model

Every request routes through the HASP AI Gateway before it reaches an inference provider. The gateway enforces controls server-side — your client code stays clean. Provider routing happens under the gateway: pick any supported model as your default, and HASP automatically fails over to a different provider on the same BAA when yours is unavailable. Why we don't lock you to a single provider →

Request

Your code

→

PHI Scan

→

Budget & Rate

→

Model Allowlist

→

LLM

Response

Your code

←

Audit Log

←

Credit Track

←

Pass-through

←

LLM

All gateway steps are logged to your tenant's signed audit chain. The PHI scan result, budget deduction, and rate-limit verdict each produce an individual audit entry — not just the final response.

Endpoints

Two surfaces, one key

The SDK-compatible endpoint lets you migrate with no code changes. The native endpoints give you higher-level operations with server-side PHI defaults already configured.

POST

/v1/messages

@anthropic-ai/sdk compatible

Wire-compatible with @anthropic-ai/sdk. Change baseURL and your key. Response shape is identical; meta fields are additive and non-breaking.

Identical parameters to the upstream SDK
Streaming SSE supported
Tool use, vision, extended thinking all work

POST

/v1/ai/chat

Native workflow endpoints

Higher-level endpoint. Org prompt templates, PHI handling defaults, and conversation threading enforced server-side — less to get wrong in integration code.

Org-published prompt templates applied automatically
Conversation history managed server-side
PHI policy (redact / allow / block) per org

Response

Standard response, non-breaking additions

The /v1/messages response is wire-compatible with leading AI SDKs. The meta key is additive — existing code ignores it, observability tooling reads it.

{
  "id":           "msg_01xNSa7Cdmrg…",
  "type":         "message",
  "role":         "assistant",
  "content":      [ { "type": "text", "text": "…" } ],
  "model":         "claude-sonnet-4-6",
  "stop_reason":  "end_turn",
  "usage":        { "input_tokens": 48, "output_tokens": 312 },

  // "meta" is additive — existing SDK code ignores it
  "meta": {
    "billing": {
      "credits_used":      312,
      "credits_remaining": 847688,
      "cap":               500000
    },
    "audit": {
      "entry_id":  "ae_01j_7xN4Rb…",
      "chain_seq": 1847,
      "signed":    true
    },
    "phi": {
      "scanned":  true,
      "detected": 0,
      "action":   null
    }
  }
}

Keys & Controls

Per-key configuration.
Not per-account.

Issue, scope, and rotate keys per organization. Optional IP allowlists on Enterprise. Every request authenticated, every response carrying meta.usage and meta.billing.

Budget cap — set a per-key monthly spend ceiling. Alerts fire at thresholds you choose. Limit enforced at the gateway.
Rate limits — 30–2,000 RPM depending on plan, scoped per key. 429 returns with Retry-After header.
Model allowlist — restrict which models the key can access. Heavy-tier models are off-by-default; opt in per org.
IP allowlists — optionally restrict key usage to specific IP ranges. Enterprise tier.

Northbridge Systems · Production Active

Key sk_live_••••••••••••••••••••••dXpq

Budget $91 / $500 · mo

Rate limit 60 RPM

Models claude-haiku-4-5 claude-sonnet-4-6

Models

All BAA-covered models

Model IDs match the upstream provider's. Token allotments are denominated in base-model units — fast/cheap tiers consume less, heavy tiers consume more. Allowlist enforcement happens at the gateway; unlisted model requests return 403 MODEL_NOT_ALLOWED. The full catalog of supported models is on the models page.

Model ID	Credit multiplier	Default access	BAA covered
claude-sonnet-4-6	1× Claude Sonnet 4.6	Included	✓
claude-haiku-4-5	0.3× Claude Haiku 4.5	Included	✓
claude-opus-4-6	1.7× Claude Opus 4.6	Opt-in per org	✓
claude-opus-4-7	1.7× Claude Opus 4.7	Opt-in per org	✓
gpt-5.5	1.7× GPT-5.5	Opt-in per org	✓
gpt-5.5-pro	10× GPT-5.5 Pro	Opt-in per org	✓
gpt-5.4	0.8× GPT-5.4	Included	✓
gpt-5.4-mini	0.25× GPT-5.4 mini	Included	✓
gpt-5.3-codex	0.6× GPT-5.3 Codex	Included	✓

Model IDs are pinned — HASP notifies 90 days before deprecation and keeps old IDs live with forwarding until the cutover date. No silent breaking changes.

Pricing

Three published tiers.
15% off annual.

Developer · Growth · Scale — published, self-serve, with generous token allotments. Enterprise pricing is volume-negotiated. 15% annual discount applies to every published tier.

Developer 1.5M tokens included

Growth 7.5M tokens included

Scale 15M tokens included

See pricing — 15% off annual →

Everything in the API

SDK-compatible `/v1/messages`

Change the base URL and the API key. Existing @anthropic-ai/sdk code keeps working. PHI scanning, audit logging, and credit accounting happen at the Gateway — invisible to your code.

Native `/v1/ai/*` workflow endpoints

Higher-level endpoints (chat, documents, summarize) where prompt templates and PHI-handling defaults are enforced server-side per organization policy. Less to get wrong in the integration code.

Per-org API keys with rotation

Issue, scope, and rotate keys per organization. Optional IP allowlists on Enterprise. Every request authenticated, every response carrying meta.usage and meta.billing.

Streaming SSE responses

Standard SSE on both SDK-compat and native endpoints. Stream tokens to your UI; the audit entry lands when the stream completes.

Webhooks for AI events

Subscribe to chat-completed, document-ingested, phi-detected, audit-export-requested. Outbound delivery itself goes on the audit chain so you can prove the webhook fired.

Budget controls on your terms

Configure org-level spend caps per meter, or per-entity caps for individual users, apps, and API keys. Email + webhook + UI alerts at any thresholds you pick. Set a cap to hard-stop at your limit, or leave it off to let usage flow into metered overage.

Why the API tier, on this substrate

Match the leading HIPAA AI API tier-for-tier on base price, with 50% more included tokens at every tier.
Provider failover under the gateway — when one BAA-covered provider has an outage, requests route to another on the same BAA chain. No code change, no manual cutover.
Compliance posture inherited end-to-end; pass-through trust artifacts mean your customers' security reviews go faster.

FAQ

Yes — change the base URL and API key, and existing code keeps working. HASP also offers higher-level workflow endpoints where PHI controls and prompt templates are enforced server-side. Full API reference is in the developer docs.

Per-org RPM at the Gateway, scaling with your plan tier. Enterprise limits are custom. Rate-limit events come back as 429 with a Retry-After header; the event itself is logged. Current limits per tier are on the pricing page.

On /v1/messages, response shape and field names are response-compatible with the leading AI provider APIs. We add meta.usage and meta.billing fields for observability — these are additive, not breaking.

90-day deprecation notice, then 410 MODEL_RETIRED. New model IDs surface in the docs and in the admin model-allowlist UI before deprecation lands.

Yes — that's the API-first plans (Starter through Scale). They include the admin UI for keys, usage, audit-export, billing, and BAA status, but no Assistant chat UI or Studio app-builder. Platform plans bundle all of those.

View all frequently asked questions →

The other product pages

API plans bundle Public API + Agent SDK + admin UX for keys, usage, audit export, billing, and BAA workflows — built for developers integrating HASP into their own software. Platform plans bundle Assistant chat + Studio for teams using AI in their day-to-day work. Same compliance substrate, same audit chain across both. Orgs can hold one plan or both.

Chat & documents

A HIPAA-ready chat interface and document analysis tool for your whole team. Ask questions, get summaries, upload files — all with PHI scanning built in and every action on your audit trail.

View Chat & documents →

AI Studio

Describe the internal tool you need and watch it build live. No developers required — AI Studio generates a working app inside HASP, audited from the first keystroke.

View AI Studio →

Agent SDK

Connect external agents, automation pipelines, and A2A-protocol clients to HASP's policy gate. Every tool invocation is authorized, identity-scoped, and recorded to the signed audit chain — whether the caller is a human, a Studio app, or a fully autonomous agent.

View Agent SDK →

Audit & Trust

A tamper-evident record of every action across every surface — signed, chained, and independently verifiable. The thing procurement teams stop scrolling for.

View Audit & Trust →

Evaluate with a free key today.

Free Evaluation includes a gateway-scoped API key plus admin metering views. PHI remains blocked until BAAs finalize. See the pricing page for current evaluation limits — convert to unlock production posture.

Talk to Sales