ProductPublic API

Your existing AI SDK code,
now HIPAA-compliant.

Change the base URL and the API key. Existing @anthropic-ai/sdk code keeps working. PHI scanning, audit logging, and credit accounting happen at the Gateway — invisible to your code.

What those 2 lines add

Compliance · frameworks we ship under
Active HIPAA BAA included
AOC under NDA SOC 2 Type II · inherited
AOC under NDA HITRUST r2 · inherited
EU + UK GDPR Art. 17 · 20 · 30
Active + CPPA-ready PIPEDA / CPPA Canada
Q3 2026 ISO 27001 In progress

Architecture

What runs between your code and the model

Every request routes through the HASP AI Gateway before it reaches an inference provider. The gateway enforces controls server-side — your client code stays clean. Provider routing happens under the gateway: pick any supported model as your default, and HASP automatically fails over to a different provider on the same BAA when yours is unavailable. Why we don't lock you to a single provider →

Request
Your code
PHI Scan
Budget & Rate
Model Allowlist
LLM
Response
Your code
Audit Log
Credit Track
Pass-through
LLM

All gateway steps are logged to your tenant's signed audit chain. The PHI scan result, budget deduction, and rate-limit verdict each produce an individual audit entry — not just the final response.

Endpoints

Two surfaces, one key

The SDK-compatible endpoint lets you migrate with no code changes. The native endpoints give you higher-level operations with server-side PHI defaults already configured.

POST
/v1/messages
@anthropic-ai/sdk compatible

Wire-compatible with @anthropic-ai/sdk. Change baseURL and your key. Response shape is identical; meta fields are additive and non-breaking.

  • Identical parameters to the upstream SDK
  • Streaming SSE supported
  • Tool use, vision, extended thinking all work
POST
/v1/ai/chat
Native workflow endpoints

Higher-level endpoint. Org prompt templates, PHI handling defaults, and conversation threading enforced server-side — less to get wrong in integration code.

  • Org-published prompt templates applied automatically
  • Conversation history managed server-side
  • PHI policy (redact / allow / block) per org

Response

Standard response, non-breaking additions

The /v1/messages response is wire-compatible with leading AI SDKs. The meta key is additive — existing code ignores it, observability tooling reads it.

response.jsonc
{
  "id":           "msg_01xNSa7Cdmrg…",
  "type":         "message",
  "role":         "assistant",
  "content":      [ { "type": "text", "text": "…" } ],
  "model":         "claude-sonnet-4-6",
  "stop_reason":  "end_turn",
  "usage":        { "input_tokens": 48, "output_tokens": 312 },

  // "meta" is additive — existing SDK code ignores it
  "meta": {
    "billing": {
      "credits_used":      312,
      "credits_remaining": 847688,
      "cap":               500000
    },
    "audit": {
      "entry_id":  "ae_01j_7xN4Rb…",
      "chain_seq": 1847,
      "signed":    true
    },
    "phi": {
      "scanned":  true,
      "detected": 0,
      "action":   null
    }
  }
}

Keys & Controls

Per-key configuration.
Not per-account.

Issue, scope, and rotate keys per organization. Optional IP allowlists on Enterprise. Every request authenticated, every response carrying meta.usage and meta.billing.

  • Budget cap — set a per-key monthly spend ceiling. Alerts fire at thresholds you choose. Limit enforced at the gateway.
  • Rate limits — 30–2,000 RPM depending on plan, scoped per key. 429 returns with Retry-After header.
  • Model allowlist — restrict which models the key can access. Heavy-tier models are off-by-default; opt in per org.
  • IP allowlists — optionally restrict key usage to specific IP ranges. Enterprise tier.
Northbridge Systems · Production Active
Key sk_live_••••••••••••••••••••••dXpq
Budget $91 / $500 · mo
Rate limit 60 RPM
Models claude-haiku-4-5 claude-sonnet-4-6

Models

All BAA-covered models

Model IDs match the upstream provider's. Token allotments are denominated in base-model units — fast/cheap tiers consume less, heavy tiers consume more. Allowlist enforcement happens at the gateway; unlisted model requests return 403 MODEL_NOT_ALLOWED. The full catalog of supported models is on the models page.

Model ID Credit multiplier Default access BAA covered
claude-sonnet-4-6 Claude Sonnet 4.6 Included
claude-haiku-4-5 0.3× Claude Haiku 4.5 Included
claude-opus-4-6 1.7× Claude Opus 4.6 Opt-in per org
claude-opus-4-7 1.7× Claude Opus 4.7 Opt-in per org
gpt-5.5 1.7× GPT-5.5 Opt-in per org
gpt-5.5-pro 10× GPT-5.5 Pro Opt-in per org
gpt-5.4 0.8× GPT-5.4 Included
gpt-5.4-mini 0.25× GPT-5.4 mini Included
gpt-5.3-codex 0.6× GPT-5.3 Codex Included

Model IDs are pinned — HASP notifies 90 days before deprecation and keeps old IDs live with forwarding until the cutover date. No silent breaking changes.

Pricing

Three published tiers.
15% off annual.

Developer · Growth · Scale — published, self-serve, with generous token allotments. Enterprise pricing is volume-negotiated. 15% annual discount applies to every published tier.

Developer 1.5M tokens included
Growth 7.5M tokens included
Scale 15M tokens included
See pricing — 15% off annual →

Everything in the API

SDK-compatible /v1/messages

Change the base URL and the API key. Existing @anthropic-ai/sdk code keeps working. PHI scanning, audit logging, and credit accounting happen at the Gateway — invisible to your code.

Native /v1/ai/* workflow endpoints

Higher-level endpoints (chat, documents, summarize) where prompt templates and PHI-handling defaults are enforced server-side per organization policy. Less to get wrong in the integration code.

Per-org API keys with rotation

Issue, scope, and rotate keys per organization. Optional IP allowlists on Enterprise. Every request authenticated, every response carrying meta.usage and meta.billing.

Streaming SSE responses

Standard SSE on both SDK-compat and native endpoints. Stream tokens to your UI; the audit entry lands when the stream completes.

Webhooks for AI events

Subscribe to chat-completed, document-ingested, phi-detected, audit-export-requested. Outbound delivery itself goes on the audit chain so you can prove the webhook fired.

Budget controls on your terms

Configure org-level spend caps per meter, or per-entity caps for individual users, apps, and API keys. Email + webhook + UI alerts at any thresholds you pick. Set a cap to hard-stop at your limit, or leave it off to let usage flow into metered overage.

Why the API tier, on this substrate

FAQ

Yes — change the base URL and API key, and existing code keeps working. HASP also offers higher-level workflow endpoints where PHI controls and prompt templates are enforced server-side. Full API reference is in the developer docs.
Per-org RPM at the Gateway, scaling with your plan tier. Enterprise limits are custom. Rate-limit events come back as 429 with a Retry-After header; the event itself is logged. Current limits per tier are on the pricing page.
On /v1/messages, response shape and field names are response-compatible with the leading AI provider APIs. We add meta.usage and meta.billing fields for observability — these are additive, not breaking.
90-day deprecation notice, then 410 MODEL_RETIRED. New model IDs surface in the docs and in the admin model-allowlist UI before deprecation lands.
Yes — that's the API-first plans (Starter through Scale). They include the admin UI for keys, usage, audit-export, billing, and BAA status, but no Assistant chat UI or Studio app-builder. Platform plans bundle all of those.

The other product pages

API plans bundle Public API + Agent SDK + admin UX for keys, usage, audit export, billing, and BAA workflows — built for developers integrating HASP into their own software. Platform plans bundle Assistant chat + Studio for teams using AI in their day-to-day work. Same compliance substrate, same audit chain across both. Orgs can hold one plan or both.

Evaluate with a free key today.

Free Evaluation includes a gateway-scoped API key plus admin metering views. PHI remains blocked until BAAs finalize. See the pricing page for current evaluation limits — convert to unlock production posture.