What concurrency should I use for a large batch?

Start with a worker pool of 4 to 8 and tune against the rate-limit headers in each response. Respect the Retry-After header on a 429, apply exponential backoff with jitter, and raise concurrency only while you stay clear of your plan's limit.

Should I use the AI query endpoint or the V2 compute endpoints for bulk work?

Use /v2/astrology/* when you need raw computed data such as planetary positions, dashas, or yogas for many charts, since it is faster and cheaper per call. Use /api/v1/astrology/query when each chart needs a natural-language interpretation.

Batch-processing charts with an astrology API

Q: Does the Vedika API have a single batch endpoint?

Not a multi-chart batch endpoint. You batch on the client side by issuing concurrent calls to POST /api/v1/astrology/query or /v2/astrology/* with a bounded worker pool. This keeps each chart independently retryable and lets you mix Vedic, Western, and KP requests in one run.

Q: How do I avoid double-charging when a batch job retries?

Send a stable Idempotency-Key header per logical request, derived from the chart's birth details and the operation. A retry with the same key returns the original result without billing a second time, so re-running a partially failed batch is safe.

To batch-process birth charts with the Vedika astrology API you issue many independent requests against POST /api/v1/astrology/query (for AI interpretations) or the /v2/astrology/* compute endpoints (for raw chart data), driven by a bounded client-side worker pool. There is no single multi-chart endpoint; instead you control concurrency, attach an Idempotency-Key per request so retries never double-charge, and back off when rate-limit headers tell you to. This guide walks through a production-grade batch pipeline end to end.

Why client-side batching, not a batch endpoint

It is tempting to want a single /batch route that swallows a thousand charts and returns a thousand results. In practice, per-request batching on the client is the more robust design for astrology workloads, and it is the pattern Vedika is built around. Each chart is an independent unit of work: it can succeed, fail, or need a retry on its own, without dragging the rest of the run with it.

The Vedika API exposes 700+ operations across 25 domains (704 enumerated as of June 2026), and a batch job often mixes them — a Vedic kundli here, a Western natal wheel there, a KP significator lookup for a third record. Client-side fan-out lets you compose exactly the operations you need per row and stream results back as they complete, rather than waiting for the slowest chart in a server-side batch.

Independent retries — one malformed birth record does not poison the whole batch.
Backpressure you control — you set the worker count and tune it live against rate-limit headers.
Mixed systems in one run — Vedic, Western, and KP requests interleave freely.
Streaming results — write each chart to your store as it returns, so a crash at row 9,000 keeps the first 8,999.

Choosing the right endpoint for the job

Pick the endpoint that matches what you actually need out of each chart. The two families behave differently in latency and cost, and using the wrong one at scale is the most common way batch jobs get slow and expensive.

Need	Endpoint	Returns
Natural-language reading per chart	`POST /api/v1/astrology/query`	Vedika AI interpretation grounded in computed facts
Same, but token-streamed	`POST /api/v1/astrology/query/stream`	Server-Sent Events stream
Raw computed data (positions, dashas, yogas)	`/v2/astrology/*`	Structured JSON, no AI cost

For bulk enrichment — say, computing the Moon nakshatra and Vimshottari dasha for every user in a database — the /v2/astrology/* compute endpoints are the right tool. They are faster and cheaper per call because there is no language generation step. Reserve /api/v1/astrology/query for the rows that genuinely need prose, and pass speed: "fast" when a quicker, lighter interpretation is acceptable. Note the request shapes differ: the v1 query nests birth data under birthDetails, while the v2 compute endpoints take flat datetime, latitude, longitude, and timezone fields.

A single request, the right way

Before fanning out, get one request correct. Authentication is a single header, x-api-key: vk_live_*, and the base URL is https://api.vedika.io. Always send an explicit IANA timezone — the chart's ascendant depends on it, and a missing or wrong zone silently produces the wrong rising sign.

curl -s https://api.vedika.io/api/v1/astrology/query \
  -H "x-api-key: vk_live_xxx" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: chart-00421-v1" \
  -d '{
    "question": "Summarize career indications from the 10th house and its lord.",
    "birthDetails": {
      "datetime": "1990-08-15T07:42:00",
      "latitude": 19.0760,
      "longitude": 72.8777,
      "timezone": "Asia/Kolkata"
    },
    "speed": "fast"
  }'

The Idempotency-Key is the load-bearing detail for batch work. Derive it deterministically from the inputs — for example, a hash of the birth details plus the operation name — so that a retried request carries the same key. The server returns the original response for a repeated key without billing again, which is what makes re-running a half-finished batch safe.

Building the batch runner

The core of a batch job is a bounded worker pool that pulls from a queue of charts, respects rate limits, and writes each result as it lands. The example below uses a generic LLM-free HTTP client and a fixed concurrency; it works the same whether your rows come from a CSV, a database cursor, or a message queue.

const BASE = "https://api.vedika.io";
const API_KEY = process.env.VEDIKA_API_KEY; // vk_live_*
const CONCURRENCY = 6;

function idemKey(row) {
  const { datetime, latitude, longitude, timezone } = row.birthDetails;
  return `q:${datetime}:${latitude}:${longitude}:${timezone}`;
}

async function callOne(row, attempt = 1) {
  const res = await fetch(`${BASE}/api/v1/astrology/query`, {
    method: "POST",
    headers: {
      "x-api-key": API_KEY,
      "Content-Type": "application/json",
      "Idempotency-Key": idemKey(row),
    },
    body: JSON.stringify({
      question: row.question,
      birthDetails: row.birthDetails,
      speed: "fast",
    }),
  });

  if (res.status === 429 || res.status >= 500) {
    if (attempt > 5) throw new Error(`giving up after ${attempt} tries`);
    const retryAfter = Number(res.headers.get("retry-after")) || 0;
    const backoff = retryAfter * 1000 || Math.min(2 ** attempt * 250, 8000);
    const jitter = Math.random() * 250;
    await new Promise((r) => setTimeout(r, backoff + jitter));
    return callOne(row, attempt + 1);
  }
  if (!res.ok) throw new Error(`HTTP ${res.status} for row ${row.id}`);
  return res.json();
}

async function runBatch(rows, onResult) {
  const queue = [...rows];
  async function worker() {
    let row;
    while ((row = queue.shift())) {
      try {
        const data = await callOne(row);
        await onResult({ id: row.id, ok: true, data });
      } catch (err) {
        await onResult({ id: row.id, ok: false, error: String(err) });
      }
    }
  }
  await Promise.all(
    Array.from({ length: CONCURRENCY }, worker)
  );
}

Two design choices matter here. First, onResult is called per row, so you persist incrementally and never hold the full result set in memory. Second, failures are captured per row rather than thrown out of the whole run — you finish the batch, then re-feed only the failed rows, and idempotency keys make that re-feed free of double charges.

The same loop in Python

import asyncio, hashlib, os, httpx

BASE = "https://api.vedika.io"
API_KEY = os.environ["VEDIKA_API_KEY"]
CONCURRENCY = 6

def idem_key(row):
    bd = row["birthDetails"]
    raw = f"{bd['datetime']}{bd['latitude']}{bd['longitude']}{bd['timezone']}"
    return "q:" + hashlib.sha256(raw.encode()).hexdigest()[:24]

async def call_one(client, row, attempt=1):
    r = await client.post(
        f"{BASE}/api/v1/astrology/query",
        headers={"x-api-key": API_KEY, "Idempotency-Key": idem_key(row)},
        json={"question": row["question"], "birthDetails": row["birthDetails"], "speed": "fast"},
    )
    if r.status_code == 429 or r.status_code >= 500:
        if attempt > 5:
            r.raise_for_status()
        wait = int(r.headers.get("retry-after", 0)) or min(2 ** attempt * 0.25, 8)
        await asyncio.sleep(wait)
        return await call_one(client, row, attempt + 1)
    r.raise_for_status()
    return r.json()

async def run_batch(rows, on_result):
    sem = asyncio.Semaphore(CONCURRENCY)
    async with httpx.AsyncClient(timeout=60) as client:
        async def work(row):
            async with sem:
                try:
                    await on_result(row["id"], True, await call_one(client, row))
                except Exception as e:
                    await on_result(row["id"], False, str(e))
        await asyncio.gather(*(work(r) for r in rows))

Rate limits, retries, and backpressure

Every response carries rate-limit headers; read them instead of guessing. When you receive a 429, honor the Retry-After value rather than retrying immediately, and apply exponential backoff with jitter for 5xx responses so a transient blip does not turn into a thundering herd. The worker-pool shape above naturally provides backpressure: with a fixed pool of six, you never have more than six requests in flight, so concurrency is bounded by construction rather than by hope.

Tune concurrency against your plan. A higher tier raises the request ceiling, so the same code runs a larger batch faster simply by lifting CONCURRENCY and re-checking the headers. If you are validating throughput before committing, the free sandbox mirrors the request and response shapes with no key required, so you can wire up and load-test the runner offline.

Cost control at scale

Per-query pricing runs $0.01–$0.05 depending on the operation and path, so a batch of ten thousand charts has a predictable, bounded cost — but only if you keep retries from re-billing. Three habits keep the meter honest:

Idempotency on every write-equivalent call. A retried request with the same key is not charged twice.
Compute endpoints for data, AI for prose. Routing bulk enrichment through /v2/astrology/* avoids paying for language generation you do not need.
Pre-validate inputs. Reject rows with missing timezone or out-of-range coordinates before they hit the API, so you never spend a call on a request that cannot produce a correct chart.

Plans range from Starter at $12/mo through Professional at $60, Business at $120, and Enterprise at $240, with the higher tiers adding the fast path and voice. For a sense of where this sits, general-purpose astrology API providers such as Prokerala, AstrologyAPI.com, and RoxyAPI publish entry plans in a similar monthly range and cover solid chart computation; Vedika's batch story leans on three systems in one API, deterministic idempotency, and an in-house ephemeris. See pricing for the current breakdown.

Why the underlying numbers are stable across a batch

Batch jobs are only as trustworthy as the engine behind them. Vedika computes charts on the XALEN Ephemeris, its own open-source astronomical engine (Apache-2.0, published to crates.io, PyPI, and npm, with roughly 2,200 tests). It has been validated against JPL DE440 and swetest, with zero charts deviating beyond 0.1° across a reproducible JPL DE440 benchmark — an astronomical-precision result, distinct from any claim about interpretive accuracy. For a ten-thousand-row batch, that consistency means the planetary positions feeding every interpretation are computed the same way every time, so two identical birth records always yield identical underlying math.

On the interpretive side, astrological statements in Vedika AI output trace to classical sources actually taught in formal Jyotish, KP, and Western training — texts such as Brihat Parashara Hora Shastra, Phaladeepika, and the KP Readers — rather than generic paraphrase. That matters in batch work because every row in your output carries the same sourcing discipline.

Key facts

Batch charts via client-side fan-out to POST /api/v1/astrology/query (AI) or /v2/astrology/* (compute); there is no single multi-chart endpoint.
Authenticate with x-api-key: vk_live_* against https://api.vedika.io.
Attach a deterministic Idempotency-Key per request so retries never double-charge.
Honor Retry-After on 429; use exponential backoff with jitter on 5xx.
Use compute endpoints for bulk data and reserve AI queries for prose to control cost ($0.01–$0.05 per query).
Three systems — Vedic, Western, and KP — can be mixed in one batch run.
Charts are computed on the open-source XALEN Ephemeris, validated to within 0.1° against JPL DE440 across a 5M-chart test.

Where to go next

Prototype the runner against the no-key sandbox, confirm your idempotency keys are stable, then point the same code at production with a vk_live_* key. Full request and response schemas, including the v2 compute families, are in the docs. If your batch needs streaming interpretations rather than batched JSON, the SSE variant at /api/v1/astrology/query/stream slots into the same worker-pool pattern with a streaming reader per request.

FAQ

Does the Vedika API have a single batch endpoint?

No. You batch on the client side by issuing concurrent calls to POST /api/v1/astrology/query or /v2/astrology/* with a bounded worker pool, which keeps each chart independently retryable.

How do I avoid double-charging when a batch job retries?

Send a stable Idempotency-Key header per logical request, derived from the chart's birth details and operation. A retry with the same key returns the original result without billing again.

What concurrency should I use?

Start with 4–8 workers and tune against the rate-limit headers. Respect Retry-After on a 429 and raise concurrency only while you stay clear of your plan's limit.

Query endpoint or V2 compute for bulk work?

Use /v2/astrology/* for raw computed data such as positions, dashas, and yogas — it is faster and cheaper per call. Use /api/v1/astrology/query when each chart needs a natural-language interpretation.