FAQ

Frequently Asked Questions

Everything you need to know about Verbalyze — our products, integration, compliance, and pricing.

Product & Capabilities

What languages does Verbalyze support?

Verbalyze supports 30+ Indian languages and dialects including Hindi, Tamil, Telugu, Kannada, Malayalam, Marathi, Bengali, Gujarati, Punjabi, Odia, Urdu, Assamese, Maithili, Bhojpuri, Rajasthani, Haryanvi, and more. We also handle code-switching — for example Hinglish (Hindi-English) and Tanglish (Tamil-English) — which generic global models cannot.

What is the speech recognition accuracy?

Our Hindi ASR achieves 3.2% Word Error Rate (WER) on a standard benchmark dataset — among the best reported for Hindi. Accuracy varies per language and domain; BFSI and healthcare domain-fine-tuned models typically outperform generic models by 15–30% on domain vocabulary.

What is the end-to-end latency?

Our streaming ASR delivers first-word results in under 90ms. Full end-to-end voice agent round-trip (ASR → LLM → TTS) completes in under 600ms on our cloud infrastructure. On-premise latency depends on hardware configuration.

Does Verbalyze support real-time streaming?

Yes. We provide a WebSocket-based streaming ASR endpoint (wss://api.verbalyze.in/v2/stt/stream) for real-time transcription as audio is being spoken. This is used for live call transcription, agent assist, and voice agent applications.

Integration & APIs

How do I get an API key?

Sign up at verbalyze.in/contact to request API access. Our team will provision a sandbox account with 10,000 free API minutes for evaluation. Production keys are provisioned after onboarding.

What SDKs are available?

We provide official SDKs for Python (pip install verbalyze) and Node.js (npm install @verbalyze/sdk). REST API with cURL is also fully supported. SDKs for Go and Java are on the roadmap for Q3 2026.

Can I self-host Verbalyze on my own infrastructure?

Yes. Our Self-Hosted LLM and on-premise ASR/TTS products are designed for enterprise deployments where data sovereignty is critical. We provide Docker containers, Kubernetes Helm charts, and dedicated deployment engineering support.

What audio formats are supported?

We accept WAV, MP3, FLAC, OGG, M4A, and raw PCM audio. For streaming, we accept 16-bit PCM at 16kHz mono. Automatic format detection is available for batch transcription requests.

Compliance & Security

Is Verbalyze compliant with India's DPDP Act 2023?

Yes. Verbalyze is built with DPDP (Digital Personal Data Protection Act) compliance from the ground up. This includes automatic PII redaction (Aadhaar, PAN, phone numbers), explicit consent management APIs, data localization in Indian data centres, and audit trail logging.

Does Verbalyze comply with RBI telemarketing guidelines?

Yes. Our Voice Agent platform enforces RBI-mandated calling windows (9AM–7PM IST), automatically cross-checks recipient numbers against the DND (Do Not Disturb) registry, and maintains encrypted consent records for every outbound interaction.

Where is my data stored?

All customer data is stored in India (Mumbai AWS region or equivalent on-premise). We do not transfer audio or transcription data outside India by default. For on-premise deployments, all data stays within your own infrastructure.

Pricing & Plans

How is Verbalyze priced?

We offer usage-based pricing per minute of audio processed, with volume discounts at scale. Enterprise annual contracts are available with committed usage and SLA guarantees. Visit verbalyze.in/pricing for detailed tier information.

Is there a free trial?

Yes. Every new account gets 10,000 free API minutes for evaluation — no credit card required. The sandbox includes access to all core products: ASR, TTS, and basic Voice Agent capabilities.

Call Analytics & Data

What structured data does each call produce?

Every call processed by Verbalyze produces a structured JSON telemetry record containing: call_id, duration, per-speaker transcripts with timecodes, diarization statistics (agent vs. customer talktime %), sentiment_vectors (trajectory, peak frustration timestamp, resolution confidence), intent_flags (e.g. payment_inquiry, competitor_mention, solution_offered), pii_redacted field list, and a compliance audit object (disclosure_verified, script_adherence_score, DND check result). This payload is delivered via REST webhook or S3 auto-sync within 800ms of call end.

How is sentiment scoring computed?

Sentiment is computed over 10-second sliding prosody windows throughout the call. We analyse pitch variance, speech rate, pause distribution, and energy intensity alongside language-model semantic scoring. The result is a per-second sentiment index (−1.0 to +1.0) which is aggregated into trajectory labels (e.g. 'negative → positive') and used to trigger real-time escalation rules (e.g. frustration score < −0.35 for more than 20 seconds).

Can I query historical call analytics?

Yes. Verbalyze delivers structured call intelligence records via three delivery channels: (1) REST webhooks — per-call event fired within 800ms of call end; (2) S3 auto-sync — call records written to your S3 bucket in real time; (3) Redacted Transcripts API — queryable endpoint returning transcript + analytics payloads filtered by date, agent ID, intent type, language, or sentiment range. All records are retained for 90 days by default; custom retention periods available on Enterprise.

How are PII fields identified and redacted in real time?

Our PII detection layer runs in parallel with transcription. We identify and redact Aadhaar numbers, PAN card numbers, credit/debit card numbers, bank account numbers, mobile numbers, and email addresses in real time as speech is processed. Redacted fields are logged as structured events in the compliance audit object — showing field type, character position in transcript, and redaction timestamp. This satisfies DPDP Act 2023, PCI-DSS, and RBI data handling requirements.

Still have questions?

Our team responds within one business day.