Arcanada — Adsessor: Introducing an In-Call AI Assistant for Zoom and Google Meet

Video calls have changed how we work. But they have not changed how we think during calls. You sit in a meeting, someone mentions a project decision from six months ago, and you either remember it or you don't. Someone speaks in a language you partially understand, and you nod along hoping to catch the context later from a transcript — after the call, when it is too late to ask.

This is the gap Adsessor exists to close.

What Adsessor Is

The name comes from Latin: adsessor (or assessor) — an assistant judge or expert advisor seated beside the magistrate during legal proceedings. The word fits the ecosystem's naming tradition (Verdicus, Scrutator, Consilium, Munera, Transcribator), but more importantly, it captures the product's essence: an expert seated beside you during the conversation, not a report delivered after it.

Adsessor is an in-call AI assistant for Zoom and Google Meet. It listens (with consent and disclosure), transcribes in real time, translates across languages, and — in later phases — speaks as a role-persona agent that participates in the conversation. It is not a post-call summariser. It is a real-time cognitive layer for your meetings.

The Problem

Three distinct gaps drive the design:

One: knowledge surfaces after the call, not during it. Current tools (Otter.ai, Fireflies, Fathom) record and transcribe, then deliver summaries post-meeting. You still reach for information in the moment and come up empty. The transcript is useful for documentation, but it does not help when the client asks a question you know is answered somewhere in your knowledge base and you cannot find it while talking.

Two: language barriers split attention. International teams default to English even when every participant is more fluent in another language. Comprehension drops. Nuance is lost. Some participants stay silent because they are processing, not because they have nothing to say. Live subtitles help, but they show the same language to everyone — they do not translate.

Three: the missing expert-seated-beside-you role. In a physical meeting, you can lean over and ask a colleague a quiet question. In a video call, that role does not exist. There is no one whispering context, suggesting references, or flagging when the current topic touches something the team decided last quarter.

Adsessor is designed to fill all three gaps — not after the call, but during it.

What Adsessor Will Be

The system has two modes, building from passive baseline toward full active agency:

Passive Baseline (Phase 1 target). Listen, record, transcribe, and summarise. No speaking. No visible presence to other participants. Host-only controls: start/stop recording, view live transcript, receive post-call summary. This mode mirrors existing tools in capability but differs in architecture — it is built directly into the platform SDK, not through a third-party bot vendor.

Active Mode (Phase 3+ target). Speaking AI agents with role personas. An Architect persona that understands technical context from your project documentation. A Business Advisor persona that recalls historical decisions. A Legal Consul persona that flags compliance concerns. Each persona can address individual participants in their detected language. The host sees live subtitles with real-time translation overlays that are independent of what other participants see — no one else's screen changes. During the call, the agent surfaces insights from a private knowledge base via RAG. It remembers participants across meetings, building cross-meeting memory over time.

Both modes operate with explicit bot disclosure — Zoom's On-Behalf-Of flow and GDPR-aligned consent — so every participant knows an AI assistant is present.

Architecture Choices

Two decisions shape the architecture more than any others:

Direct platform integration, not a third-party bot vendor. The industry standard approach is to route audio through a bot platform like Recall.ai or Rewatch. These services handle the Zoom/Meet bridge, audio extraction, and participant management for a per-meeting fee. Adsessor does the opposite: it integrates directly with the Zoom Meeting SDK and headless Chromium for Google Meet. The rationale is pragmatic rather than ideological. At low call volume, third-party bots are cheaper and faster to ship. At the volume we target, the cost crossover makes direct integration cheaper per call. More importantly, controlling the audio path gives us direct access to the raw PCM 16-bit LE 16 kHz mono stream required for downstream speech-to-text — without intermediate format conversions that degrade quality. And for GDPR compliance, not routing audio through an additional third party simplifies the data path.

Ecosystem-native integration. Adsessor does not reinvent identity, model routing, knowledge retrieval, or memory. It consumes them from existing Arcanada services: identity and authentication through Auth Arcana, LLM routing through Model Connector, knowledge retrieval through Scrutator (the open-source hybrid search engine), and cross-run participant memory through the Long Term Memory layer. This reduces duplication and ensures every service in the ecosystem gets smarter as any single service improves.

The ecosystem reuse mandate is the reason Adsessor can start as a Phase 0 spike rather than a ground-up rewrite of meeting bot infrastructure.

Where We Are Now

Phase 0 is in progress: design validation through a structured 5-day Zoom Meeting SDK spike. Four technical unknowns are being resolved before committing to Phase 1.

On-Behalf-Of cross-account flow. Zoom's February 2026 enforcement requires explicit account-level authorisation for apps acting on behalf of users. The spike validates that the OAuth and Bearer token flow works for a developer-account app targeting external Zoom accounts.

Raw audio capture. Zoom's Raw Data API exposes PCM 16-bit little-endian, 16 kHz mono audio — the format required by Whisper and other speech-to-text engines without resampling. The spike confirms the API is accessible to Pro-tier accounts and that the audio buffer can be streamed in real time to an STT pipeline.

Pro-plan eligibility. Zoom's Raw Data API and Meeting SDK advanced features require a Pro subscription or higher. The spike validates that Pro is sufficient (Business or Enterprise are not required).

Acoustic echo cancellation (AEC). If Adsessor plays audio into a call (active mode), it must avoid creating a feedback loop where its own output is treated as input. The spike tests Zoom's AEC handling with a secondary audio stream.

Per the AAL framework, Adsessor declares current AAL = L0 (no production deployment — nothing is running in customer-facing environments) and target AAL = L4 (self-healing commercial agent with autonomous recovery). The marketing site adsessor.app is registered, but the content there is a placeholder until Phase 1 deploys. No firm dates are set for any phase — this is honest build-in-public reporting, not a roadmap commitment.

Phase 1+ Roadmap

The following describes what we intend to build, in sequence. Every date is aspirational; each phase gates on the previous phase completing validation.

Phase 1 — Zoom passive baseline. Finish Zoom Meeting SDK validation. Implement bot disclosure following GDPR and Zoom's On-Behalf-Of requirements. Ship the first production version: a Node.js service that joins Zoom calls on host invitation, records raw audio, generates real-time transcription via Whisper or a similar model, stores the transcript with speaker diarisation, and delivers a post-call summary. No speaking, no visible agent presence. Deployed behind Tailscale-only access initially, then opened to early testers.

Phase 2 — Google Meet bridge + live translation. Add headless Chromium integration for Google Meet calls. Implement live subtitles with translation overlay for the host only — other participants see no change to their interface. The overlay translates between English, Russian, and Spanish initially, with language detection per speaker segment.

Phase 3 — Speaking AI agents with role personas. Introduce the Architect, Business Advisor, and Legal Consul personas. Each persona is backed by Scrutator retrieval over the host's private knowledge base — not a generic LLM, but context-aware search over the documents, wikis, and past conversations the host has explicitly indexed. Cross-meeting memory remembers participant names, preferences, and past decisions. The agent speaks only when invited by the host (push-to-talk or keyword trigger).

Why Build This in Public

Arcanada has a pattern of publishing infrastructure before products. Scrutator — the knowledge retrieval engine — is MIT-licensed open source. Combateka — the learning game — is publicly playable. The Datarim framework and the AAL methodology are documented in public posts. This is not accidental: infrastructure should be transparent, auditable, and improvable by the community.

Adsessor is a commercial product (closed-source by necessity — the bot-disclosure landscape and platform SDK terms make open-source licensing impractical for an in-call agent). But the build process follows the same transparency principle. Status updates, architectural decisions, and honest assessments of what works and what doesn't will be published on this blog. Customers who are evaluating the product before MVP can follow the journey and make informed decisions about when to integrate.

No marketing fluff. No shipping dates disguised as certainty. Just solid engineering, published as we go.

Follow arcanada.ai/blog for updates. The source code for the underlying retrieval engine is at github.com/Arcanada-one/scrutator.