FASTPIX VIDEO SEARCH AGENT

Add "ask your video library" search to your product.

Multimodal video catalog search API with timestamped clip citations.

Your users ask a question in plain language and get a direct answer, with the exact clips cited at the right timestamp, across your whole VOD and live archive.

It is a search engine you embed in your product, not a tool your users leave to go use. Title-search returns the hour-long video; Video Search returns the 18-second moment that answers the question.

Request access See what you get

Cited moments

answer + exact timestamp

Multimodal

transcripts + frames + objects + chapters

Sub-second

on the query path

TRUSTED BY PRODUCT TEAMS SHIPPING VIDEO AT SCALE

What your users can do

Drop Video Search in, and your users can ask your library anything

Once Video Search is in your product, your audience asks in plain language and gets the exact moment back, with the cited clip. A few ways teams put it to work:

Connected fitness

"Get me 10 hamstring exercises for today."

Mid-class, a member asks for something specific. Video Search pulls the exact moments from your class library and queues them up, without anyone leaving the workout.

Newsroom

"Find the clip where this politician spoke on this."

Mid-interview, a producer asks the archive a question. Video Search returns the exact cited moment from years of footage, ready to roll on air in seconds.

Education

"Show me how this concept was taught."

A learner asks; Video Search returns the exact teaching moments. Assemble them into a custom lesson, for modern, ask-anything e-learning on your own catalog.

In production: how Aadhan built news-catalog Q&A →

WHAT IT LOOKS LIKE IN YOUR PRODUCT

Query in. Answer out.
With cited moments.

Two pipes: indexing runs once per video on ingest. Query runs on every user search. Both share the same multimodal index.

INDEXonce · on ingest

Video uploaded

Transcript +
frames + objects

Multimodal index
built

Stored, ready to
query

writes ↑one shared multimodal index↓ reads

QUERYevery user search

User asks a
question

CITED scoring

Top clips
ranked

Structured answer +
citations

02:1417:46

Sub-second on the query path for typical catalogs. Indexing is one-time on ingest; lives alongside the VOD asset.

Why catalog search is broken today

Your viewers ask questions. They get title-search results.

If your catalog has more than 500 videos, title-search is the wrong surface. Your viewers are typing questions; your UI is matching keywords.

The right moment is buried inside an hour-long video.

Title-search returns the lecture. The learner still has to scrub. The first three minutes don't answer the question. Drop-off compounds.

200 hours of manual tagging to make the catalog searchable.

Content ops teams hand-tagging topics, time-codes, and themes is the most-hated workflow in EdTech and media. It scales linearly with catalog. Then breaks when content changes.

You wrote "intelligent search" in your product brief 8 months ago.

Then realized building it means an embedding model, a vector store, a re-ranker, transcript chunking strategy, frame extraction, query expansion, a UX shell, evals, and a feedback loop. The shipping date moved twice.

CITED. FIVE DIMENSIONS PER ANSWER

Every cited clip is scored on five axes

CITED is how Video Search Agent decides which clips answer the question and which ones are background. Per-vertical defaults shipped. Per-customer tuning supported.

Confidence

How sure is the model that this clip answers the question? Score 0-1. Drives inline-vs-related routing in your UI.

Index match

Multimodal signal strength: transcript + frame + object + chapter score, jointly weighted.

Timestamp precision

Exact start/end of the answering moment. Not the whole video. The 18-second moment where the question is actually addressed.

Excerpt quality

The surfaced transcript chunk that proves the moment answers the question. Renders as a quote next to the clip.

Depth

Is the clip self-contained? Or does it depend on context viewers don't have? Filters out cliffhanger moments that need 90 seconds of setup.

EVERY QUERY GETS A CITED ANSWER

100% routed by your UX. Not 60/40.

Confidence ≥0.8 surfaces inline as the primary answer. Confidence 0.5-0.8 surfaces as "related moments." Confidence under 0.5 surfaces as a fallback search across the catalog. Three lanes; same Agent.

FIVE STEPS, START TO ANSWER

Index. Query. Score. Answer. Cite.

01
Index
Catalog ingests once. Multimodal index built across transcripts, frames, objects, chapters. Lives with the asset.
02
Query
User types or speaks a question. Query expanded; intent classified.
03
Score
Every candidate clip scored on CITED. Top-N retained; threshold lanes computed.
04
Answer
Structured answer composed from top clips. Sections, bullets, supporting quotes assembled.

Cite

Each section returns: clip + start/end timestamp + excerpt + thumbnail + playback_id.

WHAT MAKES IT POWERFUL

Search that understands video, not just titles.

Multimodal, not text-only

The index reads transcripts, scene bondries, on-screen frames, objects or face detection, OCR, NER and audio events, so a question resolves to the right moment even when the title says nothing.

The exact moment, cited

Every answer comes with the clip, the start and end timestamp, a quotable excerpt, a thumbnail, and a playback ID, ready to render and play.

Sub-second answers

Indexing is one-time on ingest, so the query path returns in under a second for typical catalogs. No re-build per query.

Follow-up questions

A follow-up token keeps conversation state, so viewers can refine and dig in without starting the search over.

Ask in any language

Indexing covers transcripts in multiple languages plus scene and object signals, so a viewer can ask in their own words and still land the cited moment.

One index, one stack

The same multimodal index powers In-Video AI summary and the Clipping Agent, lives next to the asset, and is priced as part of the stack, not a separate understanding bill.

PLUGS INTO YOUR STACK

One Agent. Every catalog. Every surface.

Video Search Agent runs on the same FastPix multimodal index that powers In-Video AI summary and Clipping Agent CHAIN scoring. The index is built once. Three products read from it.

VOD assets. uploaded videos

Every video uploaded to FastPix VOD is indexed automatically (opt-in). No re-encode required.

Live recordings. auto-archive

Live streams with auto-archive become VOD assets. Same index path. Live archive becomes searchable the moment recording completes.

External catalog. bring your own

If your video lives elsewhere (S3, Cloudflare R2, partner CDN), point the indexer at the source URL. Index runs without re-ingest.

FEATURED CUSTOMER

How Aadhan replaced title-search with question-answer + cited news moments.

Aadhan ships short-form news to a large multilingual audience. Their catalog of broadcast news segments was searchable by headline only. Viewers typed questions like "what did the minister say about inflation?"; the UI matched keywords in headlines, not what people said. The engineering team wired Video Search Agent against the existing FastPix Live archive: multimodal index across multiple languages, dropped a query UI into the Aadhan app, shipped behind a feature flag.

Read the Aadhan case study

Buyer journey

How Video Search Agent shows up across FastPix.

Four cluster pages that go deeper than this one. Each is a self-contained landing for a specific buyer journey.

AI video search across catalog

The shipping-customer pattern from Aadhan. Multimodal index, news-shaped queries, cited 30-second moments. Architecture that powered it.

Read the use case

Lecture to study shorts

Long lectures become topic-tagged study shorts with AI-search across the catalog. EdTech platform pattern.

Read the use case

In-Video AI deep dive

The multimodal indexing layer that powers Video Search Agent, Clipping Agent CHAIN scoring, In-Video AI summary, and auto-tagging.

Read the product page

How Aadhan built news-catalog Q&A

Engineering team at a short-form news platform. Multimodal index across multiple languages of broadcast archives. Viewers ask; Agent answers with cited moments.

Read the story

SEARCH AGENT FAQ

Questions buyers ask before going to production

What is multimodal indexing?
Transcripts plus scene detection plus object/face recognition plus caption/chapter markers, indexed jointly. The Agent answers across all four signals, not just title or description metadata. That's what makes "show me the moment Professor Singh factors a quadratic" resolve correctly.
How fast is the answer?
Sub-second on the query path for catalogs under ~10K assets. Indexing cost is one-time per video on ingest. Lives alongside the VOD asset; no re-build on every query.
Can it answer follow-up questions?
Yes. The Agent maintains conversation state via the follow_up_token in the response. Pass that token into the next query. Follow-ups are scoped to the prior answer and the same catalog index.
Does it work for live archives?
Yes. Live recordings with auto-archive flow into the same multimodal index as VOD. The Agent answers across both.
How does it compare to Algolia or Elastic?
Algolia and Elastic index text. Video Search Agent indexes video - transcripts plus frames plus objects plus chapters, jointly weighted. Different category. Most customers keep Algolia for text-content search and use Video Search Agent for video-content search.
What happens when content updates or is deleted?
On asset update (source re-uploaded, transcript corrected, chapter edited), the indexer detects content drift and fires a reindex.required webhook. Your call when to re-run fp.search.index(). On delete, index entries are purged within minutes; the Agent stops citing the asset immediately.
What does it cost?
Video Search Agent is in early access. Indexing is per-minute of video; query is per-million queries. Sandbox catalog queries are free during early access. Bundled discount when combined with FastPix VOD + In-Video AI. See FastPix pricing or request access for current pricing.
How is this different from Twelve Labs?
Twelve Labs is a standalone video-understanding API you wire into a stack you assemble yourself. FastPix Video Search Agent is search inside the same platform that already runs your upload, encoding, delivery, player, and analytics, so the index lives next to the asset and there is nothing to stitch. It is priced as part of the stack, not as a separate understanding bill.
How do I add AI search to my own video catalog?
Run the index once across your existing FastPix playlist. Indexing is multimodal and one-time per video on ingest; queries then return cited moments with no re-build per query.
Does search work across languages?
Yes. Indexing covers transcripts in multiple languages plus scene, object, and face signals, so a viewer can ask in their own language and get the cited moment.
Can I give my viewers a YouTube-style "ask my videos" experience?
Yes. That is the core of the Video Search Agent: viewers ask a question in plain language and get a direct answer plus the exact cited clip, on your own catalog and brand. It is the "ask" experience large platforms ship, running on your archive.

Three ways to get unstuck

Whatever kind of help you need, there is a path.

Engineering support

Talk to a video engineer.

Stuck on an API call, a webhook signature, or a player integration? Reach the engineering team directly. Response within hours, not days.

Contact engineering

Integration help

Docs, code samples, video tutorials.

Self-serve resources for the most common integrations. Quickstart guides, SDK examples, recorded walkthroughs on YouTube, and detailed playback logs in your dashboard.

Browse the docs

Solution architect

Plan the rollout with a human.

New integration, migration off another platform, or a complex multi-tenant build. Book a session with a FastPix solution architect or join the Slack community to compare notes with other teams.

Join the Slack community