FASTPIX VIDEO SEARCH AGENT

Add "ask your video library" search to your product.

Multimodal video catalog search API with timestamped clip citations.

Your users ask a question in plain language and get a direct answer, with the exact clips cited at the right timestamp, across your whole VOD and live archive.

It is a search engine you embed in your product, not a tool your users leave to go use. Title-search returns the hour-long video; Video Search returns the 18-second moment that answers the question.

Cited moments
answer + exact timestamp
Multimodal
transcripts + frames + objects + chapters
Sub-second
on the query path

TRUSTED BY PRODUCT TEAMS SHIPPING VIDEO AT SCALE

Customer logoCustomer logoCustomer logoCustomer logoCustomer logoCustomer logo

What your users can do

Drop Video Search in, and your users can ask your library anything

Once Video Search is in your product, your audience asks in plain language and gets the exact moment back, with the cited clip. A few ways teams put it to work:

Connected fitness

"Get me 10 hamstring exercises for today."

Mid-class, a member asks for something specific. Video Search pulls the exact moments from your class library and queues them up, without anyone leaving the workout.

Newsroom

"Find the clip where this politician spoke on this."

Mid-interview, a producer asks the archive a question. Video Search returns the exact cited moment from years of footage, ready to roll on air in seconds.

Education

"Show me how this concept was taught."

A learner asks; Video Search returns the exact teaching moments. Assemble them into a custom lesson, for modern, ask-anything e-learning on your own catalog.

WHAT IT LOOKS LIKE IN YOUR PRODUCT

Query in. Answer out.
With cited moments.

Two pipes: indexing runs once per video on ingest. Query runs on every user search. Both share the same multimodal index.

INDEXonce · on ingest
Video uploaded
Transcript +
frames + objects
Multimodal index
built
Stored, ready to
query
writes ↑one shared multimodal index↓ reads
QUERYevery user search
User asks a
question
CITED scoring
Top clips
ranked
Structured answer +
citations
02:1417:46

Sub-second on the query path for typical catalogs. Indexing is one-time on ingest; lives alongside the VOD asset.

Why catalog search is broken today

Your viewers ask questions. They get title-search results.

If your catalog has more than 500 videos, title-search is the wrong surface. Your viewers are typing questions; your UI is matching keywords.

The right moment is buried inside an hour-long video.

Title-search returns the lecture. The learner still has to scrub. The first three minutes don't answer the question. Drop-off compounds.

200 hours of manual tagging to make the catalog searchable.

Content ops teams hand-tagging topics, time-codes, and themes is the most-hated workflow in EdTech and media. It scales linearly with catalog. Then breaks when content changes.

You wrote "intelligent search" in your product brief 8 months ago.

Then realized building it means an embedding model, a vector store, a re-ranker, transcript chunking strategy, frame extraction, query expansion, a UX shell, evals, and a feedback loop. The shipping date moved twice.

CITED. FIVE DIMENSIONS PER ANSWER

Every cited clip is scored on five axes

CITED is how Video Search Agent decides which clips answer the question and which ones are background. Per-vertical defaults shipped. Per-customer tuning supported.

C

Confidence

How sure is the model that this clip answers the question? Score 0-1. Drives inline-vs-related routing in your UI.

I

Index match

Multimodal signal strength: transcript + frame + object + chapter score, jointly weighted.

T

Timestamp precision

Exact start/end of the answering moment. Not the whole video. The 18-second moment where the question is actually addressed.

E

Excerpt quality

The surfaced transcript chunk that proves the moment answers the question. Renders as a quote next to the clip.

D

Depth

Is the clip self-contained? Or does it depend on context viewers don't have? Filters out cliffhanger moments that need 90 seconds of setup.

EVERY QUERY GETS A CITED ANSWER

100% routed by your UX. Not 60/40.

Confidence ≥0.8 surfaces inline as the primary answer. Confidence 0.5-0.8 surfaces as "related moments." Confidence under 0.5 surfaces as a fallback search across the catalog. Three lanes; same Agent.

FIVE STEPS, START TO ANSWER

Index. Query. Score. Answer. Cite.

  • 01

    Index

    Catalog ingests once. Multimodal index built across transcripts, frames, objects, chapters. Lives with the asset.

  • 02

    Query

    User types or speaks a question. Query expanded; intent classified.

  • 03

    Score

    Every candidate clip scored on CITED. Top-N retained; threshold lanes computed.

  • 04

    Answer

    Structured answer composed from top clips. Sections, bullets, supporting quotes assembled.

05

Cite

Each section returns: clip + start/end timestamp + excerpt + thumbnail + playback_id.

WHAT MAKES IT POWERFUL

Search that understands video, not just titles.

Multimodal, not text-only

The index reads transcripts, scene bondries, on-screen frames, objects or face detection, OCR, NER and audio events, so a question resolves to the right moment even when the title says nothing.

The exact moment, cited

Every answer comes with the clip, the start and end timestamp, a quotable excerpt, a thumbnail, and a playback ID, ready to render and play.

Sub-second answers

Indexing is one-time on ingest, so the query path returns in under a second for typical catalogs. No re-build per query.

Follow-up questions

A follow-up token keeps conversation state, so viewers can refine and dig in without starting the search over.

Ask in any language

Indexing covers transcripts in multiple languages plus scene and object signals, so a viewer can ask in their own words and still land the cited moment.

One index, one stack

The same multimodal index powers In-Video AI summary and the Clipping Agent, lives next to the asset, and is priced as part of the stack, not a separate understanding bill.

PLUGS INTO YOUR STACK

One Agent. Every catalog. Every surface.

Video Search Agent runs on the same FastPix multimodal index that powers In-Video AI summary and Clipping Agent CHAIN scoring. The index is built once. Three products read from it.

VOD assets. uploaded videos

Every video uploaded to FastPix VOD is indexed automatically (opt-in). No re-encode required.

Live recordings. auto-archive

Live streams with auto-archive become VOD assets. Same index path. Live archive becomes searchable the moment recording completes.

External catalog. bring your own

If your video lives elsewhere (S3, Cloudflare R2, partner CDN), point the indexer at the source URL. Index runs without re-ingest.

FEATURED CUSTOMER

How Aadhan replaced title-search with question-answer + cited news moments.

Aadhan ships short-form news to a large multilingual audience. Their catalog of broadcast news segments was searchable by headline only. Viewers typed questions like "what did the minister say about inflation?"; the UI matched keywords in headlines, not what people said. The engineering team wired Video Search Agent against the existing FastPix Live archive: multimodal index across multiple languages, dropped a query UI into the Aadhan app, shipped behind a feature flag.

Read the Aadhan case study
Aadhan

SEARCH AGENT FAQ

Questions buyers ask before going to production

  • What is multimodal indexing?

    Transcripts plus scene detection plus object/face recognition plus caption/chapter markers, indexed jointly. The Agent answers across all four signals, not just title or description metadata. That's what makes "show me the moment Professor Singh factors a quadratic" resolve correctly.
  • How fast is the answer?

    Sub-second on the query path for catalogs under ~10K assets. Indexing cost is one-time per video on ingest. Lives alongside the VOD asset; no re-build on every query.
  • Can it answer follow-up questions?

    Yes. The Agent maintains conversation state via the follow_up_token in the response. Pass that token into the next query. Follow-ups are scoped to the prior answer and the same catalog index.
  • Does it work for live archives?

    Yes. Live recordings with auto-archive flow into the same multimodal index as VOD. The Agent answers across both.
  • How does it compare to Algolia or Elastic?

    Algolia and Elastic index text. Video Search Agent indexes video - transcripts plus frames plus objects plus chapters, jointly weighted. Different category. Most customers keep Algolia for text-content search and use Video Search Agent for video-content search.