In-Video AI
AI video intelligence as one API.
Auto-clipping. Multimodal search. Scene + object detection. Captions and translations in 10+ languages. Built into FastPix Agents: Notes, Clipping, Breaking News.
curl -X POST https://api.fastpix.com/v1/on-demand -H "Content-Type: application/json" -u "<username>:<password>" -d '{ "inputs": [ { "type": "video", "url": "https://static.fastpix.com/fp-sample-video.mp4" } ], "accessPolicy": "public", "metadata": { "key1": "value1" }, "maxResolution": "1080p", "mediaQuality": "standard", "moderation": { "type": "video" }, "namedEntities": true, "chapters": true, "generate": true,}'9 features per asset, one call Native not bolt-on, no second pipeline Same billing as upload + encode
Knovo built lecture search on In-Video AI. MyClassboard uses chapter detection + transcripts for K-12 content. Multiple OTT customers use NSFW moderation as table stakes.
TRUSTED BY PRODUCT TEAMS SHIPPING VIDEO AT SCALE






Why In-Video AI exists
Three reasons it ships in the core API, not as a second pipeline.
01
One API call, not two.
Set in_video_ai flags on the same asset.create call that uploads and encodes. Results are ready when encoding finishes. No second model to call, no second invoice.
02
No model picking required.
We pick and tune the underlying model. Scence analysis, chapters, transcripts, search, summary, NER, moderation. You get features, not a model gateway.
03
Same billing as the rest.
AI features are priced per-minute alongside encoding. No separate AI invoice. No surprise per-token billing.
How it works
Nine AI features, one toggle.
TL;DR: 3 calls from file to AI output
01
PATCH AI_flags
Pick which features you want at upload time. Flags live on the same asset.create call.
02
AI runs inline
AI processing runs as part of the encoding pipeline. No separate model call.
03
Receive structured output
Webhook fires with structured JSON: chapters, transcript, summary, NER, moderation flags.
Speaker attributed diarization + transcripts
Full transcript exported as text or VTT. Display via FastPix Player or any HLS player.
- 30+ languages supported
- Full transcript exported as text, VTT, or SRT
- Native-language transcripts generated from the original audio
- Speaker diarization (who said what, when)
Auto-chapters + structured entities.
Chapters detected from visual + audio cues. Each chapter is queryable. Named entities (people, places, products, dates), are structured JSON. Use as preview text, recommendation seeds, search indexing.
- Chapters with title + timestamp
- Returns matches with timestamps
- Named Entity Recognition (people, orgs, places, products)
- Powers in-product video help, lecture review, support libraries
curl -X POST https://api.fastpix.com/v1/on-demand -H "Content-Type: application/json" -u "<username>:<password>" -d '{ "inputs": [ { "type": "video", "url": "https://static.fastpix.com/fp-sample-video.mp4" } ], "accessPolicy": "public", "metadata": { "key1": "value1" }, "maxResolution": "1080p", "mediaQuality": "standard", "namedEntities": true, "chapters": true,}'AI Video clipping + AI Reframing + Scene analysis
Detect high-engagement moments using scene analysis, make those clips available for social distribution. Reframe clips for vertical, square, and landscape formats.
- Generate short clips based on topics, keywords, or speaker activity
- Reframe videos for vertical (9:16), square (1:1), and landscape (16:9) formats
- Optimized for Shorts, Reels, TikTok, and social distribution
- Returned as clips with timestamps and metadata
AI Video Search + Video moderation
NSFW detection, policy violation flags, scene change detection, content classification on every asset. Threshold tuneable. AI video search returns the exact second a topic was mentioned.
- NSFW classifier with threshold
- Policy violation flags (configurable per platform)
- AI video search via /search endpoint
- Webhook fires immediately on flag
{ "type": "video.mediaAI.moderation.ready", "object": { "type": "media", "id": "c9ed7167-16e8-45a9-a1ad-170489a94785" }, "id": "44ba6038-bb03-4517-b17c-db27c6c10836", "workspace": { "name": "Dashboard videos", "id": "4fa0e115-9209-4b7f-b498-39c750c82bc4" }, "status": "ready", "data": { "isModerationGenerated": true, "moderationResult": { "categoryScores": [] } }, "createdAt": "2026-06-25T10:10:29.052566688Z", "attempts": []}Security, compliance, and partnerships
Customers
“AI video search across 200+ hours of educational video was a whole separate roadmap for us. With In-Video AI, the search endpoint shipped with our standard upload flow. No second pipeline.”
Knovo product team
Microlearning AI search
“Auto-chapter detection on every lecture meant teachers stopped manually marking chapters. The transcript + summary feed our content completion + share events you can feed into your recommendation engine. One API call, three workflows automated.”
MyClassboard
K-12 EdTech
0
Separate AI pipeline to maintain
Knovo + MyClassboard
1 API call
For upload + encode + AI
Both customers
Native language
Transcripts built in
30+ languages supported
Inline billing
AI cost on the same invoice
Both customers
Capabilities that ship
Nine AI primitives, one toggle.
Meeting notes agent
Auto joins the meeting on time, records, gives summary + action items + key decisions, fires the structured payload to your webhook.
Notes agent guideAuto-chapter detection
Chapters detected from visual + audio cues. Chapter timestamps in webhook.
Chapter detection guideAI Video search
GET /search returns matches with timestamps. Powers in-product help, lecture review.
Request accessSummary + Named entity recognition
Structured JSON entities. Topic tags.
Summary + NER guideVideo moderation
Threshold-tuneable classifier. Webhook fires on flag. Audit log per decision.
Moderation guideScene analysis + AI Video clipping
Detect high-engagement moments using scene analysis, make those clips available for social distribution.
Request accessVerified counts, In-Video AI
9
AI features per asset, one call
Inline with encoding
30+
Languages for transcripts + captions
EN, ES, PT, FR, DE, HI, AR, etc.
Inline
AI runs in encoding pipeline
No second pipeline
Per-minute
Same billing as encoding
No separate AI invoice
Tech specs
What In-Video AI handles.
Features, languages, output formats, integration patterns.
Transcription languages
Caption formats


Chapter detection
Search
Summary
Moderation
NER
Integration
Questions developers ask
In-Video AI questions, answered.
How is In-Video AI different from running my own model?
You don't pick or maintain the model. Set a flag at upload time and the feature is part of the asset. We tune the underlying model. Same per-minute billing as encoding.How many languages do transcripts support?
30+ languages. Common ones (English, Spanish, Portuguese, Kannada, Malayalam, Hindi, Tamil, Telugu, Bahasa, Vietnamese, Thai, Arabic, French, German) are tested daily.How does AI video search work?
/v1/video/{id}/search?q=... returns matches with timestamps. The result tells you exactly which second of which video matched, and the surrounding transcript context.Is moderation real-time?
Yes for VOD: moderation flags within minutes of upload completion. For Live: moderation runs on the live stream with webhook fire on flag.
Pricing
Per-minute AI processing.
Inline with encoding. Same per-minute billing model. See full pricing.
TRANSCRIPTION + CAPTIONS
Per minute transcribed.
$0.048/ minute
30+ languages with VTT and SRT export. Same rate for live auto-generated subtitles.
- 30+ languages
- VTT + SRT export
- Speaker diarization
SEARCH + SUMMARY + NER
Per minute analyzed.
$0.0035/ minute
Markdown summaries, structured entities, video chapters, and a conversational search endpoint. Same per-minute rate across NER, chapters, and summary.
- Conversational search endpoint
- Markdown summary
- Named-entity recognition + chapters
MODERATION + SCENE DETECTION
Per minute moderated.
$0.10/ minute
Tunable NSFW / profanity classifier with audit-log-grade webhooks.
- NSFW + profanity detection
- Threshold tuneable per workload
- Webhook + audit log per decision
Three ways to get unstuck
Whatever kind of help you need, there is a path.
Engineering support
Talk to a video engineer.
Stuck on an API call, a webhook signature, or a player integration? Reach the engineering team directly. Response within hours, not days.
Contact engineeringIntegration help
Docs, code samples, video tutorials.
Self-serve resources for the most common integrations. Quickstart guides, SDK examples, and detailed playback logs in your dashboard.
Browse the docsSolution architect
Plan the rollout with a human.
New integration, migration off another platform, or a complex multi-tenant build. Book a session with a FastPix solution architect.
Join the Slack communityDeveloper resources
Everything you need to start building.
Five-minute quick-start
Sign up, hit the endpoint, ship.
Quick-start guideFull API reference
Every endpoint, every parameter, every response.
API referenceWebhook reference
Every event FastPix emits, with sample payloads.
WebhooksCode samples
Sample apps and SDK examples on GitHub.
GitHubSlack community
Talk to FastPix engineers and other developers.
Join SlackService status
Real-time uptime and incident reports.
Status page