FastPix Developer Documentation

Video libraries become unsearchable when the only metadata is a filename and upload date, viewers cannot find content by topic, speaker, or location. FastPix solves this by extracting named entities (people, organizations, locations, dates, and key terms) from video and audio transcripts using Named Entity Recognition (NER), part of in-video AI multimodal indexing. Enable the feature by setting namedEntities: true during upload or by sending a PATCH request to the named entities endpoint for existing media.

Prerequisites

A FastPix account with an active workspace (Activate your account)
Your Access Token ID and Secret Key from the FastPix dashboard
A video or audio you can upload by URL or direct upload, or an existing mediaId for previously uploaded media
A webhook endpoint if you plan to listen for the video.mediaAI.namedEntities.ready event

Key concepts

mediaId is the unique identifier FastPix assigns to every uploaded video.
playbackId is a separate, access-controlled identifier used to construct the HLS URL https://stream.fastpix.com/<playbackId>.m3u8. Named entity extraction operates on the mediaId and produces structured metadata you retrieve through the Get Media by ID endpoint or a webhook event.

Named Entity Recognition is a natural language processing technique that turns unstructured transcript text into structured tags. In the sentence “Apple announced its latest product in California,” NER identifies Apple as an Organization, not a fruit, and California as a Location. FastPix attaches a category to every extracted entity so you can filter and index by type.

Entity categories FastPix extracts

FastPix groups entities into categories that cover the common NER taxonomy:

People - for example, John Smith
Organizations - for example, UNICEF, FastPix
Locations - for example, Mount Everest
Dates and times - for example, July 4, 1776
Concepts and actions - abstract nouns and verbs drawn from the transcript
Miscellaneous entities - product names such as iPhone 14 or monetary values such as $1,000

Read more background in our blog on named entity recognition.

Generate named entities for new media

Enable NER at upload time by adding namedEntities: true to the request body. FastPix runs transcription, then extracts and ranks entities during encoding.

Collect the URL of the media file you want to upload through the pull-from-URL method, or prepare a direct upload.
Send a POST request to the Create media from URL endpoint or the Direct upload endpoint.

The following parameters control entity extraction:

type: specify whether the input is video or audio.
url: the HTTPS URL of the source file (URL-based uploads only).
namedEntities : set to true to enable extraction.
accessPolicy: (optional). Set to public or private.
maxResolution: (optional). Cap the output rendition, for example 1080p.

Request body (create new media from URL):

Request

1 {
2   "inputs": [
3     {
4       "type": "video",
5       "url": "https://static.fastpix.com/fp-sample-video.mp4"
6     }
7   ],
8   "namedEntities": true,
9   "accessPolicy": "public"
10 }

Request body (create new media by direct upload):

Request

1 {
2   "corsOrigin": "*",
3   "pushMediaSettings": 
4   {
5     "namedEntities": true,
6     "accessPolicy": "public"
7   }
8 }

Generate named entities for existing media

Run NER on an video you have already uploaded by calling the Generate named entities endpoint.

Copy the mediaId of the media from the dashboard or from a prior create-media response.
Send a PATCH request to /on-demand/<mediaId>/named-entities, replacing <mediaId> with the actual value.

Example request body:

1 {    
2   "namedEntities": true    
3 }

Generate named entities from the dashboard

For new media

Add your media

In the left navigation, go to Video > Media. On the Upload media page, add your video using one of the following methods:

Upload from device: Drag and drop your file into the upload area, or click Browse to open your device’s file picker. Navigate to the video you want to upload, select it, and click Open.
Upload using video URL: Paste a public video URL into the Upload using video URL field and click Submit URL.

Enable named entity recognition in media settings

The Media Settings panel opens. Select Custom settings and set "namedEntities": true in the JSON configuration. For example:

1 {
2   "pushMediaSettings": {
3     "namedEntities": true
4   }
5 }

Click Continue, then click Start upload all media.

For existing media

Open the media details page

In the left navigation, go to Video > Media. Select the video you want to process from the media list to open its Media Details page.

Generate named entities

In the left navigation of the Media Details page, click Named Entity Recognition under In-Video AI. Click Generate to start the analysis. FastPix analyzes the transcript and extracts named entities such as people, organizations, locations, dates, and other key terms.

Retrieve named entity results

Retrieve extracted entities in two ways:

Call the Get media by ID endpoint for an on-demand poll.
Listen for the video.mediaAI.namedEntities.ready webhook, which fires when extraction completes.

Example event payload:

Event

1 {
2   "type": "video.mediaAI.named-entities.ready",
3   "object": {
4     "type": "media",
5     "id": "f8579bd1-e6ac-4d34-b00c-17b453788e73"
6   },
7   "id": "8ee27849-0b1b-4664-9e26-4abd5cc91d4f",
8   "workspace": {
9     "name": "BrightCovePlayer-iOS",
10     "id": "f7a13f50-7f5c-48f4-b7b2-c901dcff61c6"
11   },
12   "status": "ready",
13   "data": {
14     "isNamedEntitiesGenerated": true,
15     "namedEntities": {
16       "entityCount": 4,
17       "namedEntities": [
18         {
19           "entity": "FastPix",
20           "category": "Organization"
21         },
22         {
23           "entity": "content",
24           "category": "Concept"
25         },
26         {
27           "entity": "library",
28           "category": "Concept"
29         },
30         {
31           "entity": "stream",
32           "category": "Action"
33         }
34       ]
35     }
36   },
37   "createdAt": "2025-11-05T08:51:15.113942549Z",
38   "attempts": []
39 }

The namedEntities array contains each extracted entity and its category. Entities are ordered by relevance, so the first items reflect the topics most central to the transcript. Use this payload to populate a search index, drive semantic tagging, or enrich downstream recommendations.

Limits and considerations

Confirm the mediaId is correct when calling /on-demand/<mediaId>/named-entities. A missing or mismatched ID returns a 404.
accessPolicy and maxResolution are optional. Set them when you need private playback or a capped rendition.
NER runs on the transcript FastPix generates, so speech clarity and language affect results. If a video has no speech, entity counts drop to zero.
Entity ranking reflects transcript frequency and topical weight, not a confidence score. Treat the ordering as a relevance signal, not an accuracy guarantee.
Named entity extraction is part of FastPix in-video AI multimodal indexing. See Generate video chapters and Add auto-generated subtitles for related AI workflows.

Frequently asked questions

What is the difference between NER and POS tagging?

Part-of-speech (POS) tagging labels every word in a sentence with a grammatical role, such as noun, verb, or adjective. NER operates at a higher level: it finds spans of text that refer to real-world entities: people, organizations, locations and assigns each span a category. FastPix uses NER, not POS tagging, so you receive entity-level metadata rather than per-token grammar tags.

How are NER models trained?

NER models are trained on large corpora of text where human annotators have marked entity spans and categories. The model learns contextual patterns, for example, that a capitalized token following “Mr.” is likely a person and generalizes to new text. FastPix applies pretrained models to the transcript generated for your video, so you do not need to train or host a model yourself.

Can I build a custom NER model with FastPix?

FastPix does not expose a custom NER training endpoint. The in-video AI pipeline uses managed models optimized for transcripts across common domains. If you need domain-specific entities beyond the default categories, post-process the returned entities with your own classifier.

Why does my event payload show zero named entities?

Entity counts depend on the transcript. If the source file has no intelligible speech, contains silent sections only, or uses a language the transcription model does not support, the extractor returns an empty namedEntities array. Verify the transcript was generated and that the media contains spoken content.

Extract named entities from a video