Video transcript API: one call for TikTok, YouTube, Instagram

One API gives you video transcripts across TikTok, YouTube, and Instagram. It's the same request shape every time, and the transcript always comes back under output.data. It's $0.002 a call with no per-platform glue.
Three platforms, three different walls. YouTube hides its captions behind OAuth, Instagram exposes none at all, and TikTok has no official API.
Build it yourself and you maintain three scrapers, three auth schemes, and three output shapes.
That's three integrations to write and keep alive, for one feature: a transcript.
One request shape returns the spoken text from any of the three, under one envelope. The rest of this page walks through it.
One shape, every platform
The call is the same on all three. You pass a URL, and the transcript comes back under output.data. Here's the call I actually run, using YouTube as the example:
curl -X POST https://api.getanyapi.com/v1/run/youtube.video_transcript \
-H "Authorization: Bearer $ANYAPI_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}'import os, requests
res = requests.post(
"https://api.getanyapi.com/v1/run/youtube.video_transcript",
headers={"Authorization": f"Bearer {os.environ['ANYAPI_KEY']}"},
json={"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"},
)
print(res.json()["output"]["data"]["transcript"])const res = await fetch("https://api.getanyapi.com/v1/run/youtube.video_transcript", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.ANYAPI_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ" }),
});
const { output } = await res.json();
console.log(output.data.transcript);You change only the SKU per platform: tiktok.video_transcript for TikTok, instagram.media_transcript for Instagram. Everything else stays the same.
One key, one auth header, one envelope across all three. No Google Cloud project for YouTube, no Graph API for Instagram, no headless browser for TikTok.
What each platform gives you
The calls match; the platforms don't. How much each one exposes decides what you get back, and it's worth knowing before you build:
| TikTok | YouTube | ||
|---|---|---|---|
| Source of the text | caption track | caption track | transcribed audio |
| Output shape | WebVTT | timed line array | plain text |
| Timestamps | yes | yes | no |
| Language tag | sometimes | yes | no |
| Price per call | $0.002 | $0.002 | $0.002 |
TikTok and YouTube publish a caption track, so you get timestamps for free. Instagram publishes none, so the audio is transcribed and you get clean text without offsets.
The price is the same on all three.
The deep dive per platform
Each platform has its own gotcha and its own deep dive. Start with the one you need:
- TikTok. No official API, so a scraper or a hosted call is the only way. How to get a TikTok transcript with one API call.
- YouTube. The official
captions.downloadneeds OAuth and 403s on videos you don't own, and the popular OSS library breaks on cloud IPs. YouTube transcript API, explained. - Instagram. No transcript resource and no caption track at all, so the audio has to be transcribed. Instagram transcript API, explained.
Normalize once, branch where it matters
Because the envelope is shared, a small wrapper handles every platform and you branch only on the output shape. Here's the whole seam:
async function transcript(sku, url) {
const res = await fetch(`https://api.getanyapi.com/v1/run/${sku}`, {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.ANYAPI_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ url }),
});
const { output } = await res.json();
if (!output.found) return null; // no transcript available
// YouTube/TikTok: output.data.transcript ; Instagram: output.data.transcripts[].text
return output.data.transcript ?? output.data.transcripts?.map((t) => t.text).join("\n\n");
}Write the fetch once, and adding a platform is just a new SKU string. That's the point of a single gateway: the surface you maintain doesn't grow with the number of sources.
Try them without writing code
Each platform has a free, no-login tool. Paste a link, copy the text, see the output shape before you wire up the API:
Frequently asked questions
Is there one API for video transcripts across platforms?
Yes. POST /v1/run/<platform>.<sku> with a video URL returns the transcript under output.data for TikTok, YouTube, and Instagram, using one key and one request shape. You change the SKU per platform; everything else stays the same.
Which platforms are supported?
TikTok, YouTube, and Instagram today, each with its own deep-dive guide linked above. They sit alongside 200+ other data sources in the catalog.
Do all platforms return the same format?
The envelope is the same (output.data), but the shape inside differs because the platforms expose different things. TikTok returns WebVTT, YouTube a timed line array, Instagram plain text. The wrapper above smooths over the difference.
Why does only Instagram lack timestamps?
TikTok and YouTube serve a caption track, which carries timing. Instagram serves no caption track, so the audio is transcribed to plain text with no offsets. See the Instagram guide for the detail.
How much does a transcript cost?
$0.002 per call on every platform, billed in dollars, with no subscription. That's $2 per 1,000 transcripts.
Can I get a transcript for free?
Yes, for a one-off. Each platform has a free in-browser tool (linked above) with no key or login. For bulk or programmatic use, the endpoint is one call at $0.002.
Do I need an account on each platform?
No. You don't touch YouTube's OAuth, Instagram's Graph API, or any platform login. One AnyAPI key covers all three.
Can I transcribe many videos at once?
Yes. Loop the endpoint over a list of URLs; each call is independent and priced the same. The per-platform guides show the shell pattern.
What can I build with it?
Cross-platform search over a creator's whole footprint, an LLM summarizer or repurposing tool, subtitles, translation, or a RAG index for an assistant, all from one integration.
Related guides
Video transcripts on three platforms, plus 200+ other data sources, one key, priced in dollars. New accounts start with free credit.
Browse the data catalog