How to Scrape YouTube Transcripts - The Easy Way
Extracting transcripts from YouTube videos manually doesn't scale. Whether you're analyzing educational content, building accessibility tools, or conducting content research, you need a reliable way to get YouTube transcripts programmatically - including YouTube Shorts.
In this guide, I'll show you how to scrape YouTube transcripts using SocialKit's YouTube Transcript API - a straightforward solution that handles YouTube's dynamic content loading, anti-scraping measures, and works seamlessly with both regular videos and Shorts.
Getting Started
1. Get Your API Access Key
First, you'll need an API access key. Visit your SocialKit Dashboard to get your free access key. The free tier includes 20 requests - perfect for testing and small projects.
2. The API Endpoint
Here's the endpoint you'll be working with:
GET https://api.socialkit.dev/youtube/transcript
Required Parameters:
access_key: Your API access keyurl: The YouTube video URL (works with regular videos and Shorts)
Example Request & Response
Let's look at a real example using a YouTube video:
GET https://api.socialkit.dev/youtube/transcript?access_key=<your-access-key>&url=https://youtube.com/watch?v=dQw4w9WgXcQ
Response:
{
"success": true,
"data": {
"url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
"transcript": "[♪♪♪] ♪ We're no strangers to love ♪ ♪ You know the rules\nand so do I ♪ ♪ A full commitment's\nwhat I'm thinking of ♪ ♪ You wouldn't get this\nfrom any other guy ♪ ♪ I just wanna tell you\nhow I'm feeling ♪ ♪ Gotta make you understand ♪ ♪ Never gonna give you up ♪ ♪ Never gonna let you down ♪ ♪ Never gonna run around\nand desert you ♪ ♪ Never gonna make you cry ♪ ♪ Never gonna say goodbye ♪ ♪ Never gonna tell a lie\nand hurt you ♪",
"transcriptSegments": [
{
"text": "[♪♪♪]",
"start": 0,
"duration": 5,
"timestamp": "00:00"
},
{
"text": "♪ We're no strangers to love ♪",
"start": 5,
"duration": 5,
"timestamp": "00:05"
},
{
"text": "♪ You know the rules\nand so do I ♪",
"start": 10,
"duration": 5,
"timestamp": "00:10"
},
{
"text": "♪ A full commitment's\nwhat I'm thinking of ♪",
"start": 15,
"duration": 5,
"timestamp": "00:15"
},
{
"text": "♪ You wouldn't get this\nfrom any other guy ♪",
"start": 20,
"duration": 5,
"timestamp": "00:20"
}
],
"wordCount": 458,
"segments": 61
}
}
You get:
- Full transcript as plain text
- Timestamped segments with precise start times and durations
- Word count for quick content analysis
- Segment count to understand video structure
- Works with YouTube Shorts - same API, same format
Code Examples
Choose your preferred language and start extracting YouTube transcripts:
JavaScript/Node.js Example
const axios = require('axios');
/**
* Extract transcript from a YouTube video or Short
* @param {string} videoUrl - The YouTube video URL
* @param {string} accessKey - Your SocialKit API access key
* @param {boolean} cache - Enable caching for faster subsequent requests
* @returns {Promise<Object>} Transcript data
*/
async function getYouTubeTranscript(videoUrl, accessKey, cache = false) {
const endpoint = 'https://api.socialkit.dev/youtube/transcript';
try {
const response = await axios.get(endpoint, {
params: {
access_key: accessKey,
url: videoUrl,
cache: cache,
},
});
const data = response.data;
if (data.success) {
return data.data;
} else {
throw new Error(`API Error: ${data.message || 'Unknown error'}`);
}
} catch (error) {
if (error.response) {
throw new Error(
`API Error: ${error.response.data.message || error.response.statusText}`
);
}
throw new Error(`Request failed: ${error.message}`);
}
}
// Usage
(async () => {
const ACCESS_KEY = 'your-access-key-here';
const VIDEO_URL = 'https://youtube.com/watch?v=dQw4w9WgXcQ';
try {
const result = await getYouTubeTranscript(VIDEO_URL, ACCESS_KEY);
console.log(`Video URL: ${result.url}`);
console.log(`Word Count: ${result.wordCount}`);
console.log(`Segments: ${result.segments}`);
console.log(`\nFull Transcript:\n${result.transcript}`);
console.log('\n--- First 5 Segments ---');
result.transcriptSegments.slice(0, 5).forEach((segment) => {
console.log(`[${segment.timestamp}] ${segment.text}`);
});
} catch (error) {
console.error(`Error: ${error.message}`);
}
})();
Python Example
import requests
def get_youtube_transcript(video_url, access_key, cache=False):
"""
Extract transcript from a YouTube video or Short
Args:
video_url: The YouTube video URL
access_key: Your SocialKit API access key
cache: Enable caching for faster subsequent requests
Returns:
dict: Transcript data including full text and timestamped segments
"""
endpoint = "https://api.socialkit.dev/youtube/transcript"
params = {
"access_key": access_key,
"url": video_url,
"cache": cache
}
try:
response = requests.get(endpoint, params=params)
response.raise_for_status()
data = response.json()
if data["success"]:
return data["data"]
else:
raise Exception(f"API Error: {data.get('message', 'Unknown error')}")
except requests.exceptions.RequestException as e:
raise Exception(f"Request failed: {str(e)}")
# Usage
if __name__ == "__main__":
ACCESS_KEY = "your-access-key-here"
VIDEO_URL = "https://youtube.com/watch?v=dQw4w9WgXcQ"
try:
result = get_youtube_transcript(VIDEO_URL, ACCESS_KEY)
print(f"Video URL: {result['url']}")
print(f"Word Count: {result['wordCount']}")
print(f"Segments: {result['segments']}")
print(f"\nFull Transcript:\n{result['transcript']}")
print("\n--- First 5 Segments ---")
for segment in result['transcriptSegments'][:5]:
print(f"[{segment['timestamp']}] {segment['text']}")
except Exception as e:
print(f"Error: {str(e)}")
Response Structure Explained
The API returns comprehensive transcript data:
Full Transcript Data
transcript: Complete text transcript of the videowordCount: Total number of words in the transcriptsegments: Total number of transcript segments
Timestamped Segments
Each segment in transcriptSegments contains:
text: The spoken text for this segmentstart: Start time in secondsduration: Duration of the segment in secondstimestamp: Human-readable timestamp (MM:SS format)
Caching for Better Performance
Enable caching to speed up repeated requests:
// Cache for 1 hour (3600 seconds)
const result = await axios.get(endpoint, {
params: {
access_key: accessKey,
url: videoUrl,
cache: true,
cache_ttl: 3600,
},
});
Caching Benefits:
- Faster response times for repeated requests
- Reduced API costs
- Better performance for batch processing
- TTL range: 1 hour (3600s) to 1 month (2592000s)
Use Cases
Here are practical ways to use YouTube transcript extraction:
- Content Analysis: Extract keywords and topics from educational videos
- Accessibility: Generate accurate subtitles and captions
- Research: Analyze video content at scale for academic studies
- SEO: Convert video content into searchable text
- AI Training: Create datasets from YouTube educational content
- Translation: Source material for multi-language subtitles
- Sentiment Analysis: Analyze tone and messaging in video content
- Content Moderation: Detect policy violations in video speech
- YouTube Shorts Analysis: Extract transcripts from short-form content
YouTube Shorts Support
The API seamlessly handles YouTube Shorts URLs:
// Works with Shorts URLs
const shortsUrl = 'https://youtube.com/shorts/abc123';
const result = await getYouTubeTranscript(shortsUrl, ACCESS_KEY);
No special handling needed - the API automatically detects Shorts and returns the same structured data.
Try It Free First
Want to test transcript quality before integrating the API? Use our free YouTube Transcript Extractor tool to extract transcripts from any YouTube video or Short directly in your browser. No coding required!
Want to Build Your Own Scraper?
Prefer building your own YouTube transcript scraper with Puppeteer? Check out our comprehensive DIY guide:
How to Scrape YouTube Transcripts With Puppeteer: Complete step-by-step tutorial covering:
- Setting up Puppeteer for YouTube scraping
- Handling YouTube's dynamic content loading
- Extracting timestamped transcript segments
- Bypassing anti-bot detection measures
- Error handling and retry logic
- Code examples and best practices
This tutorial is perfect if you:
- Want to learn how YouTube transcript scraping works under the hood
- Need custom scraping logic for specific requirements
- Prefer self-hosted solutions over APIs
- Want to avoid API costs for low-volume projects
Keep in mind that building your own scraper means maintaining it as YouTube's interface changes, handling rate limits, and dealing with CAPTCHAs. For production use at scale, the API approach is more reliable and cost-effective.
Related Resources
Expand your YouTube data extraction capabilities:
- Best YouTube Transcript APIs in 2025: Compare different API options
- Best YouTube Shorts Transcript APIs: Specialized guide for Shorts
- How to Scrape YouTube Transcripts With Puppeteer: DIY approach with code
- YouTube Summarizer API: Get AI-powered video summaries
- YouTube Stats API: Extract views, likes, comments, and more
- YouTube Comments API: Get comments for sentiment analysis
Complete YouTube API Suite
SocialKit offers a comprehensive suite of YouTube APIs:
- YouTube Transcript API: Extract timestamped transcripts (what you just learned!)
- YouTube Summarizer API: AI-powered video summaries
- YouTube Comments API: Extract and analyze comments
- YouTube Stats API: Get detailed video metrics
All APIs work seamlessly with both regular YouTube videos and YouTube Shorts.
Conclusion
Extracting YouTube transcripts doesn't have to be complicated. With SocialKit's YouTube Transcript API, you can skip the complexity of web scraping, browser automation, and anti-bot detection.
Get your free API access key and start extracting YouTube transcripts in minutes. Whether you're building accessibility tools, conducting research, or analyzing content at scale, you'll have reliable transcript data at your fingertips.
Works perfectly with YouTube videos and Shorts - one API for all your transcript needs!
Happy coding! 🚀