yt-transcript-kit

Lightweight YouTube transcript extraction and YouTube search for apps that want video text and metadata first, then decide what to do with it. Built with TypeScript and zero runtime dependencies.

Install

node --version # requires Node.js 18+
npm install yt-transcript-kit

Use npx yt-transcript-kit --help for the CLI without installing globally.

Quick Start

import { fetchYouTubeTranscript, searchYouTube } from 'yt-transcript-kit';

const result = await fetchYouTubeTranscript('https://www.youtube.com/watch?v=dQw4w9WgXcQ');
console.log(result.title, result.fullText);

const videos = await searchYouTube({ query: 'typescript tutorial', maxResults: 5 });
console.log(videos[0].title, videos[0].url);

New APIs

YouTube Search

Search YouTube videos without a YouTube Data API key.

import { searchYouTube } from 'yt-transcript-kit';

const results = await searchYouTube({
  query: 'node.js streams',
  maxResults: 10,
  hl: 'en',
});

for (const video of results) {
  console.log(video.title);
  console.log(video.channelName, video.duration, video.viewCount);
  console.log(video.url);
}

Each result includes:

videoId
title
channelName
channelId
publishedAt
viewCount
duration
durationSeconds
thumbnailUrl
description
url

Search With Transcripts

Fetch transcripts for search results in one call.

import { searchYouTubeWithTranscripts } from 'yt-transcript-kit';

const results = await searchYouTubeWithTranscripts({
  query: 'react server components',
  maxResults: 3,
  includeTranscripts: true,
  transcriptOptions: {
    languages: ['en'],
  },
});

for (const video of results) {
  if (video.transcript) {
    console.log(video.title, video.transcript.fullText.slice(0, 200));
  } else {
    console.warn(video.title, video.transcriptError);
  }
}

includeTranscripts: true makes one transcript request per search result, so keep maxResults modest for CLI tools and server routes.

Transcript Search

import { fetchYouTubeTranscript, searchTranscript } from 'yt-transcript-kit';

const transcript = await fetchYouTubeTranscript('videoId');
const matches = searchTranscript(transcript, 'keyword', {
  caseSensitive: false,
  maxResults: 20,
  contextChars: 24,
});

console.log(matches[0]);

Chunking for LLM pipelines

import { fetchYouTubeTranscript, chunkTranscript } from 'yt-transcript-kit';

const transcript = await fetchYouTubeTranscript('videoId');

const chunks = chunkTranscript(transcript, {
  maxChars: 4000,
  maxTokens: 1200,
  overlapSegments: 1,
  mergeAdjacentShortSegments: true,
});

console.log(chunks[0]);

Formatting modes

import { fetchYouTubeTranscript, formatTranscript } from 'yt-transcript-kit';

const transcript = await fetchYouTubeTranscript('videoId');

const plainText = formatTranscript(transcript, { mode: 'plainText' });
const markdown = formatTranscript(transcript, { mode: 'markdown', includeTimestamps: true });
const paragraphs = formatTranscript(transcript, { mode: 'paragraphs', paragraphMergeThresholdSec: 2 });
const segments = formatTranscript(transcript, { mode: 'segments' });

console.log(markdown);

Metadata helper

import { getTranscriptWithMetadata } from 'yt-transcript-kit';

const enriched = await getTranscriptWithMetadata('videoId');
console.log(enriched.channelName, enriched.duration, enriched.thumbnailUrls);

Optional cache support

import { fetchYouTubeTranscript, InMemoryTranscriptCache } from 'yt-transcript-kit';

const cache = new InMemoryTranscriptCache({ ttlMs: 60_000 });
const transcript = await fetchYouTubeTranscript('videoId', { cache });

console.log(transcript.videoId);

Batch fetching

import { fetchManyYouTubeTranscripts } from 'yt-transcript-kit';

const results = await fetchManyYouTubeTranscripts(['videoId1', 'videoId2'], { concurrency: 3 });

for (const item of results) {
  if (item.success) {
    console.log(item.result.videoId, item.result.languageCode);
  } else {
    console.error(item.input, item.error.code);
  }
}

Cleanup helpers

import {
  cleanTranscriptSegments,
  cleanTranscriptText,
  fetchYouTubeTranscript,
} from 'yt-transcript-kit';

const transcript = await fetchYouTubeTranscript('videoId');

const cleanedText = cleanTranscriptText(transcript, {
  stripBracketedMarkers: true,
  dedupeAdjacentLines: true,
});

const cleanedSegments = cleanTranscriptSegments(transcript.segments, {
  normalizeWhitespace: true,
});

console.log(cleanedText, cleanedSegments.length);

CLI

npx yt-transcript-kit <url>
npx yt-transcript-kit <url> --format txt
npx yt-transcript-kit <url> --format json
npx yt-transcript-kit <url> --format markdown
npx yt-transcript-kit <url> --languages de,en
npx yt-transcript-kit <url> --search "keyword"
npx yt-transcript-kit <url> --chunks --max-chars 4000
npx yt-transcript-kit batch urls.txt --format json
npx yt-transcript-kit batch urls.txt --concurrency 3
npx yt-transcript-kit search "typescript tutorial" --max-results 5
npx yt-transcript-kit search "typescript tutorial" --format json
npx yt-transcript-kit search "typescript tutorial" --transcripts --languages en

Use --help to print command help.

Typical CLI uses:

--search prints matching transcript segments with their segment index.
--chunks prints chunked transcript text, or structured JSON when combined with --format json.
batch <file> --format json returns per-input success or failure records.
search <query> prints video metadata and URLs from YouTube search results.
search <query> --transcripts also attempts to fetch a transcript for each returned video.

Error Codes

INVALID_VIDEO_ID, VIDEO_UNAVAILABLE, RATE_LIMITED, NO_TRANSCRIPT, LANGUAGE_NOT_AVAILABLE, REQUEST_FAILED, EMPTY_QUERY, SEARCH_FAILED.

Environment Notes

Node.js 18+ is required.
Standard browsers are not supported because YouTube transcript requests are blocked by CORS.
Server runtimes, CLIs, browser extensions, and React Native are the intended environments.

Development

npm run build
npm run typecheck
npm run test

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
docs		docs
scripts		scripts
src		src
test		test
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

yt-transcript-kit

Install

Quick Start

New APIs

YouTube Search

Search With Transcripts

Transcript Search

Chunking for LLM pipelines

Formatting modes

Metadata helper

Optional cache support

Batch fetching

Cleanup helpers

CLI

Error Codes

Environment Notes

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

yt-transcript-kit

Install

Quick Start

New APIs

YouTube Search

Search With Transcripts

Transcript Search

Chunking for LLM pipelines

Formatting modes

Metadata helper

Optional cache support

Batch fetching

Cleanup helpers

CLI

Error Codes

Environment Notes

Development

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages