YouTubeScrap: a comprehensive tool for scraping YouTube data and transcript

Autores/as

DOI:

https://doi.org/10.5902/2357797592749

Palabras clave:

Data scraping, Computing, Computational social sciences

Resumen

YouTubeScrap is an open-source tool that streamlines the collection, analysis, and organization of YouTube video data and transcripts, tailored to researchers, analysts, and content creators. Designed for accessibility and efficiency, this tool enables users to perform targeted searches, extract detailed metadata, and retrieve multilingual transcripts without requiring API keys-addressing growing restrictions on data accessibility. Operating seamlessly within Google Colab, YouTubeScrap leverages a cloud-based infrastructure to eliminate installation barriers, offering a ready-to-use environment for users with varying technical expertise. The tool integrates Python libraries such as yt_dlp, YouTubeTranscriptAPI, and scrapetube to automate video searches, filter results by criteria like keywords and date ranges, and store outputs in Google Sheets for easy collaboration and compliance with international data privacy standards. This API-free solution democratizes access to digital content, enabling large-scale data collection and analysis across academic research, media studies, and communication fields. By simplifying complex data-handling processes, YouTubeScrap empowers users to navigate vast digital landscapes ethically and efficiently, promoting equitable access to critical online information in an era of increasing platform restrictions. This tool serves as a scalable, user-friendly resource for engaging and advancing data-driven research.

Descargas

Los datos de descargas todavía no están disponibles.

Biografía del autor/a

Isabela Rocha, University of Brasília

PhD Candidate, University of Brasília, Brasília, Federal District, Brazil.

Ergon Cugler de Moraes Silva, Council for Sustainable Economic and Social Development

Master in Public Administration and Government from Fundação Getulio Vargas; Member in the Council for Sustainable Economic and Social Development, Brasília, Federal District, Brazil.Website: https://ergoncugler.com/
Contact: contato@ergoncugler.com.

Citas

Silva, Ergon Cugler de Moraes; Rocha, Isabela. YouTubeScrap: A comprehensive tool for scraping YouTube data and transcript. (Dec, 2024). Available at: https://github.com/ergoncugler/web-scraping-youtube.

Statista. Leading countries based on YouTube audience size as of July 2024 (in millions) (2024). Available at: https://www.statista.com/statistics/280685/number-of-monthly-unique-youtube-users/.

Statista. Leading social media platforms in Brazil 2023, by reach (2023). Available at: https://www.statista.com/statistics/1307747/social-networks-penetration-brazil/.

Descargas

Publicado

2025-09-30

Cómo citar

Rocha, I., & Silva, E. C. de M. (2025). YouTubeScrap: a comprehensive tool for scraping YouTube data and transcript. InterAção, 16(4), e92749. https://doi.org/10.5902/2357797592749