
Oxylabs, a leading web intelligence platform, announced the launch of the industry’s first consent-based YouTube datasets. Designed for ethical AI training, these datasets provide creator-approved video data, enabling seamless collaboration between content creators and AI developers while addressing copyright and innovation challenges.
Oxylabs launches first YouTube datasets with creator consent.
Includes videos, transcripts, and metadata for AI training.
Supports multimodal AI for text, audio, and visual processing.
Ensures transparency in data sourcing for ethical AI.
Aligns with Oxylabs’ Ethical Web Data Collection Initiative.
Fosters fair collaboration between creators and AI companies.
Launched in Vilnius, Lithuania, Oxylabs’ YouTube datasets mark a milestone in ethical AI development. “In the ecosystem aiming to find a fair balance between respecting copyright and facilitating innovation, YouTube streamlining consent giving for AI training and providing creators with flexibility is an important step forward,” said Julius Černiauskas, CEO at Oxylabs. By ensuring all data has explicit creator consent, Oxylabs provides a transparent, verifiable source for AI training.
The datasets, comprising videos, transcripts, and detailed metadata, are optimized for training multimodal AI systems that process text, audio, and visual data. This structured, AI-ready data simplifies the development of advanced AI tools, addressing the industry’s need for high-quality, ethically sourced datasets to power content generation and task automation.
Oxylabs’ initiative promotes a cooperative ecosystem for AI development. “These datasets offer a breath of fresh air to a tense ecosystem in dire need of facilitating systematic cooperation between creators and AI companies based on mutual agreement,” said Černiauskas. This approach ensures creators’ rights are respected while enabling AI companies to innovate responsibly.
Building on its leadership in ethical data sourcing, Oxylabs continues its mission through initiatives like co-founding the Ethical Web Data Collection Initiative (EWDCI) and establishing a transparent proxy sourcing framework. These consent-based datasets set a new standard for sustainable AI development, fostering trust and innovation across the industry.
Established in 2015, Oxylabs is a web intelligence platform and premium proxy provider, enabling companies of all sizes to utilise the power of big data. Constant innovation, an extensive patent portfolio, and a focus on ethics have allowed Oxylabs to become a global leader in the web intelligence collection industry and forge close ties with dozens of Fortune Global 500 companies. Oxylabs was named Europe's fastest-growing web intelligence acquisition company in the Financial Times FT 1000 list for several consecutive years.