Google is using its expansive library of YouTube videos to train its artificial intelligence models, including Gemini and the Veo 3 video and audio generator, CNBC has learned.

The tech company is turning to its catalog of 20 billion YouTube videos to train these new-age AI tools, according to a person who was not authorized to speak publicly about the matter. Google confirmed to CNBC that it relies on its vault of YouTube videos to train its AI models, but the company said it only uses a subset of its videos for the training and that it honors specific agreements with creators and media companies.

“We’ve always used YouTube content to make our products better, and this hasn’t changed with the advent of AI,” said a YouTube spokesperson in a statement. “We also recognize the need for guardrails, which is why we’ve invested in robust protections that allow creators to protect their image and likeness in the AI era — something we’re committed to continuing.”

Such use of YouTube videos has the potential to lead to an intellectual property crisis for creators and media companies, experts said.

While YouTube says it has shared this information previously, experts who spoke with CNBC said it’s not widely understood by creators and media organizations that Google is training its AI models using its video library.

CNBC spoke with multiple leading creators and IP professionals, none were aware or had been informed by YouTube that their content could be used to train Google’s AI models.

  • supersquirrel@sopuli.xyz
    link
    fedilink
    arrow-up
    28
    ·
    edit-2
    17 days ago

    I need everyone to understand how BIG of a theft this is from artists, from all artists.

    This Is War

    “It doesn’t hurt their competitive advantage at all to tell people what kind of videos they train on and how many they trained on,” Arrigoni said. “The only thing that it would really impact would be their relationship to creators.”

    The landlords don’t like it when they have to tell us how they finance their ownership of all our homes.

  • Asafum@feddit.nl
    link
    fedilink
    arrow-up
    19
    ·
    17 days ago

    “Sure, I can provide you an easy recipe for pancakes: just be sure to click like and subscribe! Now the ingredients…”

    Not to mention all the theft as mentioned elsewhere here…

    • Grimy@lemmy.world
      link
      fedilink
      arrow-up
      7
      ·
      16 days ago

      You sometimes get these type of hallucinations during moments of silence using the whisper text-to-speech model. It’s very off putting.

  • heyWhatsay@slrpnk.net
    link
    fedilink
    English
    arrow-up
    17
    ·
    16 days ago

    I thought it was obvious. First they use the videos for ad revenue, then for AI food, btw all that Gmail and Google docs are on the menu.

  • Ogmios@sh.itjust.worksM
    link
    fedilink
    arrow-up
    8
    ·
    17 days ago

    It’s kind of funny that AI is training on videos which have already been deliberately manipulated to game YouTube’s monetization policies, rather than caring about the actual product itself.

  • nullpotential@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    5
    ·
    17 days ago

    Well that sucks. I doubt I’ll be using YouTube to make content now, which is a catch-22 because there is no other platform as popular.

  • Grimy@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    16 days ago

    I’m pretty certain all video generation models use YouTube. It’s actually a treasure trove of data. Almost all video data is either owned by Hollywood or YouTube and similar apps (tiktok, twitch).

    Sucks for content creators but I don’t think they will be getting anything out of it. It will either be freely usable or Google will lobby the goverment, using the anti AI sentiment that has been drummed up, to give itself ownership.