Apple reportedly used videos without permission from late night hosts and others to train AI
Today, word has come out that Apple and other companies have used content from YouTube videos to train AI models without the permission of the creators of these videos. According to this new report, a third party created a file of sub-titles taken from over 170,000 videos. These videos include content from long-time tech reviewer Marquees Brownlee (MKBHD), and late-night comics Stephen Colbert and Jimmy Kimmel.
“Technology companies have run roughshod. People are concerned about the fact that they didn’t have a choice in the matter,” Keller said. “I think that’s what’s really problematic.”-Amy Keller, partner at the law firm DiCello Levitt
However, large companies like Apple were using this dataset created by EleutherAI called YouTube Subtitles which doesn’t include imagery but does feature plain text of videos’ subtitles. The latter also includes translations into languages such as Japanese, German, and Arabic. YouTube Subtitles contains content from over 12,000 videos some of which have been deleted from YouTube. One unnamed creator deleted all of his videos that were online and discovered that his work was still included in some AI models.
The problem is that none of the YouTube creators had been asked for their permission to allow the videos they made to be used to train AI models. While there have been lawsuits against members of the AI community for using content without permission, companies like Open AI and Meta have defended their actions by saying that their actions were supported by the Fair Use doctrine which allows the unlicensed use of copyrighted material in certain situations.