Technology Technology

Comment: Questions about little-known US web scraper could mean increased legal scrutiny for AI training data

A large chunk of the data used to train AI systems such as ChatGPT comes from a little-known nonprofit with no paid employees whose address comes back to a nondescript building beside a parking deck in Beverly Hills, California.
While its 9.5 petabytes of data scraped from the web have

To view the latest version of this document and thousands of others like it, sign-in to MLex or register for a free trial.

Mike Swift

Chief Global Digital Risk Correspondent


Mike Swift is an award-winning journalist who has been at the forefront of covering data, privacy and cybersecurity regulatory news for more than a decade. As the Chief Global Digital Risk Correspondent for MLex, in addition to reporting, he coordinates MLex’s worldwide coverage in the practice area. Formerly chief Internet reporter for the San Jose Mercury News and SiliconValley.com, Mike has covered Google, Facebook, Apple, Microsoft, Twitter and other tech companies and has closely tracked technology and regulatory trends in Silicon Valley. He has wide ranging expertise from the business of professional sports to computer-assisted reporting. A former John S. Knight Fellow at Stanford University, he is a graduate of Colby College.

Discover MLex

Stay on top of global regulatory developments

Latest News