This week, an AI training startup called Shift announced it would clean New Yorkers' homes for free, with plans to expand into other cities including London. Looking around my own flat, I get the appeal. But there's a catch - there's always a catch.
In exchange for the cleaning, Shift wants footage of its cleaners at work: scrubbing dishes, wiping counters, dusting tables, mopping floors. It wants video of all the boring domestic labor we'd happily outsource if we could - and that robotics companies are racing to teach machines to do so they can sell us something to do it for us. That's harder than it sounds. Unlike chatbots, image generators, and other AI tools that have exploded in recent years, robots have to deal with the physical world - understanding space, motion, force, friction, weird shapes and materials, awkward lighting, and everything else humans grasp instinctively. It's why things generally easy for us, like folding clothes, picking up an apple, or pouring a glass of water, have proven maddening for roboticists to codify.
Teaching machines to do those things takes lots of data. Text, images, and videos could be easily scraped from the internet at an industrial scale - often without compensating the people who made them. The physical world is harder to scrape, and harder still to scrape quietly without paying for it. This makes access to high-quality data a massive bottleneck for companies developing physical AI, so startups like Shift are getting creative. They're not alone. In India, recent reporting revealed that home services platform Pronto has been using clients' homes as a source of AI training footage for chores like cooking, cleaning, and laundry. Pronto says it only records footage if customers explicitly opt in - it's not clear what customers get in return, other than a copy of the footage - but the practice set off a wave of backlash, with rival startups insisting they have never recorded inside homes to train AI.
Other startups are focused on scaling data collection. Silicon Valley-based Human Archive hopes to partner with companies like Pronto and have gig workers record their activities using not-so-stylish camera caps. The hats collect footage from the wearer's point of view - exactly the kind of “egocentric” or first-person data robotics companies need to teach machines how people navigate physical space. Shift also taps consumers directly, claiming to have paid tens of thousands of people across 15 countries to record their activities through its app. Some companies are skipping useful work altogether, paying workers to complete the exact same physical tasks again and again while cameras and sensors capture every movement - turning rote physical activity like folding towels, picking up cups, and carrying boxes into valuable AI training material.
And some data comes from robots already out in the world. Despite the hype, true automation is still a long way away - hence the need for all this data - but companies are keen to ship products anyway, using data from customers' homes to improve the product. Many rely on remote workers to step in when robots inevitably get stuck; they'll use that data too. Of course, trading data for something of value is not new - companies have offered discounts, convenience, and free services in exchange for access to your data for years, from loyalty cards and cookies to dashcams, insurance apps monitoring how people drive, and that heinous smart TV always showing ads. What's new is the kind of data companies are willing to pay for. For now, that means maybe letting a human clean your home in a snazzy hat for free so that, eventually, a company can sell you a robot to do it instead.