- 16th April 2018
- Posted by: Manolis
Will robots replace humans of arbiters of culture, beauty, and refinement?
When Hui Wu was growing up in China in the 1990s, she had two interests: fashion and math. The farming town where she lived was so small and poor the fields were tilled by oxen, so there wasn’t much opportunity for her to explore the first interest, and she was a girl, so her teachers told her there wasn’t much point in pursuing the second, since she would fall behind the boys eventually anyway. Nevertheless, she persisted, winning admission to an elite high school, and she learned computer programming in college.
Wu became fascinated with the problem of image recognition; she took as a challenge the computer’s inability to perform seemingly basic functions. “The very simple task of detecting a slow-moving vehicle might involve hundreds of lines of code,” she says today.
This was something a two-year-old could do. Why was it so hard to train a computer? Wu came to the U.S. to get her PhD in computer science at the University of North Carolina, Charlotte, and she put her mind to medical imaging, publishing several conference papers on things like programming a computer to find the chambers of a human heart in an ultrasound image.
Now 29, Wu works at IBM’s Thomas J. Watson Research Center in Westchester County. Wearing a long-sleeve black knit dress that spreads into pleats at the skirt, plum-colored patent leather shoes with chunky heels, and an Hermès scarf gathered below the neck with a diamond band, she describes how she has, at last, been able to link her interest in fashion to her professional life.
“Fashion to me is a great place to apply computer vision,” she says, “because there are so many sources of data that the tools can leverage now, like social media and online catalog images.” Her colleagues tell her she’s the best-dressed computer scientist at IBM; her fashionable friends say she’s the best programmer among them.
Wu and her colleagues have developed a tool called Cognitive Prints that looks for clothing designs similar to an image fed into the software. It’s essentially a search engine that works off of pictures instead of words. A designer can upload a photo of a dress she has seen in a runway show and, in seconds view hundreds of photos of dresses that are comparable in silhouette, color, and pattern.
Or she can crop the image down to a particular component of the dress—say, its ruffles—and the computer will find dresses with ruffles so the designer can see how their use has changed over time, or how they have trended over the years.
The search results can be filtered by year, designer, or domain (such as runway shows or street fashion). “Designers can ask questions like, ‘Hey, I’ve designed this new dress, but can you show me similar designs from a particular designer in the ’80s? I want to see how they match up,’ ” Wu says. The computer works with a data set so large no human could possibly get through it in a reasonable amount of time, so it provides insights not previously possible. “A lot of the knowledge I learned in medical imaging can be transferred to fashion images— say, for segmenting handbags or skirts.”
Cognitive Prints can design patterns based on existing ones, from clothes or other data sets, such as architectural images. Michael Ferraro, executive director of FIT’s Infor Design and Tech Lab, foresees practical applications for the tool, which FIT students, collaborating with industry pros from Tommy Hilfiger, explored in a pilot project last fall. “It allows designers to explore contexts and situations that characterize the problem they’re facing,” he says. “And you also find situations where you had a great idea but someone had it last year.”
Wu’s project is one of a number of new tools computer scientists are developing to help craftspeople and artists with their creative processes, and to produce creative works robotically. “We see AI as potentially transformative in the way people might be able to communicate through the arts,” says Doug Eck, a research scientist working on Project Magenta at Google AI who is focused on what is known as machine learning for creativity.
His team is teaching an AI machine how to improvise on the piano. Engineers at Sony’s Computer Science Laboratories, in Paris, have created an AI that writes pop songs. A professor at MIT has written code that composes poetry. Programmers at Rutgers University’s Art & AI Laboratory and Facebook’s AI Research collaborated with art historians to generate images that look like abstract oil paintings—pretty good ones, according to a panel of humans who were shown the images.
Researchers at Google have taught an AI to crop landscape images from Google Street View and apply a combination of filters to give the pictures aesthetic appeal. Those images tricked judges at photography competitions into thinking humans had created them.
Scientists in Denmark and Portugal collaborated on a sort of artificial mouth that measures the level of dryness in wine, which could help vintners make adjustments to achieve a desired result. On the retail side, Verve Wine will tailor recommendations based on personal preferences fed into its algorithm.
Fashion has not been immune to the trend. The online style service Thread uses AI to recommend clothing based on what’s already in your closet, much in the way Netflix and Spotify offer suggestions drawing on what you’ve already streamed. Stitch Fix takes it one step further, with an AI that comes up with suggested outfits for customers based on its analysis of what’s trending on the internet, and what it predicts will be, while H&M’s online brand Ivyrevel, in collaboration with Google, promises to collect lifestyle data—where you go and what you do—and then recommend a custom outfit based on its findings.
Amazon researchers at Lab 126 in San Francisco have written an AI that can design a new look derived from images it’s fed. Designers at Marchesa even collaborated with IBM’s AI platform, Watson, to design a dress for Karolina Kurkova to wear to the 2016 Met Gala.
“This project really changed our view on the potential of technology in the fashion industry,” says Marchesa co-founder Keren Craig. “This era is opening up new avenues for designers to approach creative thinking.”
Marc C. Close, founder and CEO of Bespokify, which enables retailers to create custom patterns from 3-D body scans and customize details for each purchaser, envisions a day when technology will supplant brands’ designers, developing unique products on the spot for anyone who walks into a store or visits an e-commerce site.
“Rather than training people in design, we’ll train them in how to train AI designers,” he says. “So you’ll have the skills to teach AI to curate the design experience around an individual based on what the brand is hoping to achieve. Designers of the future will treat AI as a paintbrush to achieve retail and brand objectives.”
Nevertheless, Jean Z. Poh, founder and CEO of Swoonery, one of the most technologically advanced e-commerce jewelry retailers, is skeptical that AI design will replace humans anytime soon. “There’s something about the human touch that is transferred in the act of creation, and a machine is just too perfect,” she says. “AI is basing it off data, but creativity is all about the ‘anti-data.’”
Since the earliest days of AI, computer engineers and philosophers have struggled to define intelligence. Is it the ability to reason? To learn? Is it a physical attribute, such as the number of brain cells relative to mass? (A prominent early-20th-century German scientist asserted that “with a head circumference under 52 centimeters you cannot expect an intellectual performance of any significance,” a premise that was used to discriminate against women.) Is creativity a component of intelligence? Or perhaps a certain level of intelligence is required for creativity. The relationship between intelligence and creativity has long been a topic of philosophical inquiry.
The problem is, defining creativity may be even more difficult than defining intelligence. But some of the programmers—or should we call them artists?—working in this new field of artificial creativity are trying to at least describe some of its attributes. Ahmed Elgammal and his colleagues at Rutgers, in a paper about teaching an AI to produce abstract works of art, cite the writings of 20th-century psychologist Daniel E. Berlyne on how significant works of art produce “arousal” in people who encounter them—a level of excitement, alertness, or passion. “Berlyne emphasized that the most significant arousal-raising properties for aesthetics are novelty, surprisingness, complexity, ambiguity, and puzzlingness,” the paper said.
“But it also has to be rooted in something we recognize,” says John R. Smith, a colleague of Wu’s at IBM who as a student at Columbia in the 1990s developed one of the earliest image search engines for the web. Successful works of art riff off, and sometimes refer to, those that preceded them—meaning their creators need to be familiar with the canon. Picasso learned from Raphael; Lennon and McCartney learned from Muddy Waters; Robert Mapplethorpe learned from Joseph Cornell. Bob Dylan told the Los Angeles Times in 2004, “Anyone who wants to be a songwriter should listen to as much folk music as they can, study the form and structure of stuff that has been around for 100 years.”
Machine learning is no different. For an artificial intelligence to be creative, it turns out, first you have to teach it to have good taste. Eck, of Google’s music composition project, says, “Just as I go to concerts and listen to my favorite bands, or I go to museums to see my favorite paintings, these machine learning models—in their own numerical way— experience this material and try to make sense of it.” Similarly, Wu explains that Cognitive Prints was “trained using big data, and in the fashion domain that means lots of fashion images.”
In the visual arts, an online platform based in Berlin called EyeEm uses machine learning to recommend high-quality photographs. More than 15 million photographers have uploaded their work to the site, and brand marketers or photo editors can search it for pictures they want to license. Nobody wants to use a crappy photograph, though, and EyeEm has millions of images, so Appu Shaji, the company’s head of R&D, asked a question: “Can we start to recognize patterns of how an expert curator would make this choice and duplicate that in a machine?”
To successfully recommend aesthetically pleasing images, Shaji and colleagues developed a tool called EyeEm Vision, which can distinguish between images lacking in visual charm and coherence and master-quality photographs. This is a subjective, even philosophical question, no doubt, but if posed mathematically it’s one that a computer can answer. Simply put, you want the machine to find the differential between two high-quality images to be less than the differential between a high-quality image and one of low quality.
So what constitutes “high quality”? The system had to be trained—it had to be taught good taste—by expert human curators, who fed it more than 100,000 images they regarded as good. “So it has a notion of what our curators would see as an amazing work of art and what is not so great,” Shaji says.
The AI learned from the data set features that the selected images had in common. For instance, numerous positive and negative correlations in visual content across various scales is one thing that contributes to a photo’s visual appeal.
In other words, similarities and contrasts of content—light/dark, hard/soft—placed near to and far from one another in the composition. In painting, Rembrandt’s and Caravaggio’s uses of chiaroscuro are perhaps the most easily seen examples of this. By breaking down an image pixel by pixel and comparing it with features that appear in the data set, Shaji says, the AI “can recompose patterns in a new setting—a photograph it has never seen before.” From there, EyeEm Vision can start to make recommendations.
Google AI’s Magenta Performance RNN (recurrent neural network) was trained on roughly 1,400 piano performances to learn how to play its own original music. “It’s crucial that they were performances—it was trained not on music scores but on the tiny decisions that pianists make,” Eck says.
Playing a Chopin sonata by hitting every key equally hard will yield a dramatically different result from allowing expressivity and dynamics to influence decisions on how loudly or softly a note should be played. “This model is just trying to learn to push the keys on the piano in the same way that these musicians did,” he says.
He puts the problem in human terms: “I could have spent every day for the last five years at the museum looking at paintings, but when I pick up a paintbrush and start to paint for the first time, it’s not at all clear to me how to relate what I’ve seen to creating a new instance of that with some paint.”
Comparing an image to an existing data set is not so different from creating a completely new image. Transposing what Google’s piano improviser does to photography would lead to EyeEm Vision starting to compose its own images. Another team at Google is working toward that, with a project dubbed Creatism that attempts to compose, with images cribbed from Google Street View, landscape photographs that humans find aesthetically pleasing.
Granted, the programmers cheated a little by picking from photographs taken from foot trails in gorgeous places like the Alps and Big Sur, but the approach is the same: Show the machine a bunch of landscape photos humans judged to be of high quality, and inject certain statistics modeled in those images to boost the aesthetic power of the resulting images.
And that is not so different from the problem that intrigued Wu in college back in China. Nor is it so mysterious that it was difficult to teach a computer to recognize a truck. Humans have around 100 billion neurons with more than 100 trillion connections to other neurons; some are specialized to recognize shapes and edges in the visual cortex, and others are sensitive to details like distinguishing a face from a football or a truck from a building. Against 5 billion years of evolution, for machine learning to have figured out how to do this in fewer than 100 lines of code since the advent of image recognition in 2012 is pretty impressive.
As engineers push AI further, more creativity will result. “It’s going to be a very interesting next few years,” Shaji says. The Infor Design and Tech Lab at FIT has already taken a deep dive into 3-D fashion design to explore virtual prototyping as a means of replacing the slow, costly, and wasteful process of developing fabric samples. “We’re going to see virtual prototyping leak into the entire manufacturing process,” says the lab’s Michael Ferraro.
But not everyone is so confident about the potential of artificial intelligence to replace human creativity. “The poetic spirit and the cultural contribution and timing of an artist cannot be ‘created’ by a machine,” says Bill Griffin, a partner at prominent Los Angeles art gallery Kayne Griffin Corcoran. “The source always originates within the human condition. AI can’t take into account the intentionality, context, conceptual journey, and, ultimately, the creative life force of the artist.”
But Eck sees the field as expanding, rather than replacing, human creativity. “We’re trying to help give rise to a new kind of art and a new kind of artist,” he says. Success seems a ways off yet—Magenta’s musical compositions are only around 30 seconds long, and the programmers admit they lack structure—but Eck thinks that, like a great photograph, he’ll know it when he sees it. “If an artist is able to work with our tools and builds a following that cares about this art, I think we’ll know if we’re making good or bad art,” he says. “I’m totally fine if it’s 10,000 people who love what this artist is doing and the critics are panning it.”