Skip to main content
  1. Home
  2. Emerging Tech
  3. Computing
  4. News

Neural network can create high-res images based on a text description

Add as a preferred source on Google

As far as artificial intelligence goes, 2016 has been the year of deep learning. Brain-inspired neural networks have received massive amounts of investment in time, resources and funding — and, boy, has it ever paid off!

In a new piece of research — carried out by investigators at Rutgers University, the University of North Carolina at Charlotte, Lehigh University, and the Chinese University of Hong Kong — neural networks have been used to generate high quality images based on nothing more detailed than basic text descriptions.

Recommended Videos

“Generating realistic images from text descriptions has many applications,” researcher Han Zhang told Digital Trends. “Previous approaches have difficulty in generating high resolution images, and their synthesized images in many cases lack details and vivid object parts. Our StackGAN for the first time generates 256 x 256 images with photo-realistic details.”

A video of the work was shared online by YouTuber Károly Zsolnai-Fehér as part of his excellent series of Two Minute Papers educational videos.

Image used with permission by copyright holder

“For many years, we have trained neural networks to perform tasks like face, traffic sign, or handwriting recognition,” Zsolnai-Fehér told us. “Generally, with millions of training examples, we show the neural network how to do something, and expect them to learn these concepts, and do well on their own afterwards. This piece of work is completely different: here, after learning the neural networks are able to create something completely new — such as synthesizing new, photorealistic images from a piece of text we have written. This opens up a world of possibilities, and I am super-excited to see where researchers take this concept in the future.”

While there have certainly been examples of computational creativity before — ranging from MIT’s Nightmare Machine to projects that can generate predictive video simply by looking at a still image — this is nonetheless an intriguing piece of work. It’s also fascinating because the two-stage method of drawing images looks, to our way of thinking, a whole lot like the way artists will sketch out a piece of work, and then do a second pass to add detail.

We may still be a way from replacing human illustrators with robots, but this is nonetheless an exciting leap forward.

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
The best new ChatGPT feature is one most people will never use
Logo, Emblem, Symbol

For years, the biggest conversation around AI has been what these tools can do. They can browse the web, analyze documents, connect to your apps, conduct research, and increasingly act on your behalf. But as AI systems become more capable, another question has become harder to ignore: what happens when an AI assistant is tricked into handing over information it shouldn’t?

OpenAI’s new Lockdown Mode is its latest answer to that problem. Available across all ChatGPT account types, Lockdown Mode is an optional security setting designed for people and organizations handling sensitive information. The trade-off is that you get stronger protection against certain forms of data theft, but you lose access to some of ChatGPT’s most powerful features.

Read more
An app that lets anyone control a robot from their phone, no coding required
Sounds cool, right? Forget doomscrolling, now your phone can operate a robot arm instead
Representative Image

A team of researchers at Georgia Tech has developed a new smartphone-based system that could dramatically simplify how people interact with robots. Called COBALT, the platform allows users with little to no computing experience to remotely control robot arms from virtually anywhere in the world using just a phone and an internet connection.

The project, developed at Georgia Tech’s People, AI & Robotics (PAIR) Lab, transforms smartphones into motion controllers for robotic arms. Users simply move their phones in different directions, and the robot mirrors those movements in real time. Basic tasks such as grabbing, moving, and releasing objects can be performed through simple on-screen controls, making the experience feel more like playing a mobile game than operating industrial machinery.

Read more
Coursera wants users to learn through shorter, faster content
Coursera wants online learning to feel more like TikTok
Coursera

Online learning platform Coursera is taking a page straight out of TikTok’s playbook. The company has launched a new AI-powered feed designed to serve short-form educational content in a scrollable, personalized format, signaling a major shift in how digital learning platforms may try to keep users engaged.

The feature introduces bite-sized video lessons, clips, and explainers curated through artificial intelligence based on a user’s interests, learning habits, career goals, and previous course activity. Instead of committing to hour-long lectures or full certification programs upfront, users can now discover short educational snippets designed to make learning feel more casual, accessible, and addictive.

Read more