How deep‌ ‌learning‌ ‌is transforming design: NLP and CV applications

Learn how Natural Language Processing and Computer Vision are being applied today in the design field

Javier Fuentes
Towards Data Science

--

If you have ever tried creating a user interface, you probably quickly realized that design is hard. Choosing the right colors, using fonts that match, making your layout balanced… And all of that while keeping your users’ needs in mind! Can we somehow reduce all of this complexity, allowing everyone to design, even if they don’t know about spacing rules or color contrast theory? Wouldn’t it be nice if software could help you with these?

This problem is not new. This is a long-standing problem that the Human Computer Interaction (HCI) community has been working towards for years. Deep learning has only recently started to be applied (check this or this paper, for example), and as it has happened in other fields, it has quickly become a core enabler to make these sorts of technologies work in real products. It turns out that it is more feasible to collect data for some of these problems than to come up with a full mathematical formulation of why a certain design works. Not without challenges though. As a complex, high-dimensional, and multi-solution problem, it is often difficult to define your model inputs, outputs, or even what to optimize for!

While current deep learning approaches are not ready to fill the same level of responsibilities that a designer has, it has started to remove friction and partially automate steps of the design process. This empowers non-designers to prototype their own ideas, without the direct input of a designer.

No matter if you are a designer or not, these are some exciting applications of deep learning in the space:

Computer vision

As a visual domain, there are plenty of applications of computer vision in the design field.

Sketch to design conversion

Source: uizard.ioTry it

Using complex design software can be a daunting challenge for non-designers, but a pen and a paper? A really low barrier of entry means that everyone can get started real quick.

However, this is no easy task. Solving this at the computer vision level involves not only understanding shapes, but understanding intent. Recognizing lines is an easy problem, but knowing what that line means can be really challenging.

And even if you manage to understand everything from the visual perspective, you still need to go through the poorly defined layout modeling task of “making everything look good”.

Sketch-based query of design resources

Source: Google autodraw

What if you already have a set of visual assets you want to work with? The exploration of these resources can be challenging through classical language-based search alone. You have an image of what you want in mind, but how is that particular icon/drawing called? Just quickly sketch it!

This is a complex vision problem, especially when we assume the set of assets is not fixed, as the sketch representation can vary greatly from the actual assets.

Vision-based theme creation

Source: uizard.ioTry it

Sure, you may be able to fetch some basic components and shapes, but what about colors, typographies, complex components design, etc.? These are really time-consuming tasks that designers spend weeks developing for each new project. The idea here is that you just select an image of your existing project, the URL of your website, or even any random image from the internet and in a few seconds you save weeks of work from a designer! A whole “design system” is created from your visual inspiration.

Sounds useful, but how do you go about modeling this? The first step is to reach a complex understanding of the pixels in the image and from there we can extract a set of colors, typographies, components, etc. But of course, it may be that not all components are there, so you also need to design the components you don’t see. Read more.

Natural language processing

While it may sound counterintuitive to apply natural language processing to a visual domain at first, design is a multimodal domain, which presents lots of opportunities to apply NLP techniques. And when you come to think of it, components in a design can be modeled as words, screens as sentences, and your whole app as a long text. This means that we can leverage a bunch of the work done in the NLP field in the past years and use it to learn from layouts. This approach has been proven effective in LayoutLM as well as in numerous other papers applied to design, such as this one or this one.

The design components, whether textual or otherwise, are the “words” of your “sentence”. This can then be fed into a transformer to solve numerous tasks. Source

Description to design generation

But why even bother solving the individual problems that make up the design problem, when we could just describe what we wanted? Well, while there have been prototypes that do this, don’t expect a perfect solution just yet. Humans are still needed to iterate on the infinite solutions of a design problem, tweak things to their liking, etc.
However, it is a really interesting multi-modal problem. Mixing modalities is always a challenge, and here we are going from regular English to a 2D layout, where you don’t only need to predict tokens, but where they are located, their content, style, etc.

Design autocompletion

Given a partial design layout (each box is a component), a model is tasked to finish it Source

Autocompletion is saving the world millions of hours by making the experience of writing text on tiny mobile phone screens more efficient. It is time to leverage this while designing as well.

I mean, if designs can be modeled almost as text, why can we not have autocompletion capabilities while designing? And indeed similar modeling approaches could be applied to both problems, where instead of predicting the next words we predict the next design components.

Final words

With abundant data and a low cost of mistakes, the design field is the perfect playground for deep learning research and development. Here we just mentioned a handful of projects, but there are numerous other challenging tasks where both NLP and CV modeling techniques are the key to success. From code generation to screen link prediction or design retrieval, just to name a few.

If you are interested in building the future of design through deep learning, remember to follow me on Twitter. I will keep you posted whenever we have open positions at Uizard.

--

--