Neural networks and artificial intelligence can help solve complex problems, but so far this does not mean that they can replace a person. Let's see why they are not able and where they can help, using the example of several neurons.

ruDALL-E
Mid trip
Inpainting Demo
balaboba
Point

In the world of technology and robotics, it is heard that robots in recent decades perform tasks that, before the advent of the machines described, were performed only by a person. At the same time, as practice shows, the use of robots and artificial intelligence does not in every case make it possible to achieve the desired quality of performance compared to people. Against the backdrop of information noise around ChatGPT and other means of obtaining information in the form of text, I decided to study several neural networks that create content in the form of text and images.

ruDALL-E

Russian DALL-E is a neural network that creates pictures based on a text description. But in fact - a neural network that can recognize images, as well as perform a search function: enter text and get images. Also works with complex expressions, such as “why did he hit him”. The neural network will find the word “hit” in the text and show it in the picture. Works with images only in vector format (bmp, gif, jpeg, png), while the size does not exceed 1 MB. As the first request, I used the word “verification”. The neural network should generate what is written in text. At least that's what it says in its documentation, but let's see how it will actually be later. Now I would still figure out what exactly needs to be entered as a text query. 

And, frankly, the result was not encouraging.

Especially when there is some kind of haze in the recommended additional images. In general, if you are not a fan of twisting everything that comes across before your eyes, then this neural network is not for you. 

However, using different algorithms, I managed to get a result that I could more or less match with the word “test”. I can create a query that will fully describe the action in the picture. By selecting various variations, I settled on “Testing the Neural Network”. 

I thought that I should finish, but the next request “The boy is doing homework” made me realize that this was only the beginning.

Well, I completely dragged the following request into the research process, which sounds like this:3D printer in space”.

We are not talking about a picture on the cover of a magazine, but the fact that the network drew a sketch of a 3D printer and placed it in space delighted me at that time. A good start, I thought, but my attention was diverted by a more powerful neural network.

Mid trip

While testing the ruDALL-E neural network, I corresponded with my wife, who suggested that I test another neural network called Mid trip. It was very interesting, but very difficult.

Compared to ruDall-E, at first glance, Mid journey uses other algorithms and advanced functionality. ruDALL-E does not give results immediately, so I decided to test it in between and find out what Mid journey can and can do.

The process of setting up the workspace just exhausted me. The neural network works by inputting and outputting information through Discord.

I knew that the neural network only works in English, but the process of entering a request was not obvious. You can’t just type in text by analogy with a search engine.

And when I figured out the commands for entering information, I still stumbled over all sorts of permissions and confirmations.

And finally, the process began.

The neural network created the first result.

The neural network offers to choose one of the results obtained in order to continue generating. The end result is shown in the following image.

Now compare with the result that was obtained in Russian DALL-E. Significantly different, right? To test this neural network, I tried several queries that are related to 3D printing. Below I give examples proposed by the neural network for different calls.

Waiting for 3d printing

Farm 3D printing

Super 3D printer

Abstraction generation is certainly good, but let's try a more difficult task.

Let's try to create a logo for a company that develops software for 3D printing. The first request turned out to be quite difficult for the perception of the neural network and gave the following result.

Therefore, I decided to simplify the request in order to get something more interesting and non-standard. Any request must be made in such a way that it is understood. It just won't work to write something like that.

I decided to develop the fourth option.

And after several development options, the neural network gave me the following final version.

It is difficult for me to evaluate the result, so I leave it to your judgment in the comments. I can say for sure that I will not use this logo for myself. I noticed that where the definition of 3D printing is used, there are always different variations of skulls. And it's not just that. At a certain point in the development of 3D printing, many people printed skulls, so the photo was expected. Let's try to generate a logo for the query "Online 3D printing service".

I did not delve into the generation algorithms, although the result shows that from the word “logo” the neuron pulls up some general idea of ​​the logo. All these monograms and color palette tell us about the general approach in understanding the presentation of the logo. I don't like this style, so I tried to combine our logo with a picture from the Internet on the subject of 3D printing with a color that I like.

As a result, we got a concept that is quite interesting in my opinion.

And yet not a ready-made solution. Just a concept, which, if necessary, will need to be finalized by the designer in the form of a real person.

At the end, I realized that in complex queries on relatively narrow topics, the neural network supports only the simplest queries. Of course, the number of these requests is growing, and the network is learning from them, accumulating knowledge and experience against the background of interaction with real people. For example, there is such an approach to training neural networks, which is called "saturation", and there is a method of "compression". With "saturation" the neural network is trained on a large set of training examples, and with "compression" - only on one. So far, this is like communicating with a child who can be taught both good and bad. But you probably won't be able to learn anything from him. On the one hand, there is no need to invent anything, and on the other hand, even if you come up with something new, then this, perhaps, will have been invented a long time ago. 

Inpainting Demo

Inpainting Demo is a neural network that allows you to edit images and photos in order to remove some unwanted things or objects. A demonstration of the functionality in the form of a preview is located on the main page of the neural network.

The first step is to select an image to edit.

I chose images from the fitting of our art object on the wall, which was supported by a hand. I set myself the task of removing my hand from the photo.

Did all the steps as instructed.

So it didn't work. Tried several times, in the end the desired result did not work. Pixelmator coped with this task right away.

In the end, I did not understand how to work with her. Maybe you can.

balaboba

balaboba is a neural network that allows you to generate a continuation of the text based on the input of short abstracts and a brief description.

I thought, well, at least there should be no problems with the test. Now he will offer me a bunch of variations of the text, based on short sentences and abstracts, but that was not the case. Most likely, my expectations after generating images were too high.

At the same time, in the case of Balabob, I managed to write some points that filled this text. When writing this text, I used the help of Balabob. I can confidently identify the moment where it was the neural network that helped me. This moment is the suggestion of options for continuing the text, which helped me remember the moments and aspects that I most likely would have forgotten to describe if I had not worked with the neuron. However, it lacks text structure, error correction, or other functionality that is usually used for editing. And if I forget something, I can remember later. Therefore, the main help in acceleration. And the acceleration of work is also very good!

For myself, I have identified the following algorithm for communicating with Balabob. Wrote the text briefly and concisely. The first step is to throw each sentence sequentially as initial data. I immediately add a word or expression on the topic to the sentence. We are waiting for the results. We look, choose and add the appropriate options. As soon as the paragraph is ready, we run it in its entirety. We look, choose and add the appropriate options. As soon as the text is ready, we run it in its entirety. Structuring.

Upon request

Read more about the neural network on the website studia3d.com on my blog.

The network issued

On the site you can download 3d models of people, animals, vegetables, fruits, cars, etc. All this you can download to your computer and use in your model.

That's how she knows. But it's true! In any case, well done =)

Let's compare this with my usual method of writing an article: I write a short text consistently and thesis. I begin to unfold each sentence and word in order to describe and convey as much as possible what I had in mind. Further, I process everything that has turned out structurally. As you can see, there is practically no difference. The neural network helps a little “not to forget anything”. This is its main advantage.

The experience of writing text using a neural network actually seemed new to me. Although the algorithm is approximately the same. But the speed of writing a text with a neural network is higher and the work was more productive, since the process of “unfolding” the descriptive part of the text is faster. Although the structure of the text and the process of conveying the essence of the article still have to be taken over, sometimes even spelling, because such functions that Glavred, Balaboba does not. Glavred, if anyone does not know, helps to clear the text of verbal garbage and checks for compliance with the informational style.

The text suggested by the neuron:

In the world of technology and robotics, more and more often, you can hear that robots can now perform tasks that were previously only possible for humans. At the same time, as practice shows, the use of robots and artificial intelligence does not always allow achieving a higher quality of performance than people did.

Glavred rating 5 out of 10.

Text corrected by me:

In the world of technology and robotics, it is heard that robots in recent decades perform tasks that, before the advent of the described machines, were performed only by a person. At the same time, as practice shows, the use of robots and artificial intelligence does not in every case make it possible to achieve the desired quality of performance compared to people.

Glavred rating 10 out of 10.

But anyway, I liked it. It's like with a child: you probably won't be able to learn something from him. Although in general, the child can help with something. At least not boring. It's great that there is such an expansion of positivity.

Point

It is interesting to find neurons for generating 3D models. The image is, of course, good, but the picture can only be printed on a regular printer. For a 3D printer, the model must be three-dimensional, consist of polygons, not pixels, and correspond to requirements.

On request in the search bar, I found the same Point.

Let's try how it works. Enter the query "small 3d printer".

And got a box. Interesting. Let's enter a simpler query "a 3D printer".

I don't understand. I don't understand anything. Apparently it's too early. And how nice it would be to give our customers the opportunity to generate 3D models simply from a textual description. Apparently it's too early.

Hack and predictor Aviator

This year, neural networks can be used as an addition to existing services. A good example is GTranslate, a website machine translation service.

GTranslate is a website translator that can automatically translate any website into any language and make it available to the whole world!

A feature of this service is the selection of translations in accordance with the subject of the site.

So far, the existing neurons, with all the beauty of frames and bright colors, cannot replace artists, photographers and similar frames. Weak functionality, rather weak quality, rather weak logic and a very complex system of interaction. Good response time, but it is not clear what time it will be when something acceptable will be generated. So far, this is interesting only in the case of useless pictures that are only needed to fill some empty space with something colorful. Although it is better to use abstraction for this.