Microsoft isn't wasting any time in the current artificial intelligence race with already established tech giants and smaller companies that are on a meteoric rise. There is no doubt that the company is doing everything within its power to secure its spot as one of the leaders in this new paradigm.
While the Bing chatbot (which integrates OpenAI's GPT-4 technology) has still not been made available to the public, Microsoft released a generative AI text-to-image model today for anyone to use.
The text-to-image AI trend arguably started with the release of the original DALL-E deep learning model. This was back in January 2021. Today, no one can deny that the current leader in this space is Midjourney, which is the result of a project from an independent research lab in San Francisco.
Midjourney has played a huge role in popularizing text-to-image models on a global level. Although everyone has been recently talking about the impressive advancements in Midjourney v5, Microsoft decided to preview its own generative AI model for images.
What Is the Bing Image Creator?
Due to the close relationship established with OpenAI, Microsoft plans to utilize the technology that their partner company developed. That's why the Bing Image Creator is actually just an advanced version of the DALL-E text-to-image model.
The Bing Image Creator comes with standard OpenAI safeguards as well as additional protective measures that aim to minimize the chance of unsafe or harmful images being generated with the model. Microsoft notes that they are fully committed to deploying their AI systems responsibly.
Will Image Creator Be Integrated into the Bing Chatbot?
According to an article published on the official Microsoft blog, the company plans to fully integrate the Bing Image Creator into its chatbot. In other words, you'll be able to get both text and images as output when you give instructions to the chatbot.
I have to say that this is what I'm most excited about. While I view ChatGPT (or GPT-4) as the most amazing thing developed in the world of AI and available for public use, I wish that it could generate images as output. It looks like the Bing chatbot will solve this.
For now, you can try out the Bing Image Creator tool for free. All you need to start generating images is a Microsoft account. You can navigate to the official Bing Image Creator website and start creating images now.
On the website, you may notice interesting information about the AI tool. For instance, it is noted that the model only supports prompts written in English. However, it will support more languages in the near future.
How to Use Bing Image Creator?
One of the first things that you'll see when you begin using the Bing Image Creator is that you have a limited number of 'boosts'. Each boost will increase the speed at which an image is generated.
In case you use all of your boosts, you will have the option to perform small tasks to get Microsoft Rewards that you can then exchange for additional boosts. This is a nice little gamified system that the Microsoft team created.
Apart from the "Create" button, there is also a "Surprise Me" option in the top right corner that you can press if you want a random prompt to be generated. I pressed the button and it gave me a prompt that read "neon living room from the future, Scandinavian design." Here is the result.
I'm going to be honest - it didn't do a good job with this one. This doesn't seem like a living room from the future and the quality isn't great now that I've gotten used to Midjourney v5. But I decided to come up with a few of my own prompts and see what type of results I'd get.
The first prompt that came to mind is "sushi rolls on a wooden board, beautiful lighting, photorealistic." Before we get to the result, I just want to take a second to say how I appreciate that the tool will give you useful tips when you're waiting for the image to be generated.
It's like playing a video game and waiting for a scene to load, and you get a useful tip on how to improve your performance while playing. I can see how this can effectively help people learn to write better prompts as they're using the tool to generate images.
Now, let's see the result for the sushi rolls.
This is much better. The first image is the best in my opinion, but it could be even further improved with a bit of work. However, I wanted to test the model's capabilities when it comes to creating an abstract piece of art.
I decided to share only one of the images out of the four it generated because it was clearly the best. I wrote a prompt asking the model to create an "abstract image of a giraffe sailing on a cloud over a small majestic settlement, and you can see the result above". I like it!
I couldn't help but continue coming up with ridiculous prompt ideas as soon as I saw what it looks like when a giraffe is sailing on a cloud. The next prompt I gave the Bing Image Creator is to make an image of an "international space station made out of tiny balloons." Here is the result.
I like all four of these concepts, but the prompt is very simple. I now wanted to see what it would do if I added a bunch of attributes to a prompt.
My next prompt involved asking the model to create an image of an "anatomical diagram of the human heart, full detail, hand drawn, technical, old manuscript, schematic, da vinci drawing." Here's what the model generated.
These are amazing results. It just goes to show how important it is to write a good and detailed prompt if you want to get great results. If you've never experimented with text-to-image models before, I recommend that you take the time to learn how to express your creativity with words. Even adding a few adjectives to your prompt can greatly enhance the generated output.
Final Thoughts
The Bing Image Creator is an awesome AI tool that you can use to generate anything from vector illustrations to photorealistic scenes. It is based on an enhanced version of the DALL-E model.
In my opinion, Midjourney v5 is far better than the Bing Image Creator, for now. There's no telling how this might change in the future as they further develop the technology. One thing I will say is that it's very exciting to be living in these times and seeing such rapid improvements in the AI space.
One of the main things I like about this text-to-image model is the inclusion of a "Surprise Me" button, which generates a random prompt every time you press it. It might not mean much to an experienced prompter, but I believe it plays a key role in helping people who've never used the technology before learn how to write good prompts.