Stay updated with our daily and weekly newsletters for the latest on industry-leading AI advancements.
Graphic designers and those who depend on their skills should pay attention: a revolutionary tool called COLE has emerged, potentially shaking up the profession. Named after Henry Cole, the creator of the first graphical Christmas card in 1843, COLE lets users input a graphic design idea—like “a poster for an upcoming Winter Holiday concert with people playing instruments in warm clothes among falling snow”—and uses AI to generate both the image and accompanying text.
COLE is actually an amalgamation of several AI models, including Meta’s Llama2-13B, DeepFloyd IF, LLaVA1.5-13B, and GPT-4V, along with the open-source graphics renderer Skia. Developed by a team of 12 researchers from Microsoft Research Asia and Peking University, COLE tackles the complexities of graphic design by consolidating SVG elements and additional embellishments into a unified image layer, which the AI then processes to extract the background and generate descriptive text.
The team trained their background modeler AI on a massive dataset of 100,000 high-quality raw graphic design images sourced from the internet. The result is more of a framework than a final product for now. However, initial results are impressive: COLE can generate sophisticated, crisp, and organized graphic designs from simple text prompts, similar to other text-to-image generators like DALL-E 3 or Midjourney. What sets COLE apart is its ability to embed text accurately into images, a task that has traditionally been challenging for AI art generators.
Furthermore, COLE creates images with distinct, editable blocks for text and objects, allowing users to tweak these elements without having to switch to other programs like Adobe Photoshop or InDesign. Users can easily change the displayed text or visual elements directly within the COLE framework.
The researchers behind COLE envision a system that requires minimal effort from users while providing high-quality and flexible editing capabilities. COLE’s results compete favorably with other state-of-the-art AI tools, especially in generating covers, headers, and posters, and offer superior editing capabilities for text and objects within the images.
Despite its promise, COLE is not without limitations. The current system does not allow users to change the arrangement or placement of typography blocks, and it supports only one color of typography per image. Nevertheless, the researchers plan to address these issues in future updates.
COLE could represent both a challenge and an asset for graphic designers. While it can produce professional-quality designs that might rival those created by trained designers, it’s also designed to assist users in refining the output with human expertise when necessary. This suggests that graphic design training will still be valuable.
In summary, COLE aims to democratize high-quality graphic design, a vision shared by other companies like Adobe and Canva. While it’s not yet publicly available, a demo is expected soon on their Github project webpage.