Visual Harmony in Generative Design Systems

Table of Links
-
Introduction
-
Related Work
2.1 Semantic Typographic Logo Design
2.2 Generative Model for Computational Design
2.3 Graphic Design Authoring Tool
-
Formative Study
3.1 General Workflow and Challenges
3.2 Concerns in Generative Model Involvement
3.3 Design Space of Semantic Typography Work
-
Design Consideration
-
Typedance and 5.1 Ideation
5.2 Selection
5.3 Generation
5.4 Evaluation
5.5 Iteration
-
Interface Walkthrough and 6.1 Pre-generation stage
6.2 Generation stage
6.3 Post-generation stage
-
Evaluation and 7.1 Baseline Comparison
7.2 User Study
7.3 Results Analysis
7.4 Limitation
-
Discussion
8.1 Personalized Design: Intent-aware Collaboration with AI
8.2 Incorporating Design Knowledge into Creativity Support Tools
8.3 Mix-User Oriented Design Workflow
-
Conclusion and References
7.3 Results Analysis
7.3.1 Satisfaction of Generated Outcome. All Participants found that the generated outcome effectively blends both the information of the selected typeface and imagery (𝑀𝐸𝐴𝑁 = 4.78, 𝑆𝐷 = 0.43), and the majority of them (𝑀𝐸𝐴𝑁 = 4.17, 𝑆𝐷 = 0.62) agree the outcome can achieve a visually harmonious effect. Additionally, Over half of the participants (𝑀𝐸𝐴𝑁 = 4.06, 𝑆𝐷 = 0.73) acknowledged that the generated outcomes were diverse. Their feedback supports that TypeDance is capable of achieving a natural blend and providing diverse results, which aligns with the second design consideration (D2) defined in Sect. 4.
• Preservation. The majority of participants (11/18) expressed that the generated results were “beyond their expectation” and “innovative.” They found that TypeDance was capable of producing reasonable results that effectively combined both typeface and imagery. As mentioned by P3, “I initially didn’t see any relation between the swan and the letter ‘E,’ but the result showed that they could be combined in a way that is visually pleasing (P3, Fig. 7).”
• Harmony. Participants (16/18) agreed that the generated results exhibited aesthetical harmony. TypeDance successfully maintained the legibility of the typeface while enhancing the visual appeal by incorporating imagery that “aligned with the skeleton of the text (P1, P4, Fig. 7).”
• Diversity. Over half of the participants agreed that the generated results were diverse (14/18). Some participants (N=4) emphasized the importance of obtaining alternative designs in practice, commented that “Though I have achieved a satisfactory result, I still want to regenerate to see more interesting results (P2, Fig. 8; E7, Fig. 9).”
In terms of preservation, general users exhibited a lower sensitivity than designers in recognizing the typeface and imagery. In contrast, designers could swiftly perceive the content and showed a tendency to export potential designs to advanced tools for further preservation enhancement. Despite differing levels of design expertise, both novice users and designers demonstrated similar scores in terms of the harmony and diversity of the generated results. Besides perceptual harmony, designers identified more artistic effects. As E1 commented, “I never thought that AI could understand and produce negative space (Fig. 7).” In that case, TypeDance integrated the dog into the typeface by filling the empty space
in the letter E”. During the creation process, all participants experimented with different combinations of design priors to achieve more diversified results. Interestingly, color was more frequently employed than shape, while semantics were consistently selected without specifying a text prompt.
7.3.2 Usability of System. The user study indicated that most participants (𝑀𝐸𝐴𝑁 = 4.39, 𝑆𝐷 = 0.67) found TypeDance to maintain workflow integrity in the design process. Additionally, a majority (𝑀𝐸𝐴𝑁 = 4.33, 𝑆𝐷 = 0.77) expressed satisfaction with the flexibility of blending different granularities of typeface and imagery. In terms of controllability during the generation process and editability of the generated result, more than half of the participants (N=12) agreed that TypeDance provided satisfactory control and editability options. These features align with the design considerations of customization and post-editability (D3 and D4) defined in Sect. 4.
• Integrity. Most participants(N=10)strongly agreed the complete workflow has been instantiated within TypeDance. A participant highlighted, “I don’t need to switch between different platforms to finish a design (E2, Fig. 8).”
• Flexibility. Half of the participants (N=9) strongly agreed with the flexibility provided by TypeDance to personalize their designs. Most participants (N=15) experimented with more than two types of typeface granularity in their designs. E2 pointed out, “I can easily select a single stroke that overlaps with other strokes in the typeface.”
• Controllability. More than half of the participants (N=12) agreed that TypeDance provides a high level of controllability. They found that the generated results were able to accurately “reflect the selected imagery” and “adhere to the chosen shape”.
• Editability. The post-editability of Typdance was strongly agreed upon by half of the participants (N=8). Several participants (N=3) expressed their desire for a generative tool that not only generates designs once but also provides the ability to make adjustments and rectify the results.
All participants widely recognized the workflow integrity of TypeDance, with different perspectives from designers and general users. Designers valued it for integrating essential functionalities that typically require switching between various platforms in the traditional workflow, while general users praised TypeDance for allowing them to sequentially follow the components in the interface to finish a design. Logo demands high customization with their special property
of revealing identity. The option to select imagery from personal photos adds a personalized touch, surpassing the resources available in a shared community. Both designers and novices emphasized the ability to control and draw inspiration from the real world with specified visual representation, color, and shape. This feature is especially crucial in some scenarios, e.g., “designing a city logo.”
The gap between flexibility and editability demonstrates the different expectations from designers and general users. General users demonstrated less interest in experimenting with different granularities of typeface, predominantly utilizing letter-level blending. Designers, on the other hand, highly praised this function as it allows them to segment various parts of the typeface or even combine across different granularities. After gaining the generated results, general users express satisfaction with changing colors or deleting elements (P4 & P6, Fig. 8). Designers find delight in the refinement function, as E5 notes “it simulates the real design process where imagery is progressively simplified or details are added to the typeface (E5, Fig. 9).” They also expressed a desire for more advanced editing functions, such as bezier curves, to fine-tune shapes.”
7.3.3 Usefulness of Individual Functions. Participants also provided evaluations for each component within the TypeDance system. The selection and generation components received unanimous agreement from all participants, with high and comparable scores. The coherence between selecting the typeface and imagery and considering design factors, such as preparing design materials, had a direct impact on the scores of the selection and generation components. E3 stated, “It saves much time for me to select the desired typeface and adjust the bezier curves to create shapes resembling specific objects, like a dog.” They also appreciated the diverse range of results offered by the system, which they found crucial for the design process (N=3). For the pre-generation, In terms of ideation, more than half of the participants (N=11) agreed that the concepts provided during the pre-generation phase are “ helpful to extend imagination” and “the explanation makes sense for me”. During the post-generation stage, the scores for evaluation and refinement were comparable due to the cohesive nature of the operations. Some participants (N=4, including E1, E3, and E4) expressed a particular satisfaction with these two components, as TypeDance achieves a “recognization the similarity between typeface and imagery” and “a more fine-grained adjustment that is independent of the generation”. These post-generation tools are “especially suitable for semantic typography design,” said by E1.
7.4 Limitation
In response to the issues encountered by users while using TypeDance, we identified the main limitations of the current system from three dimensions.
7.4.1 Trial and Error in Selecting Typeface and Imagery. Although the current TypeDance allows creators to select and blend flexibly, facilitating quick generation, participant feedback suggests that trial-and-error with heterogeneous mapping between typeface and imagery could prolong the creation process. For instance, E9 achieved the final design after three attempts, experimenting with different typeface granularities, including “Bea”, single “e” and “a”, and “ea”. Participants noted that, apart from trying different parts of the typeface based on the selected imagery, it would also be challenging to find suitable imagery based on the chosen typeface.”
7.4.2 Tradeoff between Imagery Diversity and Result Style Consistency. Fig. 11 displays this limitation, where using the same imagery for “Hong Kong” produces stylistically consistent result, while using different imagery leads to noticeable inconsistency. E1, remarked, “These elements look good individually, but when combined, they appear discordant.” This inconsistency arises from transferring imagery and style from image references to the generated result, resulting in various styles when using multiple references. While adding a textual prompt is a partial solution to alleviate this issue, it lacks precise control. Incorporating multiple imageries within a single typeface is a common and significant format for semantic typographic logos. Thus, Achieving precise control over imagery diversity and result style consistency remains an important area for further investigation.
Authors:
(1) SHISHI XIAO, The Hong Kong University of Science and Technology (Guangzhou), China;
(2) LIANGWEI WANG, The Hong Kong University of Science and Technology (Guangzhou), China;
(3) XIAOJUAN MA, The Hong Kong University of Science and Technology, China;
(4) WEI ZENG, The Hong Kong University of Science and Technology (Guangzhou), China.
This paper is