Introduction

Typography, the art of arranging text, plays a crucial role in making written language not only legible and readable but also emotionally engaging and appropriate for content. It is a fundamental element in graphic design, holding a central position in visual communication across various mediums, ranging from print to digital displays. The task of selecting and pairing fonts is a significant challenge in the creation of documents, posters, logos, advertisements, and other design forms, as it profoundly influences visual impact and communication effectiveness. Owing to the subjective nature of design, designers often rely on subjective and intuitive judgments to choose fonts that are stylistically suitable and harmonious for a specific medium. In recent years, new advancements in font generation and selection techniques have emerged, particularly through GAN-based models1 and multimodal approaches like CLIP2,3, which focus on creating fonts for various languages and aesthetic styles4. However, these models are typically geared towards font generation rather than providing insights into font pairing, which remains an underexplored area. Given the highly visual and intuitive nature of typography, designers seek practical guidelines that not only enhance visual aesthetics but also align with the emotional tone of the content. Therefore, understanding the fundamental visual characteristics of fonts is critical for practical design application.

Many design books and websites provide various rules-of-thumb for font pairing5,6, such as “Use Fonts from the Same Family,” “Mix Serifs and Sans-Serifs,” and “Create Contrast.” However, these rules are generally based on subjective assessments and knowledge and are difficult to formalize or generalize7. Moreover, these rules often apply only to a handful of popular fonts, neglecting others7. Additionally, previous studies have analyzed the results indicating that certain types of fonts are more effective in specific mediums8,9,10,11. However, these studies typically selected a single representative example of a font type and conducted a qualitative investigation based on user experience, presenting different analysis results. Recently, the use of artificial intelligence in supporting areas of art and design, such as font recognition, selection, and prediction, has garnered significant interest12,13,14,15. However, these approaches primarily assist in recommending and selecting typefaces based on web images, without a clear understanding of the morphological basis (visual elements of fonts) behind the recommendations and lack specific guidelines for selecting and pairing fonts across various mediums. Therefore, this study conducted a quantitative analysis of font usage across various mediums and proposed a framework that offers objective and reliable criteria for font selection and pairing. This approach complements traditional subjective and intuitive methods used by designers with data-driven insights to enhance the effectiveness and appropriateness of typographic choices in different design contexts. Specifically, we aimed to analyze the visual elements of typefaces and their pairings across different mediums, going beyond existing empirical rules. For this purpose, we collected font usage case data (N=22,897) from Fontsinuse.com, a website showing real examples of font-use, that includes fonts (N=9022) used across different mediums.

In this study, the visual characteristics of fonts are broken down into three key dimensions-Serif vs. Sans-Serif, Basic vs. Decorative letterforms, and Light vs. Bold. These dimensions were chosen as they represent the most fundamental distinctions in how fonts are perceived and used across different design contexts. These three visual characteristics are widely recognized in typography and are often used as primary categories to classify fonts16. Each of these visual characteristics directly impacts readability, tone, and overall design balance, which are critical factors for designers when making practical and informed font selections.

Using this data, we collected font morphological data to visualize each font in uppercase, lowercase, symbols, and numbers and applied Non-negative Matrix Factorization (NMF) to reduce the font design parameters into three dimensions. Consequently, our analysis revealed statistically significant differences in the font parameters used across each medium. This enabled us to predict the specific font parameters that most influenced the selection of fonts for individual use or pairing with other fonts. In addition, we employed network analysis to identify authentic font pairings within each medium. By quantifying these essential visual features, this study bridges the gap between subjective aesthetic decisions and objective design criteria, allowing designers to make more informed and practical font choices.

To achieve the objectives of this study, we performed the following tasks:

  • Data Collection: We collected a large-scale of font-use cases and fonts from fontsinuse.com, a website showing real examples of typeface use, across 19 mediums.

  • Font Morphological Data Collection: Each font was visualized by capturing its visual elements in upper case, lower case, symbols, and numbers.

  • Font Design Parameter Identification: We applied Non-negative Matrix Factorization (NMF) to reduce the design elements of each font into three dimensions, thereby quantifying the visual characteristics of the typefaces (N=3,798).

  • Font Analysis per Medium: We used the axis values of the typefaces to statistically verify the differences between each medium, particularly to predict the differences between single fonts and fonts used in pairs using a Random Forest.

  • Network Analysis of Font Pairing: We identified influential font combinations such as singles, pairs, and triplets in each medium based on our morphological analysis.

Related works

Typeface and font pairing

Typography involves the skillful arrangement of text to make written language both clear and engaging to readers. It plays a key role in guiding the audience and enhancing the legibility and accessibility of text, thereby ensuring a positive reading experience15. Typography focuses on the application of various typefaces, where a typeface refers to a set of fonts that share common design features. The use of fonts is crucial to communication design. Despite the subjective nature of the design, rules regarding the characteristics of typefaces and font pairing are widely shared. Various studies have addressed this topic and some have found that certain typefaces are more effective in specific fields. For example, Altaboli8 suggested that Sans-Serif is more effective for screen displays, whereas Serif is better suited for printed displays. Similarly, Hojjati and Balakrishnan9 proposed that Sans-Serif is more suitable for long text on screen displays. However, other research indicates that there is little or no difference in readability between Serif and Sans-Serif on both screen and printed displays10. These contrasting opinions often arise because each study selected a single representative example of a typeface and conducted a qualitative analysis based on user experience.

Additionally, Wallace et al.11 noted that the fonts we commonly read on the web have also changed. For instance, based on Google Font Analytics, Roboto and Open Sans account for 51% of the total font views today, whereas Lato, Montserrat, and Oswald account for another 11%. Various visual designs such as posters, book covers, advertisements, and product packaging contain textual information printed in various fonts. Rules for using fonts vary within different mediums, but the use of fonts is based on intuitive and subjective evaluations and knowledge17, making them difficult to formalize or generalize. Additionally, font pairing rules often apply only to a few popular fonts, whereas others are overlooked7. Font pairing is a fine-grained problem in which subtle distinctions between fonts may be important, making the rules and conventions of font pairing provided by human experts difficult to formalize.

Recent advancements in typography and font generation have shifted from subjective evaluations to more data-driven and computational approaches. In particular, methods like CF-Font4 have demonstrated the utility of content and style disentanglement to generate new fonts from few reference images. By employing a content fusion module, this method optimizes the style and content fusion in an unprecedented way, making it a key innovation in few-shot font generation. Hassan et al. (2023) introduced a Conditional Font GAN (CFGAN) that enhances the quality of generated fonts by utilizing adaptive instance normalization (AdaIN) at every layer of the generator, allowing for style consistency and diversity even at high resolutions1. Similarly, CLIP-Font2 introduces a semantic self-supervised few-shot font generation technique that uses attention-based contrastive learning to capture fine-grained style representations, providing superior performance in font synthesis. FontCLIP3, takes this further by enabling cross-lingual font retrieval and editing, allowing typography-specific knowledge to integrate with large-scale vision-language models. This approach not only bridges the gap between Roman alphabet fonts and multilingual fonts but also generalizes font characteristics across various languages, showcasing a new frontier in typographic applications. While these methods excel in real-time font generation, there is still a need for further research in understanding the morphological characteristics of fonts in a lower-dimensional space. Our research differs by focusing on how fonts are used and paired in real-world graphic design contexts. Rather than generating fonts, we analyze their morphological characteristics to provide practical guidelines for font pairing based on large-scale data, offering a more structured approach to font selection across various mediums.

Therefore, we intend to solve these problems using a computational approach with large-scale data. By quantitatively analyzing the morphological characteristics of typefaces, we aim to develop more objective and comprehensive criteria and rules and, on this basis, provide a reliable foundation for the use of typefaces and font pairing across various mediums.

Non-negative matrix factorization for typeface morphological features

NMF decomposes a given matrix into two non-negative matrices, allowing the extraction of inherent features or patterns from the data. Notably, NMF can automatically extract spatial locality from image data and analyze the meaning of each axis in reduced dimensions. Research has also been conducted using NMF with font data. For example, Adaptifont applied the NMF technique to reduce the dimensionality of image data in 25 classic fonts18. This process effectively reflects the actual characteristics of glyphs in each dimension, allowing for the identification of key visual features of fonts, even in a lower-dimensional space. In addition, Lee and Jung19 and Lee et al.20 explored font classification methods using NMF, achieving high accuracy rates, such as 98%, in classifications involving 48 different fonts based on character-level analyses. Thus, NMF-based analytical methods provide a method for accurately extracting and understanding the morphological characteristics of typefaces in complex font design spaces. In particular, the various axes in NMF and their meanings are instrumental in quantitatively analyzing the characteristics of the typefaces used in pairings within the design field.

In addition to NMF, other dimensionality reduction techniques such as PCA (Principal Component Analysis)21, UMAP (Uniform Manifold Approximation and Projection)22, and t-SNE (t-Distributed Stochastic Neighbor Embedding)23 are commonly used to reduce data complexity. PCA is effective for finding orthogonal components that capture the most variance in data, while UMAP excels in preserving local and global structure in low-dimensional embeddings. t-SNE, on the other hand, is particularly effective at visualizing high-dimensional data by focusing on preserving local relationships between data points, making it a popular tool in exploratory data analysis . However, NMF is preferred in this study for its ability to provide non-negative and additive components, which align well with the visual and physical characteristics of fonts, making it more interpretable in the context of typography.

Machine learning classification algorithms for typeface characterization

Logistic regression and decision-tree ensemble models are commonly used for binary classification. Decision-tree ensemble models include Random Forest24, XGBoost25, and LightGBM26 combined multiple decision trees to enhance performance. Random Forest is a bagging ensemble model that can handle datasets with many prediction variables27. It creates a forest of decision trees and merge them to obtain a more accurate classification and regression model. Random Forest has been found to be highly effective in various studies, consistently ranking among the highest prediction accuracies for classification tasks. This is a supervised learning algorithm that creates a forest of decision trees and merges them to form a more accurate classification and regression model. Tyagi et al28 used a dataset of 26,000 projects and applied various classifiers, concluding that Random Forest was the best based on experimental values24. Similarly, Hussain et al.29 used a dataset of approximately 130,000 projects, including 40,000 successful projects, and found Random Forest to be the best approach, with an accuracy of 80.67%. This is consistent with the highest prediction accuracies in the classification settings30.

In this study, we focus on Random Forest because of its proven efficiency and robustness in handling complex datasets. We classified typefaces in a binary manner based on whether they were paired in font-use cases and built prediction models using values extracted from NMF. These metrics provide insight into how accurately the model classified typefaces, the proportion of true positives among positively classified items, and the proportion of true positives correctly identified by the model. The use of Random Forest in our analysis allowed us to effectively decipher the morphological features of fonts across various formats, demonstrating its utility in typographical research.

Network

Network analysis enables the quantitative analysis of large-scale data on abstract subjects, modeling complex interactions between individuals and groups in the form of nodes and links. This method facilitates the exploration of structures, diffusion, and evolutionary processes using these relationships. Ahn et al.31 used a bipartite network constructed from data on ingredients and their flavor compounds in the culinary field. This network analysis revealed relationships based on shared flavor compounds and identified new patterns regardless of specific cuisines or ingredients, offering a method for quantitatively analyzing ingredient combinations in a creativity-centric culinary field. They constructed four types of null models to randomly sample ingredient pairs in recipes and analyzed the mean number of shared compounds per region. These models include uniform random, frequency-conservation, frequency and ingredient category preservation, and uniform random with ingredient category preservation. Regardless of the sampling criteria, significant differences were found in the recipes across regions. They also measured the authenticity and prevalence of ingredients in each cuisine. Using this approach, Park and Hyun32 constructed a network using large-scale image data collected from an online interior design platform to reveal rules for color, material, and furniture combinations according to the interior style. These studies have provided a way to analyze new patterns inherent in large-scale data in a particular field through network analysis. In this study, we applied this approach to typefaces to reveal the typeface pairing rules that had previously been evaluated primarily on a qualitative basis. This allows for a more accurate and objective understanding of the interactions between typefaces (i.e., their pairings) and the complex networks they form. We assumed that this would enable more valid and grounded choices when multiple types of faces are used.

Methods

The following section outlines the key stages of the research process, providing a clear step-by-step overview of the methodology used in this study(see Fig. 1). This flowchart captures the sequential steps, starting with Data Collection, where a large dataset of font-use cases and fonts from various mediums was gathered. The next step, Preprocessing, involved organizing the visual data of fonts, including their uppercase, lowercase, symbols, and numbers. In the Font Parameter Reduction phase, Non-Negative Matrix Factorization (NMF) was applied to reduce the dimensionality of the font data and to extract key visual features. Following this, Font Pairing Classification used a Random Forest classification model to distinguish whether fonts were used individually or in pairs based on their morphological characteristics. Lastly, in the Network Analysis step, relationships between fonts and their usage across various mediums were examined using bipartite network projection, identifying authentic font combinations in each medium. It involved predicting font pairs and interpreting the network analysis to highlight significant font pairings and usage patterns across different media. This flowchart provides a systematic and comprehensive overview of the research design, ensuring readers understand the sequence and interaction of experimental steps.

Fig. 1
figure 1

Framework of the Research Design and Experimental Step.: The framework illustrates the research process, beginning with data collection and preprocessing, followed by dimensionality reduction using NMF, classification of fonts using Random Forest, and concluding with network analysis to evaluate font pairings across various mediums.

Font data collection

In this study, extensive data collection was conducted on the website fontsinuse.com. This site introduces a wide range of usage cases for typefaces, from popular typefaces, such as Helvetica and Futura, to newer typefaces, allowing people to explore practical applications in specific mediums. We collected large-scale data including 22,897 font-use cases and 9,022 fonts. The font-use cases were categorized into 20 mediums and are detailed in Fig. 2a. Subsequently, we combined .ttf files with typefaces collected as .png files from MyFonts.com, totaling 3,798 fonts. The fonts were subjected to morphological analyses. Additionally, we gathered detailed classifications of each typeface (such as Serif and Sans-serif) from sources such as dafont.com and organized this information, as shown in Fig. 2b. As a result, 15,132 font-use cases with fonts were examined in this manner, out of a total of 22,897. Only those uses containing at least one font were included in the analysis.

Fig. 2
figure 2

(a) Number of font-use cases by medium; (b) Number of fonts by categorization. Through our methodology, font analysis was performed around the medium of the top five (indicated by *) with many Num.

Extracting font morphological features by NMF

For the morphological analysis of the fonts, we utilized NMF. Referencing Kadner et al. ’s research method, we reconstructed the data to account for missing symbols in certain fonts using 26 uppercase letters, 26 lowercase letters, 4 symbols, and 10 numbers (Fig. 3a)18. Based on these data, we built a three-dimensional learning model to train the top five fonts in frequency for each font medium and conducted a morphological analysis of the fonts. During our exploration, we incorporated bold, light, and italic variants into the model, but this addition did not significantly aid in clarifying the interpretation of axis-specific tendencies.

In our analysis, if the specific font type used (e.g., light bold or condensed) was not specified, we classified it as regular or medium for consistency. Therefore, following the suggestion of Kadner et al.18, we decided to perform a three-dimensional visual analysis to better understand the morphological features (Fig. 3b). NMF was chosen for its ability to provide interpretable and intuitive results33,34, which is essential when analyzing visual features like thickness and curvature-key factors in font selection and pairing. non-negativity constraint ensures that features are represented additively, aligning with how designers perceive and manipulate font characteristics. In the analyzed three-dimensional space, the x-axis represents Serif vs. Sans-Serif, the y-axis represents Basic vs. Decorative letterforms, and the z-axis represents Light vs. Bold, effectively reflecting the characteristics of fonts. These characteristics of fonts are informed by typographic literature and studies on font classification and visual characteristics35,36,37, including the font classification guide provided by MyFonts.com. Interestingly, within the same fonts, we observed movement towards the z-axis as the ’bold’ attribute was emphasized. These labels were assigned based on the visual inspection of evenly spaced samples from the NMF-reduced data, which aligned with these categories, allowing us to provide intuitive and visually interpretable axis labels. However, comparing the difference in thickness between Sans-Serif and Serif is challenging. This difficulty arises because, while the thickness in Sans-Serif is equidistant, making comparison easier, the thickness in Serif is not uniform, leading to ambiguity in setting the analysis criteria. The meanings of these axes allowed us to determine the font design parameters.

We conducted a detailed correlation analysis using the Pearson correlation coefficient38 to measure the thickness and curvature of fonts. The analysis revealed strong correlations between the NMF-derived axes and key font characteristics, with the X-axis linked to curvature and the Z-axis to thickness, further validating the semantic meanings assigned to each axis (See Supplementary Information S3 for detailed results).

This analysis revealed significant patterns: the X-axis, associated with a minimal increase in curvature (correlation of 0.077). However, the Y-axis and Z-axis showed more substantial effects. The Y-axis demonstrated a moderate correlation with thickness (0.3858) and a slight positive correlation with curvature (0.096), indicating that fonts classified towards higher Y-axis values tend to be slightly thicker and more curved. The Z-axis presented a strong correlation with thickness (0.6189) and a slight negative correlation with curvature (-0.061), suggesting that as fonts become bolder (higher values on the Z-axis), they tend to be thicker but slightly less curved. These findings further support the semantic interpretation of the axes, linking key design features such as serif characteristics and boldness directly to font attributes. A comparative analysis of NMF with other methods like PCA is provided in the Supplementary Information S3, highlighting why NMF is the most suitable for this visual analysis.

Fig. 3
figure 3

NMF: (a) Font morphological data example (Futura); and (b) Results of NMF - font space, examples along three axes, example of Euclidean distance value.

Machine learning algorithms for analyzing differences between font as pairing or single

In our study, we classified the fonts as 1 for pairing and 0 for singles in each medium. We then tested prediction models for each coordinate of the fonts extracted through NMF. We evaluated three machine learning models: Logistic Regression, Random Forest, and XGBoost, as shown in Fig. 4a, to determine which provided the most accurate prediction of font usage patterns. Random Forest consistently outperformed the other models. Considering the dataset size (N=3,798), we used models to measure the accuracy, precision, and recall of the prediction models. The dataset was divided into a training set (70%) and a testing set (30%). The accuracy, precision, recall, and F1 values of the prediction models trained for each medium are shown in Fig. 4b. Here, accuracy is the proportion of correctly predicted samples from the total samples, precision is the proportion of true-positive samples among those predicted as positive, and recall is the proportion of true-positive samples correctly identified by the model. F1 is the harmonic mean of precision and recall and is used to evaluate the predictive performance of model. The accuracy of the Random Forest model was generally above 0.8 across all mediums, showing excellent performance metrics. Using the Random Forest model trained for Single or Pairing classification (Accuracy Avg=0.81, Precision Avg=0.85, Recall Avg=0.93), we predicted and classified whether the fonts were used for pairing within the medium based on morphological characteristics. Additionally, we analyzed Partial Dependency Plots (PDP) to assess which specific values along the X, Y, and Z axes were most influential in determining font pairing tendencies. A PDP was employed to evaluate the key points along the X, Y, and Z axes where fonts were most likely to be classified as paired or single. This approach reveals significant insights into font pairing tendencies within each medium.

Fig. 4
figure 4

Machine Learning Algorithm Performance: (a) Average performance metrics (accuracy, precision, recall, F1) for Logistic Regression, Random Forest, and XGBoost with test sizes of 0.2 and 0.3.; (b) Random Forest model performance across different mediums in terms of accuracy, precision, recall, and F1 score.

Network analysis

To analyze the pairing of fonts further, we constructed a Font Network. We follow the methodology proposed by Ahn et al.31 to form a font-medium network, exploring how various fonts are utilized in specific mediums. Specifically, we adopted a Bipartite Network Projection method from Ann et al. This network structure allows connections between only two distinct sets of nodes, representing the relationships between fonts and the various mediums in which they are applied.

These connections were visualized as a One-mode Network, clearly showing the relationship between the fonts. Additionally, the color of each node indicates the type of font, and the thickness of the edges between nodes represents how often different fonts are connected within the same usage.

To delve further into the significance of font combinations within the mediums, we applied the Authenticity Algorithm of Ahn et al.31 to calculate the prevalence and authenticity of each font. This algorithm measures each font’s prevalence (\(P_{is}\)) and authenticity (\(A_{is}\)) in specific mediums, based on the total number of font-use cases (N). The prevalence of a pair of fonts (ij) in the medium (s) is expressed as \(P_{ij}^{s} = \frac{n_{ij}^{s}}{N_s}\); \(P_{ijk}^{s} = \frac{n_{ijk}^{s}}{N_s}\), and that of a triplet of fonts (ijk) as \(P_{ijk}^{s} = \frac{n_{ijk}^{s}}{N_s}\). The authenticity is then calculated by subtracting the average prevalence of the font pairings in other mediums from its prevalence in the specific medium, expressed as \(A_{ij}^{s} = P_{ij}^{s} - \langle P_{ij}^{s'} \rangle _{s' \ne s}\) for pairs, and \(A_{ijk}^{s} = P_{ijk}^{s} - \langle P_{ijk}^{s'} \rangle _{s' \ne s}\) for triplets. Additionally, we calculated the centrality metrics, such as betweenness and eigenvector centrality, to identify the key elements within the network. This approach aims to interpret the complex network structure of font usage and how the fonts in each medium interact and are utilized. We constructed null models to create randomized font pairings and conducted a z-test to compare the real and randomized pairings, examining the axis-specific differences. This allowed us to measure how much actual pairings deviated from random expectations and confirm their significance. For further details, see Supplementary Information Section S2.

Results and discussion

Data overview

The collected data are summarized as follows: On average, each font-use case incorporates a combination of 2.05 fonts, with a maximum of 42. Figure 5a shows the number of font-use cases with up to 10 fonts. Specifically, 9,655 use cases involved a single font, 8,042 font pairs, and 3,089 triplets. Among these, the most frequently used fonts were Sans-Serif, such as Helvetica and Futura, which were also predominantly used in pairs and triplets (Fig. 5b).

Fig. 5
figure 5

Data Overview: (a) Number of fonts pairing in font-use cases, (b) Top six singles with high frequency (c) Top six pairs with high frequency.

Fig. 6
figure 6

(a) Partial dependence plots of x,y,z for 8-ranked mediums. The gray dotted lines indicate points with higher prediction probabilities., example of predicted font and font-use cases- (b) Branding / Identity, and (c) Booklets / Pamphlets.

Additionally, across all mediums, the most widely used fonts included Futura, Helvetica, Gotham, Akzidenz-Grotesk, and FF-DIN, with a total of 13. The most frequently used pairs in the 15 mediums were Futura-Futura Condensed, Futura-Helvetica, and Helvetica-Times. In the eight mediums, the most commonly used triplets were Helvetica-Eurostile-ITC Avant Garde Gothic, Helvetica-Futura-Eurostile, and Druk-Druk Condensed-Druk Wide. Furthermore, we compared the average values of each axis in the NMF of the fonts used in each medium to ascertain the differences across various mediums (see Supplementary Information Table S1). Using One-way ANOVA, we verified the statistical significance of the average differences among the mediums (\(F_X = 9.3008, p < 0.001\); \(F_Y= 15.492, p < 0.001\); \(F_Z= 5.7044, p < 0.001\)). This statistical approach helps identify significant patterns and variations in font usage across different mediums, providing a robust basis for further analysis. For each medium, we observed distinct trends in the font design parameters. On the x-axis, which represents a range from Sans-Serif to Serif, printed mediums like ’Magazines/Periodicals’ (\(Mean_X = 41.95\)) typically used Serif, highlighting their choice for traditional typography. On the y-axis, indicative of decorative style, ’Advertising’ (\(Mean_Y = 30.84\)) showed a higher usage of visually ornate fonts, suggesting an emphasis on visual design elements. Lastly, on the z-axis, which measures font thickness, digital mediums such as ’Mobile/Tablet’ (\(Mean_Z = 32.04\)) generally favored thicker fonts, highlighting varied font preferences across different mediums. In addition, most digital mediums significantly prefer thicker fonts, with average values above 30 (p< 0.05).

Morphological characteristics between single and paired fonts

We employed a Random Forest prediction model to analyze the morphological differences between fonts used individually and those paired with other fonts across various mediums. This model helped predict the X, Y, and Z values of paired fonts within each medium. Using the model, we predicted the X, Y, and Z values of paired fonts in each medium and analyzed the axial differences between fonts in a pair. In this study, we focused on eight mediums, each with over 100 pairs of data, where all NMF coordinate values were available, as shown in Fig. 6a.

The Partial Dependence Plots (PDPs) derived from the Random Forest model in Fig. 6a show how changes in the X, Y, and Z values (representing different font characteristics) affect the likelihood of a font being used in combination with others. Using the PDP obtained from the Random Forest model, we determined the points on each axis that were more likely to be classified as 1 in the model’s classification decision. For example, when examining the PDP for each coordinate in Branding/Identity, we found that the prediction rate was high, from above 60 on the x-axis, between 0-10 and at point 75 on the z-axis. At these points of high prediction rates on the x-, y-, and z-axes, we observed fonts with bold and distinct glyphs that were used in Branding/Identity contexts (Fig. 6b). Similarly, in Booklets/Pamphlets, the fonts at high prediction points on each axis were found to be Serif, like Apple Garamond and HTF Didot around points 23-24 on the x-axis, and lighter fonts below 10 on the z-axis (Fig. 6c). These insights highlight the practical value of the PDPs, providing data-driven guidance to designers on how certain font traits affect usage across various mediums.

Furthermore, we examined the axis-specific differences between the fonts used in pairs. Using four null models, we randomly sampled 1,000 pairs and calculated the axis-specific differences for pairs used in each of the eight mediums by conducting a z-test. This allowed us to analyze the trends in the x, y, and z coordinates for paired fonts across the eight mediums. We present the mediums that showed the same trends and statistically significant differences in the z-test results across all four null models, as per the method of Ann et al.31; despite the influence of frequency and font types in random pairs, certain mediums showed robust patterns across each axis (see Supplementary Information S2). The absolute value of the z-statistic indicates the number of standard deviations of the sample mean from the population mean; the larger the absolute value, the greater the deviation.

For differences along the x-axis, different trends were observed depending on the null model, with only Album Art showing significant results (p< 0.001). All four models exhibited differences between -4 and -5 from the mean, suggesting that Sans-Serif tends to pair with Sans-Serif and Serif with Serif more than expected. For the Y-axis, most mediums showed the same trend regardless of the null model. Web, Booklets/Pamphlets, Branding/Identity, and Magazines/Periodicals showed significant differences compared to random pairs (p< 0.001), indicating a tendency to pair fonts with minimal differences along the y-axis in these mediums. The absolute value differences and the corresponding coordinate differences can be found in Supplementary Information S2. Conversely, Album Art showed significant differences in all but the frequency and medium conservation models, mainly showing positive differences, suggesting pairs of Basic and Decorative letterform fonts. On the z-axis, all mediums predominantly showed positive values, indicating that all mediums tended to use both light and bold fonts simultaneously. Album Art, Books, and Magazines/Periodicals, in particular, showed significant differences across all mediums.

Network analysis of fonts and authentic font pairings in medium

The results of the Backbone of the Font Network (see Fig. 7) revealed distinct patterns in how fonts are used across various design mediums. The network was constructed by analyzing font pairings observed across different design contexts, with each node representing a specific font. The size of the nodes indicates the prevalence of each font-larger nodes, like those representing Helvetica, Futura, and Univers, highlight fonts that are frequently used across multiple mediums. The edges between nodes represent the frequency of font pairings within the same medium, with thicker edges indicating more frequent combinations. Node colors represent different font types (e.g., Sans-Serif, Serif, Decorative), providing a visual guide to how these types distribute and interact within the network.

The analysis clearly shows that Helvetica, Futura, and Univers occupy central positions within the network, not only due to their prevalence but also because of their frequent connections to other fonts. These fonts are highly versatile, serving as the foundation for typographic designs across a wide variety of contexts. Their strong connections to other fonts, particularly the frequent pairings between Helvetica and Futura, demonstrate a robust complementary relationship, making them staple choices in contemporary design practices. In contrast, Decorativefonts like Cooper Black and Gill Sans form smaller, more concentrated clusters. While these fonts appear less frequently paired with others, their clustering suggests they play a distinct role in certain design contexts, though further data would be needed to determine the exact contexts (e.g., posters, advertisements) they are predominantly used in. Serif fonts, while still visible in the network, appear on the periphery. Their placement indicates they are less frequently paired with other fonts compared to Sans-Serif fonts. While the network doesn’t specify, this may suggest their more selective use in design contexts, such as traditional or editorial layouts, where readability or aesthetic preferences are prioritized.

This analysis not only highlights the most influential fonts within the design but also provides a deeper understanding of how different font types interact. The network highlights both common and less frequent font pairings. Sans-Serif fonts dominate across the network, appearing as central nodes, while Serif, Decorative, and Script fonts tend to occupy more specialized roles, forming smaller clusters. This analysis offers a broad view of how different font types interact and are paired, providing insights into both widely used and less common combinations.

Fig. 7
figure 7

Backbone of the font network.

We investigated the impact of font pairing in the form of singles, pairs, and triplets for each medium. This allowed us to identify font-pairing combinations in mediums that had not been previously analyzed using the null model. We calculated the authenticity of all singles, Pairs, and triplets used in each medium and created an Authenticity Pyramid for each medium. Figure 8 shows the Authenticity Pyramids for Branding/Identity, Posters/Flyers, Books, and the Web. For instance, in ’Branding/Identity,’ regular/bold Sans-Serif with low x- and y-coordinate values and high z-coordinate values in the NMF coordinate space are predominantly used (Fig. 8e), with an interesting observation that combinations of Gotisch and Sans-Serif in triplets have high authenticity (Fig. 8a). In Posters/Flyers, pairings not used in other mediums measured the highest authenticity, predominantly involving Serial B/D, Ogg, and Decorative, such as Eniac and Digestive (Fig. 8b). The Authenticity Pyramid for Books (Fig. 8c) shows a tendency different from other mediums, where Sans-Serifs are mainly paired with other Sans-Serifs, and Serifs with other Serifs. Additionally, in the ’Web,’ Neue-Helvetica not only ranks high in authenticity but also plays an important role in the network (Fig. 8f), having the highest Betweenness Centrality and ranking fourth in Eigenvector values among all fonts. Furthermore, in ’Web,’ there is a tendency to pair fonts of the same type (Fig. 8d), with pairs involving Sans-Serif with Serif combinations often using fonts with a y-axis value of 0 with low distance. Fonts from the Halyard family are frequently used in pairs and triplets.

In addition, we analyzed the authenticity of high-frequency fonts across different mediums. Notable differences were observed even among universally used fonts (Fig. 9). Helvetica, for instance, showed high authenticity in Album Art and Object Product, but lower authenticity in mediums such as Exhibition/Installation, Film/Video, and Magazine/Maps. This suggests that although Helvetica is frequently used, its influence varies across different mediums. Specifically, Neue-Helvetica shows the highest authenticity in the ‘Software Apps’. Notably, while most other fonts generally display negative authenticity in the ‘Web’, this font uniquely possesses positive values. This indicates that Neue-Helvetica significantly influences software and web mediums.

Fig. 8
figure 8

Authenticity pyramid for the top five mediums: (a) Branding/Identity, (b) Posters/flyers, (c) Books, (d) Web, (e) Branding/Identity, high authenticity font NMF distribution, and (f) Web networks.

Conclusion

This study introduces a data-driven framework to provide objective and reliable criteria for font selection and pairing, addressing the limitations of subjective, intuition-based approaches commonly used in typography and graphic design. Using NMF, we visualized fonts across a variety of morphological features, reducing their characteristics into three interpretable dimensions: Serif vs. Sans-Serif (X-axis), Basic vs. Decorative letterforms (Y-axis), and Light vs. Bold (Z-axis). These axes capture key aspects of font morphology and provide designers with a practical tool to quantitatively assess visual differences between fonts, aiding in both single font usage and font pairings across different design contexts.

Our results demonstrate that NMF offers a visually interpretable method for representing the essential elements of font design by delivering more meaningful correlations between the axes and key visual traits, such as thickness and curvature. This novel framework allows for a deeper understanding of font morphology, particularly in the context of font pairing, which has previously been difficult to quantify in design practice. Additionally, we compared axis-specific differences between random font pairings from null models and actual pairs used in various mediums, offering insights into the relationships between fonts and the patterns of their usage.

However, while the three-dimensional latent space defined by NMF effectively captures core morphological characteristics, these three axes may not fully explain the complexity of all fonts, particularly those with intricate design features. Although more refined axes could further improve the granularity of font analysis, this remains a challenge for future work. The current framework, however, provides a significant first step in understanding font morphology and pairing relationships, offering a practical and interpretable model that can be expanded upon in subsequent research. Furthermore, the study primarily focused on font pairs and triplets, without exploring more complex combinations of four or more fonts. This simplification, though practical, limits the scope of the analysis in more intricate design projects. Future research should explore more complex font interactions, as well as user interaction data to enhance the practical relevance of the framework. Expanding the dataset to include a wider range of fonts and design mediums would improve the generalizability of the results and provide deeper insights into font selection processes across various contexts.

In conclusion, this study quantitatively assesses the visual characteristics of fonts and provides a robust, interpretable framework that enables designers to make more informed and objective decisions about font pairings. By leveraging global font data and applying advanced computational methods, we have revealed new insights into font usage patterns, offering a valuable tool that bridges the gap between the art and science of typography. As typography continues to play a pivotal role especially in graphic design, this research contributes to the growing understanding of font selection and pairing, advancing both the theoretical and practical applications of typography in contemporary design environments. This research contributes significantly to enhancing our understanding of typography, shedding light on new patterns in font usage, and deepening our comprehension of the aesthetics and complexity of font design.

Fig. 9
figure 9

Comparison authenticity by medium for high-frequency fonts.