Function textwrap::core::display_width
source · pub fn display_width(text: &str) -> usize
Expand description
Compute the display width of text
while skipping over ANSI
escape sequences.
§Examples
use textwrap::core::display_width;
assert_eq!(display_width("Café Plain"), 10);
assert_eq!(display_width("\u{1b}[31mCafé Rouge\u{1b}[0m"), 10);
assert_eq!(display_width("\x1b]8;;http://example.com\x1b\\This is a link\x1b]8;;\x1b\\"), 14);
Note: When the unicode-width
Cargo feature is disabled, the
width of a char
is determined by a crude approximation which
simply counts chars below U+1100 as 1 column wide, and all other
characters as 2 columns wide. With the feature enabled, function
will correctly deal with combining characters in their
decomposed form (see Unicode equivalence).
An example of a decomposed character is “é”, which can be
decomposed into: “e” followed by a combining acute accent: “◌́”.
Without the unicode-width
Cargo feature, every char
below
U+1100 has a width of 1. This includes the combining accent:
use textwrap::core::display_width;
assert_eq!(display_width("Cafe Plain"), 10);
#[cfg(feature = "unicode-width")]
assert_eq!(display_width("Cafe\u{301} Plain"), 10);
#[cfg(not(feature = "unicode-width"))]
assert_eq!(display_width("Cafe\u{301} Plain"), 11);
§Emojis and CJK Characters
Characters such as emojis and CJK characters used in the
Chinese, Japanese, and Korean languages are seen as double-width,
even if the unicode-width
feature is disabled:
use textwrap::core::display_width;
assert_eq!(display_width("😂😭🥺🤣✨😍🙏🥰😊🔥"), 20);
assert_eq!(display_width("你好"), 4); // “Nǐ hǎo” or “Hello” in Chinese
§Limitations
The displayed width of a string cannot always be computed from the string alone. This is because the width depends on the rendering engine used. This is particularly visible with emoji modifier sequences where a base emoji is modified with, e.g., skin tone or hair color modifiers. It is up to the rendering engine to detect this and to produce a suitable emoji.
A simple example is “❤️”, which consists of “❤” (U+2764: Black Heart Symbol) followed by U+FE0F (Variation Selector-16). By itself, “❤” is a black heart, but if you follow it with the variant selector, you may get a wider red heart.
A more complex example would be “👨🦰” which should depict a man
with red hair. Here the computed width is too large — and the
width differs depending on the use of the unicode-width
feature:
use textwrap::core::display_width;
assert_eq!("👨🦰".chars().collect::<Vec<char>>(), ['\u{1f468}', '\u{200d}', '\u{1f9b0}']);
#[cfg(feature = "unicode-width")]
assert_eq!(display_width("👨🦰"), 4);
#[cfg(not(feature = "unicode-width"))]
assert_eq!(display_width("👨🦰"), 6);
This happens because the grapheme consists of three code points:
“👨” (U+1F468: Man), Zero Width Joiner (U+200D), and “🦰”
(U+1F9B0: Red Hair). You can see them above in the test. With
unicode-width
enabled, the ZWJ is correctly seen as having zero
width, without it is counted as a double-width character.
§Terminal Support
Modern browsers typically do a great job at combining characters as shown above, but terminals often struggle more. As an example, Gnome Terminal version 3.38.1, shows “❤️” as a big red heart, but shows “👨🦰” as “👨🦰”.