Function textwrap::core::display_width

source ·
pub fn display_width(text: &str) -> usize
Expand description

Compute the display width of text while skipping over ANSI escape sequences.

§Examples

use textwrap::core::display_width;

assert_eq!(display_width("Café Plain"), 10);
assert_eq!(display_width("\u{1b}[31mCafé Rouge\u{1b}[0m"), 10);
assert_eq!(display_width("\x1b]8;;http://example.com\x1b\\This is a link\x1b]8;;\x1b\\"), 14);

Note: When the unicode-width Cargo feature is disabled, the width of a char is determined by a crude approximation which simply counts chars below U+1100 as 1 column wide, and all other characters as 2 columns wide. With the feature enabled, function will correctly deal with combining characters in their decomposed form (see Unicode equivalence).

An example of a decomposed character is “é”, which can be decomposed into: “e” followed by a combining acute accent: “◌́”. Without the unicode-width Cargo feature, every char below U+1100 has a width of 1. This includes the combining accent:

use textwrap::core::display_width;

assert_eq!(display_width("Cafe Plain"), 10);
#[cfg(feature = "unicode-width")]
assert_eq!(display_width("Cafe\u{301} Plain"), 10);
#[cfg(not(feature = "unicode-width"))]
assert_eq!(display_width("Cafe\u{301} Plain"), 11);

§Emojis and CJK Characters

Characters such as emojis and CJK characters used in the Chinese, Japanese, and Korean languages are seen as double-width, even if the unicode-width feature is disabled:

use textwrap::core::display_width;

assert_eq!(display_width("😂😭🥺🤣✨😍🙏🥰😊🔥"), 20);
assert_eq!(display_width("你好"), 4);  // “Nǐ hǎo” or “Hello” in Chinese

§Limitations

The displayed width of a string cannot always be computed from the string alone. This is because the width depends on the rendering engine used. This is particularly visible with emoji modifier sequences where a base emoji is modified with, e.g., skin tone or hair color modifiers. It is up to the rendering engine to detect this and to produce a suitable emoji.

A simple example is “❤️”, which consists of “❤” (U+2764: Black Heart Symbol) followed by U+FE0F (Variation Selector-16). By itself, “❤” is a black heart, but if you follow it with the variant selector, you may get a wider red heart.

A more complex example would be “👨‍🦰” which should depict a man with red hair. Here the computed width is too large — and the width differs depending on the use of the unicode-width feature:

use textwrap::core::display_width;

assert_eq!("👨‍🦰".chars().collect::<Vec<char>>(), ['\u{1f468}', '\u{200d}', '\u{1f9b0}']);
#[cfg(feature = "unicode-width")]
assert_eq!(display_width("👨‍🦰"), 4);
#[cfg(not(feature = "unicode-width"))]
assert_eq!(display_width("👨‍🦰"), 6);

This happens because the grapheme consists of three code points: “👨” (U+1F468: Man), Zero Width Joiner (U+200D), and “🦰” (U+1F9B0: Red Hair). You can see them above in the test. With unicode-width enabled, the ZWJ is correctly seen as having zero width, without it is counted as a double-width character.

§Terminal Support

Modern browsers typically do a great job at combining characters as shown above, but terminals often struggle more. As an example, Gnome Terminal version 3.38.1, shows “❤️” as a big red heart, but shows “👨‍🦰” as “👨🦰”.