About BhashaIndia | Contribute | SiteMap | Register | Sign in to Windows Live ID
  Developers Patrons
Hindi Tamil Kannada Gujarati Marathi Telugu Bengali Malayalam Punjabi Konkani Oriya Sanskrit Nepali
Home > Developers > KnowHow > UnicodeMaths > Holes Welcome Guest!

Compatibility Holes

  • Compatibility holes (reserved positions) exist in some Unicode sequences to avoid duplicate encodings (ugh!)
  • E.g., U+2071-U+2073 are holes for ¹²³, which are U+00B9, U+00B2, and U+00B3, respectively
  • Math alphanumerics have holes corresponding to Letterlike symbols.
  • Recommendation: you can use the hole codes internally, but must import and export the standard codes.

Characters standards defined before Unicode included some of the most common math alphabetics. To be interoperable with these standards, Unicode added them in the Letterlike Symbols block U+2100 – U+214F. It's undesirable to have two codepoints for the same character and so the math alphabetics that are already in Unicode are not defined in the math alphanumerics in plane 1.

However to aid in implementations, holes occur in the math alphanumerics block at the positions where these characters would have been if the Letterlike Symbols hadn't already been defined. The recommendation is that you can use the hole codes internally, but should import and export the standard Letterlike Symbol codes if they exist.

If you use Unicode superscripted and subscripted digits (see U+2070 – U+209F), be sure they display the same way that they would using corresponding markup. To know if a character is a math alphanumeric character you can check for inclusion in the two ranges U+2100 – U+214F and U+1D400 – U+1D7FF. In C, you can see if the character ch is in these ranges using the if() statement

if(IN_RANGE(0x2100, ch, 0x214F) || IN_RANGE(0x1D400, ch, 0x1D7FF)) {} where the macro IN_RANGE(n1, ch, n2) is defined by

#define IN_RANGE(n1, b, n2) ((unsigned)((b) - (n1)) <= unsigned((n2) - (n1)))

This macro effectively has only one goto and is almost as fast as a single compare.

Partner Profile | Privacy Statement | Why Passport | Testimonials
This site uses Unicode for non-English characters and uses Open Type fonts.
©2003-2007 Microsoft Corporation. All rights reserved.