[texinfo-pretest] texinfo 4.7.90 pretest available
Karl Berry
karl at freefriends.org
Sun Dec 5 18:48:01 EST 2004
How does this look for the specification / documentation on the two
points we've discussed?
Thanks,
k
--- texinfo.txi.~1.119.~ 2004-11-30 05:29:20.000000000 -0800
+++ texinfo.txi 2004-12-05 15:45:37.000000000 -0800
@@ -16354,6 +16354,10 @@
@enumerate
@item
-The standard ASCII letters (a-z and A-z), and numbers (0-9) are not
-modified. All other characters are changed as specified below.
+The standard ASCII letters (a-z and A-Z) are not modified. All other
+characters are changed as specified below.
+
+ at item
+The standard ASCII numbers (0-9) are not modified except when a number
+is the first character of the node name. In that case, see below.
@item
@@ -16375,4 +16379,11 @@
This includes @samp{_}, which is mapped to @samp{_005f}.
+ at item
+If the node name does not begin with a letter, the literal string
+ at samp{g_t} is prefixed to the result. (Due to the rules above, that
+string can never occur otherwise; it is an arbitrary choice, standing
+for ``GNU Texinfo''.) This is necessary because XHTML requires that
+identifiers begin with a letter.
+
@end enumerate
@@ -16499,8 +16510,9 @@
@cindex Expansion of 8-bit characters in HTML cross-references
-Characters other than plain 7-bit ASCII are transformed into the
-corresponding Unicode code point(s), in Normalization Form C, which
+Usually, characters other than plain 7-bit ASCII are transformed into
+the corresponding Unicode code point(s) in Normalization Form C, which
uses precomposed characters where available. (This is the
-normalization form recommended by the W3C and other bodies.)
+normalization form recommended by the W3C and other bodies.) This
+holds when that code point is 0xffff or less, as it almost always is.
These will then be further transformed by the rules above into the
@@ -16519,4 +16531,11 @@
therefore expands to @samp{B_0306} (B with combining breve).
+When the Unicode code point is above 0xffff, the transformation is
+ at samp{__ at var{xxxxxx}}, with two leading underscores followed by six
+hex digits. Since Unicode has declared that their highest code point
+is 0x10ffff, this is sufficient. (We felt it was better to define
+this extra escape than to always use six hex digits, since the first
+two would nearly always be zeros.)
+
For the definition of Unicode Normalization Form C, see Unicode report
UAX#15, @uref{http://www.unicode.org/reports/tr15/}. Many related
More information about the texinfo-pretest
mailing list