IPnom Home • Manuals • FreeBSD

 FreeBSD Man Pages

Man Sections:Commands (1)System Calls (2)Library Functions (3)Device Drivers (4)File Formats (5)Miscellaneous (7)System Utilities (8)
Keyword Live Search (10 results max):
 Type in part of a command in the search box.
 
Index:
  a.out(5)
  acct(5)
  adduser.conf(5)
  aliases(5)
  amd.conf(5)
  auth.conf(5)
  big5(5)
  bluetooth.hosts(5)
  bluetooth.protocols(5)
  bootparams(5)
  bootptab(5)
  config(5)
  core(5)
  crontab(5)
  ctm(5)
  cvs(5)
  devd.conf(5)
  devfs(5)
  device.hints(5)
  dhclient.conf(5)
  dhclient.leases(5)
  dhcp-eval(5)
  dhcp-options(5)
  dir(5)
  dirent(5)
  disktab(5)
  editrc(5)
  elf(5)
  ethers(5)
  euc(5)
  eui64(5)
  exports(5)
  fbtab(5)
  fdescfs(5)
  finger.conf(5)
  forward(5)
  fs(5)
  fstab(5)
  ftpchroot(5)
  gb18030(5)
  gb2312(5)
  gbk(5)
  gettytab(5)
  groff_font(5)
  groff_out(5)
  groff_tmac(5)
  group(5)
  hcsecd.conf(5)
  hesiod.conf(5)
  hosts(5)
  hosts.equiv(5)
  hosts.lpd(5)
  hosts_access(5)
  hosts_options(5)
  inetd.conf(5)
  info(5)
  inode(5)
  intro(5)
  ipf(5)
  ipnat(5)
  ipnat.conf(5)
  ipsend(5)
  isdnd.acct(5)
  isdnd.rates(5)
  isdnd.rc(5)
  kbdmap(5)
  keycap(5)
  keymap(5)
  krb5.conf(5)
  lastlog(5)
  libarchive-formats(5)
  libmap.conf(5)
  link(5)
  linprocfs(5)
  loader.conf(5)
  login.access(5)
  login.conf(5)
  mac.conf(5)
  magic(5)
  mailer.conf(5)
  make.conf(5)
  malloc.conf(5)
  master.passwd(5)
  moduli(5)
  motd(5)
  msdos(5)
  msdosfs(5)
  mskanji(5)
  named.conf(5)
  netconfig(5)
  netgroup(5)
  netid(5)
  networks(5)
  newsyslog.conf(5)
  nologin(5)
  nsmb.conf(5)
  nsswitch.conf(5)
  ntp.conf(5)
  ntp.keys(5)
  opieaccess(5)
  opiekeys(5)
  passwd(5)
  pbm(5)
  pccard.conf(5)
  periodic.conf(5)
  pf.conf(5)
  pf.os(5)
  phones(5)
  printcap(5)
  procfs(5)
  protocols(5)
  publickey(5)
  pw.conf(5)
  quota.group(5)
  quota.user(5)
  radius.conf(5)
  rc.conf(5)
  rcsfile(5)
  remote(5)
  resolv.conf(5)
  resolver(5)
  rhosts(5)
  rndc.conf(5)
  rpc(5)
  rrenumd.conf(5)
  rtadvd.conf(5)
  services(5)
  shells(5)
  ssh_config(5)
  sshd_config(5)
  stab(5)
  style.Makefile(5)
  sysctl.conf(5)
  syslog.conf(5)
  tacplus.conf(5)
  tar(5)
  term(5)
  termcap(5)
  terminfo(5)
  texinfo(5)
  tmac(5)
  ttys(5)
  tzfile(5)
  usbd.conf(5)
  utf2(5)
  utf8(5)
  utmp(5)
  uuencode(5)
  uuencode.format(5)
  vgrindefs(5)
  wtmp(5)

utf8(5)

NAME

     utf8 -- UTF-8, a transformation format of ISO 10646


SYNOPSIS

     ENCODING "UTF-8"


DESCRIPTION

     The UTF-8 encoding represents UCS-4 characters as a sequence of octets,
     using between 1 and 6 for each character.	It is backwards compatible
     with ASCII, so 0x00-0x7f refer to the ASCII character set.  The multibyte
     encoding of non-ASCII characters consist entirely of bytes whose high
     order bit is set.	The actual encoding is represented by the following
     table:

     [0x00000000 - 0x0000007f] [00000000.0bbbbbbb] -> 0bbbbbbb
     [0x00000080 - 0x000007ff] [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb
     [0x00000800 - 0x0000ffff] [bbbbbbbb.bbbbbbbb] ->
	     1110bbbb, 10bbbbbb, 10bbbbbb
     [0x00010000 - 0x001fffff] [00000000.000bbbbb.bbbbbbbb.bbbbbbbb] ->
	     11110bbb, 10bbbbbb, 10bbbbbb, 10bbbbbb
     [0x00200000 - 0x03ffffff] [000000bb.bbbbbbbb.bbbbbbbb.bbbbbbbb] ->
	     111110bb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb
     [0x04000000 - 0x7fffffff] [0bbbbbbb.bbbbbbbb.bbbbbbbb.bbbbbbbb] ->
	     1111110b, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb

     If more than a single representation of a value exists (for example,
     0x00; 0xC0 0x80; 0xE0 0x80 0x80) the shortest representation is always
     used.  Longer ones are detected as an error as they pose a potential
     security risk, and destroy the 1:1 character:octet sequence mapping.


COMPATIBILITY

     The utf8 encoding supersedes the utf2(5) encoding.  The only differences
     between the two are that utf8 handles the full 31-bit character set of
     ISO 10646 whereas utf2(5) is limited to a 16-bit character set, and that
     utf2(5) accepts redundant, non-``shortest form'' representations of char-
     acters.


SEE ALSO

     euc(5), utf2(5)

     Rob Pike and Ken Thompson, "Hello World", Proceedings of the Winter 1993
     USENIX Technical Conference, USENIX Association, January 1993.

     F. Yergeau, UTF-8, a transformation format of ISO 10646, January 1998,
     RFC 2279.

     The Unicode Standard, Version 3.0, The Unicode Consortium, 2000, as
     amended by the Unicode Standard Annex #27: Unicode 3.1 and by the Unicode
     Standard Annex #28: Unicode 3.2.


STANDARDS

     The utf8 encoding is compatible with RFC 2279 and Unicode 3.2.

FreeBSD 5.4			 April 7, 2004			   FreeBSD 5.4

SPONSORED LINKS




Man(1) output converted with man2html , sed , awk