Funcons-beta : Characters.cbs | PRETTY | PDF
Outline
Characters
[
Type characters Alias chars
Datatype unicode-characters Alias unicode-chars
Type unicode-points
Funcon unicode-character Alias unicode-char
Funcon unicode-point Alias unicode
Type basic-multilingual-plane-characters Alias bmp-chars
Type basic-multilingual-plane-points
Type iso-latin-1-characters Alias latin-1-chars
Type iso-latin-1-points
Type ascii-characters Alias ascii-chars
Type ascii-points
Funcon ascii-character Alias ascii-char
Funcon utf-8
Funcon utf-16
Funcon utf-32
Funcon backspace
Funcon horizontal-tab
Funcon line-feed
Funcon form-feed
Funcon carriage-return
Funcon double-quote
Funcon single-quote
Funcon backslash
]Built-in Type
characters <: valuesLiteral characters can be written 'C' where C is any visible character
other than a single-quote or backslash character, which need to be
escaped as '\'' and '\\'.
Alias
chars = charactersUnicode character set
The set of Unicode characters and allocated points is open to extension. See https://en.wikipedia.org/wiki/Plane_(Unicode)
Built-in Datatype
unicode-characters <: characters
Alias
unicode-chars = unicode-charactersBuilt-in Type
unicode-points <: bounded-integers(0, unsigned-bit-vector-maximum(21))Built-in Funcon
unicode-character(_:unicode-points) : unicode-characters
Alias
unicode-char = unicode-characterThe values in unicode-characters are the values of
unicode-character(UP:unicode-points).
Funcon
unicode-point(_:unicode-characters) : =>unicode-points
Alias
unicode = unicode-pointRule
unicode-point(unicode-character(UP:unicode-points)) ~> UPUnicode basic multilingual plane
The set of Unicode BMP characters and allocated points is open to extension.
Built-in Datatype
basic-multilingual-plane-characters <: unicode-characters
Alias
bmp-chars = basic-multilingual-plane-charactersBuilt-in Type
basic-multilingual-plane-points <:
bounded-integers(0, unsigned-bit-vector-maximum(17))The values in basic-multilingual-plane-characters are the values of
unicode-character(BMPP:basic-multilingual-plane-points).
ISO Latin-1 character set
Built-in Datatype
iso-latin-1-characters <: basic-multilingual-plane-characters
Alias
latin-1-chars = iso-latin-1-charactersType
iso-latin-1-points ~> bounded-integers(0, unsigned-bit-vector-maximum(8))The values in iso-latin-1-characters are the values of
unicode-character(ILP:iso-latin-1-points).
ASCII character set
Built-in Type
ascii-characters <: iso-latin-1-characters
Alias
ascii-chars = ascii-charactersType
ascii-points ~> bounded-integers(0, unsigned-bit-vector-maximum(7))The values in ascii-characters are the values of
unicode-character(AP:ascii-points).
Funcon
ascii-character(_:strings) : =>ascii-characters?
Alias
ascii-char = ascii-characterascii-character"C" takes a string. When it consists of a single ASCII
character C it gives the character, otherwise ( ).
Rule
ascii-character[C:ascii-characters] ~> C
Rule
C : ~ ascii-characters
------------------------------------
ascii-character[C:characters] ~> ( )
Rule
length(C*) =/= 1
--------------------------------------
ascii-character[C*:characters*] ~> ( )Character point encodings
See https://en.wikipedia.org/wiki/Character_encoding
Built-in Funcon
utf-8(_:unicode-points) : =>(bytes, (bytes, (bytes, bytes?)? )? )Built-in Funcon
utf-16(_:unicode-points) : =>(bit-vectors(16), (bit-vectors(16))? )Built-in Funcon
utf-32(_:unicode-points) : =>bit-vectors(32)Control characters
Funcon
backspace : =>ascii-characters
~> unicode-character(hexadecimal-natural"0008")Funcon
horizontal-tab : =>ascii-characters
~> unicode-character(hexadecimal-natural"0009")Funcon
line-feed : =>ascii-characters
~> unicode-character(hexadecimal-natural"000a")Funcon
form-feed : =>ascii-characters
~> unicode-character(hexadecimal-natural"000c")Funcon
carriage-return : =>ascii-characters
~> unicode-character(hexadecimal-natural"000d")Funcon
double-quote : =>ascii-characters
~> unicode-character(hexadecimal-natural"0022")Funcon
single-quote : =>ascii-characters
~> unicode-character(hexadecimal-natural"0027")Funcon
backslash : =>ascii-characters
~> unicode-character(hexadecimal-natural"005c")