module Html:sig..end
Encodes characters that need protection by converting them to
entity references. E.g. "<" is converted to "<".
As the entities may be named, there is a dependency on the character
set.
Legacy functions:
val encode_from_latin1 : string -> string
val decode_to_latin1 : string -> stringThese functions have a more general interface and should be preferred in new programs.
val unsafe_chars_html4 : stringThe string contains '<', '>', '"', '&' and the control characters 0-8, 11-12, 14-31, 127.
val encode : in_enc:Netconversion.encoding ->
?out_enc:Netconversion.encoding ->
?prefer_name:bool -> ?unsafe_chars:string -> unit -> string -> stringThe input string that is encoded as in_enc is recoded to
out_enc, and the following characters are encoded as HTML
entity (&name; or &#num;):
unsafe_charsout_enc. By
default (out_enc=`Enc_usascii), only ASCII characters can be
represented, and thus all code points >= 128 are encoded as
HTML entities. If you pass out_enc=`Enc_utf8, all characters
can be represented.For example, the string "(a<b) & (c>d)" is encoded as
"(a<b) & (c>d)".
It is required that out_enc is an ASCII-compatible encoding.
The option prefer_name selects whether named entities (e.g. <)
or numeric entities (e.g. <) are prefered.
The efficiency of the function can be improved when the same encoding is applied to several strings. Create a specialized encoding function by passing all arguments up to the unit argument, and apply this function several times. For example:
let my_enc = encode ~in_enc:`Enc_utf8 () in
let s1' = my_enc s1 in
let s2' = my_enc s2 in ...
val encode_tstring : in_enc:Netconversion.encoding ->
out_kind:'s Netstring_tstring.tstring_kind ->
?out_enc:Netconversion.encoding ->
?prefer_name:bool ->
?unsafe_chars:string -> unit -> Netsys_types.tstring -> 'sThis version takes a tstring argument, and returns the string type
chosen by the out_kind arg.
val encode_poly : in_enc:Netconversion.encoding ->
in_ops:'s Netstring_tstring.tstring_ops ->
out_kind:'t Netstring_tstring.tstring_kind ->
?out_enc:Netconversion.encoding ->
?prefer_name:bool -> ?unsafe_chars:string -> unit -> 's -> 'tFully polymorphic version
typeentity_set =[ `Empty | `Html | `Xml ]
val decode : in_enc:Netconversion.encoding ->
out_enc:Netconversion.encoding ->
?lookup:(string -> string) ->
?subst:(int -> string) ->
?entity_base:entity_set -> unit -> string -> stringThe input string is recoded from in_enc to out_enc, and HTML
entities (&name; or &#num;) are resolved. The input encoding
in_enc must be ASCII-compatible.
By default, the function knows all entities defined for HTML 4 (this
can be changed using entity_base, see below). If other
entities occur, the function lookup is called and the name of
the entity is passed as input string to the function. It is
expected that lookup returns the value of the entity, and that this
value is already encoded as out_enc.
By default, lookup raises a Failure exception.
If a character cannot be represented in the output encoding,
the function subst is called. subst must return a substitute
string for the character.
By default, subst raises a Failure exception.
The option entity_base determines which set of entities are
considered as the known entities that can be decoded without
help by the lookup function: `Html selects all entities defined
for HTML 4, `Xml selects only <, >, &, ",
and ',
and `Empty selects the empty set (i.e. lookup is always called).
val decode_tstring : in_enc:Netconversion.encoding ->
out_kind:'s Netstring_tstring.tstring_kind ->
out_enc:Netconversion.encoding ->
?lookup:(string -> string) ->
?subst:(int -> string) ->
?entity_base:entity_set ->
unit -> Netsys_types.tstring -> 'sThis version takes a tstring argument, and returns the string type
chosen by the out_kind arg.
val decode_poly : in_enc:Netconversion.encoding ->
in_ops:'s Netstring_tstring.tstring_ops ->
out_kind:'t Netstring_tstring.tstring_kind ->
out_enc:Netconversion.encoding ->
?lookup:(string -> string) ->
?subst:(int -> string) ->
?entity_base:entity_set -> unit -> 's -> 'tFully polymorphic version