HTML::HTML5::Writer - output a DOM as HTML5
my $writer = HTML::HTML5::Writer->new;
This module outputs XML::LibXML::Node objects as HTML5 strings. It works
well on DOM trees that represent valid HTML/XHTML documents; less well
on other DOM trees.
"$writer = HTML::HTML5::Writer->new(%opts)"
Create a new writer object. Options include:
Choose which serialisation of HTML5 to use: 'html' or 'xhtml'.
Set to true in order to attempt to produce output which works as
both XML and HTML. Set to false to produce content that might
If you don't explicitly set it, then it defaults to true for
HTML, and false for XHTML.
Set this to a string to choose which tag to output.
Note, this purely sets the tag and does not change
how the rest of the document is output. This really is just a
plain string literal...
# Yes, this works...
my $w = HTML::HTML5::Writer->new(doctype => '');
The following constants are provided for convenience:
DOCTYPE_HTML2, DOCTYPE_HTML32, DOCTYPE_HTML4 (latest stable
strict HTML 4.x), DOCTYPE_HTML4_RDFA (latest stable HTML
4.x+RDFa), DOCTYPE_HTML40 (strict), DOCTYPE_HTML40_FRAMESET,
DOCTYPE_HTML40_LOOSE, DOCTYPE_HTML40_STRICT, DOCTYPE_HTML401
(strict), DOCTYPE_HTML401_FRAMESET, DOCTYPE_HTML401_LOOSE,
DOCTYPE_HTML401_STRICT, DOCTYPE_HTML5, DOCTYPE_LEGACY
(about:legacy-compat), DOCTYPE_NIL (empty string),
DOCTYPE_XHTML1 (strict), DOCTYPE_XHTML1_FRAMESET,
DOCTYPE_XHTML1_LOOSE, DOCTYPE_XHTML1_STRICT, DOCTYPE_XHTML11,
DOCTYPE_XHTML_RDFA (latest stable strict XHTML+RDFa),
Defaults to DOCTYPE_HTML5 for HTML and DOCTYPE_LEGACY for XHTML.
This module always returns strings in Perl's internal utf8
encoding, but you can set the 'charset' option to 'ascii' to
create output that would be suitable for re-encoding to ASCII
(e.g. it will entity-encode characters which do not exist in
Set this to a true to force attributes to be quoted. If not
explicitly set, the writer will automatically detect when
attributes need quoting.
Set this to true to force void elements to always be terminated
with '/>'. If not explicitly set, they'll only be terminated
that way in polyglot or XHTML documents.
* start_tags and end_tags
Except in polyglot and XHTML documents, some elements allow
their start and/or end tags to be omitted in certain
circumstances. By setting these to true, you can prevent them
from being omitted.
Special characters that can't be encoded as named entities need
to be encoded as numeric character references instead. These can
be expressed in decimal or hexadecimal. Setting this option to
'dec' or 'hex' allows you to choose. The default is 'hex'.
Outputs (i.e. returns a string that is) an XML::LibXML::Document as
Outputs an XML::LibXML::Element as HTML.
Outputs an XML::LibXML::Attr as HTML.
Outputs an XML::LibXML::Text as HTML.
Outputs an XML::LibXML::CDATASection as HTML.
Outputs an XML::LibXML::Comment as HTML.
Outputs an XML::LibXML::PI as HTML.
Outputs the writer's DOCTYPE.
Takes a string and returns the same string with some special
characters replaced. These special characters do not include any of
'&', '<', '>' or '"', but you can provide a string of additional
characters to treat as special:
$encoded = $writer->encode_entities($raw, characters=>'&<>"');
Returns $char entity-encoded. Encoding is done regardless of whether
$char is "special" or not.
Boolean indicating if $writer is configured to output XHTML.
Boolean indicating if $writer is configured to output polyglot HTML.
Booleans indicating whether optional start and end tags should be
Boolean indicating whether attributes need to be quoted.
Boolean indicating whether void elements should be closed in the
BUGS AND LIMITATIONS
Certain DOM constructs cannot be output in non-XML HTML. e.g.
my $xhtml = <
This text is within the HR element