Main Content

mlreportgen.utils.html2dom.prepHTMLFile

Prepare HTML file for conversion to DOM

Since R2020a

Description

preppedHTMLStr = mlreportgen.utils.html2dom.prepHTMLFile(htmlFile) prepares the HTML in the file specified by htmlFile for conversion to the MATLAB® Report Generator™ internal document object model (DOM). The prepared HTML in preppedHTMLStr can be converted to a DOM API representation by using an mlreportgen.dom.HTML object. The mlreportgen.utils.html2dom.prepHTMLFile function:

  • Corrects invalid markup by calling mlreportgen.utils.tidy with the settings for HTML output.

  • Uses the MATLAB web browser to convert the tidied markup to an HTML DOM document. See https://www.w3.org/TR/WD-DOM/introduction.html.

    The MATLAB web browser computes the CSS properties of the elements in the HTML input based on internal and external style sheets specified by the input HTML, and on the style attribute of an element. The CSS property computation supports all valid CSS style sheet selectors, including selectors not directly supported by mlreportgen.dom.HTML or mlreportgen.dom.HTMLFile objects.

  • Converts the HTML DOM document to HTML markup that is supported by mlreportgen.dom.HTML and mlreportgen.dom.HTMLFile objects. The style attribute for each element specifies the element CSS properties that the MATLAB web browser computed.

  • Returns the prepared HTML as a string scalar.

example

preppedHTMLFileName = mlreportgen.utils.html2dom.prepHTMLFile(htmlFile,preppedHTMLFileName) generates the prepared HTML in a file with the name specified by preppedHTMLFile.The prepared HTML in preppedHTMLFileName can be converted to a DOM API representation by using an mlreportgen.dom.HTMLFile object.

preppedHTMLStr = mlreportgen.utils.html2dom.prepHTMLFile(___,"Tidy",false) prepares the HTML without first tidying it. Specify "Tidy",false after all other input arguments. Use this syntax if you want to tidy the HTML markup yourself. For example, you might want to call mlreportgen.utils.tidy with different options than the ones used by mlreportgen.utils.html2dom.prepHTMLFile, then pass the tidied HTML as the input to mlreportgen.utils.html2dom.prepHTMLFile.

Examples

collapse all

Use mlreportgen.utils.html2dom.prepHTMLFile to prepare an HTML file for conversion to a DOM object that you can append to a report.

Create a CSS style sheet, myCSS.css, to specify that the text in a paragraph is red and that the font family is Arial.

p {
    color: red;
    font-family: Arial;
   }

Create a file, myHTML.html, that contains this HTML:

<html>
<head>
    <link rel="stylesheet" type="text/css" href="myCSS.css" >
</head>
<body>
    <p> Hello World</p>
</body>
</html>

The HTML is not XML-parsable because the link element is not properly closed. The slash / before the closing angle bracket > is missing.

Try to convert the HTML to a DOM object and append the object to a report.

import mlreportgen.dom.*; 
rpt = Document("MyReport","docx"); 
htmlObj = HTMLFile("myHTML.html");
append(rpt,htmlObj); 
close(rpt); 
rptview(rpt);
Error using mlreportgen.dom.HTMLFile
HTML error: expected end of tag 'link'

mlreportgen.dom.HTMLFile ends with an error due to the missing end tag.

Prepare the HTML for conversion to DOM by using mlreportgen.utils.html2dom.prepHTMLFile. Create an mlreportgen.dom.HTMLFile object from the prepared HTML and append the object to the report.

import mlreportgen.dom.*
import mlreportgen.utils.html2dom.*
d = Document("test","pdf");
preppedHTMLFile = prepHTMLFile("myHTML.html","mypreppedHTML.html");
htmlObj = HTMLFile(preppedHTMLFile);
append(d,htmlObj);
close(d);
rptview(d);

Input Arguments

collapse all

HTML file to be prepared for conversion to DOM, specified as a character vector or string scalar.

File for prepared HTML, specified as a character vector or string scalar.

Example: "myHTML.html"

Output Arguments

collapse all

Prepared HTML, returned as a string scalar.

Name of the file that contains the prepared HTML, returned as a string scalar.

Tips

  • MATLAB Report Generator mlreportgen.dom.HTML or mlreportgen.dom.HTMLFile objects typically cannot accept the raw HTML output of third-party applications, such as Microsoft® Word, that export native documents as HTML markup. In these cases, your Report API report generation program can use the mlreportgen.utils.html2dom.prepHTMLString and mlreportgen.utils.html2dom.prepHTMLFile functions to prepare the raw HTML for use with the mlreportgen.dom.HTML or mlreportgen.dom.HTMLFile objects. Typically, your program will have to further process the prepared HTML to remove valid but undesirable objects, such as line feeds that were in the raw content.

Version History

Introduced in R2020a