Changing the default configuration
Let's take a look at another HTML file: thoreau.html. If we use the default configuration to create the corresponding PDF, we detect some problems in the resulting PDF thoreau0.pdf:
body tag (style="font-family: Nimbus Roman No9 L,Times New Roman"), but the font in the resulting PDF is Helvetica.Let's find out how to fix these problems by looking at the following example:
FontFactory.registerDirectories();
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document,
new FileOutputStream("results/xmlworker/thoreau1.pdf"));
document.open();
HtmlPipelineContext htmlContext = new HtmlPipelineContext();
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.setImageProvider(new AbstractImageProvider() {
public String getImageRootPath() {
return "src/main/resources/html/";
}
});
htmlContext.setLinkProvider(new LinkProvider() {
public String getLinkRoot() {
return "http://tutorial.itextpdf.com/src/main/resources/html/";
}
});
CSSResolver cssResolver =
XMLWorkerHelper.getInstance().getDefaultCssResolver(true);
Pipeline<?> pipeline =
new CssResolverPipeline(cssResolver,
new HtmlPipeline(htmlContext,
new PdfWriterPipeline(document, writer)));
XMLWorker worker = new XMLWorker(pipeline, true);
XMLParser p = new XMLParser(worker);
p.parse(HTMLParsingProcess.class.getResourceAsStream("/html/thoreau.html"));
document.close();
See HTMLParsingImagesLinks1.java and the resulting PDF thoreau1.pdf
As you can see, we don't need a special XML Worker configuration to fix the font problem.