Articles

Java Code for Saving PDF File in HTML Format

by Lara Sheen .NET & Java Developer

The long awaited version of Aspose.Pdf for Java 4.3.0 has been released. This new release provides an exciting feature to convert PDF files to HTML format by using just two lines of code. We’ve previously provided the feature to extract PDF file contents as HTML using the PdfExtractor class’ extractTextAsHTML(..) method but the new approach greatly improves the output fidelity.


Here is the code:

// Load source PDF file
com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document("c:/source.pdf");
// Save the file into HTML format
pdfDocument.save("c:/output.html", com.aspose.pdf.SaveFormat.Html);


In this new release, we have introduced a separate JAR file, aspose-pdf-4.3.0-jdk14.jar, targeted for JDK1.4 and JDK1.5. Furthermore, this release also provides better PDF to image conversion, both in terms of results and performance. Some important new and improved features included in this release are given below:

  • PDF to HTML
  • Aspose.Pdf for Java 4.0.0(Merged version) is now compatible with jdk 1.5
  • PDF to JPEG Conversion- Performance is much improved
  • PDF to HTML - some words overlap and formatting issues are resolved
  • Insert method of PdfFileEditor class causing problems are now fixed
  • TextSegment class missing issue is resolved from com.aspose.pdf.facades package

Sponsor Ads


About Lara Sheen Senior   .NET & Java Developer

212 connections, 4 recommendations, 589 honor points.
Joined APSense since, March 2nd, 2007, From Lane Cove, Australia.

Created on Dec 31st 1969 18:00. Viewed 0 times.

Comments

No comment, be the first to comment.
Please sign in before you comment.