pdf-generation interview questions
Top pdf-generation frequently asked interview questions
I'm looking to create a printable pdf
version of my website webpages. Something like express.render()
only render the page as pdf
Does anyone know a node module that does that ?
If not, how would you go about implementing one ? I've seen some methods talk about using headless browser like phantom.js
, but not sure whats the flow.
Source: (StackOverflow)
I need to automatically generate a PDF file from an exisiting (X)HTML-document. The input files (reports) use a rather simple, table-based layout, so support for really fancy JavaScript/CSS stuff is probably not needed.
As I am used to working in Java, a solution that can easily be used in a java-project is preferable. It only needs to work on windows systems, though.
One way to do it that is feasable, but does not produce good quality output (at least out of the box) is using CSS2XSLFO, and Apache FOP to create the PDF files. The problem I encountered was that while CSS-attributes are converted nicely, the table-layout is pretty messed up, with text flowing out of the table cell.
I also took a quick look at Jrex, a Java-API for using the Gecko rendering engine.
Is there maybe a way to grab the rendered page from the internet explorer rendering engine and send it to a PDF-Printer tool automatically? I have no experience in OLE programming in windows, so I have no clue what's possible and what is not.
Do you have an idea?
EDIT: The FlyingSaucer/iText thing looks very promising. I will try to go with that.
Thanks for all the answers
Source: (StackOverflow)
- Many applications like RepliGo, Aldiko, Mantano, ezPdf in android market make this type of annotation in their pdf viewer shown in the below image.
- I tried in many ways to implement this annotation but I failed. I have a pdf viewer for android and separate java code for annotations using iText for drawing lines.
- My question is can i able to implement iText in android. If it's possible, which package do I have to import?
- Also in some applications, a canvas method is used for drawing lines. Is it possible to include this canvas method in android instead of using an annotation?. The goal would be to have the same features that annotations have.
- In the below image (RepliGo PDF Reader) which kind of code did they use for annotations?

Source: (StackOverflow)
I'm attempting to create a PDF file from an HTML file. After looking around a little I've found: wkhtmltopdf to be perfect. I need to call this .exe from the ASP.NET server. I've attempted:
Process p = new Process();
p.StartInfo.UseShellExecute = false;
p.StartInfo.FileName = HttpContext.Current.Server.MapPath("wkhtmltopdf.exe");
p.StartInfo.Arguments = "TestPDF.htm TestPDF.pdf";
p.Start();
p.WaitForExit();
With no success of any files being created on the server. Can anyone give me a pointer in the right direction? I put the wkhtmltopdf.exe file at the top level directory of the site. Is there anywhere else it should be held?
Edit: If anyone has better solutions to dynamically create pdf files from html, please let me know.
Source: (StackOverflow)
There were questions on that but not recently and technology must have gone ahead since then.
Requirements:
- generating pdf documents based on predefined template (I can use either pdf forms or xsl-fo)
- being able to fill textual data
- being able to fill graphical data (generated bar codes)
- being able to alter pdf template in production environment without patching (recompiling)
- generating pdf file to be saved in the database (as blob) and/or printed
- open source/free
The options assumed are iText, PDFBox, FOP, anything else? What are recommendations based on the requirements above?
Source: (StackOverflow)
I am creating a PDF file using DOMPDF. I have a big content to extract in PDF, we need some header in all the pages. So can anyone telme how to add a header and footer in the PDF so that the header will shown in all the pages using DOMPDF.
Source: (StackOverflow)
When I generate Doxygen documentation in PDF format, I get plenty of different files with a single diagram in each.
Is it possible to obtain a single PDF document, organized as a book, roughly as the HTML version?
Is it possible to get it automatically, i.e. without dealing manually with the latex files?
Thank's!
Source: (StackOverflow)
This question already has an answer here:
After 10 hours and trying 4 other HTML to PDF tools I'm about ready to explode.
wkhtmltopdf sounds like an excellent solution...the problem is that I can't execute a process with enough permissions from asp.net so...
Process.Start("wkhtmltopdf.exe","http://www.google.com google.pdf");
starts but doesn't do anything.
Is there an easy way to either:
-a) allow asp.net to start processes (that can actually do something) or
-b) compile/wrap/whatever wkhtmltopdf.exe into somthing I can use from C# like this: WkHtmlToPdf.Save("http://www.google.com", "google.pdf");
Source: (StackOverflow)
I'm currently using Kramdown to generate HTML from Markdown in Ruby.
I know that I can generate a latex file using kramdown and convert it to pdf usaing a command line utility. But I want a pure ruby solution.
Is there a way to convert markdown to pdf using only ruby without using command-line utilities?
Source: (StackOverflow)
I'm new to using jsPDF but and for the life of me I can't get any css to apply to this thing! I've tried inline, internal, and external all to no avail! I read in another SO post that since it's technically printing stuff to a file I need a print style sheet, and that didn't work either.
I have a very basic page that I'm just trying to get any CSS to work with:
JS:
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js"></script>
<script type="text/javascript" src="jspdf.js"></script>
<script type="text/javascript" src="./libs/FileSaver.js/FileSaver.js"></script>
<script type="text/javascript" src="./libs/Blob.js/BlobBuilder.js"></script>
<script type="text/javascript" src="jspdf.plugin.standard_fonts_metrics.js"></script>
<script type="text/javascript" src="jspdf.plugin.split_text_to_size.js"></script>
<script type="text/javascript" src="jspdf.plugin.from_html.js"></script>
<script>
$(document).ready(function(){
$('#dl').click(function(){
var specialElementHandlers = {
'#editor': function(element, renderer){
return true;
}
};
var doc = new jsPDF('landscape');
doc.fromHTML($('body').get(0), 15, 15, {'width': 170, 'elementHandlers': specialElementHandlers});
doc.output('save');
});
});
</script>
HTML:
<body>
<div id="dl">Download Maybe?</div>
<div id="testcase">
<h1>
We support special element handlers. Register them with jQuery-style
</h1>
</div>
</body>
And finally the stylesheet that is external:
h1{
color: red;
}
div{
color: red;
}
I'm sure all is getting included correctly, and that there are no errors, already checked all of that. Is there like some extra function I need to call to get the css to work as well? Let me know please! Thanks alot! Any other tips you may have are also appreciated!
EDIT:
This is the exact webpage:
<html>
<head>
<link rel="stylesheet" rel='nofollow' href="print.css" type="text/css" media="print"/>
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js"></script>
<script type="text/javascript" src="jspdf.js"></script>
<script type="text/javascript" src="./libs/FileSaver.js/FileSaver.js"></script>
<script type="text/javascript" src="./libs/Blob.js/BlobBuilder.js"></script>
<script type="text/javascript" src="jspdf.plugin.standard_fonts_metrics.js"></script>
<script type="text/javascript" src="jspdf.plugin.split_text_to_size.js"></script>
<script type="text/javascript" src="jspdf.plugin.from_html.js"></script>
<script>
$(document).ready(function(){
$('#dl').click(function(){
var specialElementHandlers = {
'#editor': function(element, renderer){
return true;
}
};
var doc = new jsPDF('landscape');
doc.fromHTML($('body').get(0), 15, 15, {'width': 170, 'elementHandlers': specialElementHandlers});
doc.output('save');
});
});
</script>
</head>
<body>
<div id="dl">Download Maybe?</div>
<div id="testcase">
<h1>
We support special element handlers. Register them with jQuery-style
</h1>
</div>
</body>
</html>
Source: (StackOverflow)
I am a PHP developer and in one of my projects, I need to convert some HTML documents (about 30 to 50 pages) into PDF documents.
My search has turned up the following possible solutions. Among them are some PHP libraries and some command line applications. Each has its own advantages and disadvantages.
PHP libraries:
- fpdf (need more effort to convert)
- tcpdf (need more effort to convert)
- html2fpdf http://html2fpdf.sourceforge.net
- html2pdf http://html2pdf.fr/
- dompdf http://code.google.com/p/dompdf/ (compared to other, works well)
For each library, I have problems like:
- Takes a long time (more than five minutes to convert 30 HTML pages)
Requires too many resources (memory and time)
(I set the following parameters in php.ini:
max_execution_time = 600
memory_limit = 250M
but things still don't work.)
Needs HTML pages to be well-formatted (e.g. no missing close tags)
All of these work when I try to convert simple HTML docs (five or fewer pages with little CSS)
Command line applications
All command line apps work perfectly and very quickly compared to the above libraries, but only when I run them directly on console. When I try to use them in PHP with exec()
or system()
, they give me errors.
The following are the command line applications and their errors when I run them in PHP:
html2pdf (http://www.tufat.com/s_html2ps_html2pdf.htm)
html2pdf:11380): Gtk-WARNING **: cannot open display: :0.0
No protocol specified
wkhtmltopdf
Loading page: 10%
Loading page: 33%
Loading page: 100%
Waiting for redirect
Outputting pages
QPainter::begin(): Returned false
QPainter::begin(): Returned false
QPainter::save: Painter not active
QPainter::scale: Painter not active
QPainter::setRenderHint: Painter must be active to set rendering hints
QPainter::setBrush: Painter not active
QPainter::pen: Painter not active
QPainter::setPen: Painter not active
htmltopdf (http://www.ultrashareware.com/html-to-pdf.htm)
So now I am looking for help. Can anyone answer:
Which PHP library would work well in my case?
Why do these errors occur in command line applications?
Source: (StackOverflow)
Does someone know a (preferably open-source) PDF layout engine for Java, capable of rendering tables with horizontal page breaks? "Horizontal page breaking" is at least how the feature is named in BIRT, but to clarify: If a table has too many columns to fit across the available page width, I want the table to be split horizontally across multiple pages, e.g. for a 10-column table, the columns 1-4 to be output on the first page and columns 5-10 on the second page. This should of course also be repeated on the following pages, if the table has too many rows to fit vertically on one page.
So far, it has been quite difficult to search for products. I reckon that such a feature may be named differently in other products, making it difficult to use aunt Google to find a suitable solution.
So far, I've tried:
BIRT claims to support this, but the actual implementation is so buggy, that it cannot be used. I though it is self-evident for such a functionality, that the row height is kept consistent across all pages, making it possible to align the rows when placing the pages next to each other. BIRT however calculates the required row height separately for each page.
Jasper has no support.
I also considered Apache FOP, but I don't find any suitable syntax for this in the XSL-FO specification.
iText is generally a little bit too "low level" for this task anyway (making it difficult to layout other parts of the intended PDF documents), but does not seem to offer support.
Since there seem to be some dozens other reporting or layout engines, which may or may not fit and I find it a little bit difficult to guess exactly what to look for, I was hoping that someone perhaps already had similar requirements and can provide at least a suggestion in the right direction. It is relatively important that the product can be easily integrated in a Java server application, a native Java library would be ideal.

Now, to keep the rows aligned across all pages, the row heights must be calculated as follows:
Row1.height = max(A1.height, B1.height, C1.height, D1.height)
Row2.height = max(A2.height, B2.height, C2.height, D2.height)
While BIRT currently seem to do something like:
Page1.Row1.height = max(A1.height, B1.height)
Page2.Row1.height = max(C1.height, D1.height)
Page1.Row2.height = max(A2.height, B2.height)
Page2.Row2.height = max(C2.height, D2.height)

Source: (StackOverflow)
I have an old project that is using iTextSharp library for PDF generation. iTextSharp DLL is added as reference to the project. iTextSharp was originally available under the LGPL licence. Some time ago the licence has changed to be AGPL (in release 5.0.0) that is copy left so you'd have to GPL all your code if you used it.
My problem is that I don't know when I downloaded the DLL file that is linked in my project. I don't know if the DLL is still under LGPL or it is already under AGPL. And that would mean that I have to GPL my project.
Is there any way to check what is the version of iTextSharp when you have only the DLL? Or what is the licence of it?
Or is there any place where I can download old version of iTextSharp that is still under LGPL so I'll be sure I'm not breaking the licence by not making my project GPL?
Source: (StackOverflow)