pdf interview questions
Top pdf frequently asked interview questions
Which is the right/best tag to use in my HTML file when I want to display the Adobe PDF viewer? Right now I'm using the code below, but there are weird side effects (e.g. it seems to steal the starting focus that I've set to another INPUT text box; it doesn't seem to play real well with the jQueryUI Resizeable class; etc.)
<embed src="abc.pdf" type="application/pdf" />
Could I even do the same thing with the OBJECT tag? Are there advantages/disadvantages to using one tag vs. the other?
Source: (StackOverflow)
When working with PDFs, I've run across the MIME types application/pdf
and application/x-pdf
among others.
Is there a difference between these two types, and if so what is it? Is one preferred over the other?
I'm working on a web app which must deliver huge amounts of PDFs and I want to do it the correct way, if there is one.
Source: (StackOverflow)
Ok, I'm now banging my head against a brick wall with this one.
I have an HTML (not XHTML) document that renders fine in Firefox 3 and IE 7. It uses fairly basic CSS to style it and renders fine in HTML.
I'm now after a way of converting it to PDF. I have tried:
- DOMPDF: it had huge problems with tables. I factored out my large nested tables and it helped (before it was just consuming up to 128M of memory then dying--thats my limit on memory in php.ini) but it makes a complete mess of tables and doesn't seem to get images. The tables were just basic stuff with some border styles to add some lines at various points;
- HTML2PDF and HTML2PS: I actually had better luck with this. It rendered some of the images (all the images are Google Chart URLs) and the table formatting was much better but it seemed to have some complexity problem I haven't figured out yet and kept dying with unknown node_type() errors. Not sure where to go from here; and
- Htmldoc: this seems to work fine on basic HTML but has almost no support for CSS whatsoever so you have to do everything in HTML (I didn't realize it was still 2001 in Htmldoc-land...) so it's useless to me.
I tried a Windows app called Html2Pdf Pilot that actually did a pretty decent job but I need something that at a minimum runs on Linux and ideally runs on-demand via PHP on the Webserver.
I really can't believe I'm this stuck. Am I missing something?
Source: (StackOverflow)
There has been many Questions recently about drawing PDF's.
Yes, you can render PDF's very easily with a UIWebView
but this cant give the performance and functionality that you would expect from a good PDF viewer.
You can draw a PDF page to a CALayer or to a UIImage. Apple even have sample code to show how draw a large PDF in a Zoomable UIScrollview
But the same issues keep cropping up.
UIImage Method:
- PDF's in a
UIImage
don't optically
scale as well as a Layer approach.
- The CPU and memory hit on generating
the
UIImages
from a PDFcontext
limits/prevents using it to create a
real-time render of new zoom-levels.
CATiledLayer Method:
- Theres a significant Overhead (time)
drawing a full PDF page to a
CALayer
: individual tiles can be seen rendering (even with a tileSize tweak)
CALayers
cant be prepared ahead of
time (rendered off-screen).
Generally PDF viewers are pretty heavy on memory too. Even monitor the memory usage of apple's zoomable PDF example.
In my current project, I'm developing a PDF viewer and am rendering a UIImage
of a page in a separate thread (issues here too!) and presenting it while the scale is x1. CATiledLayer
rendering kicks in once the scale is >1. iBooks takes a similar double take approach as if you scroll the pages you can see a lower res version of the page for just less than a second before a crisp version appears.
Im rendering 2 pages each side of the page in focus so that the PDF image is ready to mask the layer before it starts drawing.Pages are destroyed again when they are +2 pages away from the focused page.
Does anyone have any insights, no matter how small or obvious to improve the performance/ memory handling of Drawing PDF's? or any other issues discussed here?
EDIT: Some Tips (Credit- Luke Mcneice,VdesmedT,Matt Gallagher,Johann):
Save any media to disk when you can.
Use larger tileSizes if rendering on TiledLayers
init frequently used arrays with placeholder objects, alternitively another design approach is this one
Note that images will render faster than a CGPDFPageRef
Use NSOperations
or GCD & Blocks to prepare pages ahead
of time.
call CGContextSetInterpolationQuality(ctx, kCGInterpolationHigh); CGContextSetRenderingIntent(ctx, kCGRenderingIntentDefault);
before CGContextDrawPDFPage
to reduce memory usage while drawing
init'ing your NSOperations
with a docRef is a bad idea (memory), wrap the docRef into a singleton.
Cancel needless NSOperations
When you can, especially if they will be using memory, beware of leaving contexts open though!
Recycle page objects and destroy unused views
Close any open Contexts as soon as you don't need them
on receiving memory warnings release and reload the DocRef and any page Caches
Other PDF Features:
Documentation
Example projects
Source: (StackOverflow)
How could I merge / convert multiple pdf files into one large pdf file?
I tried the following, but the content of the target file was not as expected:
convert file1.pdf file2.pdf merged.pdf
I need a very simple/basic cli solution. Best would be if I could pipe the output of the merge / convert straight into pdf2ps ( as originally attempted in my previously asked question here: http://stackoverflow.com/questions/2507596/linux-piping-convert-pdf2ps-lp ).
Source: (StackOverflow)
I am looking for a library enabling the creation of PDFs, including files and bitmaps. The license should allows free usage in commercial applications. Good documentation is important too.
Where is a list of current open source libraries for generating PDF files that can be used with Java? I would like to have some kind of comparison table as to functionality offered by various library alternatives?
Criteria I am looking for (suggestions for additional criteria are welcome) are:
ease of use of the library and the interfaces provided by the
library for a Java programmer to specify inputs and to
programmatically provide text and image data to be included in the
PDF output,
output quality of the PDF output from the library and the output
compatibility with the most commonly used PDF readers,
ability to include images into the PDF output with the ability to
specify many of the standard image formats such as bitmap, jpeg,
GIF, and png,
reliability and robustness of the library along with error detection
and error handling, and error reporting,
the ability to programmatically process an existing PDF file to
search for content and extract content from the file including text, font data, and images,
general acceptance by the Java community and expectation of
continuing product support and innovation.
I have heard about Apache PDFBox, is there any better library? Do you have experience to share?
Source: (StackOverflow)
I got very frustrated when I realized that Android is not able to display PDFs (in a WebView or whatever) out-of-the-box.
So my question is: are there any (OS) JARs or classes to display a PDF document within an app?
Does anybody have experience with using some of the standard Java PDF viewer libraries on Android? The libraries don't need to be free, only usable with Android phones.
I heard that iText got ported over to Android. Have any of you done something with it yet?
Source: (StackOverflow)
How could I search the contents of PDF files in a directory/subdirectory? I am looking for some command line tools. It seems that grep
can't search PDF files.
Source: (StackOverflow)
I have a SSRS report. When i tried to export to PDF it was taking 4 pages due to its width., where the 2nd and 4th pages were displaying one of my field from the table. So i tried to set the layout size in report properties as width=18in and height =8.5in.
It gave me the whole table in a single page of PDF. But I am getting 2nd and 4th page blank.
Is the way I am doing is incorrect or else how to get rid of that blank pages?
Source: (StackOverflow)
I'm trying to use the command line program convert
to take a PDF into an image (JPEG or PNG). Here is one of the PDFs that I'm trying to convert.
I want the program to trim off the excess white-space and return a high enough quality image that the superscripts can be read with ease.
This is my current best attempt. As you can see, the trimming works fine, I just need to sharpen up the resolution quite a bit. This is the command I'm using:
convert -trim 24.pdf -resize 500% -quality 100 -sharpen 0x1.0 24-11.jpg
I've tried to make the following conscious decisions:
- resize it larger (has no effect on the resolution)
- make the quality as high as possible
- use the
-sharpen
(I've tried a range of values)
Any suggestions please on getting the resolution of the image in the final PNG/JPEG higher would be greatly appreciated!
Source: (StackOverflow)
I need to automatically generate a PDF file from an exisiting (X)HTML-document. The input files (reports) use a rather simple, table-based layout, so support for really fancy JavaScript/CSS stuff is probably not needed.
As I am used to working in Java, a solution that can easily be used in a java-project is preferable. It only needs to work on windows systems, though.
One way to do it that is feasable, but does not produce good quality output (at least out of the box) is using CSS2XSLFO, and Apache FOP to create the PDF files. The problem I encountered was that while CSS-attributes are converted nicely, the table-layout is pretty messed up, with text flowing out of the table cell.
I also took a quick look at Jrex, a Java-API for using the Gecko rendering engine.
Is there maybe a way to grab the rendered page from the internet explorer rendering engine and send it to a PDF-Printer tool automatically? I have no experience in OLE programming in windows, so I have no clue what's possible and what is not.
Do you have an idea?
EDIT: The FlyingSaucer/iText thing looks very promising. I will try to go with that.
Thanks for all the answers
Source: (StackOverflow)
I've asked a couple of questions here but am still having issues. I'd appreciate if you could tell me what I am doing wrong in my code. I run the code above from a ASP.Net page and get "Cannot Access a Closed Stream".
var doc = new Document();
MemoryStream memoryStream = new MemoryStream();
PdfWriter.GetInstance(doc, memoryStream);
doc.Open();
doc.Add(new Paragraph("First Paragraph"));
doc.Add(new Paragraph("Second Paragraph"));
doc.Close(); //if I remove this line the email attachment is sent but with 0 bytes
MailMessage mm = new MailMessage("username@gmail.com", "username@gmail.com")
{
Subject = "subject",
IsBodyHtml = true,
Body = "body"
};
mm.Attachments.Add(new Attachment(memoryStream, "test.pdf"));
SmtpClient smtp = new SmtpClient
{
Host = "smtp.gmail.com",
Port = 587,
EnableSsl = true,
Credentials = new NetworkCredential("username@gmail.com", "my_password")
};
smtp.Send(mm); //the "Cannot Access a Closed Stream" error is thrown here
Thanks!!!
EDIT:
Just to help somebody looking for the answer to this question, the code to send a pdf file attached to an email without having to physically create the file is below (thanks to Ichiban and Brianng):
var doc = new Document();
MemoryStream memoryStream = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(doc, memoryStream);
doc.Open();
doc.Add(new Paragraph("First Paragraph"));
doc.Add(new Paragraph("Second Paragraph"));
writer.CloseStream = false;
doc.Close();
memoryStream.Position = 0;
MailMessage mm = new MailMessage("username@gmail.com", "username@gmail.com")
{
Subject = "subject",
IsBodyHtml = true,
Body = "body"
};
mm.Attachments.Add(new Attachment(memoryStream, "filename.pdf"));
SmtpClient smtp = new SmtpClient
{
Host = "smtp.gmail.com",
Port = 587,
EnableSsl = true,
Credentials = new NetworkCredential("username@gmail.com", "password")
};
smtp.Send(mm);
Source: (StackOverflow)
For my django powered site, I am looking for an easy solution to convert dynamic html pages to pdf.
Pages include HTML and charts from Google visualization API (which is javascript based, yet including those graphs is a must).
Source: (StackOverflow)
In my application I will receive a byte stream and convert it to a pdf file in the phone memory. How do I render that to a pdf? And show it on an activity?
Source: (StackOverflow)