EzDevInfo.com

pdf interview questions

Top pdf frequently asked interview questions

EMBED vs. OBJECT

Which is the right/best tag to use in my HTML file when I want to display the Adobe PDF viewer? Right now I'm using the code below, but there are weird side effects (e.g. it seems to steal the starting focus that I've set to another INPUT text box; it doesn't seem to play real well with the jQueryUI Resizeable class; etc.)

<embed src="abc.pdf" type="application/pdf" />

Could I even do the same thing with the OBJECT tag? Are there advantages/disadvantages to using one tag vs. the other?


Source: (StackOverflow)

Proper MIME media type for PDF files

When working with PDFs, I've run across the MIME types application/pdf and application/x-pdf among others.

Is there a difference between these two types, and if so what is it? Is one preferred over the other?

I'm working on a web app which must deliver huge amounts of PDFs and I want to do it the correct way, if there is one.


Source: (StackOverflow)

Advertisements

Convert HTML + CSS to PDF with PHP?

Ok, I'm now banging my head against a brick wall with this one.

I have an HTML (not XHTML) document that renders fine in Firefox 3 and IE 7. It uses fairly basic CSS to style it and renders fine in HTML.

I'm now after a way of converting it to PDF. I have tried:

  • DOMPDF: it had huge problems with tables. I factored out my large nested tables and it helped (before it was just consuming up to 128M of memory then dying--thats my limit on memory in php.ini) but it makes a complete mess of tables and doesn't seem to get images. The tables were just basic stuff with some border styles to add some lines at various points;
  • HTML2PDF and HTML2PS: I actually had better luck with this. It rendered some of the images (all the images are Google Chart URLs) and the table formatting was much better but it seemed to have some complexity problem I haven't figured out yet and kept dying with unknown node_type() errors. Not sure where to go from here; and
  • Htmldoc: this seems to work fine on basic HTML but has almost no support for CSS whatsoever so you have to do everything in HTML (I didn't realize it was still 2001 in Htmldoc-land...) so it's useless to me.

I tried a Windows app called Html2Pdf Pilot that actually did a pretty decent job but I need something that at a minimum runs on Linux and ideally runs on-demand via PHP on the Webserver.

I really can't believe I'm this stuck. Am I missing something?


Source: (StackOverflow)

Fast and Lean PDF Viewer for iPhone / iPad / iOs - tips and hints?

There has been many Questions recently about drawing PDF's.

Yes, you can render PDF's very easily with a UIWebView but this cant give the performance and functionality that you would expect from a good PDF viewer.

You can draw a PDF page to a CALayer or to a UIImage. Apple even have sample code to show how draw a large PDF in a Zoomable UIScrollview

But the same issues keep cropping up.

UIImage Method:

  1. PDF's in a UIImage don't optically scale as well as a Layer approach.
  2. The CPU and memory hit on generating the UIImages from a PDFcontext limits/prevents using it to create a real-time render of new zoom-levels.

CATiledLayer Method:

  1. Theres a significant Overhead (time) drawing a full PDF page to a CALayer: individual tiles can be seen rendering (even with a tileSize tweak)
  2. CALayers cant be prepared ahead of time (rendered off-screen).

Generally PDF viewers are pretty heavy on memory too. Even monitor the memory usage of apple's zoomable PDF example.

In my current project, I'm developing a PDF viewer and am rendering a UIImage of a page in a separate thread (issues here too!) and presenting it while the scale is x1. CATiledLayer rendering kicks in once the scale is >1. iBooks takes a similar double take approach as if you scroll the pages you can see a lower res version of the page for just less than a second before a crisp version appears.

Im rendering 2 pages each side of the page in focus so that the PDF image is ready to mask the layer before it starts drawing.Pages are destroyed again when they are +2 pages away from the focused page.

Does anyone have any insights, no matter how small or obvious to improve the performance/ memory handling of Drawing PDF's? or any other issues discussed here?

EDIT: Some Tips (Credit- Luke Mcneice,VdesmedT,Matt Gallagher,Johann):

  • Save any media to disk when you can.

  • Use larger tileSizes if rendering on TiledLayers

  • init frequently used arrays with placeholder objects, alternitively another design approach is this one

  • Note that images will render faster than a CGPDFPageRef

  • Use NSOperations or GCD & Blocks to prepare pages ahead of time.

  • call CGContextSetInterpolationQuality(ctx, kCGInterpolationHigh); CGContextSetRenderingIntent(ctx, kCGRenderingIntentDefault); before CGContextDrawPDFPage to reduce memory usage while drawing

  • init'ing your NSOperations with a docRef is a bad idea (memory), wrap the docRef into a singleton.

  • Cancel needless NSOperations When you can, especially if they will be using memory, beware of leaving contexts open though!

  • Recycle page objects and destroy unused views

  • Close any open Contexts as soon as you don't need them

  • on receiving memory warnings release and reload the DocRef and any page Caches

Other PDF Features:

Documentation

Example projects


Source: (StackOverflow)

merge / convert multiple pdf files into one pdf

How could I merge / convert multiple pdf files into one large pdf file?

I tried the following, but the content of the target file was not as expected:

convert file1.pdf file2.pdf merged.pdf

I need a very simple/basic cli solution. Best would be if I could pipe the output of the merge / convert straight into pdf2ps ( as originally attempted in my previously asked question here: http://stackoverflow.com/questions/2507596/linux-piping-convert-pdf2ps-lp ).


Source: (StackOverflow)

What is the best PDF open source library for Java? [closed]

I am looking for a library enabling the creation of PDFs, including files and bitmaps. The license should allows free usage in commercial applications. Good documentation is important too.

Where is a list of current open source libraries for generating PDF files that can be used with Java? I would like to have some kind of comparison table as to functionality offered by various library alternatives?

Criteria I am looking for (suggestions for additional criteria are welcome) are:

  1. ease of use of the library and the interfaces provided by the library for a Java programmer to specify inputs and to programmatically provide text and image data to be included in the PDF output,

  2. output quality of the PDF output from the library and the output compatibility with the most commonly used PDF readers,

  3. ability to include images into the PDF output with the ability to specify many of the standard image formats such as bitmap, jpeg, GIF, and png,

  4. reliability and robustness of the library along with error detection and error handling, and error reporting,

  5. the ability to programmatically process an existing PDF file to search for content and extract content from the file including text, font data, and images,

  6. general acceptance by the Java community and expectation of continuing product support and innovation.

I have heard about Apache PDFBox, is there any better library? Do you have experience to share?


Source: (StackOverflow)

Display PDF within app on Android? [closed]

I got very frustrated when I realized that Android is not able to display PDFs (in a WebView or whatever) out-of-the-box.

So my question is: are there any (OS) JARs or classes to display a PDF document within an app?

Does anybody have experience with using some of the standard Java PDF viewer libraries on Android? The libraries don't need to be free, only usable with Android phones.

I heard that iText got ported over to Android. Have any of you done something with it yet?


Source: (StackOverflow)

How to search contents of multiple pdf files?

How could I search the contents of PDF files in a directory/subdirectory? I am looking for some command line tools. It seems that grep can't search PDF files.


Source: (StackOverflow)

How to get rid of blank pages in PDF exported from SSRS

I have a SSRS report. When i tried to export to PDF it was taking 4 pages due to its width., where the 2nd and 4th pages were displaying one of my field from the table. So i tried to set the layout size in report properties as width=18in and height =8.5in.

It gave me the whole table in a single page of PDF. But I am getting 2nd and 4th page blank. Is the way I am doing is incorrect or else how to get rid of that blank pages?


Source: (StackOverflow)

Convert PDF to image with high resolution

I'm trying to use the command line program convert to take a PDF into an image (JPEG or PNG). Here is one of the PDFs that I'm trying to convert.

I want the program to trim off the excess white-space and return a high enough quality image that the superscripts can be read with ease.

This is my current best attempt. As you can see, the trimming works fine, I just need to sharpen up the resolution quite a bit. This is the command I'm using:

convert -trim 24.pdf -resize 500% -quality 100 -sharpen 0x1.0 24-11.jpg

I've tried to make the following conscious decisions:

  • resize it larger (has no effect on the resolution)
  • make the quality as high as possible
  • use the -sharpen (I've tried a range of values)

Any suggestions please on getting the resolution of the image in the final PNG/JPEG higher would be greatly appreciated!


Source: (StackOverflow)

Converting HTML files to PDF [closed]

I need to automatically generate a PDF file from an exisiting (X)HTML-document. The input files (reports) use a rather simple, table-based layout, so support for really fancy JavaScript/CSS stuff is probably not needed.

As I am used to working in Java, a solution that can easily be used in a java-project is preferable. It only needs to work on windows systems, though.

One way to do it that is feasable, but does not produce good quality output (at least out of the box) is using CSS2XSLFO, and Apache FOP to create the PDF files. The problem I encountered was that while CSS-attributes are converted nicely, the table-layout is pretty messed up, with text flowing out of the table cell.

I also took a quick look at Jrex, a Java-API for using the Gecko rendering engine.

Is there maybe a way to grab the rendered page from the internet explorer rendering engine and send it to a PDF-Printer tool automatically? I have no experience in OLE programming in windows, so I have no clue what's possible and what is not.

Do you have an idea?

EDIT: The FlyingSaucer/iText thing looks very promising. I will try to go with that.

Thanks for all the answers


Source: (StackOverflow)

iTextSharp - Sending in-memory pdf in an email attachment

I've asked a couple of questions here but am still having issues. I'd appreciate if you could tell me what I am doing wrong in my code. I run the code above from a ASP.Net page and get "Cannot Access a Closed Stream".

var doc = new Document();

MemoryStream memoryStream = new MemoryStream();

PdfWriter.GetInstance(doc, memoryStream);
doc.Open();
doc.Add(new Paragraph("First Paragraph"));
doc.Add(new Paragraph("Second Paragraph"));

doc.Close(); //if I remove this line the email attachment is sent but with 0 bytes 

MailMessage mm = new MailMessage("username@gmail.com", "username@gmail.com")
{
    Subject = "subject",
    IsBodyHtml = true,
    Body = "body"
};

mm.Attachments.Add(new Attachment(memoryStream, "test.pdf"));
SmtpClient smtp = new SmtpClient
{
    Host = "smtp.gmail.com",
    Port = 587,
    EnableSsl = true,
    Credentials = new NetworkCredential("username@gmail.com", "my_password")
};

smtp.Send(mm); //the "Cannot Access a Closed Stream" error is thrown here

Thanks!!!

EDIT:

Just to help somebody looking for the answer to this question, the code to send a pdf file attached to an email without having to physically create the file is below (thanks to Ichiban and Brianng):

var doc = new Document();
MemoryStream memoryStream = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(doc, memoryStream);

doc.Open();
doc.Add(new Paragraph("First Paragraph"));
doc.Add(new Paragraph("Second Paragraph"));

writer.CloseStream = false;
doc.Close();
memoryStream.Position = 0;

MailMessage mm = new MailMessage("username@gmail.com", "username@gmail.com")
{
    Subject = "subject",
    IsBodyHtml = true,
    Body = "body"
};

mm.Attachments.Add(new Attachment(memoryStream, "filename.pdf"));
SmtpClient smtp = new SmtpClient
{
    Host = "smtp.gmail.com",
    Port = 587,
    EnableSsl = true,
    Credentials = new NetworkCredential("username@gmail.com", "password")

};

smtp.Send(mm);

Source: (StackOverflow)

Render HTML to PDF in Django site

For my django powered site, I am looking for an easy solution to convert dynamic html pages to pdf.

Pages include HTML and charts from Google visualization API (which is javascript based, yet including those graphs is a must).


Source: (StackOverflow)

How to render PDF in Android

In my application I will receive a byte stream and convert it to a pdf file in the phone memory. How do I render that to a pdf? And show it on an activity?


Source: (StackOverflow)