EzDevInfo.com

docx4j

JAXB-based Java library for Word docx, Powerpoint pptx, and Excel xlsx files

docx4j does not replace variables

I just followed approach No 2 in the VariableReplace example from docx4j 2.8.1 and everything it does, is to remove the variable markers ${}.

The steps I did:

  • Opened Word 2013, typed ${variable} as text only
  • Saved it to somewhere
  • read it in my Java program and build my HashMap with .put("variable", "TEST");
  • other code is copied and pasted from the example above.
  • Saved the document

I'd expect 'TEST' solely, and get just 'variable' without the markers in the output document.


Source: (StackOverflow)

How to save images from a word document in DOCX4J

I'm trying to traverse through a word document and save all the images found in the word document. I tried uploading the sample word document to the online demo and noticed that images are listed as:

/word/media/image1.png  rId5    image/png
/word/media/image2.png  rId5    image/png
/word/media/image3.jpg  rId5    image/jpeg

How can I programmatically save these images while traversing the document?

Currently I get all the text from the document like this:

   WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File(filePath))
   MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart()
   Document wmlDocumentEl = (org.docx4j.wml.Document)documentPart.getJaxbElement()
   Body body =  wmlDocumentEl.getBody();
   DocumentTraverser traverser = new DocumentTraverser();

   class DocumentTraverser  extends TraversalUtil.CallbackImpl {
      @Override
      public List<Object> apply(Object o) {
         if (o instanceof org.docx4j.wml.Text) {
         ....
         }
         return null;
      }
   }

Source: (StackOverflow)

Advertisements

How to apply new line in docx file generation using DOCX4J

By the tutorials that I have seen. I learned how to add text on generating a docx file. but then Every time I add a line of text. I noticed that there is always a space between the first line of text and the second line of text. just like hitting the enter key twice. I know that the main cause is that everytime I add a line of text, I use a paragraph. and a paragraph starts with a space after another paragraph.

This is how I add a text

ObjectFactory factory;
factory = Context.getWmlObjectFactory();
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
P spc = factory.createP();
R rspc = factory.createR();
rspc.getContent().add(wordMLPackage.getMainDocumentPart().createParagraphOfText("sample"));
spc.getContent().add(rspc);

java.io.InputStream is = new java.io.FileInputStream(file);
wordMLPackage.getMainDocumentPart().addObject(spc);

so this code successfully runs and produces the right output. but when i add another paragraph. or text. i want it to be just under the first line of text. is there any way that i can add a simple line of text without using a paragraph? thanks in advance

EDIT: I've also tried adding a simple org.docx4j.wml.Text like this

Text newtext = factory.createText();
newtext.setValue("sample new text");
wordMLPackage.getMainDocumentPart().addObject(newtext);

the program will run but when i open the generated docx file, it will just prompt a message saying that there are problem with the contents.


Source: (StackOverflow)

How to convert office documents into html in android

could we convert microsoft office documents(doc, docx, ppt, pptx, xls, xlsx, etc.) in to html string in Android. i need to show office documents in my app. i have searched and found docx4j, apache poi and http://angelozerr.wordpress.com/2012/12/06/how-to-convert-docxodt-to-pdfhtml-with-java/ to convert files in html. this approach is working fine in desktop version. but when using in android i am getting "Unable to convert in Dalvik format error 1". which is may be due to using too much jars in my android project. i want to know is there a single way from which i convert office document to html in android. sorry for english.

EDIT

i am now able to convert doc to html using apache poi. here is method

public void showsimpleWord() {
    File file = new File("/sdcard/test.doc");

    HWPFDocumentCore wordDocument = null;
    try {
        wordDocument = WordToHtmlUtils.loadDoc(new FileInputStream(file));
    } catch (Exception e1) {
        // TODO Auto-generated catch block
        e1.printStackTrace();
    }

    WordToHtmlConverter wordToHtmlConverter = null;
    try {
        wordToHtmlConverter = new WordToHtmlConverter(
                DocumentBuilderFactory.newInstance().newDocumentBuilder()
                        .newDocument());
        wordToHtmlConverter.processDocument(wordDocument);
        org.w3c.dom.Document htmlDocument = wordToHtmlConverter
                .getDocument();
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        DOMSource domSource = new DOMSource(htmlDocument);
        StreamResult streamResult = new StreamResult(out);
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer serializer = tf.newTransformer();
        serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        serializer.setOutputProperty(OutputKeys.INDENT, "yes");
        serializer.setOutputProperty(OutputKeys.METHOD, "html");
        serializer.transform(domSource, streamResult);
        out.close();
        String result = new String(out.toByteArray());
        System.out.println(result);
        ((WebView) findViewById(R.id.webview)).loadData(result,
                "text/html", "utf-8");

    } catch (Exception e) {
        e.printStackTrace();
    }
}

now searching for others.


Source: (StackOverflow)

how to put images at a certain place in the Word(.docx) file by using DOCX4J in java

I have a requirement that I have a Word(.DOCX) file. by using a java program i need to put an image at a certain place in the document by using DOCX4J. can anyone please help me!!!

I'm trying with the following code...

 final String XPATH = "//w:t";
String image_Path = "D:\\Temp\\ex.png";
String template_Path = "D:\\Temp\\example.docx";

WordprocessingMLPackage  package =  WordprocessingMLPackage.createPackage();
List texts = package.getMainDocumentPart().getJAXBNodesViaXPath(XPATH, true);
for (Object obj : texts) {
  Text text = (Text) ((JAXBElement) obj).getValue();

  ObjectFactory factory = new ObjectFactory();         
  P paragraph = factory.createP();         
  R run = factory.createR();         
  paragraph.getContent().add(run);         
  Drawing drawing = factory.createDrawing();         
  run.getContent().add(drawing);         
  drawing.getAnchorOrInline().add(image_Path); 
  package.getMainDocumentPart().addObject(paragraph);
  package.save(new java.io.File("D:\\Temp\\example.docx"));here

Source: (StackOverflow)

Docx4j library is not thread-safe. What are possible ways to workaround this issue?

I have wrote an application that must parse and retrieve some data from a few thousands large docx files. It will run on a high-performance production server with many CPUs, large amount of RAM and fast SSDs in RAID arrays, so obviously I want to fully use all available performance capabilities.

I found out that my application successfully do any other job in many concurrent threads, but it fails to concurrently parse many docx files using docx4j library. Moreover, this library can't safely support in many separate threads more than one instance of WordprocessingMLPackage class that contains a data from a docx file.

Googling and examination of a source code of the library confirm that it is totally not thread-safe (its classes, for example, contain many static fields and instances that cannot be used concurrently).

So I have some questions to ask:

  • Is there any other libraries with the same capabilities that are guaranteed to be thread-safe?
  • Can I launch my workers in some separate processes instead of separate threads to workaround this issue? If so, how badly will it decrease a performance of my application?

Source: (StackOverflow)

Convert docX to a custom XML

I have been trying to convert my docX files to a XML I have custom-made. My users want their data converted to this XML for easier content query in their web app and they want the input to be from their docX.

I have tried looking for converter API in Java but none seem to fit my requirement. I have looked into docx4j but realized that it only converts to HTML and PDF. I am thinking if there exists a converter API to which I can input, say, an intermediate translator (XSLT) and the output would be my custom XML complete with the data from my docX.

Is there an existing tool for this? If there is none, any suggestions on the approach I have to take in coding my own converter e.g. from openXML, convert to XSL-FO first before the custom XML?

Would love to hear from the community.

Thank you very much.


Source: (StackOverflow)

ClassNotFoundException: org.docx4j.openpackaging.exceptions.Docx4JException

So here we go again. My head is banging on my PC about few hours, I can't figured out what to do. On my local PC I run the java code from Intellij Idea. It works. Now I have to create jar file to make it able to use on some remote server. I added all libraries, jars that my program needs at project settings (Added libraries at Artifacts section). But it doesn't work running at remote server. What imports my program needs:

import org.docx4j.dml.CTBlip;
import org.docx4j.jaxb.XPathBinderAssociationIsPartialException;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.PartName;
import org.docx4j.openpackaging.parts.relationships.RelationshipsPart;
import org.docx4j.relationships.Relationship;

import javax.xml.bind.JAXBException;
import java.io.File;
import java.util.List;

Error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/docx4j/openpackaging/exceptions/Docx4JException
Caused by: java.lang.ClassNotFoundException: org.docx4j.openpackaging.exceptions.Docx4JException
        at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
Could not find the main class: Main. Program will exit.

So is the problem in creating the jar? I missed something?


Source: (StackOverflow)

docx4j find and replace

I have docx document with some placeholders. Now I should replace them with other content and save new docx document. I started with docx4j and found this method:

public static List<Object> getAllElementFromObject(Object obj, Class<?> toSearch) {
    List<Object> result = new ArrayList<Object>();
    if (obj instanceof JAXBElement) obj = ((JAXBElement<?>) obj).getValue();

    if (obj.getClass().equals(toSearch))
        result.add(obj);
    else if (obj instanceof ContentAccessor) {
        List<?> children = ((ContentAccessor) obj).getContent();
        for (Object child : children) {
            result.addAll(getAllElementFromObject(child, toSearch));
        }
    }
    return result;
}

public static void findAndReplace(WordprocessingMLPackage doc, String toFind, String replacer){
    List<Object> paragraphs = getAllElementFromObject(doc.getMainDocumentPart(), P.class);
    for(Object par : paragraphs){
        P p = (P) par;
        List<Object> texts = getAllElementFromObject(p, Text.class);
        for(Object text : texts){
            Text t = (Text)text;
            if(t.getValue().contains(toFind)){
                t.setValue(t.getValue().replace(toFind, replacer));
            }
        }
    }
}

But that only work rarely because usually the placeholders splits across multiple texts runs.

I tried UnmarshallFromTemplate but it work rarely too.

How this problem could be solved?


Source: (StackOverflow)

How to replace EclipseLink 2.3.2 with EclipseLink 2.5 in WebLogic Server 12c

I currrently try to run Docx4j in WebLogic Server 12c. WebLogic Server 12c comes with EclipseLink 2.3.2.

There is a similar Post describing the situation which unfortunately yield no answer.

Docx4j does not work with the JAXB (MOXy) implementation which is part of EclipseLink 2.3.2. I got Docx4j running standalone with EclipseLink 2.5. So I am very confident that using EclipseLink 2.5 with Weblogic Server 12c will solve the issue with Docx4j.

How can I replace the EclipseLink Vesion 2.3.2 the WebLogic Server 12c is running on with EclipseLink Version 2.5?


Source: (StackOverflow)

How to remove all comments from docx file with docx4j?

I'd like to remove all the comments from a docx file using docx4j.

I can remove the actual comments with a piece of code like is shown below, but I think I also need to remove the comment references from the main document part as well (otherwise the document is corrupted), but I can't figure out how to do that.

CommentsPart cmtsPart = wordMLPackage.getMainDocumentPart().getCommentsPart();
org.docx4j.wml.Comments cmts = cpart.getJaxbElement();
List<Comments.Comment> coms = cmts.getComment();
coms.clear();

Any guidance appreciated!

I also posted this question on the docx4j forum: http://www.docx4java.org/forums/docx-java-f6/how-to-remove-all-comments-from-docx-file-t1329.html.

Thanks.


Source: (StackOverflow)

docx4j replace variable with html

i got this sample code to replace variables with text and it works perfect.

WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File("c:/template.docx"));

VariablePrepare.prepare(wordMLPackage);

MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();            

HashMap<String, String> mappings = new HashMap<String, String>();
mappings.put("firstname", "Name"); //${firstname}
mappings.put("lastname", "Name"); //${lastname}

documentPart.variableReplace(mappings);

wordMLPackage.save(new java.io.File("c:/replace.docx"));

but now i have to replace the variables with html. I tried something like this. but of cause it does not work

WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File("c:/template.docx"));

VariablePrepare.prepare(wordMLPackage);

MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();    

String html = "<html><head><title>Import me</title></head><body><p style='color:#ff0000;'>Hello World!</p></body></html>"; 

AlternativeFormatInputPart afiPart = new AlternativeFormatInputPart(new PartName("/hw.html"));
afiPart.setBinaryData(html.toString().getBytes());
afiPart.setContentType(new ContentType("text/html"));
Relationship altChunkRel = documentPart.addTargetPart(afiPart);
CTAltChunk ac = Context.getWmlObjectFactory().createCTAltChunk();
ac.setId(altChunkRel.getId());


HashMap<String, String> mappings = new HashMap<String, String>();
mappings.put("firstname", ac.toString()); //${firstname}
mappings.put("lastname", "Name"); //${lastname}

documentPart.variableReplace(mappings);

wordMLPackage.save(new java.io.File("c:/replace.docx"));

Is there any way to achieve this?


Source: (StackOverflow)

How to center text in docx4j

I have a paragraph of text which I would like to appear in the center of the document. How can I do this in docx4j? I am currently using:

    PPr paragraphProperties = factory.createPPr();

    //creating the alignment
    TextAlignment align = new TextAlignment();
    align.setVal("center");
    paragraphProperties.setTextAlignment(align);

    //centering the paragraph
    paragraph.setPPr(paragraphProperties);

but it isn't working.


Source: (StackOverflow)

Rejecting re-init on previously-failed class java.lang.Class on Android Lollipop

I am trying to generate PDF/Docx in android.

I tried with a lot of libraries: apache poi, docx4j and pdf box but always have this message in the console.

Any idea?

For example for this example code for docx4j:

public class ExportNotebookToWordTask extends RoboAsyncTask<Void> {

        private ProgressDialog exportProgress;
        private Activity activity;

        protected ExportNotebookToWordTask (Context context, Activity activity) {
            super(context);
            this.activity = activity;
            exportProgress = new ProgressDialog(activity);
            exportProgress.setIndeterminate(true);
            exportProgress.setCancelable(false);
            exportProgress.setCanceledOnTouchOutside(false);
            exportProgress.setMessage(context.getString(R.string.export_notebook_to_pdf_progress));
        }

        @Override
        protected void onPreExecute() throws Exception {
            super.onPreExecute();
            exportProgress.show();
        }

        @Override
        public Void call() throws Exception {

            WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
            wordMLPackage.getMainDocumentPart().addParagraphOfText("Hello Word!");

            File notebookDir = new File(Environment.getExternalStorageDirectory().getAbsolutePath() + File.separator + Constants.NOTEBOOKS_DIR);
            if(!notebookDir.exists()) {
                notebookDir.mkdir();
            }
            wordMLPackage.save(new File(notebookDir, course.getName() + Constants.DOCX_EXTENSION_FILE));

            return null;
        }

        @Override
        protected void onSuccess(Void result) throws Exception {
            super.onSuccess(result);
            DigitalNotebookActivity.this.finish();
        }

        @Override
        protected void onFinally() throws RuntimeException {
            super.onFinally();
            if (exportProgress != null && exportProgress.isShowing()) {
                exportProgress.dismiss();
            }
        }
    }

Log:

05-25 22:41:42.927  29302-31419/com.digitalnotebook I/art﹕ Rejecting
re-init on previously-failed class
java.lang.Class<org.docx4j.jaxb.JaxbValidationEventHandler> 05-25
22:41:42.927  29302-31419/com.digitalnotebook I/art﹕ Rejecting re-init
on previously-failed class
java.lang.Class<org.docx4j.jaxb.JaxbValidationEventHandler> 05-25
22:41:42.957  29302-31419/com.digitalnotebook I/art﹕ Rejecting re-init
on previously-failed class
java.lang.Class<org.docx4j.jaxb.JaxbValidationEventHandler> 05-25
22:41:42.977  29302-31419/com.digitalnotebook I/art﹕ Rejecting re-init
on previously-failed class
java.lang.Class<org.docx4j.jaxb.JaxbValidationEventHandler> 05-25
22:41:43.027  29302-31419/com.digitalnotebook I/System.out﹕
22:41:43.041 [pool-4-thread-8] INFO  org.docx4j.jaxb.Context -
java.vendor=The Android Project 05-25 22:41:43.027 
29302-31419/com.digitalnotebook I/System.out﹕ 22:41:43.043
[pool-4-thread-8] INFO  org.docx4j.jaxb.Context - java.version=0 05-25
22:41:43.137  29302-31419/com.digitalnotebook I/System.out﹕
22:41:43.152 [pool-4-thread-8] DEBUG org.docx4j.utils.ResourceUtils -
Attempting to load: org/docx4j/wml/jaxb.properties 05-25 22:41:43.147 
29302-31419/com.digitalnotebook I/System.out﹕ 22:41:43.160
[pool-4-thread-8] DEBUG org.docx4j.utils.ResourceUtils - Not using
MOXy, since no resource: org/docx4j/wml/jaxb.properties 05-25
22:41:43.147  29302-31419/com.digitalnotebook I/System.out﹕
22:41:43.161 [pool-4-thread-8] INFO  org.docx4j.jaxb.Context - No MOXy
JAXB config found; assume not intended.. 05-25 22:41:43.147 
29302-31419/com.digitalnotebook I/System.out﹕ 22:41:43.161
[pool-4-thread-8] DEBUG org.docx4j.jaxb.Context -
org/docx4j/wml/jaxb.properties not found via classloader. 05-25
22:41:43.147  29302-31419/com.digitalnotebook I/art﹕ Rejecting re-init
on previously-failed class
java.lang.Class<org.docx4j.jaxb.NamespacePrefixMapperSunInternal>
05-25 22:41:43.157  29302-31419/com.digitalnotebook I/art﹕ Rejecting
re-init on previously-failed class
java.lang.Class<org.docx4j.jaxb.NamespacePrefixMapperSunInternal>
05-25 22:41:43.157  29302-31419/com.digitalnotebook I/art﹕ Rejecting
re-init on previously-failed class
java.lang.Class<org.docx4j.jaxb.NamespacePrefixMapper> 05-25
22:41:43.157  29302-31419/com.digitalnotebook I/art﹕ Rejecting re-init
on previously-failed class
java.lang.Class<org.docx4j.jaxb.NamespacePrefixMapper> 05-25
22:41:43.157  29302-31419/com.digitalnotebook I/art﹕ Rejecting re-init
on previously-failed class
java.lang.Class<org.docx4j.jaxb.NamespacePrefixMapperRelationshipsPartSunInternal>
05-25 22:41:43.157  29302-31419/com.digitalnotebook I/art﹕ Rejecting
re-init on previously-failed class
java.lang.Class<org.docx4j.jaxb.NamespacePrefixMapperRelationshipsPartSunInternal>
05-25 22:41:43.177  29302-31419/com.digitalnotebook I/art﹕ Rejecting
re-init on previously-failed class
java.lang.Class<org.docx4j.jaxb.NamespacePrefixMapperRelationshipsPart>
05-25 22:41:43.177  29302-31419/com.digitalnotebook I/art﹕ Rejecting
re-init on previously-failed class
java.lang.Class<org.docx4j.jaxb.NamespacePrefixMapperRelationshipsPart>
05-25 22:41:43.177  29302-31419/com.digitalnotebook I/art﹕ Rejecting
re-init on previously-failed class
java.lang.Class<org.docx4j.jaxb.NamespacePrefixMapper>

Source: (StackOverflow)

How to get Page/Sheet Count of Word/Excel documents?

In my project I have one requirement to show the number of pages in Word documents (.doc, .docx) files and number of sheets in Excel documents (.xls, .xlsx). I have tried to read the .docx file using Docx4j but the performance is very poor but I need just the word count and tried using Apache POI. I am getting an error, something like:

"trouble writing output: Too many methods: 94086; max is 65536. By package:" 

I want to know whether there is any paid/open source library available for android.


Source: (StackOverflow)