Lucene & Community Commons PDF Merge

0
we're having some issues with the Lucene module. We like the capabilities that Lucene search adds to our application, but we have found that it requires a jar file (tika-app-1.6) that packages up a pdf-app-box dependency. Unfortunately Community Commons PDF Merge also requires pdf-app-box ( version 2.0.3 ) , and there was a major change to pdfbox between 1.8 and 2.0, specifically the PDDocument constructir now requires a MemoryUsageSetting parameter. This causes a runtime error when we use the PDF Merge, because both jar files contain the PDDocument class.... here's the stack trace: java.lang.NoSuchMethodError: org.apache.pdfbox.pdmodel.PDDocument.<init>(Lorg/apache/pdfbox/io/MemoryUsageSetting;)V     at org.apache.pdfbox.multipdf.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:254)     at org.apache.pdfbox.multipdf.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:227)     at communitycommons.Misc.mergePDF(Misc.java:618)     at communitycommons.actions.MergeMultiplePdfs.executeAction(MergeMultiplePdfs.java:42)     at communitycommons.actions.MergeMultiplePdfs.executeAction(MergeMultiplePdfs.java:1)     at com.mendix.systemwideinterfaces.core.UserAction.execute(UserAction.java:50)     at com.mendix.basis.actionmanagement.CoreActionHandlerImpl.doCall(CoreActionHandlerImpl.scala:73)     at com.mendix.basis.actionmanagement.CoreActionHandlerImpl.call(CoreActionHandlerImpl.scala:53)     at com.mendix.core.actionmanagement.CoreAction.call(CoreAction.java:51)     at com.mendix.basis.actionmanagement.ActionManager$1.execute(ActionManager.java:170)     at com.mendix.util.classloading.Runner.doRunUsingClassLoaderOf(Runner.java:33)     at com.mendix.basis.actionmanagement.ActionManager.executeSync(ActionManager.java:174)     at com.mendix.basis.component.InternalCore.execute(InternalCore.java:445)     at com.mendix.modules.microflowengine.actions.actioncall.JavaAction.execute(JavaAction.scala:52)     at com.mendix.modules.microflowengine.microflow.impl.MicroflowObject.execute(MicroflowObject.java:47)     at com.mendix.modules.microflowengine.microflow.impl.MicroflowImpl.executeAfterBreakingIfNecessary(MicroflowImpl.java:200)     at com.mendix.modules.microflowengine.microflow.impl.MicroflowImpl.executeAction(MicroflowImpl.java:157)     at com.mendix.systemwideinterfaces.core.UserAction.execute(UserAction.java:50)     at com.mendix.basis.actionmanagement.CoreActionHandlerImpl.doCall(CoreActionHandlerImpl.scala:73)     at com.mendix.basis.actionmanagement.CoreActionHandlerImpl.call(CoreActionHandlerImpl.scala:53)     at com.mendix.core.actionmanagement.CoreAction.call(CoreAction.java:51)     at com.mendix.basis.actionmanagement.ActionManager$1.execute(ActionManager.java:170)     at com.mendix.util.classloading.Runner.doRunUsingClassLoaderOf(Runner.java:33)     at com.mendix.basis.actionmanagement.ActionManager.executeSync(ActionManager.java:174)     at com.mendix.basis.component.InternalCore.executeSync(InternalCore.java:523)     at com.mendix.modules.microflowengine.actions.SubMicroflowAction.execute(SubMicroflowAction.scala:44)     at com.mendix.modules.microflowengine.microflow.impl.MicroflowObject.execute(MicroflowObject.java:47)     at com.mendix.modules.microflowengine.microflow.impl.MicroflowImpl.executeAfterBreakingIfNecessary(MicroflowImpl.java:200)     at com.mendix.modules.microflowengine.microflow.impl.MicroflowImpl.executeAction(MicroflowImpl.java:157)     at com.mendix.systemwideinterfaces.core.UserAction.execute(UserAction.java:50)     at com.mendix.basis.actionmanagement.CoreActionHandlerImpl.doCall(CoreActionHandlerImpl.scala:73)     at com.mendix.basis.actionmanagement.CoreActionHandlerImpl.call(CoreActionHandlerImpl.scala:53)     at com.mendix.core.actionmanagement.CoreAction.call(CoreAction.java:51)     at com.mendix.basis.actionmanagement.ActionManager$1.execute(ActionManager.java:170)     at com.mendix.util.classloading.Runner.doRunUsingClassLoaderOf(Runner.java:33)     at com.mendix.basis.actionmanagement.ActionManager.executeSync(ActionManager.java:174)     at com.mendix.basis.component.InternalCore.execute(InternalCore.java:445)     at com.mendix.webui.actions.client.ExecuteAction.execute(ExecuteAction.java:143)     at com.mendix.webui.requesthandling.ClientRequestHandler$$anonfun$handleRequest$1.apply$mcV$sp(ClientRequestHandler.scala:321)     at com.mendix.webui.requesthandling.ClientRequestHandler$$anonfun$handleRequest$1.apply(ClientRequestHandler.scala:307)     at com.mendix.webui.requesthandling.ClientRequestHandler$$anonfun$handleRequest$1.apply(ClientRequestHandler.scala:307)     at com.mendix.basis.actionmanagement.IMonitoredAction$$anon$1.execute(IMonitoredAction.scala:47)     at com.mendix.util.classloading.Runner.doRunUsingClassLoaderOf(Runner.java:33)     at com.mendix.basis.actionmanagement.IMonitoredAction$class.monitor(IMonitoredAction.scala:49)     at com.mendix.webui.requesthandling.ClientRequestHandler$ClientMonitoredAction.monitor(ClientRequestHandler.scala:419)     at com.mendix.webui.requesthandling.ClientRequestHandler.handleRequest(ClientRequestHandler.scala:307)     at com.mendix.webui.requesthandling.ClientRequestHandler.handleActionWithSessionRequired(ClientRequestHandler.scala:238)     at com.mendix.webui.requesthandling.ClientRequestHandler.handleAction(ClientRequestHandler.scala:202)     at com.mendix.webui.requesthandling.ClientRequestHandler.liftedTree1$1(ClientRequestHandler.scala:99)     at com.mendix.webui.requesthandling.ClientRequestHandler.processRequest(ClientRequestHandler.scala:91)     at com.mendix.externalinterface.connector.RequestHandler.doProcessRequest(RequestHandler.java:40)     at com.mendix.external.connector.MxRuntimeConnector$1.execute(MxRuntimeConnector.java:70)     at com.mendix.external.connector.MxRuntimeConnector$1.execute(MxRuntimeConnector.java:67)     at com.mendix.util.classloading.Runner.doRunUsingClassLoaderOf(Runner.java:33)     at com.mendix.external.connector.MxRuntimeConnector.processRequest(MxRuntimeConnector.java:73)     at com.mendix.basis.impl.MxRuntimeImpl.processRequest(MxRuntimeImpl.java:875)     at com.mendix.m2ee.appcontainer.server.handler.RuntimeHandler.handle(RuntimeHandler.java:41)     at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)     at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)     at org.eclipse.jetty.server.Server.handle(Server.java:368)     at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)     at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)     at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)     at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)     at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)     at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)     at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)     at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)     at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)     at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)     at java.lang.Thread.run(Thread.java:745)   We really need both of these capabilities (PDFMerge & Lucene) - anybody got any ideas how to fix this?  
asked
2 answers
0

You can remove the getFileFormat() function and the tika*.jar and replace the UpdateIndexFileDocument main part with

 

		// BEGIN USER CODE
		InputStream is = null;
		if (filedocument.getHasContents()== false) {
			return false;
		}
		try {
			is = Core.getFileDocumentContent(getContext(), filedocument.getMendixObject());
		} catch (Exception e) {
			logger.error("Error reading stream, file is missing from disk " + filedocument.getName());
			return false;
		}
		try {
			String text = null;
			int pos = filedocument.getName().lastIndexOf('.');
			String ext = filedocument.getName().substring(pos+1).toLowerCase();			
			logger.debug("Index file document, filetype " + ext);
			switch (ext){
			case "pdf": 
				text = textFromPDF(is);
				break;
			case "doc":
			case "xls":
				text = textFromOffice97(is);
				break;
			case "docx":
				text = textFromOfficeX(is);
				break;
			case "xlsx":
				text = textFromXLSX(is);
				break;
			case "pptx":
				text = textFromPPTX(is);
				break;				
			case "xml":
				text = textFromXML(is);
				break;
			case "html":
				text = textFromHTML(is);
				break;
			case "txt":
				text = textFromTextFile(is);
				break;
			default:
				logger.error("Unsupported file type " + ext + " " + filedocument.getName());
				break;
			}
			if (text != null && !text.isEmpty()) {
				// Add the tag-stripped contents as a Reader-valued Text field so it will
				// get tokenized and indexed.
				StringReader reader = new StringReader(text);
				Document doc = new Document();
				doc.add(new TextField(LuceneFactory.TEXT, reader));
				doc.add(new TextField(LuceneFactory.MXTYPE, __filedocument.getType(), Field.Store.YES));
				doc.add(new TextField(LuceneFactory.MXID, Long.toString(__filedocument.getId().toLong()), Field.Store.YES));
				LuceneFactory.getInstance().addDocumentQueued(indexId, doc);
			}
		} catch (Exception e){
			logger.error("Error indexing filedocument", e);
		}
		return true;

 

answered
-1

Easy and safe way to join or combine your confidential data files such as PDF files which includes your personal data as well is Merge PDF Tool This Software is built with complete security that no data loss will be entertained in using the tool. You can trust the tool as I have personally used it before. visit https://www.osttopstapp.com/merge-pdf.html

answered