DocumentParser app service no longer available?

I'm trying to do a POC which extracts metadata from a FileDocument, and I came across this app service: DocumentParser This would be very useful to showcase to our potential client. But I realized that the app service is down since it returns Any chance of reviving this app service?
1 answers

If you want to get the meta data from a file the same sort of way the app service is performing this task the code below could help. The code is for a java action that has an input of a specialization of System.FileDocument called MyDoc in the module MyFirstModule and returns a string with the attributes and values separated with a colon (:) as a string.

package myfirstmodule.actions;

import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.sax.BodyContentHandler;
import org.apache.tika.parser.Parser;
import com.mendix.core.Core;
import com.mendix.systemwideinterfaces.core.IContext;
import com.mendix.webui.CustomJavaAction;
import com.mendix.systemwideinterfaces.core.IMendixObject;

public class JA_GetMetaData extends CustomJavaAction<java.lang.String>
	private IMendixObject __Document;
	private myfirstmodule.proxies.MyDoc Document;

	public JA_GetMetaData(IContext context, IMendixObject Document)
		this.__Document = Document;

	public java.lang.String executeAction() throws Exception
		this.Document = __Document == null ? null : myfirstmodule.proxies.MyDoc.initialize(getContext(), __Document);

		  Parser parser = new AutoDetectParser();
	      BodyContentHandler handler = new BodyContentHandler();
	      Metadata metadata = new Metadata();
	      InputStream inputstream = Core.getFileDocumentContent(getContext(), Document.getMendixObject());
	      ParseContext context = new ParseContext();
	      parser.parse(inputstream, handler, metadata, context);
	      //the handler contains the text content of the file being processed

	      //getting the list of all meta data elements 
	      String[] metadataNames = metadata.names();
	      StringBuilder sb = new StringBuilder();
	      for(String name : metadataNames) {		        

	      return sb.toString();

	 * Returns a string representation of this action
	public java.lang.String toString()
		return "JA_GetMetaData";


For a sample pdf file I just show the meta data in a message box and this will get you something like below:

WIth a little extra effort in the java code you can create an action that will store the data in an entiry associated with the document entity that holds the file. In addition you can create a more generic version by using the type parameters in the java action in the modeler. And if you are interested in the content of the file, have a look at the handler.

Be aware that you'll need the apache tika library in your userlib folder, this can be downloaded from:

Also be aware that the mehtod does not work for all files, I did a short test with pdf, docx, xlsx and png these all return you the metadata.

Hope this helps you further in your showcase.