Read Lined JSON/Raw JSON/multiple JSON lines

1
Hi Folks, Any one tried reading Lined JSON(JSON Lines) as shown below (Instead of Regular JSON) https://jsonlines.org/examples/. Example for content of file: {"id":1,"father":"Mark","mother":"Charlotte"} {"id":2,"father":"John","mother":"Ann"} {"id":3,"father":"Bob","mother":"Monika"} Solution 1: Use Java action to Open the Lined Json File , read the content as text and add '[’ in the begining and ']’ in the end of the file. Add ',’ in the end of each record. This may cause performance issue if the data is huge. This will result in more time to parse the whole JSON content. Any other Solution can think of? Thanks and Regards, Nirmal Kumar  
asked
3 answers
4

I ran into this situation and solved it slightly differently. So here's my solution for future reference:

 

Use CommunityCommons.SplitItem action to split your raw 'JSON' (lines) string into SplitItems with splitparameter '\n'.

Then iterate over the SplitItem list and import like usual, with a importmapping based on a single JSON line.

No exotic Java actions needed and worked fine for me with a file of 2700 lines, imported on a monthly basis.

 

When having to import largers files, you might want to add some extra batching like Daan has done. 

answered
1

Not sure if you can change the source where JSON is generated. 

But, why not use Mendix String replace and concatenate functions, may be it is faster than Java actions

https://docs.mendix.com/refguide6/string-function-calls#replaceall

- replace all } to },

- Add [ and ] 

answered
1

I had a similar problem with large .txt files with a JSON on every line and resolved it with a custom file-to-string java action that only returns part of the file as string. As input (besides the file) you give an Offset and Amount (similar to retrieving from database). You could set an Amount of 1 and iterate over all lines in your file or a larger amount and split the resulting string into seperate JSONs with string functions. The resulting one-line string (JSON) can be mapped with normal Mendix functionality.

		// BEGIN USER CODE
		//Get input stream from FileDocument
		InputStream fis = Core.getFileDocumentContent(getContext(), InputFile.getMendixObject());
		
		//Instantiate string to return and newline symbol
		String returnString = "";
		String newLine = System.lineSeparator();
		
		//Instantiate LineIterator
		LineIterator it = IOUtils.lineIterator(fis, "UTF-8");
		try {
			//Instantiate Counter to know on which line we are
			Integer lineCount = 0;
			
		   while (it.hasNext() && lineCount < Offset + Amount) { //End at Amount -1
		      String line = it.nextLine(); 
		      
		      if (lineCount >= Offset) { //Start from line with row number = offset
				//Add given line to returnstring with enter
				returnString += newLine + line;
				}
		      //Next row
		      lineCount++;
		   }
		} finally {
			//Close LineIterator
			it.close();
		 }
		
		//Trim blank line at first row and return string
		return returnString.trim();
		// END USER CODE

The LineIterator from ApacheCommons (example here) doesn't keep the complete file in memory, making it very efficient for large files. If you combine this with end-/start-transaction Java Actions, we were able to reliably process multi-GB files within minutes. 

Note that this only works for files where the JSONs are seperated by newlines (i.e. nextLine() works).

As far as I'm aware, Mendix does not support LinedJSON, so you will have to revert to a custom solution I suppose. Hope this helps.

 

Cheers,

Daan

answered