How to trace Root Cause of a java.lang.OutOfMemoryError

2
Hi,  I have an on-premise Mendix app that stopped unexpectedly. When I start the interactive m2ee shell, it mentions that 10 critical errors were logged :   [mxtmgfs@lelvdck0070 ~]$ m2ee INFO: The application process is running, the MxRuntime has status: running ERROR: 10 critical error(s) were logged. Use show_critical_log_messages to view them. ERROR: Executing get_logged_in_user_names did not succeed: result: 1, message: , caused by: Java heap space INFO: Application Name: Task Services m2ee(mxtmgfs): show_critical_log_messages 2022-01-12 09:26:34 - Error in execution of monitored action 'Unnamed-Action-138' (execution id: 5bc8041e-23f4-43a7-9582-ecd824c9ddf9, execution type: SCHEDULED_EVENT) - Caused by: java.lang.OutOfMemoryError: Java heap space 2022-01-12 09:26:34 - Error in execution of monitored action 'RegularClientAction' (execution id: 1642001159935-18, execution type: CLIENT) - Caused by: java.lang.OutOfMemoryError: Java heap space 2022-01-12 09:26:34 - An unhandled error occurred in the MxRuntime. - Caused by: java.lang.OutOfMemoryError: Java heap space 2022-01-31 01:16:18 - Error in execution of monitored action 'RegularClientAction' (execution id: 1643613366487-38, execution type: CLIENT) - Caused by: java.lang.OutOfMemoryError: Java heap space 2022-01-31 01:16:18 - An unhandled error occurred in the MxRuntime. - Caused by: java.lang.OutOfMemoryError: Java heap space 2022-02-01 16:01:19 - Error in execution of monitored action 'Unnamed-Action-136' (execution id: af7614a7-759f-4220-85fd-9f512e8646dd, execution type: SCHEDULED_EVENT) - Caused by: java.lang.OutOfMemoryError: Java heap space 2022-02-01 16:01:19 - Error in execution of monitored action '{"request":"InternalOqlTextGetRequest (depth = -1): SELECT *            FROM System.BackgroundJob            WHERE                EndTime < $END_TIME                or StartTime < $START_TIME","type":"RetrieveOQLDataTableAction"}' (execution id: dec07e3a-c69e-48f8-870f-4350abb051de, execution type: CUSTOM) - Caused by: java.lang.OutOfMemoryError: Java heap space 2022-02-01 16:01:19 - Error in execution of monitored action 'Unnamed-Action-153' (execution id: be45ec87-11d5-43e1-acab-147763f5b6cb, execution type: SCHEDULED_EVENT) - Caused by: java.lang.OutOfMemoryError: Java heap space 2022-02-01 16:01:19 - Error in execution of monitored action 'RegularClientAction' (execution id: 1643752825547-12, execution type: CLIENT) - Caused by: java.lang.OutOfMemoryError: Java heap space 2022-02-01 16:01:19 - An unhandled error occurred in the MxRuntime. - Caused by: java.lang.OutOfMemoryError: Java heap space m2ee(mxtmgfs): exit   Though, when i grep for the same OutOfMemoryError in the application logs, I see the following : [mxtmgfs@lelvdck0070 ~]$ grep -C 2 OutOfMemory mendix/myapp.log 2022-01-31 01:16:00.006 INFO - Encryption: Encrypted value is EMPTY! 2022-01-31 01:16:18.035 CRITICAL - ActionManager: Error in execution of monitored action 'RegularClientAction' (execution id: 1643613366487-38, execution type: CLIENT) 2022-01-31 01:16:18.035 CRITICAL - ActionManager: java.lang.OutOfMemoryError: Java heap space 2022-01-31 01:16:18.035 CRITICAL - M2EE: An unhandled error occurred in the MxRuntime. 2022-01-31 01:16:18.035 CRITICAL - M2EE: java.lang.OutOfMemoryError: Java heap space 2022-01-31 01:17:00.006 INFO - Encryption: Encrypted value is EMPTY! [mxtmgfs@lelvdck0070 ~]$   ^The errors logged here don’t seem to match up with the OutOfMemory errors reported by the m2ee’s “show_critical_log_messages” command. Also I wanted to trace the root cause for these errors, but I can’t find any stacktrace & I’m not sure what to do with “execution id: 1643613366487-38”.   Can you please suggest a way I can investigate the root cause for what may have triggered this?      
asked
3 answers
1

Take a look at the following documentation:

https://docs.mendix.com/refguide/runtime-java-errors

answered
1

In first place, you must understand the issue. So try reading this article: https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/memleaks002.html

Next, may be after reading you must be able to understand:

- Out of memory exception can happen due to many factors: prolonged retaining of objects in memory, causing no space to create new objects. Or programming error, causing lot of objects created in memory with in a fraction, leading to no memory to create new objects. 

- In both cases: you must look in to your log, metrics etc to see, which case is affecting: prolonged or sudden rise in memory

- You can also write java actions, which can provide memory usage at the current moment of the application. Based on that, you can probably find the root cause of which functionality is causing this issue and work on that. 

- Accordingly you can take actions. 

answered
0

@Nirmalkumar/Shreyash, thanks for your replies!

I did go through the logs actually (please see above). To repeat the specific question I'm trying to ask here - 

The errors logged here don’t seem to match up with the OutOfMemory errors reported by the m2ee’s “show_critical_log_messages” command.

Also I wanted to trace the root cause for these errors, but I can’t find any stacktrace & I’m not sure what to do with “execution id: 1643613366487-38”.

Typically, in my limited experience with Java-based apps, whenever the max heap size threshold is reached, I have a complete stacktrace to see the sequence of methods called allowing me trace it back to the source of the problem. But here, in this case, all I see being logged is the “execution-id” & no stacktrace, or other information to work with. I’m looking for a way to figure out what microflow was executing? Or what Scheduled Event was running? Or what custom Java action was running? – that may have caused this error.

Where can I find that stacktrace? (Or is there some logging-level that I need to reconfigure in order to see this stacktrace)

(Note - Btw, this is an on-premise app & there is no monitoring/metrics-gathering in place yet.)

answered