GC overhead limit exceeds error in Sandbox / Exception in execution of monitored action null

0
Hi guys, I see that the sandbox gives me this error; 00:35:51 APP ERROR ActionManager: Exception in execution of monitored action 'null' (execution id: b0e205de-f1c4-444b-90bd-0e73512f0a27, execution type: SCHEDULED_EVENT) 00:35:51 APP ERROR ActionManager: akka.pattern.AskTimeoutException: Recipient[Actor[akka://MxRuntimeSystem/user/dispatchSupervisor#-121283537]] had already been terminated. 3 questions aboutt this: 1 - Would this kill the running sandbox app? I dont think so because I see the error keep coming back in the log. 2 - If i have a look to the SE's i see none of the SE's keep coming back in eah 2 minuts. Therefore I am wondering whether this is Mendix runtime SE or something? And how can I solve this? 3 - I simulated yearly orders in the app (around 80k made), after this I saw the app was working well. Would I exceed the limit of 100 mb or something, where can I check this for the sandbox (without exporting the data from the app)? EDIT: 4- I Discovered before getting the above error that I get this error GC Overhead limit exceeded error , thus probably it has to do with the test orders made yesterday. What does this error mean ( i already looked but the information was too scattered for me to find a solution). Is there a way to solve this for the free sandbox version and keeping the test data in it? If i am deploying to sandbox, i can get into the app for several minuts, and after X minuts i get the access error of "Sign in failed (category: sign in form)... 10:00:39 app CRITICAL ActorSystem: exception on LARS’ timer thread 10:00:39 app CRITICAL ActorSystem: java.lang.OutOfMemoryError: GC overhead limit exceeded 10:00:39 app INFO at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409) 10:00:39 app INFO at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375) 10:00:39 app INFO at java.lang.Thread.run(Thread.java:745) 10:00:39 app INFO ActorSystem: starting new LARS thread 10:00:39 app CRITICAL ActorSystem: Uncaught fatal error from thread [MxRuntimeSystem-scheduler-1] shutting down ActorSystem [MxRuntimeSystem] 10:00:39 app CRITICAL ActorSystem: java.lang.OutOfMemoryError: GC overhead limit exceeded 10:00:39 app INFO at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409) 10:00:39 app INFO at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375) 10:00:39 app INFO at java.lang.Thread.run(Thread.java:745) 10:00:40 app ERROR WebServices: Exception occurred while processing webservice request 10:00:40 app ERROR WebServices: com.mendix.integration.WebserviceException: Internal server error Update, i see that the app keeps normal for around 17 minuts and then first the GC error comes and after that a fatal error comes 12:34:15 app INFO WARNING:m2ee:Runtime is being started in Development Mode. Set DEVELOPMENT_MODE to "false" (currently "true") to set it to production. 12:34:15 app WARNING Runtime is being started in Development Mode. Set DEVELOPMENT_MODE to "false" (currently "true") to set it to production. 12:34:15 app INFO S3 config detected, activating external file store 12:34:15 app INFO INFO:m2ee:S3 config detected, activating external file store 12:34:15 app INFO INFO:m2ee:Successfully updated backup service 12:34:15 app INFO Successfully updated backup service 12:34:15 app INFO INFO:m2ee:Trying to start the MxRuntime... 12:34:15 app INFO Trying to start the MxRuntime... 12:34:15 app INFO Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128M; support was removed in 8.0 12:34:18 app INFO Logging: Logging to file: /home/vcap/app/log/out.log, max size: 2048KiB, max rotation: 10 12:34:18 app INFO Core: Mendix Runtime 6.3.1 (build 7603). Copyright © 2003-2016 Mendix bv. All rights reserved. 12:34:21 app INFO Core: Storage service: S3 storage, bucket location: cloud-foundry-shared-sandbox-prod 12:34:21 app INFO Core: Clustering is disabled 12:34:21 app INFO ConnectionBus: Database: PostgreSQL 9.4.1, name: '0fbb7f03-30f0-422c-a1ab-c2351d10fa23' 12:34:21 app INFO Driver: PostgreSQL Native Driver PostgreSQL 9.4 JDBC4.1 (build 1206) 12:34:22 app INFO Core: Project company name is 'Mendix' 12:34:22 app INFO Core: License expiration date is 'null' 12:34:22 app INFO Core: License type is: 'Perpetual' 12:34:22 app INFO Core: Running after-startup-action... 12:34:22 app INFO AppCloudServices: Starting OpenId handler ... OpenIDReturnURL = https://app.mxapps.io/openid/callback; OpenIdProvider: https://mxid2.mendixcloud.com/mxid2/discover 12:34:22 app INFO Apr 04, 2016 10:34:22 AM org.openid4java.server.RealmVerifier setEnforceRpId 12:34:22 app WARNING RP discovery / realm validation disabled; 12:34:22 app INFO Apr 04, 2016 10:34:22 AM org.openid4java.discovery.Discovery discover 12:34:22 app INFO Starting discovery on URL identifier: https://mxid2.mendixcloud.com/mxid2/discover 12:34:23 app INFO Apr 04, 2016 10:34:23 AM org.openid4java.discovery.yadis.YadisResolver discover 12:34:23 app INFO Yadis discovered 1 endpoints from: https://mxid2.mendixcloud.com/mxid2/discover 12:34:23 app INFO Apr 04, 2016 10:34:23 AM org.openid4java.discovery.Discovery discover 12:34:23 app INFO Discovered 1 OpenID endpoints. 12:34:23 app INFO Apr 04, 2016 10:34:23 AM org.openid4java.consumer.ConsumerManager associate 12:34:23 app INFO Trying to associate with https://mxid2.mendixcloud.com/mxid2/ attempts left: 4 12:34:23 app INFO Apr 04, 2016 10:34:23 AM org.openid4java.consumer.ConsumerManager associate 12:34:23 app INFO Associated with https://mxid2.mendixcloud.com/mxid2/ handle: 1454396801984-S-156053c7-4d14-4dc0-b769-68bb6719a6a4 12:34:23 app INFO AppCloudServices: Starting OpenId handler ... DONE 12:34:23 app INFO Core: Successfully ran after-startup-action. 12:34:23 app INFO ActionManager: Scheduling CommunityCommons.releaseOldLocks every 5 minutes, starting from 2011-01-27 09:58:53 (UTC) 12:34:23 app INFO ActionManager: Scheduling EmailTemplate.SendQueuedEmails every 1 hour, starting from 2015-04-10 00:00:00 (UTC) 12:34:23 app INFO ActionManager: Scheduling EmailTemplate.Cleanup every 1 week, starting from 2015-04-10 01:00:00 (UTC) 12:34:23 app INFO ActionManager: Scheduling App.SE_TelVoorraadProductProductCategorien every 1 hour, starting from 2016-03-03 12:20:04 (UTC) 12:34:23 app INFO Core: Mendix Runtime successfully started, the application is now available. 12:34:24 app INFO INFO:m2ee:The MxRuntime is fully started now. 12:34:24 app INFO The MxRuntime is fully started now. 12:34:24 app INFO INFO:m2ee:Ensuring admin user credentials 12:34:24 app INFO Ensuring admin user credentials 12:34:24 app INFO Core: XAS instance {5ca327fd-c67a-4952-8738-91324fbfcd84} expired, removing this instance now 12:34:24 app INFO INFO:m2ee:Model version: 1.0.0.115 12:34:24 app INFO Model version: 1.0.0.115 12:34:24 app INFO The remote debugger is now enabled, the password to use is c1968dca@M36c75b9f1bd02ead00 12:34:24 app INFO You can use the remote debugger option in the Mendix Business Modeler to connect to the /debugger/ sub url on your application (e.g. https://app.example.com/debugger/). 12:34:24 app INFO INFO:m2ee:The remote debugger is now enabled, the password to use is c1968dca@M36c75b9f1bd02ead00 12:34:24 app INFO INFO:m2ee:You can use the remote debugger option in the Mendix Business Modeler to connect to the /debugger/ sub url on your application (e.g. https://app.example.com/debugger/). 12:37:46 app INFO WebUI: Anonymous user 'Anonymous_c6da2ee4-f905-4b71-bc96-8bce9d68b326' created (Number of concurrent sessions: 1). 13:00:37 app INFO SLF4J: Failed to load class "org.slf4j.impl.StaticMDCBinder". 13:00:37 app INFO SLF4J: Defaulting to no-operation MDCAdapter implementation. 13:00:37 app INFO SLF4J: See http://www.slf4j.org/codes.html#no_static_mdc_binder for further details. **13:00:37 app CRITICAL ActorSystem: exception on LARS’ timer thread 13:00:37 app CRITICAL ActorSystem: java.lang.OutOfMemoryError: GC overhead limit exceeded** 13:00:37 app INFO at akka.dispatch.AbstractNodeQueue.(AbstractNodeQueue.java:22) 13:00:37 app INFO at akka.actor.LightArrayRevolverScheduler$TaskQueue.(Scheduler.scala:443) 13:00:37 app INFO at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409) 13:00:37 app INFO at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375) 13:00:37 app INFO at java.lang.Thread.run(Thread.java:745) 13:00:37 app INFO ActorSystem: starting new LARS thread **13:00:37 app CRITICAL ActorSystem: Uncaught fatal error from thread [MxRuntimeSystem-scheduler-1] shutting down ActorSystem [MxRuntimeSystem] 13:00:37 app CRITICAL ActorSystem: java.lang.OutOfMemoryError: GC overhead limit exceeded** 13:00:37 app INFO at akka.dispatch.AbstractNodeQueue.(AbstractNodeQueue.java:22)
asked
1 answers
1

I ran into this a fair bit lately.

Unfortunately this error does kill your app (and kills it good). It messes up the actor system, until you do a restart of the environment.

Mind you, a GC overhead error is not exactly the same as an OutOfMemory error:

Cause: The detail message "GC overhead limit exceeded" indicates that the garbage collector is running all the time and Java program is making very slow progress. After a garbage collection, if the Java process is spending more than approximately 98% of its time doing garbage collection and if it is recovering less than 2% of the heap and has been doing so far the last 5 (compile time constant) consecutive garbage collections, then a java.lang.OutOfMemoryError is thrown. This exception is typically thrown because the amount of live data barely fits into the Java heap having little free space for new allocations. Action: Increase the heap size. The java.lang.OutOfMemoryError exception for GC Overhead limit exceeded can be turned off with the command line flag -XX:-UseGCOverheadLimit.

The solution tends to be similar however. Any logic that makes use of a large amount of data should be done in batches, as Ronald suggests, so that blocks of memory can be released to the GC. An alternate solution would be to increase the size of the java heap, though that may not be an option in Sandbox.

You indicate the app fails consistently after some 17 minutes.

  1. Can you replicate this on your local environment? Note; you can adjust your local VM memory settings to mirror those of the sandbox (keep in mind that the heapspace tends to be 50% of the allocated memory).
  2. Does this happen regardless of user activity on the application?
  3. Are there any SE's that could be a logical cause, and can you set logging for these to see if any of them coincides with the crashes?
answered