Application hosted in AWS went down - No usage or minimum usage

0
Hi Experts, We are facing issues with application hosted in AWS. Application goes down and when we tried to check the root cause of it, from the logs we understood that the system was not used or used by minimum number of users (there are cases with 1 or 2 users). From the monitoring graphs of the application, DB connections and CPU utilization was less.  System is not used for some hours then it goes down. Not sure why this is happening when there is no usage. Did anyone faced similar problem?   
asked
2 answers
0

I think it's really weird that there is nothing in your log. I run a number of applications on AWS Fargate, and I send the logging to Splunk. When my app crashes, I always have some log line to work with. You should check your logs and see what is logged when your app goes down. If you don't have a log message, it's really hard to understand what's going on.

There are a few types of common errors you could search your log for, the first three documented here:

  • java.lang.OutOfMemoryError: Java heap space

  • java.lang.OutOfMemoryError: GC overhead limit exceeded

  • java.lang.StackOverflowError

  • Trail license exceeded

The fourth error should result in the log message below:

INFO - Core: Maximum run time exceeded, framework is now terminating

Recently, I have encountered a new error when running on AWS Fargate. In my log, I see the line:

INFO: MENDIX-METRICS: {"jvm": {"crash": 1.0}, "version": "1.0", "timestamp": "2020-10-15T07:24:03.672159"}

But I have no idea what caused this issue, but it occurred twice on a test environment, so I haven't investigated it.

answered
0

Hi Rom, Thanks for your response. Me and Mythili we work in the same team. 

The issue is: 

- No Memory leaks

- EC2, RDS CPU usage looks normal (like below 10%). 

- DB connections looks normal

- Nothing strange found in syslog except the process is getting terminated. We are not sure why. 

- If there are very minimal or no usage for few hours (lets say 10 hours) then system goes down. May be that is also wrong conclusion. But we are not sure. We only noticed this happening. 

We also involved Mendix expert team but after couple of meetings the conclusion is same, they also do not know. We are not sure why instances goes down. And it is from different data center every time. 

Indeed its really weird.

Thats why we brought this to forum, thinking if someone else faced this, they might help us. 

 

answered