Hello Oliver,
Could I ask for more information?
- Status of other objects in the cluster: PVCs, replicaSets, etc
- Was the ingress applied?
- The gateway pod is running, correct? All other pods are fine, from what I understand.
Hello Carlos,
thank you for your quick response and helpful hints!
Regarding the ingress: You were absolutely right! The ingress-nginx-controller
pod was indeed not running. I've increased the allocated memory, and it is now running correctly.
Here is the current status of all pods in the cluster:
NAME READY STATUS RESTARTS
app-config-service-6f858dc4b7-l29vd 1/1 Running 1 (145m ago)
app-manager-768cfd7b65-sr62g 1/1 Running 1 (145m ago)
auth-service-7dd4967b85-mcfn9 1/1 Running 5 (145m ago)
auth-service-v2-775449d4b9-cqxrg 1/1 Running 1 (145m ago)
chartmuseum-0 1/1 Running 1 (145m ago)
edge-manager-0 1/1 Running 2 (145m ago)
edgeeye-config-service-0 1/1 Running 1 (145m ago)
edgeeye-fluentbit-collector-6kqq2 0/1 Init:0/1 0
edgeeye-fluentbit-watcher-7b676fd4cc-q8bht 0/1 Unknown 0
firmwaremanagement-minio-76459c6f8-v6dqc 1/1 Running 1 (145m ago)
firmwaremanager-ui-bdbf6dc7d-vzbjm 1/1 Running 1 (145m ago)
ie2328-gateway-56885c9768-9x9cz 2/2 Running 2 (145m ago)
ie2328-gateway-init-migrations-wdj9c 0/1 Completed 0
iels-deployment-7957565c69-2gpwl 1/1 Running 1 (145m ago)
iess-s3-statefulset-0 5/5 Running 5 (145m ago)
iess-statefulset-0 1/1 Running 1 (145m ago)
influxdb-0 1/1 Running 1 (145m ago)
ingress-nginx-controller-klcxs 1/1 Running 0
keycloak-0 1/1 Running 1 (145m ago)
launchpad-c499cbf-hgnws 1/1 Running 1 (145m ago)
notification-service-0 1/1 Running 1 (145m ago)
onpremisedevicetypemanagement-cc4cf8746-c6mjz 0/1 Init:1/2 0
onpremisefirmwaremanagement-68cb6b4845-8l9r2 0/1 Init:0/3 0
portal-service-0 1/1 Running 1 (145m ago)
portal-ui-78c546bbcf-5fs7v 1/1 Running 1 (145m ago)
portal-wss-7b5d456df4-568fp 1/1 Running 1 (145m ago)
portalhub-0 1/1 Running 1 (145m ago)
postgres-0 1/1 Running 1 (145m ago)
postgres-iedevicecatalog-0 1/1 Running 1 (145m ago)
regsecgen-29242680-bz4cf 0/1 Completed 0
tunnel-server-0 1/1 Running 1 (145m ago)
twin-service-546fbdb579-mrdl9 0/1 Init:0/1 0
twin-service-546fbdb579-rvr9t 0/1 Init:0/1 0
wfx-7cdddb8fc4-j72m6 0/1 CrashLoopBackOff 68 (4m7s ago)
wfx-qx-54b958c76f-lfrbz 0/1 CrashLoopBackOff 56 (3m50s ago)
workflowexecutor-6db5f4984d-qzpnv 1/1 Running 4 (35m ago)
As you can see, the ie2328-gateway
pod is running (2/2 Ready
), and my IEM Instance is indeed shown as online in the Hub.
However, the core issue persists: the wfx
pods (wfx-7cdddb8fc4-j72m6
and wfx-qx-54b958c76f-lfrbz
) are still in a CrashLoopBackOff
state. Their logs continue to show problems connecting to the PostgreSQL databases.
For example, a typical error from their logs is:
{"level":"error","module":"pgx","err":"failed to connect to `host=postgres-iedevicecatalog.iem-anlagenmodell.svc.cluster.local user=postgres database=wfx`: hostname resolving error (lookup postgres-iedevicecatalog.iem-anlagenmodell.svc.cluster.local: Try again)","time":"2025-08-07T13:27:40Z","caller":"github.com/jackc/pgx/v4@v4.14.1/log/zerologadapter/adapter.go:100","message":"connect failed"}
{"level":"error","error":"failed to connect to `host=postgres-iedevicecatalog.iem-anlagenmodell.svc.cluster.local user=postgres database=wfx`: hostname resolving error (lookup postgres-iedevicecatalog.iem-anlagenmodell.svc.cluster.local: Try again)","time":"2025-08-07T13:27:40Z","caller":"code.siemens.com/swupdate/wfx/cmd/wfx-server/main.go:132","message":"Failed to create backend"}
{"level":"fatal","error":"failed to connect to `host=postgres-iedevicecatalog.iem-anlagenmodell.svc.cluster.local user=postgres database=wfx`: hostname resolving error (lookup postgres-iedevicecatalog.iem-anlagenmodell.svc.cluster.local: Try again)","time":"2025-08-07T13:27:40Z","caller":"code.siemens.com/swupdate/wfx/cmd/wfx-server/main.go:104","message":"Fatal error, bailing out"}
Do you have any further suggestions on how to troubleshoot the database initialization process within the PostgreSQL pod, or how to ensure the wfx
pods can correctly connect and migrate their schemas?
Thanks again for your continued support!
Hello Oliver,
Good to know that it has improved.
Regarding your other points, could I ask that you verify that all the general requirements are met?
https://docs.eu1.edge.siemens.cloud/get_started_and_operate/industrial_edge_management/how_to_setup_operate/pro/overview/general-reqs.html
I suspect that pods might be getting stuck due to a lack of resources.
Other than that, could you also check the logs for the other problematic pods? What type of error do they display?
Hello Carlos,
Thanks for the link. My device does meet the general requirements, but I'm not entirely sure if these are actually being applied in the k3s environment.
To verify, I re-ran the installation using Minikube instead of k3s, and in that setup everything worked successfully.
To me, it looks like this was a specific issue with my environment when using k3s.Thanks a lot for your support!
Best regards,
Oliver