We just upgraded our MT OS in a remote cloud with 24R2
After reboot of the OS, the MT is completely out of service: all the pods are in unknow status
so I went to the procedure of MT delete an MT installer and everything is up and running.
Is there any way to properly shut down the kubernetes pods to avoid this issue ?
Best regards
Page 1 / 1
This works for me … use it on the namespace pod ...
cd <drive:>\IFS\MWS\<build-home>ifsroot\deliveries\1-build-home\ifsinstaller
.\mtctl stop --namespace <namespace>
Hello, @MCIPSTEV
After upgrading the OS, components like systemd and the kernel can behave differently. This can break the communication between the kubelet and the container runtime , which results in all pods showing status:unknown — just like in your screenshot.
In some cases, the upgrade can also impact CNI plugins like Calico or Flannel, either removing them or breaking their configuration, which prevents pods from starting or getting network access.
To avoid this kind of issue going forward, it's a good idea to properly stop the Middleware Tier before reboot or upgrade. You can use the mtctl.cmd tool for that — it allows you to gracefully stop and start the pods. There’s a helpful section on usage here:
.\mtctl.cmd stop -n test .\mtctl.cmd start -n test
In our particular case, le underlying pods are not in a good shape, so .\mtctl.cmd stop -n test .\mtctl.cmd start -n test
were not sufficient
Do I need to reinstall the MT ?
Hi Steveninck,
Please perform a "describe" on the pods. It may give a better indication for why they have an Unknown status. Also check the output for a why the previous pod restarted.
kubectl describe pod <pod name> -n ifs-ingress
You can also perform a "describe" on the node itself. It may also have additional clues.
Best regards -- Ben
Please find below the output :
PS E:\ifsroot\deliveries\build-home\ifsinstaller> kubectl describe pod ingress-ingress-nginx-controller-npntp -n ifs-ingress Name: ingress-ingress-nginx-controller-npntp Namespace: ifs-ingress Priority: 20000000 Priority Class Name: ifs-infra-node-critical Service Account: ingress-ingress-nginx Node: mcihpdifsl03.d29.tes.local/10.192.242.25 Start Time: Tue, 20 May 2025 18:09:23 +0000 Labels: app.kubernetes.io/component=controller app.kubernetes.io/instance=ingress app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=ingress-nginx app.kubernetes.io/part-of=ingress-nginx app.kubernetes.io/version=1.11.2 controller-revision-hash=5995844fb9 helm.sh/chart=ingress-nginx-4.11.2 pod-template-generation=1 Annotations: <none> Status: Running IP: 10.192.242.25 IPs: IP: 10.192.242.25 Controlled By: DaemonSet/ingress-ingress-nginx-controller Containers: controller: Container ID: containerd://0bdf453971a12e7b7162b9f38fc160196525fe4308651876e4979abc2a298954 Image: ifscloud.jfrog.io/docker/ingress-nginx/controller:v1.11.2 Image ID: ifscloud.jfrog.io/docker/ingress-nginx/controller@sha256:d5f8217feeac4887cb1ed21f27c2674e58be06bd8f5184cacea2a69abaf78dce Ports: 80/TCP, 443/TCP, 10254/TCP Host Ports: 80/TCP, 443/TCP, 10254/TCP SeccompProfile: RuntimeDefault Args: /nginx-ingress-controller --enable-annotation-validation=true --default-backend-service=$(POD_NAMESPACE)/ingress-ingress-nginx-defaultbackend --election-id=ingress-ingress-nginx-leader --controller-class=k8s.io/ingress-nginx --ingress-class=nginx --configmap=$(POD_NAMESPACE)/ingress-ingress-nginx-controller --enable-ssl-passthrough State: Terminated Reason: Unknown Exit Code: 255 Started: Wed, 04 Jun 2025 03:14:08 +0000 Finished: Tue, 10 Jun 2025 03:17:36 +0000 Ready: False Restart Count: 8 Requests: cpu: 100m memory: 90Mi Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5 Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3 Environment: POD_NAME: ingress-ingress-nginx-controller-npntp (v1:metadata.name) POD_NAMESPACE: ifs-ingress (v1:metadata.namespace) LD_PRELOAD: /usr/local/lib/libmimalloc.so Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qs2gq (ro) Conditions: Type Status PodReadyToStartContainers False Initialized True Ready False ContainersReady False PodScheduled True Volumes: kube-api-access-qs2gq: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: Burstable Node-Selectors: kubernetes.io/os=linux Tolerations: node.kubernetes.io/disk-pressure:NoSchedule op=Exists node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/network-unavailable:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists node.kubernetes.io/pid-pressure:NoSchedule op=Exists node.kubernetes.io/unreachable:NoExecute op=Exists node.kubernetes.io/unschedulable:NoSchedule op=Exists Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SandboxChanged 2m57s (x1296 over 4h43m) kubelet Pod sandbox changed, it will be killed and re-created.
The final line indicates: "Pod sandbox changed, it will be killed and re-created." A quick search indicated multiple causes: 1) network problems, 2) insufficient host memory or CPU. You can perform a describe on the node itself to view the memory and CPU resources.
Best regards -- Ben
It is just an unattended reboot, network has not changed and memory is still the same. We are going to upgrade the OS and reinstall the MT.
Thanks for your help.
At this stage, we have reinstalled MT and is working again. I’m looking for a procedure on how to correctly shutdown the kubernetes pods, and underlying pods, so we can script it before a reboot