Solved

MT Upgrade Ubuntu 20.04-22.04

Forum|Forum|11 months ago
June 6, 2025
11 replies
231 views

MCIPSTEV
Sidekick (Customer)

We just upgraded our MT OS in a remote cloud with 24R2

After reboot of the OS, the MT is completely out of service: all the pods are in unknow status

so I went to the procedure of MT delete an MT installer and everything is up and running.

Is there any way to properly shut down the kubernetes pods to avoid this issue ?

Best regards

Best answer by InfFilipV

Hi,
you should stop pods before any “dangerous” action with “mtctl stop”.

If you think it is not enough, you can remove whole namespace - delete all pods from kubernates with “installer delete” and deploy them back with “installer mtinstaller”.

.\mtctl.cmd
- mtctl.cmd status --namespace <namespace>
- mtctl.cmd stop --namespace <namespace>
- mtctl.cmd start --namespace <namespace>
.\installer.cmd
- .\installer.cmd --set action=delete
- .\installer.cmd --set action=mtinstaller

We are sometimes facing issue with totally stucked kubernates, then we reboot linux and reinstall whole kubernates.

BR

puzinoj
Do Gooder (Customer)
Forum|Forum|11 months ago
June 6, 2025

This works for me … use it on the namespace pod ...

cd <drive:>\IFS\MWS\<build-home>ifsroot\deliveries\1-build-home\ifsinstaller

.\mtctl stop --namespace <namespace>

+11

hardik
Hero (Partner)
Forum|Forum|11 months ago
June 6, 2025

Hello, @MCIPSTEV

After upgrading the OS, components like systemd and the kernel can behave differently. This can break the communication between the kubelet and the container runtime , which results in all pods showing status:unknown — just like in your screenshot.

In some cases, the upgrade can also impact CNI plugins like Calico or Flannel, either removing them or breaking their configuration, which prevents pods from starting or getting network access.

To avoid this kind of issue going forward, it's a good idea to properly stop the Middleware Tier before reboot or upgrade. You can use the mtctl.cmd tool for that — it allows you to gracefully stop and start the pods. There’s a helpful section on usage here:

https://docs.ifs.com/techdocs/25r1/070_remote_deploy/060_tips_managing_middle_tier/010_mtctl/#usage

Hope this helps!

Thanks,
Hardik

darshana
Sidekick (Employee)
Forum|Forum|11 months ago
June 6, 2025

IFS\MWS\ifsroot\deliveries\1.0_delivery \ifsinstaller

.\mtctl.cmd stop -n test
.\mtctl.cmd start -n test

MCIPSTEV
Author
Sidekick (Customer)
Forum|Forum|11 months ago
June 10, 2025

In our particular case, le underlying pods are not in a good shape, so
.\mtctl.cmd stop -n test
.\mtctl.cmd start -n test

were not sufficient

Do I need to reinstall the MT ?

+15

Ben Monroe
Superhero (Employee)
Forum|Forum|11 months ago
June 10, 2025

Hi Steveninck,

Please perform a "describe" on the pods. It may give a better indication for why they have an Unknown status. Also check the output for a why the previous pod restarted.

kubectl describe pod <pod name> -n ifs-ingress

You can also perform a "describe" on the node itself. It may also have additional clues.

Best regards -- Ben

MCIPSTEV
Author
Sidekick (Customer)
Forum|Forum|11 months ago
June 10, 2025

Please find below the output :

PS E:\ifsroot\deliveries\build-home\ifsinstaller> kubectl describe pod ingress-ingress-nginx-controller-npntp -n ifs-ingress
Name:                 ingress-ingress-nginx-controller-npntp
Namespace:            ifs-ingress
Priority:             20000000
Priority Class Name:  ifs-infra-node-critical
Service Account:      ingress-ingress-nginx
Node:                 mcihpdifsl03.d29.tes.local/10.192.242.25
Start Time:           Tue, 20 May 2025 18:09:23 +0000
Labels:               app.kubernetes.io/component=controller
                      app.kubernetes.io/instance=ingress
                      app.kubernetes.io/managed-by=Helm
                      app.kubernetes.io/name=ingress-nginx
                      app.kubernetes.io/part-of=ingress-nginx
                      app.kubernetes.io/version=1.11.2
                      controller-revision-hash=5995844fb9
                      helm.sh/chart=ingress-nginx-4.11.2
                      pod-template-generation=1
Annotations:          <none>
Status:               Running
IP:                   10.192.242.25
IPs:
  IP:           10.192.242.25
Controlled By:  DaemonSet/ingress-ingress-nginx-controller
Containers:
  controller:
    Container ID:    containerd://0bdf453971a12e7b7162b9f38fc160196525fe4308651876e4979abc2a298954
    Image:           ifscloud.jfrog.io/docker/ingress-nginx/controller:v1.11.2
    Image ID:        ifscloud.jfrog.io/docker/ingress-nginx/controller@sha256:d5f8217feeac4887cb1ed21f27c2674e58be06bd8f5184cacea2a69abaf78dce
    Ports:           80/TCP, 443/TCP, 10254/TCP
    Host Ports:      80/TCP, 443/TCP, 10254/TCP
    SeccompProfile:  RuntimeDefault
    Args:
      /nginx-ingress-controller
      --enable-annotation-validation=true
      --default-backend-service=$(POD_NAMESPACE)/ingress-ingress-nginx-defaultbackend
      --election-id=ingress-ingress-nginx-leader
      --controller-class=k8s.io/ingress-nginx
      --ingress-class=nginx
      --configmap=$(POD_NAMESPACE)/ingress-ingress-nginx-controller
      --enable-ssl-passthrough
    State:          Terminated
      Reason:       Unknown
      Exit Code:    255
      Started:      Wed, 04 Jun 2025 03:14:08 +0000
      Finished:     Tue, 10 Jun 2025 03:17:36 +0000
    Ready:          False
    Restart Count:  8
    Requests:
      cpu:      100m
      memory:   90Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       ingress-ingress-nginx-controller-npntp (v1:metadata.name)
      POD_NAMESPACE:  ifs-ingress (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qs2gq (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   False
  Initialized                 True
  Ready                       False
  ContainersReady             False
  PodScheduled                True
Volumes:
  kube-api-access-qs2gq:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type    Reason          Age                       From     Message
  ----    ------          ----                      ----     -------
  Normal  SandboxChanged  2m57s (x1296 over 4h43m)  kubelet  Pod sandbox changed, it will be killed and re-created.

+15

Ben Monroe
Superhero (Employee)
Forum|Forum|11 months ago
June 10, 2025

The final line indicates: "Pod sandbox changed, it will be killed and re-created."
A quick search indicated multiple causes: 1) network problems, 2) insufficient host memory or CPU.
You can perform a describe on the node itself to view the memory and CPU resources.

Best regards -- Ben

MCIPSTEV
Author
Sidekick (Customer)
Forum|Forum|11 months ago
June 10, 2025

It is just an unattended reboot, network has not changed and memory is still the same.
We are going to upgrade the OS and reinstall the MT.

Thanks for your help.

MCIPSTEV
Author
Sidekick (Customer)
Forum|Forum|11 months ago
June 10, 2025

At this stage, we have reinstalled MT and is working again.
I’m looking for a procedure on how to correctly shutdown the kubernetes pods, and underlying pods, so we can script it before a reboot

+13

InfFilipV
Hero (Partner)
Answer
Forum|Forum|7 months ago
October 6, 2025

Hi,
you should stop pods before any “dangerous” action with “mtctl stop”.

If you think it is not enough, you can remove whole namespace - delete all pods from kubernates with “installer delete” and deploy them back with “installer mtinstaller”.

.\mtctl.cmd
- mtctl.cmd status --namespace <namespace>
- mtctl.cmd stop --namespace <namespace>
- mtctl.cmd start --namespace <namespace>
.\installer.cmd
- .\installer.cmd --set action=delete
- .\installer.cmd --set action=mtinstaller

We are sometimes facing issue with totally stucked kubernates, then we reboot linux and reinstall whole kubernates.

BR

Filip | Developer | InfoConsulting.com

MCIPSTEV
Author
Sidekick (Customer)
Forum|Forum|7 months ago
October 6, 2025

Hello, that is exactely what we do, running the installer delete and mtinstaller takes about 30-60 minutes.

Did you find what you're looking for? If not:

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded