Skip to main content

One of customer having initially having PROD environment (Middle Tier server) with 12 CPU and 32Gb memory and we set scale as 10. if we increase scale 20,30.. PODS not getting started, in this case when customer is using the application reporting POD getting crashed.
After that we increased hardware sizing like 24 CPU and 64Gb memory, we setup scale as 100.then PODS getting started and Application is up and running without any issues and reporting pods working properly

Now, the issue is customer checked the hardware utilization of MT server and its using only maximum 24Gb memory only and less CPU usage.

Customer want justification for the above query, why the MT server not being used even 40% of allocated memory. Can someone justify?
 

there may be a case where the new architecture is optimizing process management better through parallelism or a good level of threading since there are more resources to run it and in good time concurrently. 


This is the way Kubernetes (k8s) work. Even if we only use 40% of the memory we need to reserve 100% of the memory we allow the containers to grow to (their memory limit).  If the k8s node only have 24Gb, the pods might start and run well for some customers, but if the pods actually need all its memory on a high load scenario the k8s node will crash. 


Hello everyone,

We are currently facing significant system requirements, specifically 98 GB of memory. Despite running our MT for several months, we have never exceeded 10% of this requirement, leading to substantial unnecessary costs.

Currently, we are using Cloud Remote on AWS EC2 servers. We are considering transitioning to AWS managed services, specifically:

  • Amazon Elastic Kubernetes Service (EKS)
  • Amazon RDS for Oracle

Our goal is to establish a fail-safe production environment.

Although IFS supports Kubernetes (as mentioned here), there is limited information available regarding managed Kubernetes services.

Has anyone here tried using these managed services? Any insights or experiences would be greatly appreciated.

Thank you!

 


IFS do not support customers that runs IFS Cloud in their own Kubernetes infrastructure - it has to be deployed with our bundled Microk8s.
EKS is not an option. We have no plans on the roadmap to support it. Maybe in the future… who knows...


Subject: Request for Procedure to Fully Shutdown Kubernetes Before OS Update

Hello,

I would like to request a clear and detailed procedure for properly shutting down the entire Kubernetes infrastructure prior to performing an OS update. Currently, using the MCTL start/stop command only stops our customer namespace, but does not affect the underlying Kubernetes infrastructure provided by IFS.

I was unable to find relevant guidance in the IFS documentation.

Thank you in advance for your support.

Best regards,


My understanding is that a OS patch can be applied while Microk8s is running.
After a reboot Microk8s will start including all applications within it.

If you have doubts, you can stop and start microk8s as follows:

sudo microk8s stop
Patch
reboot
sudo microk8s start
 


My understanding is that a OS patch can be applied while Microk8s is running.
After a reboot Microk8s will start including all applications within it.

If you have doubts, you can stop and start microk8s as follows:

sudo microk8s stop
Patch
reboot
sudo microk8s start
 

For a proper shutdown, do we need to do the command before the microk8s stop ?

 MCTL stop

  sudo microk8s stop

Patch

reboot

sudo microk8s start 

MCTI start

 


Again - it should not be necessary, but if you for some reason doubt microk8s is not functioning as it’s designed to do, then yes, you can stop IFS Cloud middle tier as well.

The pod statuses might look a bit more clean if you do so. After a VM restart (without stopping mickok8s or pods) you might see restarts on the pods that one might interpreted as error, but it is not - it is just the normal way k8s shows state after a k8s restart.

 


Again - it should not be necessary, but if you for some reason doubt microk8s is not functioning as it’s designed to do, then yes, you can stop IFS Cloud middle tier as well.

The pod statuses might look a bit more clean if you do so. After a VM restart (without stopping mickok8s or pods) you might see restarts on the pods that one might interpreted as error, but it is not - it is just the normal way k8s shows state after a k8s restart.

 

I just did a mctl stop and start, but I still have an issue with one pod/
ifsapp-connect NOT STARTED, ½ how can I solve this ?
 

Best Regards


We are experiencing the same issues. We have been running MT with 10 cores and 48 Gb RAM for one year without any issues.

After rebooting the MT to apply patches some PODS does not go online, specifically the reporting PODS. We will increase cores to 16 and test later today.

But the underlying problem is as described, the PODS require a lot of CPU/RAM but the actual utilization is appalling.

We are running Replicas: 3 and Scale: 100


 e are experiencing the same issues. We have been running MT with 10 cores and 48 Gb RAM for one year without any issues.

After rebooting the MT to apply patches some PODS does not go online, specifically the reporting PODS. We will increase cores to 16 and test later today.

But the underlying problem is as described, the PODS require a lot of CPU/RAM but the actual utilization is appalling.

We are running Replicas: 3 and Scale: 100

In our case: there are no users connected, the requirements are met and it worked before the reboot.

CPU(s):                               16 cores and 96 GB of ram


Status “NOT STARTED” is strange… i thought i seen them all :)

To see the underlying reason for that state do a “kubectl describe pod <connect-pod>” - last elines will show the event of the pod. This has to be done immediately after you get a state of interest - k8s events are only kept for a short time.
The event can also reside on the deployment “kubectl describe deploy ifsapp-connect” or seldom also on the replica set “kubectl describe rs <connect rs>”


The log you pasted is from the linkerd-proxy container in the ifsapp-connect pod… you need to specify the ifsapp-connect container to get the desired log: “kubectl logs <connect-pod> -c ifsapp-connect”

I agree, kubernetes memory management i “appalling” 
To not get OOM on individual nodes when some pods peak on a node we need to reserve peak memory for each pod, even if only part of their memory is used when the pods idle.


 e are experiencing the same issues. We have been running MT with 10 cores and 48 Gb RAM for one year without any issues.

After rebooting the MT to apply patches some PODS does not go online, specifically the reporting PODS. We will increase cores to 16 and test later today.

But the underlying problem is as described, the PODS require a lot of CPU/RAM but the actual utilization is appalling.

We are running Replicas: 3 and Scale: 100

In our case: there are no users connected, the requirements are met and it worked before the reboot.

CPU(s):                               16 cores and 96 GB of ram

You might have different issues than us. But we have deactivated certain pods that we do not use, to improve performance. 

For example in our ifscloud-values.yaml (we have deactivated a dozen pods):

ifsappextensibility:
replicas: 0

ifsappscim:
replicas: 0

ifsvirusscanner:
replicas: 0

ifsappsigningservice:
replicas: 0

ifsapprmpanel:
replicas: 0

ifsappnativeexecutor:
replicas: 0

ifsappnativenotification:
replicas: 0

ifsappnativeserver:
replicas: 0

ifsappreportingwebdesigner:
replicas: 0

ifsappreportingwebdesignerren:
replicas: 0

ifsappreportingwebruntimeren:
replicas: 0

ifsappreportingwebtransformer:
replicas: 0

ifsapppowerbisvc:
replicas: 0

ifsappbusmod:
replicas: 0

ifsappdoc:
replicas: 1


We reconfigured the MT servers with 16 cores and kept 48Gb RAM, after restarting the machines all the PODS started, also a bit quicker than usual.


Just made a script that can be used in an runtime env to see the used, allocated and available memory. - if it don’t work or you have questions on it - use ChatGPT 😁


------------------------------------------- podmemory.sh

#!/bin/bash


kubectl describe pods > ./descpods.txt

# Initialize total
total_mib=0

# Extract memory limits and sum them
while read -r mem; do
    #echo "$mem"
    if > $mem == *Mi ]]; then
        val=${mem%Mi}
        ((total_mib += val))
    elif l) $mem == *M ]]; then
        val=${mem%M}
        ((total_mib += val))
    fi
done < <(grep -A3 'Limits:'  ./descpods.txt | grep 'memory' | awk '{print $2}')

echo "Total Allocated Memory: $total_mib"


total_mib=0

kubectl top pods --no-headers > ./toppods.txt

while read -r mem; do
    #echo "$mem"
    if /> $mem == *Mi ]]; then
        val=${mem%Mi}
        ((total_mib += val))
    elif $mem == *M ]]; then
        val=${mem%M}
        ((total_mib += val))
    fi
done < <(cat  ./toppods.txt | awk '{print $3}')

echo "Total Used Memory: $total_mib"


 kubectl describe  node | grep -A6 Capacity


 rm ./descpods.txt ./toppods.txt
 


Hello,

Thanks for the feedback.

 

The mayor issue remains: why do we need to provide HUGE memory on MT Linux Server, it has a very high cost. Even for non production server.

 

We tried to strip down the required memory for 96 GB to 48 GB and the pods simply did not start.

Regarding the memory use:
 

Memory usage per pod in namespace 'mcicfg':
------------------------------------------------
ifs-file-storage-8f4f455b9-whnfg : 152 Mi
ifs-virus-scanner-78fb69cb6b-r9x99 : 1482 Mi
ifsapp-am-workqueue-89b5b684b-6sjm8 : 13 Mi
ifsapp-amm-5958f8b94f-67srb : 323 Mi
ifsapp-application-svc-75d6dcc866-svpss : 300 Mi
ifsapp-client-5d799fdf4d-chjkt : 10 Mi
ifsapp-client-notification-56d665d96f-ldpsw : 177 Mi
ifsapp-client-services-66fbbdc698-xt7ws : 582 Mi
ifsapp-connect-5fd8dd6d9d-jmrcb : 1355 Mi
ifsapp-doc-58f69f56f9-df6rg : 1120 Mi
ifsapp-docman-esign-7d46d6557b-28k5g : 149 Mi
ifsapp-extensibility-6d7456464d-k6vht : 85 Mi
ifsapp-iam-5f7b9cf6d9-lctk9 : 674 Mi
ifsapp-native-executor-7b985b8587-mxlhv : 196 Mi
ifsapp-native-notification-9dc7c6755-smj56 : 186 Mi
ifsapp-native-server-685d6fd57-78gbq : 717 Mi
ifsapp-odata-665c9c4666-pfzwg : 1197 Mi
ifsapp-proxy-dcd55db84-ds9xs : 21 Mi
ifsapp-reporting-5dc5b4c46-fsc4x : 488 Mi
ifsapp-reporting-br-65857d7469-x82qm : 103 Mi
ifsapp-reporting-ren-7f954d69dd-wblfq : 382 Mi
ifsapp-reporting-web-designer-6d5c95d4d9-hppx8 : 5 Mi
ifsapp-reporting-web-designer-ren-bdcd44c58-q929z : 85 Mi
ifsapp-rmpanel-bfc469479-4lfgj : 97 Mi
ifsapp-rmsync-764c597488-gfzp9 : 133 Mi
ifsapp-scim-6bcc5d8bc4-2tfdt : 120 Mi
ifsapp-signing-service-55f76477f7-gvb7b : 147 Mi
------------------------------------------------
Total memory used: 10299 Mi
PS E:\ifsroot>

 

As you can see it is barely 10 GB of memory used !! This is insane to have a server with 96GB. 
Where is it HARDCODED ?


You need to consider the specifications of the infrastructure you're using. For example, we're currently running our middleware VM on Azure using a Standard D8as v4 instance. We’ve stopped unused pods to optimize performance, and we're running IFS Cloud 24R with 2 replicas for OData pods, which works well for our setup.

Additionally, If you increase the scale value you’ll need to ensure that your VM has sufficient memory (more scale means more memory). 


Is there some documentation of the purpose of each pod so we can decide whether the pod needs to be started or not


Please check this document - https://docs.ifs.com/techdocs/25r1/070_remote_deploy/010_installing_fresh_system/010_planning_installation/007_middle_tier/010_overview/#kubernetes


Thanks for the link, but when looking at the linux journal, pods are just crahsing radomly when out of memory and if you are lucky and have loads of mémory, then you are event not aware about it.

So logon on you MT with the ifs credentials

journalctl --since "2025-08-07 11:20"      

And then you can find for a pod like "ifsapp-doc"


Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: GC Thread#0 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_>
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: CPU: 14 UID: 8983 PID: 3002637 Comm: GC Thread#0 Not tainted 6.14.0-1010-aws #1>
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: Hardware name: Amazon EC2 r5.4xlarge/, BIOS 1.0 10/16/2017
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: Call Trace:
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: <TASK>
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: dump_stack_lvl+0x76/0xa0
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: dump_stack+0x10/0x20
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: dump_header+0x48/0x220
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: oom_kill_process+0x116/0x270
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: out_of_memory+0x102/0x330
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: mem_cgroup_out_of_memory+0x13b/0x170
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: try_charge_memcg+0x40a/0x5a0
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: charge_memcg+0x34/0x70
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: __mem_cgroup_charge+0x31/0xa0
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: alloc_anon_folio+0x1cd/0x450
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: do_anonymous_page+0x16b/0x440
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: handle_pte_fault+0x1ce/0x1f0
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: __handle_mm_fault+0x62c/0x770
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: handle_mm_fault+0x106/0x2b0
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: do_user_addr_fault+0x1be/0x840
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: exc_page_fault+0x85/0x1a0
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: asm_exc_page_fault+0x27/0x30
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: RIP: 0033:0x72d894713050
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: Code: 4c 89 ff e8 82 95 f6 ff eb 97 48 39 f7 73 17 48 83 fa 01 75 15 0f 1f 44 0>
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: RSP: 002b:000072d88162d7b8 EFLAGS: 00010206
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: RAX: 000072d894d5d7e8 RBX: 000072d893ad9288 RCX: 0000000000000001
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: RDX: 0000000000001000 RSI: 00000000fe800000 RDI: 00000000e0291000
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: RBP: 000072d88162d7e0 R08: 0000000000000000 R09: 0000000000000000
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 000072d893ad9270
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: R13: 0000000040000000 R14: 000072d89085f800 R15: 000072d893ad9b38
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: </TASK>
Aug 07 11:20:33 MCIHPDIFSL02.d29.tes.local kernel: memory: usage 1562500kB, limit 1562500kB, failcnt 446

Does anyone has encountered this issue ??

This means:  1 562 500kB is someware hardcoded in the ifs core pod


Reply