Question

IFS10 - CPU 100% and very slow system

5 years ago
November 11, 2019
34 replies
6954 views

SORBJAHAU
Sidekick (Customer)
55 replies

After we upgraded to IFS10 we are struggling with slowness in our Oracle DB. We haven’t been able to pinpoint the cause of our problems. But very often (5-10 times each week), we see that the CPU goes to 100% and doesn’t let go. Every user in our company are then struggling with slowness when this happens. As an example, saving a time registration can take 3-4 minutes.

Today, a restart of the DB server was necessary to get it back to normal.

We have used the query below to find CPU consuming operations, and have also checked background jobs and DMBS jobs. Nothing special has been running when the problems occur, or at least nothing that helps us to conclude on a specific problem.

select s.username,
 s.inst_id,
 s.sid,
 s.serial#,
 s.status,
 (select value from gv$sesstat natural join v$statname where name = 'physical reads' and sid=s.sid)disk_reads,
 (select value from gv$sesstat natural join v$statname where name = 'session logical reads' and sid=s.sid) logical_reads,
 (select value/100 cpu_sec from gv$sesstat natural join v$statname where name = 'CPU used by this session' and sid=s.sid)cpu_sec,
 s.logon_time,
 s.program,
 s.osuser,
 s.machine,
 p.pid,
 p.spid,
 s.client_identifier,
 s.client_info,
 s.module,
 s.action,
 s.event,
 s.wait_class,
 case
 when s.status = 'ACTIVE' then
 s.last_call_et
 else
 null
 end wait_sec,
 s.blocking_session_status,
 s.blocking_instance,
 s.blocking_session,
 s.sql_id,
 s.sql_hash_value,
 s.sql_address,
 s.prev_sql_id,
 s.prev_hash_value,
 s.prev_sql_addr,
 s.sql_child_number,
 s.prev_child_number,
 s.server
 from gv$session s join gv$process p on (s.inst_id = p.inst_id and s.paddr = p.addr)
 where s.type = 'USER' and s.status='ACTIVE'
 order by disk_reads desc

We have integrations and scheduled tasks running, all which have been running fine in IFS 8.1 for years.

After googling, we some people says that Oracle 12.2 is not as effective as 11, but we can’t verify this.

Does anyone have some tips to solve these things, or at least find the root cause when the problems occur? Appreciate all help we can get.

Some facts:

Around 5-700 concurrent users
Using Azure AD for authentication
Using around major 10 Modules in IFS (DocMan, Finance, Engineering, SCM, etc)
DB size around 600GB
Running reports with Crystal Reports (set up via web, running queries towards IFS)

Page 1 / 2

Nadeesh Herath
Hero (Employee)
74 replies
5 years ago
November 11, 2019

Presuming most of the CPU is being consumed by the Oracle service,

Have you checked the memory usage of the server whilst CPU hits 100%? also does CPU reach 100% only when certain batch jobs execute? (you can check this by querying the executing background jobs when CPU hits 100%). This should shed some light on the issue.

If there are no background jobs running and CPU hits 100% and there is no significant errors found in the Oracle alert log, I believe an Oracle Stats pack report or AWR report has to be analyzed in order for any issues.

Nadeesh Herath

SORBJAHAU
Author
Sidekick (Customer)
55 replies
5 years ago
November 11, 2019

@NadeeshHerath , correct. All CPU is consumed by Oracle Service. The memory usage (tot 96GB) is constantly at around 90%, I think this is normal behavior.

We checked the background jobs when the CPU hits 100% Nothing time consuming or big things was executing, so no findings there.

We discovered that the SWAP drive was not configured according to increased RAM, and setup as SSD. This has been fixed now, waiting to check if this has an effect.

We have also played with the parameter OPTIMIZER_FEATURES_ENABLE, setting this to 11.2.0.4. And turned off parallel execution. Not 100% if these settings has any effect yet. Does anyone have experience with this?

We haven’t checked the Oracle Alert log, but I would assume we would’ve heard from our DB partner company if anything is wrong there. Can check with them. I also thing we should look at the Oracle Stats pack or AWR reports.

alanbclark
Sidekick (Partner)
60 replies
5 years ago
November 11, 2019

Pinpointing these issues may take several queries and a bit of monitoring to pin down. Can you please provide the following information.

The server has 96 GB of memory installed. How much memory is being used by the “oracle.exe” process(es)? Please confirm that this server only hosts one database.
What are the values of the following Oracle parameters?
- compatible
- cpu_count
- db_cache_size
- job_queue_processes
- optimizer_deatures_enable
- optimizer_index_cost_adj
- pga_aggregate_target
- processes
- sessions
- sga_max_size
- sga_target
- shared_pool_size
- sort_area_size
How many cores are assigned to this server?
Is the server physical or virtual?
Are the table statistics current (withing the past week)?
Have the indexes been validated lately?
What is the operating system version?
Do you have a utility installed to view the threads under the “oracle.exe” process to see what type of process (worker, execution, job, etc) is consuming the CPU? All sessions run as a separate thread under the “oracle.exe” process.
If the issue went away when the database was restarted it was either a background job (IFS or database) or an interactive process launched by a user. Please comment.
What is your schedule for updating the IFS search indexes on objects? Do you have all of them enabled?
When the excessive CPU was found was the database creating large numbers of archive logs? This will let us know if the process was performing any inserts, updates, or deletions as part of the processing.
Do you have the in-memory option turned on?

-Alan

SORBJAHAU
Author
Sidekick (Customer)
55 replies
5 years ago
November 12, 2019

@alanbclark , thanks for looking at this

81.6 GB. Yes, it only serves one database
- compatible: 12.2.0
- cpu_count: 16
- db_cache_size: 40G
- job_queue_processes: 100
- optimizer_features_enable: 11.2.0.4
- optimizer_index_cost_adj: 10
- pga_aggregate_target: 2000M
- processes: 800
- sessions: 1232
- sga_max_size: 80G
- sga_target: 80G
- shared_pool_size: 4G
- sort_area_size=65536
4 sockets / 16 vCPU
Virtual
Yes
Yes
Windows Server 2016 (1607)
Yes
The issue went away after the restart, and we didn’t notice any particular time consuming jobs or processes running before the restart.
Rebuild indexes are scheduled to run every 2. month. It doesn’t seems to have run after the upgrade.
No
Not in use

+19

durette
Superhero (Customer)
526 replies
5 years ago
November 12, 2019

Intelligent tuning is only possible with good measurements. If you’re not comfortable with STATSPACK (included with every version of Oracle) or AWR (included if you’ve separately licensed the Oracle Tuning Pack on top of Enterprise Edition), I highly recommend Database Performance Analyzer.

https://www.solarwinds.com/database-performance-analyzer

If you find this forum useful, please consider paying back to the community in your own areas of expertise. If you ask for help in a private message, please know that I will likely repost it as a forum question (with your personal details redacted) and answer it here. -- Kevin Durette - https://www.linkedin.com/in/kevindurette

alanbclark
Sidekick (Partner)
60 replies
5 years ago
November 12, 2019

As an alternative to a $2,000 utility, here is a query that will show sessions ranked by CPU usage and shows helpful information about the session so you can track down the process/user. The query displays helpful information like username, machine, program, type, event, and the sql text/command being referenced.

select sess_cpu.con_id,
sess_cpu.sid,
NVL(sess_cpu.username, 'Oracle Process') username,
sess_cpu.status,
sess_cpu.logon_time,
sess_cpu.machine, sess_cpu.program, sess_cpu.type, sess_cpu.event,
round((sysdate - sess_cpu.logon_time)*1440*60) logon_SECS,
sess_cpu.value/100 SESS_CPU_SECS, (sess_cpu.value - call_cpu.value)/100 LAST_CPU_SECS,
round((sess_cpu.value/100)/round((sysdate - sess_cpu.logon_time)*1440*60)*100,2) Percent,
sess_cpu.sql_id,
(select sql_text from v$sql sql where sql.sql_id = sess_cpu.sql_id) sql_text
from
(select se.con_id, se.sql_id, ss.statistic#, se.sid, se.username, se.status, se.program, se.logon_time, sn.name, ss.value,
se.machine, se.type, se.event
from v$session se, v$sesstat ss, v$statname sn
where se.sid = ss.sid
and sn.statistic# = ss.statistic#
and sn.name in ('CPU used by this session') ) sess_cpu,
(select se.con_id, ss.statistic#, se.sid, ss.value, value/100 seconds
from v$session se, v$sesstat ss, v$statname sn
where se.sid = ss.sid
and sn.statistic# = ss.statistic#
and sn.name in ('CPU used when call started') ) call_cpu
where sess_cpu.sid = call_cpu.sid
and sess_cpu.con_id = call_cpu.con_id
and round((sysdate - sess_cpu.logon_time)*1440*60) > 0
order by percent desc;

-Alan

+18

Srikanth
Superhero (Partner)
291 replies
5 years ago
November 12, 2019

Hello,

It is interesting to note that your Apps 10 database is facing performance issues, especially CPU spiking at 100%. In our experience, Apps 10 database is designed to perform fast.

Please check to see if there is any Oracle maintenance jobs running as per the Oracle alert log around the same time when the CPU spikes up. We had a client who ran into a similar issue running Oracle 12.1.0.2 and it turned out to be a maintenance job. The issue went away once the job was disabled.
Check for blocking locks in the database using the query below. This also causes high CPU usage.
------- REM IFS SQL to find blocking locks ----------------
select logon_time, sid, serial#, blocking_session, program, seconds_in_wait, machine, client_info, username "ORACLE USERNAME", wait_class, audsid, process
from v$session
where blocking_session is not NULL
order by logon_time;

Please check these two first and keep me posted.

Cheers, Srikanth

SORBJAHAU
Author
Sidekick (Customer)
55 replies
5 years ago
November 12, 2019

alanbclark wrote:

I’m getting an error running this

alanbclark
Sidekick (Partner)
60 replies
5 years ago
November 12, 2019

Try this one. I removed an inner select statement that displays the SQL_TEXT of the command being run. You can find the SQL commands being run by querying the v$sql VIEW and matching the SQL_ID being returned form the query shown below.

select sess_cpu.con_id,
sess_cpu.sid,
NVL(sess_cpu.username, 'Oracle Process') username,
sess_cpu.status,
sess_cpu.logon_time,
sess_cpu.machine, sess_cpu.program, sess_cpu.type, sess_cpu.event,
round((sysdate - sess_cpu.logon_time)*1440*60) logon_SECS,
sess_cpu.value/100 SESS_CPU_SECS, (sess_cpu.value - call_cpu.value)/100 LAST_CPU_SECS,
round((sess_cpu.value/100)/round((sysdate - sess_cpu.logon_time)*1440*60)*100,2) Percent,
sess_cpu.sql_id
from
(select se.con_id, se.sql_id, ss.statistic#, se.sid, se.username, se.status, se.program, se.logon_time, sn.name, ss.value,
se.machine, se.type, se.event
from v$session se, v$sesstat ss, v$statname sn
where se.sid = ss.sid
and sn.statistic# = ss.statistic#
and sn.name in ('CPU used by this session') ) sess_cpu,
(select se.con_id, ss.statistic#, se.sid, ss.value, value/100 seconds
from v$session se, v$sesstat ss, v$statname sn
where se.sid = ss.sid
and sn.statistic# = ss.statistic#
and sn.name in ('CPU used when call started') ) call_cpu
where sess_cpu.sid = call_cpu.sid
and sess_cpu.con_id = call_cpu.con_id
and round((sysdate - sess_cpu.logon_time)*1440*60) > 0
order by percent desc;

-Alan

SORBJAHAU
Author
Sidekick (Customer)
55 replies
5 years ago
November 12, 2019

Thanks @alanbclark . Is there any information to get from showing the INACTIVE sessions? I see that the query I posted in my first post shows almost the same information, but only showing the ACTIVE sessions. My also shows the SQL running.

+19

durette
Superhero (Customer)
526 replies
5 years ago
November 13, 2019

You'll need to also join by the SQL child ID. If it's null on the session, you'll want 0 on the library cache. This happens because the same query can be executed with multiple plans depending on table statistics, available indexes, and the parsing schema.

Fredrik Stenén
Do Gooder (Employee)
5 replies
5 years ago
November 13, 2019

Hi,

Could you if possible get a statpack report for that system during the time period when the problem occur.

Could you also describe the typical user behavior during this time frame?

Send the statpack report to me for further investigations in IFS Benchmarking lab.

+21

Sajith D
Superhero (Employee)
427 replies
5 years ago
November 13, 2019

Hi,

Is this a Multi-Tenant database or a standalone database? Are you running the base version of Oracle 12.2. or have you applied the later patch releases? There were significant performance issues related to Multi-tenant databases in the base release which were fixed in later PSU’s.

+19

durette
Superhero (Customer)
526 replies
5 years ago
November 13, 2019

@alanbclark , this is definitely going in my toolkit.

I’d like to suggest a few edits:

∙ You can usually look at the CLIENT_INFO to get username information for IFSSYS sessions.
∙ As I said, V$SQL may include multiple copies of a given query for things like index changes or for a different parsing user. I added the child number to make a unique join.
∙ It might be helpful to separate the session CPU from the call CPU. Otherwise, a session that starts out idle and then later does a lot of work might get underreported.
∙ I joined to active background jobs.

1 Attachments

top_sessions_recently.txt

+11

SHAUN_KERSLAKE
Hero (Customer)
115 replies
5 years ago
November 14, 2019

Thanks for the SQL; this is really interesting and something i will use for analysis on any performance issues.

I notice a difference between @durette and @alanbclark SQL scripts in that the percentage columns return very different results and this is the field used for ordering the highest consuming job. @alanbclark percentage column seems to have values for most lines and the value seems sensible, whereas @durette only a small number have a value returned in percentage and the rest are empty.

Are you able to explain why different?

Which of the two SQL statements should be more accurate to use?

Regards

Shaun

+19

durette
Superhero (Customer)
526 replies
5 years ago
November 14, 2019

I reported the average session CPU separately from the average call CPU. Most sessions will tend to be currently idle.

SORBJAHAU
Author
Sidekick (Customer)
55 replies
5 years ago
November 15, 2019

Sajith D wrote:

It’s a standalone database. We are running latest PSU’s

+15

ChanakaAmarasekara
Hero (Employee)
340 replies
5 years ago
November 17, 2019

From a Crystal Reports framework perspective make sure you have installed the new version of the Crystal Web Service and the Crystal .net runtime that comes with Apps 10.

Also you can enable the CWS logs and see how much time it takes to process and generate the layout.

- Chanaka Am -

SORBJAHAU
Author
Sidekick (Customer)
55 replies
5 years ago
November 21, 2019

Just wanted to give an update on this.

We did three changes and haven’t seen the CPU spike to 100% after this:

Changed Parallel Execution from 640 to 0 (alter system set parallel_max_servers=0 scope=both;)
Changed optimizer_features_enable to 11.2.0.4
Changed disks handling for swapfile, changed to SSD and size according to RAM

We are still struggling with slower query results than in IFS8, much more waiting for our users. If anyone have tips on how what to check, any help will be appreciated! Especially the HR checks (User_access_api) and Language/Translation checks (DOMAIN_SYS) seems to take much longer time in IFS10. We are only using one language in our IFS version. Could maybe be an configuration error in our system?

terje
Do Gooder (Customer)
2 replies
5 years ago
November 21, 2019

If your Database server is swapping, you really need to add more memory.

Another reason for poor performance can be backup jobs, either RMAN or Server backup

SORBJAHAU
Author
Sidekick (Customer)
55 replies
5 years ago
November 21, 2019

@terje , we have 96GB of RAM and the SSD swap drive is set to 200GB. Would think this is enough?

We'll check our RMAN and Server backup.

terje
Do Gooder (Customer)
2 replies
5 years ago
November 21, 2019

SSD swap area is a safety net, that only should be used in rare situations.

If you OS or DB is swapping, during daily operations, your performance will be downgraded.

81,6 GB SGA should be enough, if the db parameters are optimized.

If you have swapping on OS level, you need give more memory to the OS, reducing the DB SGA.

alanbclark
Sidekick (Partner)
60 replies
5 years ago
November 21, 2019

It would be interesting to know what and how many objects are available for parallel execution. Most IFS objects have a parallel processing degree of 1, and therefore won’t start up parallel processes.

SELECT owner, table_name, degree
FROM dba_tables
where trim(degree) <> '1'
order by owner, table_name;

Here’s q query where you can view parallel statistics should it still be enabled on other databases. Needs privilege to run this query.

select * from v$pq_sysstat;

-Alan

Fredrik Stenén
Do Gooder (Employee)
5 replies
5 years ago
November 22, 2019

If you look in your top SQL you might see:

select toid from type$ where…

It is related to metadata loading in Aurena and consumes quite some resources during metadata load. This is due to a bug in Oracle which can be resolved by applying a patch. Basically the patch solves a missing sys index.

Search Bug 25154887 : RECURSIVE QUERIES ON TYPE$ RUN DURING DESC SLOW IN 12C in Oracle Metalink.

//Fredrik

SORBJAHAU
Author
Sidekick (Customer)
55 replies
5 years ago
November 22, 2019

@alanbclark, the first query return 0 records. Does this mean that disabling parallel execution is not the reason why it’s better now?

@terje , our Oracle DB partner doubt that RMAN is the cause of slowness. Daily incremental backup at 19:00, runs for 20min, log backup each half hour, but this is finished in 1 minute. Full backup runs for a while, but this is only running in weekends.

We’ll check the swapping settings.

Page 1 / 2

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos

1 Attachments

Reply

Related Topics

Sim kaart doet het niet meericon

simkaart doet het niet meericon

Nummerbehoud SMSicon

4g werkt nieticon

Simkaart Simpel in iPhone12icon

Did you find what you're looking for? If not:

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings