Friday, January 22, 2016

Troubleshooting slow insert and latch: cache buffers chains



Recently we had an issue where App Support was complaining that they experience slow processing with batch loads.
Instead of processing each file in few seconds, it is not finishing even after 30 minutes. So they engaged DBA for investigation.

We noticed sudden surge in processing time from 6:15 am PST and subsequent load in CPU. We did not had any changes previous night.

So checked for high wait events where we started noticed latch: cache buffers chains wait event.



SAMPLE_TIME                    SESSION_ID    USER_ID SQL_ID        EVENT                          WAIT_CLASS       WAIT_TIME TIME_WAITED BLOCKING_SESSION INSTANCE_NUMBER
------------------------------ ---------- ---------- ------------- ------------------------------ --------------- ---------- ----------- ---------------- ---------------
02-DEC-15 11.16.12.283 AM             685         74 3uysnh37v9j6v latch: cache buffers chains    Concurrency              0      395981                                1
02-DEC-15 11.45.17.994 AM             205         74 3uysnh37v9j6v latch: cache buffers chains    Concurrency              0      332885                                1
02-DEC-15 11.31.25.164 AM            1433         74 3uysnh37v9j6v latch: cache buffers chains    Concurrency              0      302887                                1
02-DEC-15 11.31.25.164 AM             685         74 3uysnh37v9j6v latch: cache buffers chains    Concurrency              0      302082                                1
02-DEC-15 11.31.25.164 AM            1063         74 3uysnh37v9j6v latch: cache buffers chains    Concurrency              0      300508                                1
02-DEC-15 11.46.08.235 AM             685         74 3uysnh37v9j6v latch: cache buffers chains    Concurrency              0      300245                                1
02-DEC-15 11.16.02.217 AM             685         74 3uysnh37v9j6v latch: cache buffers chains    Concurrency              0      241780                                1
02-DEC-15 11.38.56.550 AM            1433         74 3uysnh37v9j6v latch: row cache objects       Concurrency              0      223027                                1

SQL ID:  3uysnh37v9j6v

insert into xabc_nrtrde 
(seqnum,rectype,imsi,imei,calltime,utcoffset,duration,cause,servicecode,
sscode,dialled,connected,thirdparty,recentid,callref,calling,
apnni,apnoi,datavolin,datavolout,sgsnaddr,ggsnaddr,chgid,neid)
values (seq_nrtrde_recnum.NEXTVAL,:p1,:p2,:p3,:p4,:p5,:p6,:p7,:p8,:p9,:p10,:p11,:p12,:p13,:p14,:p15,:p16,:p17,:p18,:p19,:p20,:p21,:p22,:p23)
Execution details for SQL ID:3uysnh37v9j6v. You could see that number of executions have come down drastically around 11:30 am and elapsed time per execution and buffer gets spiked up.

TXNDATETIME            INSTANCE_NUMBER PLAN_HASH_VALUE EXECUTIONS ELAPSED_TIME_PER_EXEC_MS ROWS_PER_EXEC DISK_READS_PER_EXEC BUFFER_GETS_PER_EXEC CPU_TIME_PER_EXEC  TOTALROWS AVGIORESPTIMEMS APPWAIT_DELTATIME_PER_EXEC_MS
---------------------- --------------- --------------- ---------- ------------------------ ------------- ------------------- -------------------- ----------------- ---------- --------------- -----------------------------
2015-12-02 05:45                     1      2311974949     612999                     .331             1                 .06                 7.64                 0     612993            .925                             0
2015-12-02 06:00                     1      2311974949     389008                     .267             1                 .05                 7.86                 0     389001           1.064                             0
2015-12-02 06:15                     1      2311974949     350089                     .288             1                 .05                  7.9                 0     350081           1.211                             0
2015-12-02 09:15                     1      2311974949     154871                   14.663             1                 .07               662.26              .003     154861           1.954                             0
2015-12-02 09:30                     1      2311974949      15316                  234.952             1                 .07             10963.88               .05      15316           4.587                             0
2015-12-02 09:45                     1      2311974949      14483                    250.6             1                 .07             10799.49              .053      14483           4.669                             0
2015-12-02 10:00                     1      2311974949      13731                  255.988             1                 .07             10129.57              .053      13730           4.353                             0
2015-12-02 10:15                     1      2311974949     333519                    8.662             1                 .08               328.05              .001     333509           2.112                             0
2015-12-02 10:30                     1      2311974949      14382                  246.728             1                 .07              10126.7              .048      14382           4.053                             0
2015-12-02 10:45                     1      2311974949      79687                   47.062             1                 .08              1885.24              .009      79685           2.142                             0
2015-12-02 11:00                     1      2311974949          0               4273869.19             0                   0            180304575            832.34          0               0                             0
2015-12-02 11:15                     1      2311974949      59023                   73.038             1                 .08              2887.12              .013      59019           2.423                             0
2015-12-02 11:30                     1      2311974949          0               4493535.76             0                   0            160664052            765.63          0               0                             0
2015-12-02 11:45                     1      2311974949          0               4496544.18             0                  39            153153458            773.05          0           5.901                             0
2015-12-02 12:00                     1      2311974949          0               4478637.42             0                   0            105239000            600.46          0               0                             0
2015-12-02 12:15                     1      2311974949          0                4238411.5             0                   0            164204052            900.73          0               0                             0
2015-12-02 12:30                     1      2311974949          5               102547.829             1                25.8            5440350.6            31.762          5           2.104                             0
2015-12-02 13:00                     1      2311974949      39951                   13.086             1                 .06               969.93              .005      39950           1.571                             0



So enabled tracing on one of the sessions to get more details and noticed a surprise.  Database is spending time going through recyclebin to reclaim space. We noticed very high number of dropped tables in DBA_RECYLEBIN as this application creates tables on the fly, process it and drop it.

=====================
PARSING IN CURSOR #4571527280 len=55 dep=1 uid=0 oct=7 lid=0 tim=23688084258045 hv=1974263057 ad='70000030196b230' sqlid='b52m6vduutr8j'
delete from RecycleBin$                     where bo=:1
END OF STMT
EXEC #4571527280:c=20000,e=51019,p=0,cr=3368,cu=0,mis=0,r=0,dep=1,og=4,plh=4234963248,tim=23688084258037
CLOSE #4571527280:c=0,e=14,dep=1,type=3,tim=23688084259183
XCTEND rlbk=0, rd_only=1, tim=23688084259221
=====================
PARSING IN CURSOR #4572432872 len=148 dep=1 uid=0 oct=3 lid=0 tim=23688084259288 hv=534114327 ad='7000002fab2b2f8' sqlid='0hhmdwwgxbw0r'
select obj#, type#, flags, related, bo, purgeobj, con#    from RecycleBin$    where ts#=:1 and to_number(bitand(flags, 16)) = 16    order by dropscn
END OF STMT
FETCH #4572432872:c=0,e=13,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=4,plh=815922003,tim=23688084259287
=====================

Dropped all objects from recyclebin to resolve current issue so that application will get back to a working state. Also advised Development team to add purge clause while dropping table  (eg: drop table MADHU_TEST purge; ) and wrote a cron job to clean recyclebin.