Archive for the ‘Administration’ Category
Now that you have got your audit table somewhere a little more sensible (i.e. not in the SYSTEM tablespace), there’s probably a policy about how many audit records should be kept.
Thoughtfully, the DBMS_AUDIT_MGMT package provides some of what you need to keep the audit records in check. However, a little more thought by Oracle would have helped. Lets see what I mean.
First we need to initialise for audit control. You can check to see if this has already been done as follows:
SET SERVEROUTPUT ON BEGIN IF DBMS_AUDIT_MGMT.IS_CLEANUP_INITIALIZED(DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD) THEN DBMS_OUTPUT.PUT_LINE('AUD$ is initialized for cleanup'); ELSE DBMS_OUTPUT.PUT_LINE('AUD$ is not initialized for cleanup.'); END IF; END; /
NOTE: To do this for Fine-Grained auditing, you need to use the constant DBMS_AUDIT_MGMT.AUDIT_TRAIL_FGA_STD instead, and check on table FGA_LOG$.
If cleanup is not initialised, you need to set it up as follows:
BEGIN DBMS_AUDIT_MGMT.INIT_CLEANUP (AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD, DEFAULT_CLEANUP_INTERVAL => 999 ); END; /
Some VERY important things to note here:
- The DEFAULT_CLEANUP_INTERVAL doesn't do anything (up to and including Rel 12.1). It's for "future use", apparently. However, if it is not specified, it has been associated with bugs in relation to cleanup to last timestamp not working and not cleaning anything up.
- If you have not already moved the audit tables AUD$ / FGA_LOG$ out of the SYSTEM tablespace, to any other tablespace this will move them for you, right now, whether desired or not, into SYSAUX.
- If you DEINIT_CLEANUP, it does not move the tables back to SYSTEM.
OK, we are initialised. We could call the creation of a purge job, which will wipe out all of our audit records (every 24 hours in this example), but that would be an unlikely requirement.
BEGIN DBMS_AUDIT_MGMT.CREATE_PURGE_JOB ( AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD, AUDIT_TRAIL_PURGE_INTERVAL => 24, AUDIT_TRAIL_PURGE_NAME => 'Purge_AUD$', USE_LAST_ARCH_TIMESTAMP => FALSE ); END; /
It's MORE likely we want to wipe out the last N days worth of records. To do this we need to set the point (LAST_ARCHIVE_TIMESTAMP) from which we want to retain records and wipe out everything before that. So lets set for a 30 day retention.
BEGIN DBMS_AUDIT_MGMT.SET_LAST_ARCHIVE_TIMESTAMP( AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD, LAST_ARCHIVE_TIME => systimestamp-30, RAC_INSTANCE_NUMBER => 1 ); END; /
And check it
select * from DBA_AUDIT_MGMT_LAST_ARCH_TS; AUDIT_TRAIL RAC_INSTANCE LAST_ARCHIVE_TS ---------------------- ------------ ------------------------------------ STANDARD AUDIT TRAIL 0 17-MAY-14 11.00:01.000000 PM +00:00
Excellent. Now we create a job as before with "USE_LAST_ARCH_TIMESTAMP => TRUE" and all is good, EXCEPT that nothing is moving the timestamp forward.
The job will be called, purge the old records and that's it. When it is next invoked, the timestamp will not have moved on. We therefore need another job to move the timestamp on... SO why bother setting up a job with these automatic routines if it doesn't automate all of the requirement? Bit annoying that. I just create my own scheduled job with 2 calls, and forget the built-in (half a) job aspect of the management system:
BEGIN DBMS_AUDIT_MGMT.SET_LAST_ARCHIVE_TIMESTAMP( AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD, LAST_ARCHIVE_TIME => systimestamp-30, RAC_INSTANCE_NUMBER => 1 ); DBMS_AUDIT_MGMT.clean_audit_trail( audit_trail_type => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD, use_last_arch_timestamp => TRUE); END; /
OK, the audit management system is pretty good; it deletes in batches, it works well, and it doesn't need much from Oracle to make it much better. 7/10. Good but could do better.
You need to check out the associated views which show you the basic system config and what's going on:
DBA_AUDIT_MGMT_CLEAN_EVENTS Displays the cleanup event history DBA_AUDIT_MGMT_CLEANUP_JOBS Displays the currently configured audit trail purge jobs DBA_AUDIT_MGMT_CONFIG_PARAMS Displays the currently configured audit trail properties DBA_AUDIT_MGMT_LAST_ARCH_TS Displays the last archive timestamps set for the audit trails
OK - that should keep things nice and tidy in the database. What about the audit files on the OS? Find out about that in Part 3.
One of the oldest problems with the Auditing capabilities within Oracle is that the SYS.AUD$ table resides in the SYSTEM tablespace. Unless you are rigorous in ensuring that your audit records are routinely pruned to keep the table manageable, it can single-handedly make the SYSTEM tablespace enormous.
Historically, we used to move the table and its associated objects to a new tablespace ourselves. In Oracle 7 is was a drop and re-create. Later we performed an alter table … move; command, coupled with an alter index rebuild. However, some bits frequently got left behind doing this…
In Oracle 10, a new package appeared: DBMS_AUDIT_MGMT. The procedure SET_AUDIT_TRAIL_LOCATION allowed you to move the table to a new tablespace. It didn’t work properly. It didn’t move indexes or LOB segments, and shouldn’t be used. However, roll on Oracle 11 and the (obvious) bugs have been ironed-out.
First of all, moving the table (NOTE: If the table is big, this may take quite a while. Only do this at a period of low system activity to avoid potential locking issues at the start and end of the move):
BEGIN DBMS_AUDIT_MGMT.SET_AUDIT_TRAIL_LOCATION( AUDIT_TRAIL_TYPE => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD, AUDIT_TRAIL_LOCATION_VALUE => 'SYSAUX'); END; /
This works a treat in Oracle 11 and 12 for the standard audit trail, and for fine-grained auditing. It successfully moved every object associated with SYS.AUD$.
select owner,table_name,tablespace_name from dba_tables where table_name = 'AUD$' OWNER TABLE_NAME TABLESPACE_NAME ------------------------------ ------------------------------ -------------------------- SYS AUD$ SYSAUX
select owner,table_name,tablespace_name from dba_lobs where table_name = 'AUD$' OWNER TABLE_NAME TABLESPACE_NAME ------------------------------ ------------------------------ -------------------------- SYS AUD$ SYSAUX SYS AUD$ SYSAUX select owner,table_name,tablespace_name from dba_indexes where table_name = 'AUD$' OWNER TABLE_NAME TABLESPACE_NAME ------------------------------ ------------------------------ -------------------------- SYS AUD$ SYSAUX SYS AUD$ SYSAUX
WARNING! Oracle still say that AUD$ should be in the system tablespace for upgrades. I can't find anything that supercedes that, despite moving the table now being supported by an official package that works.
These are a bit of a pain as their location isn’t intuitive. So to remind me where they are:
Login to the server(s) as the grid owner and check the scan listener status. This will show you the location of the listener log. cd to just below the diag directory and you’re off!:
server-name:/u01/grid>ps -ef | grep SCAN grid 8542 8282 0 10:20 pts/0 00:00:00 grep SCAN grid 9349 1 0 Mar07 ? 00:07:33 /u01/app/11g/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
server-name:/u01/grid>lsnrctl status LISTENER_SCAN1
LSNRCTL for Linux: Version 126.96.36.199.0 - Production on 28-MAY-2014 10:20:12
Copyright (c) 1991, 2013, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1))) STATUS of the LISTENER ------------------------ Alias LISTENER_SCAN1 Version TNSLSNR for Linux: Version 188.8.131.52.0 - Production Start Date 07-MAR-2014 17:27:50 Uptime 81 days 15 hr. 52 min. 21 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /u01/app/11g/grid/network/admin/listener.ora Listener Log File /u01/app/11g/grid/log/diag/tnslsnr/server-name/listener_scan1/alert/log.xml Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN1))) (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.6.148.141)(PORT=1521))) Services Summary... Service "FRONT_APP_DB_SVC.WORLD" has 2 instance(s). Instance "ORCL1", status READY, has 1 handler(s) for this service... Instance "ORCL2", status READY, has 1 handler(s) for this service... Service "ORCL.WORLD" has 2 instance(s). Instance "ORCL1", status READY, has 1 handler(s) for this service... Instance "ORCL2", status READY, has 1 handler(s) for this service... Service "ORCLXDB.WORLD" has 2 instance(s). Instance "ORCL1", status READY, has 1 handler(s) for this service... Instance "ORCL2", status READY, has 1 handler(s) for this service... The command completed successfully
ADRCI: Release 184.108.40.206.0 - Production on Wed May 28 10:22:47 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
ADR base = "/u01/app/11g/grid/log" adrci> show homes ADR Homes: diag/asmcmd/user_grid/server-name diag/tnslsnr/server-name/listener_scan3 diag/tnslsnr/server-name/listener_scan2 diag/tnslsnr/server-name/listener_scan1 diag/asmtool/user_root/host_3797755080_80
adrci> show alert
Choose the alert log from the following homes to view:
1: diag/asmcmd/user_grid/server-name 2: diag/tnslsnr/server-name/listener_scan3 3: diag/tnslsnr/server-name/listener_scan2 4: diag/tnslsnr/server-name/listener_scan1 5: diag/asmtool/user_root/host_3797755080_80 Q: to quit
Please select option: 4
and there we are...
remember to be on the correct host for each scan,
otherwise the alert (listener) log file will be out of date.
When you end up spending a far great a percentage of your day than seems sensible killing off Java connections that Developers have carelessly left lying around, locking objects all over the place, you need a solution to get them to go away. The solution is to let them do it themselves!
I’m not advocating granting ALTER SYSTEM to Developers! That way madness lies, or certainly some unintended consequences. I’m all for Dev’s having a lot of freedom in the database, just not freedom with the database.
So, creating a stored procedure (in this example as sys, but as any user with an explicit ALTER SYSTEM privilege granted will do) to kill sessions without allowing too much latitude to do anything else seems appropriate. Here’s one I built earlier:
create or replace procedure sys.kill_session ( p_sid IN number, p_serial IN number, p_instance IN number) as -- Neil Chandler. Grant the ability to kill session on a restricted basis. 21.07.2010 l_username varchar2(30) := null; l_priv number := 1; begin -- Who owns the session? select username into l_username from gv$session where sid = p_sid and serial#=p_serial and inst_id = p_instance; -- Check for DBA role select count(*) into l_priv from dba_role_privs where grantee = l_username and granted_role = 'DBA'; -- If the user has the DBA priv, deny the kill request if l_priv > 0 or l_username is null then dbms_output.put_line ('User request to kill session '||p_sid||','||p_serial||',@'||p_instance|| ' denied. Session is for privileged user '||l_username||'.'); else dbms_output.put_line ('Killing user '||l_username||' - '||p_sid||','||p_serial||',@'||p_instance); execute immediate 'alter system disconnect session '''|| p_sid||','||p_serial||',@'||p_instance|| ''' immediate'; end if; end; / -- and let the proc be seen and used create or replace public synonym kill_session for sys.kill_session; grant execute on kill_session to (whomever); Then a nifty bit of sql to generate the kill commands for the Developers. Please include your favourite columns from gv$session: select username,status,blocking_session, 'exec kill_session ('|| sid||','||serial#||','||inst_id||')' Kill_Command from gv$session where username is not null and type <> 'BACKGROUND' / USERNAME STATUS BLOCKING_SESSION KILL_COMMAND --------- ------- ---------------- ------------------------------------ SYS ACTIVE exec kill_session (31,65,1) SYSTEM INACTIVE exec kill_session (58,207,1) USER_1 INACTIVE exec kill_session (59,404,1) USER_2 INACTIVE exec kill_session (72,138,1) USER_2 INACTIVE exec kill_session (46,99,2)
May the odds be forever in your favour. Let the killing commence...
When installing Oracle Grid Infrastructure 11.2 (and all other releases), you need to make sure that you have all of the server setting correct and to standard before you do the install. One that bit me recently was the timezone setting. The Red Hat 6.4 server(s) in question has the correct file in /etc/localtime (copied from /usr/share/zoneinfo/whatever). If I type in date, I get the reply in the correct timezone (GMT/BST as I’m in London), so all seems correct.
However, the slack Unix Sysadmin (which might or might not have been me) had not put the correct setting in /etc/sysconfig/clock. Unfortunately, when you install Grid Infrastructure, the setting is read from /etc/sysconfig/clock and embedded into a Grid Inforastructure config file. $GRID_HOME/crs/install/s_crsconfig_hostname_env.txt
### This file can be used to modify the NLS_LANG environment variable, which determines the charset to be used for messages.
### For example, a new charset can be configured by setting NLS_LANG=JAPANESE_JAPAN.UTF8
### Do not modify this file except to change NLS_LANG, or under the direction of Oracle Support Services
If you change this entry, and you should check with Oracle Support if this is OK for your site, and you will need to restart Grid Infrastructure. The one thing about this that I really don’t like is that Oracle is storing a runtime configuration file in a an install directory. Does it do that anywhere else?
So, you’re creating (or rebuilding) an index ONLINE on a busy system. Your session dies, or it becomes necessary to kill the command, you may find that Oracle does not (always manage to) automatically clean up after itself.
CREATE INDEX my_ind ON my_table (mycol ASC) LOCAL LOGGING COMPRESS 1 ONLINE; (ctrl-c) ORA-01013: user requested cancel of current operation select * from user_indexes where index_name = 'my_ind'; INDEX_NAME INDEX_TYPE my_ind NORMAL
OMG! WTF! TLA's! The index is there, even though I cancelled the create statement! Lets drop it...
drop index my_ind; * ERROR at line 1: ORA-08104: this index object <B>79722</B> is being online built or rebuilt
So, HOW do I sort out this mess? Use DBMS_REPAIR!
1 declare 2 lv_ret BOOLEAN; 3 begin 4 lv_ret := dbms_repair.online_index_clean(79722); 5* end; select * from user_indexes where index_name = 'ind_name'; no rows selected
Bang! and the index (or, rather, left-over temporary extents from the build attempt) is gone, ready for you to try again.
In Oracle 11G, Oracle introduced SQL Plan Management (SPM). It is excellent (I love it to bits). It allows you to create Baselines against SQL which lock-down the SQL execution plan. No more plan flips. More consistency. Perfect.
Whenever some Baselined SQL is ran, Oracle still parses it and compares the parsed output to the accepted (Evolved) baselines. If the newly parsed plan is better, a new baseline is added to DBA_SQL_PLAN_BASELINES but is NOT accepted. This means that you need to spend time manually accepting the baseline; running the command DBMS_SPM.EVOLVE_SQL_BASELINE plan and checking the new plan. IF you want it, and/or Oracle evaluates that is it a better plan for that particular set of bind variables, the plan is accepted and becomes a candidate to be used by future execution of your SQL. Complete control over your execution plans.
So, Oracle, what’s all this about in Oracle 12C, eh?
In Oracle 12C there’s a new SPM Evolve advisor task. “By default,
SYS_AUTO_SPM_EVOLVE_TASK runs daily in the scheduled maintenance window" - So, it runs every night and by default it runs DBMS_SPM.EVOLVE_SQL_BASELINE for all new baselines created today and automatically accepts the new plans.
BY DEFAULT? NO! the Oracle, NO!
That is precisely what I don't want from baselines - Oracle making it's own mind up about plans without any input from me. I'm using baselines to stop Oracle changing its mind. To explicitly limit the number of paths allowed by the Optimizer to ones I know about and with which I am comfortable. Don't introduce functionality to do the opposite.
So, immediately following the installation of 12C, I would recommend running (you need to be SYS for this):
SELECT PARAMETER_NAME, PARAMETER_VALUE AS "VALUE" FROM DBA_ADVISOR_PARAMETERS WHERE ( (TASK_NAME = 'SYS_AUTO_SPM_EVOLVE_TASK') AND ( (PARAMETER_NAME = 'ACCEPT_PLANS') OR (PARAMETER_NAME = 'TIME_LIMIT') ) ); PARAMETER_NAME VALUE ------------------------- ---------- ACCEPT_PLANS TRUE TIME_LIMIT 3600
BEGIN DBMS_SPM.SET_EVOLVE_TASK_PARAMETER('SYS_AUTO_SPM_EVOLVE_TASK', 'ACCEPT_PLANS', 'false'); END; /
OK, back where we were, with any baselines fixed in place and doing what I want them to do! Not change.
Running RAC? (Why? No, really, WHY? Never heard of DataGuard? With a broker?)
Not sure if you’ve configured it correctly?
Not sure if you have all of the recommended initialisation parameters set?
All recommended RPM’s installed?
All daemons running?
etc, etc, etc,
Well, as of Oracle 220.127.116.11 where’s a new feature provided by default called RACCheck. You can find it installed in directory $ORACLE_HOME/suptools/raccheck, (or you can download it from MOS article 1268927.1) and it’s called “raccheck”. With a little sudo configuration, or the root passwords, you can check the configuration on every node in a few minutes per node (run at a sensible time). All the basics appear to be covered, and you get a nice list of anomalies out of the system in HTML format.
I don’t necessarily agree with some of the errors/warnings produced (you might want the “problems” it’s finding!), but it gives you cause to re-think about an element of the system that may be configured in a non-standard way, and you get lots of relevant and useful links to MOS articles.
e.g. One problem:
|WARNING||SQL Check||Some user sessions lack proper failover mode (BASIC) and method (SELECT)||All Databases|
Can be happily ignored as I’m using a SCAN listener, which renders this WARNING irrelevant.
but I would recommend that you use the utility and accept/understand any exceptions. It should help stabilise any RAC installations you may have.
Some days you just forget the dot all of the i’s.
I had just installed a new RAC cluster, got it all up and running and was using DBCONSOLE to check the system out – no access to the Production Grid Control for this cluster yet. I then made a few more configuration changes and restarted one of the nodes. I was rather surprised that the console could no longer access the system. It was claiming the instance was down, and asking for server logins to allow restart. I was quite sure the instance was available, mainly because I was connected using SQL Developer and executing queries.
So, what went wrong? What config had changed before I restarted the nodes? I checked my notes and… I was hardening passwords. One of the passwords I changed was the SYSMAN password. However, I had completely neglected to inform the EM agent for the console that I had changed the password! Idiot.
cd $ORACLE_HOME/<node_database>/sysman/config vi emoms.properties change: - oracle.sysman.eml.mntr.emdRepPwd=<clear-text-password> - oracle.sysman.eml.mntr.emdRepPwdEncrypted=FLASE
emctl stop dbconsole
emctl start dbconsole
...and all is well again
This blog entry was brought to you by Pierrot.