Friday, December 12, 2014

RMAN Backups on NFS

There are many options for mounting NFS to be used with various oracle files (binaries, datafiles, OCR). This post shows results of three mount options used and backup time for a single tablespace (size 16GB) using RMAN. The database is a single instance 11.2.0.3 database. Both NFS server and client were on the same network segment connected via a single switch.
NFS server export file content
 more /etc/exports
/opt/backup 192.168.0.66(rw,sync,all_squash,insecure,anonuid=500,anongid=500)
anonuid and anongid represent user id and group id of oracle user on NFS client (server running the DB). no_root_squash has not been used.
Options used for mounting the NFS are
1. mount -t nfs -o rw,hard,rsize=32768,wsize=32768,vers=3,nointr,timeo=600,tcp,nolock 192.168.0.76:/opt/backup /opt/backup (backup time : 06:45)

2. mount -t nfs 192.168.0.76:/opt/backup /opt/backup (backup time : 07:05)

3. mount -t nfs -o rw,hard,rsize=32768,wsize=32768,vers=3,nointr,timeo=600,tcp,nolock,actimeo=0 192.168.0.76:/opt/backup /opt/backup (backup time : 06:35)

In test 1 the mount is without actimeo. Test 2 doesn't specify any mount options and rely on default options. In test 3 the options are similar to test 1 with the exception of actimeo being included.
There's some confusion as to when to use actimeo option. Some documents state it explicitly not to use noac option with RMAN yet says nothing explicitly about actimeo option. At the same time it is stated that it could be used with single instance databases (359515.1) while others explicitly state not to use actimeo with single instance and only required for RAC (1164673.1,762374.1). Neverthless the option 3 was able to complete the backup in roughly same time as option 1. However use of noac increased the backup time unacceptably long (into hours. Not included in the post).




Below is the IO rate and throughput observed from the emconsole during each test.


Useful metalink note
Howto Optimize NFS Performance with NFS options. [ID 397194.1]
How To Setup DNFS (Direct NFS) On Oracle Release 11.2 [ID 1452614.1]
Step by Step - Configure Direct NFS Client (DNFS) on Linux (11g) [ID 762374.1]
Mount Options for Oracle files when used with NFS on NAS devices [ID 359515.1]
NFS Performance Decline Introduced by Mount Option "actimeo=0" [ID 1164673.1]

Monday, December 1, 2014

Installing Oracle Database 12.1.0.2 on RHEL 7

Oracle has certified 12.1.0.2 on RHEL 7 kernels 3.10.0-54.0.1.el7 or later (1304727.1). The current RHEL 7 available from RedHat site is
cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.0 (Maipo)
and comes with following kernel
uname -r
3.10.0-123.el7.x86_64
which is valid for installing 12.1.0.2. It must be noted only 12.1.0.2 is certified on RHEL 7.

RHEL 7 comes with XFS as the default file system. Although Oracle has its own material praising the XFS this is still not listed as supported file system on 236826.1 - Supported and Recommended File Systems on Linux. It's not yet known if XFS is supported and recommended for Oracle RDBMS data files. In any explicit support statement is with regard to Oracle linux support but not Oracle RDBMS support. On 1529864.1 (which is relevant for RHEL 6) it is explicitly stated database is supported on ext2, ext3 and ext4 file systems, however there's no similar document (Requirements for Installing Oracle Database 12.1 on RHEL7 or OL7 64-bit (x86-64)) available yet (refer 1961997.1). On the other hand going by documents published for RHEL 6 (1601759.1) which states oracle does not certify local file systems and support and certification is up to the OS vendor. Nevertheless it was possible to install a 12.1.0.2 database on XFS local file system without any problem.



On the updated document which list RHEL 7 as an option for 12.1, the compat-libstdc++-33 is missing from the required RPM list. However the installer checks for this package and raise a warning.
compat-libstdc++-33 is not available on RHEL 7 install media. However it could be downloaded as a separate package from the redhat site.

RHEL 7 uses grub version 2. Editing entries in grub is different to that of RHEL6. This will come into play when setting elevator and transparent_hugepages.

In RHEL 7 the concept of runlevels has been replaced with systemd targets. The runlevel command exists but only for compatibility reasons and it is recommended if possible to avoid using the runlevel command. In minimal installation the runlevel command was returning "unknown" instead of expected "N 3" (RHEL 7 installed with Server and GUI didn't have this problem). As a result the runlevel pre-req check could fail on grid infrastructure and database installs.
It was possible to ignore this pre-req failure and continue the installation.

All other preinstallation task setup is similar to earlier versions of RHEL.

Useful metalink notes
Certification Information for Oracle Database on Linux x86-64 [ID 1304727.1]
Supported and Recommended File Systems on Linux [ID 236826.1]
Requirements for Installing Oracle Database 12.1on RHEL7 or OL7 64-bit (x86-64) [ID 1961997.1]
Requirements for Installing Oracle 11.2.0.4 RDBMS on RHEL7 or OL7 64-bit (x86-64) [ID 1962100.1]

Related Posts
Installing 11.2.0.3 on RHEL 6
Installing 11gR2 (11.2.0.3) GI with Role Separation on RHEL 6
Installing 11gR2 (11.2.0.3) GI with Role Separation on OEL 6
Installing 11gR2 Standalone Server with ASM and Role Separation on RHEL 6
Installing 12c (12.1.0.1) RAC on RHEL 6 with Role Separation - Clusterware

Saturday, November 29, 2014

RHEL 7 VNC Setup - "Oh no! Somethign has gone wrong"

If RHEL 7 was installed as a minimal install
 yum grouplist
Available environment groups:
   Minimal Install
then setting up vnc by installing tigervnc-server would result in following screen when login with a vncviwer.
Installing X windows did not resolve the issue
yum groupinstall "X Window System"
The minimal installation does not have any GUI related components as such the vnc has no desktop to connect to. Installing KDE desktop
yum groupinstall "kde-desktop"
or the GNOME the default desktop for RHEL7
yum groupinstall GNOME
resolved the issue.

Related Post
Blank VNC Screen


Friday, November 28, 2014

Creating Local Yum Repository Using an ISO or DVD for RHEL5, 6 and 7

Creating a local yum repository allows a convenient way of installing the required rpms as part of an Oracle installation. Using yum repository is convenient as oppose to installing required rpms via rpm -i as yum fetches the necessary dependencies from the repository. The post gives the steps for setting up local yum repository for RHEL5, RHEL6 and RHEL7 using either ISO or DVD which has the installation.
Mount the ISO or the DVD. In all cases the ISO or DVD is mounted on /media mount point.

Setting up local yum repository on RHEL 5.
Currently there are no repositories.
 yum repolist
Loaded plugins: rhnplugin, security
This system is not registered with RHN.
RHN support will be disabled.
repolist: 0
Create the repository file (extension must be .repo).
cat rhel5.repo
[RHEL5ISO]
name=RHEL 5 ISO
baseurl=file:///media/Server
enabled=1
gpgcheck=1
gpgkey=file:///media/RPM-GPG-KEY-redhat-release
The repository data is taken from "repodata/repomd.xml". In RHEL5 this is located inside the Server directory for linux server installation. Similar repomd.xml exists for cluster,clusterstorage and VT. Setting only /media for baseurl will result in following error
file:///media/repodata/repomd.xml: [Errno 5] OSError: [Errno 2] No such file or directory: '/media/repodata/repomd.xml'
If the gpgcheck is set to 1 (enabled) then prior to installing a package authenticity of it is checked using the GPG signatures. For this to work the gpgkey must be set to the RPM-GPG-KEY-redhat-release file which is available under /media.
Once the file is setup run the following commands to clean and list the repository data.
yum clean all
Loaded plugins: rhnplugin, security
Cleaning up Everything

yum repolist
Loaded plugins: rhnplugin, security
This system is not registered with RHN.
RHN support will be disabled.
RHEL5ISO                                                                                                                                                                                                               | 1.5 kB     00:00
RHEL5ISO/primary                                                                                                                                                                                                       | 920 kB     00:00
RHEL5ISO                                                                                                                                                                                                                            3285/3285
repo id                                   repo name                                      status
RHEL5ISO                                  RHEL 5 ISO                                     enabled: 3,285 repolist: 3,285
Run yum list or yum grouplist to test the repository is working.
yum grouplist
Loaded plugins: rhnplugin, security
This system is not registered with RHN.
RHN support will be disabled.
Setting up Group Process
RHEL5ISO                                                                                                                                                                                                               | 1.5 kB     00:00
RHEL5ISO/primary                                                                                                                                                                                                       | 920 kB     00:00
RHEL5ISO/group                                                                                                                                                                                                         | 1.0 MB     00:00
Installed Groups:
   Administration Tools
   Authoring and Publishing
   Development Libraries
   Development Tools
   Editors
   FTP Server
   GNOME Desktop Environment
   GNOME Software Development
   Graphical Internet
   Legacy Network Server
   Legacy Software Development
   Legacy Software Support
   Mail Server
   Network Servers
   Office/Productivity
   Printing Support
   Server Configuration Tools
   System Tools
   Text-based Internet
   X Software Development
   X Window System
Available Groups:
   DNS Name Server
   Engineering and Scientific
   Games and Entertainment
   Graphics
   Java Development
   KDE (K Desktop Environment)
   KDE Software Development
   MySQL Database
   News Server
   OpenFabrics Enterprise Distribution
   PostgreSQL Database
   Sound and Video
   Web Server
   Windows File Server
Done



Setting up local yum repository on RHEL 6 and RHEL 7.
Setting up local yum repository for both RHEL6 and RHEL7 is similar and same repo file could be used for both RHEL versions. Only difference between these RHEL versions and the RHEL5 is the baseurl.
cat rhel6.repo
[RHEL6ISO]
name=RHEL 6 ISO
baseurl=file:///media
enabled=1
gpgcheck=1
gpgkey=file:///media/RPM-GPG-KEY-redhat-release
The "repodata/repomd.xml" is available from the base media directory. As such baseurl is set to the mount point of the ISO or the DVD.
Run yum clean all and yum repolist as before and use grouplist to validate the repository.
From RHEL 6
 yum grouplist
Loaded plugins: product-id, refresh-packagekit, security, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Setting up Group Process
RHEL6ISO/group_gz                                                                                                                                                                                                      | 204 kB     00:00 ...
Installed Groups:
   Additional Development
   Base
   Compatibility libraries
   Console internet tools
   Debugging Tools
   Desktop Platform
   Dial-up Networking Support
   Directory Client
   E-mail server
   FTP server
   Fonts
   General Purpose Desktop
   Graphical Administration Tools
   Hardware monitoring utilities
   Java Platform
   KDE Desktop
....
From RHEL7
 yum grouplist
Loaded plugins: product-id, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Available environment groups:
   Minimal Install
   Infrastructure Server
   File and Print Server
   Basic Web Server
   Virtualization Host
   Server with GUI
Available Groups:
   Compatibility Libraries
   Console Internet Tools
   Development Tools
   Graphical Administration Tools
   Legacy UNIX Compatibility
   Scientific Support
   Security Tools
   Smart Card Support
   System Administration Tools
   System Management
Done

Thursday, November 27, 2014

Tunneling VNC Over SSH Using PuTTY

By default VNC runs on port 5901. This port may not always be open for access. Vnc access maybe needed for GUI base work such as runInstaller, DBCA, DBUA etc. (there are other ways to get GUI to desktop, such as Xming). In situation where VNC port is not open it could be tunneled over SSH. This post shows how to use PuTTY for this effect.
1. Set the tunneling information before opening the ssh connection. Source port is the local listening port. In this case port 5999 has been chosen as the local listener port. Destination is a host:port combination. In this case the local host is the destination host and port is set 5901 which the remote vnc listening port.
2. Once Click add to make the tunneling take effect when ssh connection is established.



3.Establish the ssh connection.
4. Connect a vncviwer specifying the source port used earlier.
If the source port had been 5901 (the default vnc port) then the vncviwer connection could use the following

Wednesday, November 19, 2014

Databases With Different Timezones in Same Server

There may be occasions where two database that reside in the same server is required to have different timezones. Changing the timezone of the database does not help in this case as this is applicable only to columns of "timestamp with local timezone". Changing the timezone on the OS level may also not be useful as there are two databases to contend with.
Solution is to use the TZ environment variable. This is applicable for both single instance and RAC databases. This post gives an example having two databases with different timezones in the same server.
First up is the single instance case. The two databases are std11g2 and ent11g2 (both 11.2.0.3 databases). The timezone of the std11g2 will be changed to GMT+5 while the timezone of the ent11g2 will remain unaffected. As it is now both databases have the same timezone
SQL> select dbtimezone from dual;

DBTIME
------
+00:00
Set the TZ to desired timezone and restart the database that requires timezone to be changed
export TZ=Etc/GMT+5
There was no need to restart the listener. In fact in this case three databases were running in the same server and listener was running out of a 12.1.0.2 Oracle home and the two 11.2 databases used for this post registered with this listener. Even after the database is restarted the timezone will still show as before the restart. But querying the systimestamp will show the time according to the timezone used.On std11g2
SQL> select dbtimezone from dual;

DBTIME
------
+00:00

SQL> SELECT systimestamp from dual;

SYSTIMESTAMP
-----------------------------------
19-NOV-14 12.09.57.837228 PM -05:00
On ent11g2
SQL> select dbtimezone from dual;

DBTIME
------
+00:00

SQL> SELECT systimestamp from dual;

SYSTIMESTAMP
-----------------------------------
19-NOV-14 05.09.57.694861 PM +00:00
All the remote connections to the database will use the respective timezones
unset TZ

sqlplus  sys@std11g2 as sysdba

SQL> select systimestamp from dual;

SYSTIMESTAMP
-----------------------------------
19-NOV-14 07.26.24.918270 AM -05:00

sqlplus  sys@ent11g2 as sysdba

SQL> select systimestamp from dual;

SYSTIMESTAMP
-----------------------------------
19-NOV-14 12.26.45.653530 PM +00:00
For RAC databases where the start and stop of database is managed by the clusterware the timezone information is specified using the setenv. In this case two databases (std12c1 and tzdb both 12.1) reside in same cluster nodes and it's expected that tzdb to have a different timezone. Both databases were using the same listeners (SCAN, Listener). Query the current environment setting for any timezone information using getenv
srvctl getenv database -d std12c1 
std12c1:

srvctl getenv database -d tzdb
tzdb:
Set the timezone information using setenv for the tzdb database
srvctl setenv database -d tzdb -T 'TZ=Etc/GMT+5'
Verify the setting
srvctl getenv database -d tzdb
tzdb:
TZ=GMT+5
Stop and restart the database
srvctl stop database -d tzdb
srvctl start database -d tzdb
Query the databases for timestamp
sqlplus asanga@tzdb

SQL> select systimestamp from dual;

SYSTIMESTAMP
-----------------------------------
19-NOV-14 11.22.50.685116 AM -05:00

sqlplus asanga@std12c1

SQL> select  systimestamp from dual;

SYSTIMESTAMP
-----------------------------------
19-NOV-14 04.23.08.054139 PM +00:00
To remove timezone information on RAC use
srvctl unsetenv database -d tzdb  -envs TZ
srvctl getenv database -d tzdb
tzdb:



A test cases was used to simulate how a application server that connects to the database via JDBC would see the time values. The java code is given at the end of the post. The output resulted from running this code against the RAC databases is given below.
2014-11-19 16:44:10.542481 xxxx 2014-11-19 16:44:10.542481 +0:00 xxxx 16:44:10
2014-11-19 16:44:10.0      xxxx 2014-11-19 16:44:10              xxxx 16:44:10

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

2014-11-19 16:44:10.606114 xxxx 2014-11-19 11:44:10.606114 -5:00 xxxx 16:44:10
2014-11-19 11:44:10.0      xxxx 2014-11-19 11:44:10              xxxx 11:44:10
The program queries the database for systimestmap and sysdate. The top half before the divider (xxxxx) represent the output from the std12c1 database (which didn't change the timezone). The bottom half shows the output from the database tzdb which had the timezone changed.
On each half there are two lines. First line represent getting the systimestamp from the JDBC resultset using getTimestamp,getString and getTime methods. The second line represent getting sysdate from the resultset using the same set of methods. The machine that ran the java program had the same timezone as the std12c1 database.
From the output it could be seen that querying the systimestamp and getting the result using either getTimestamp and getTime methods loses the timezone information and shows the incorrect time. On the other hand getting the results using the getString method preserves the timezone information.
However querying the sysdate and obtaining the result from any of the aforementioned methods doesn't have this problem and time with respect to the timezone used is given. Therefore application using this method could run into problems if the client side timezone is different to that of the database timezone and how systimestmap results are obtained.
To overcome this problem change the timezone on the application servers to match the database timezone. If multiple applications are running out of same server use "user.timezone" to set the timezone for each application server based on the database it is connecting to.

Java code used for the test case
Public class Test {

  public static void main(String[] args) throws SQLException {


        tz1();
        System.out.println("\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\n");
        tz2();

    }

    public static void tz1() throws SQLException {

        OracleDataSource ds = new OracleDataSource();
        ds.setUser("asanga");
        ds.setPassword("asa");
        ds.setURL("jdbc:oracle:thin:@192.168.0.66:1521:ent11g2");

        Connection con = ds.getConnection();
        PreparedStatement pr = con.prepareStatement("select systimestamp,sysdate from dual");
        ResultSet rs = pr.executeQuery();
        while (rs.next()) {

            System.out.println(rs.getTimestamp(1) + " xxxx " + rs.getString(1) + " xxxx " + rs.getTime(1));
            System.out.println(rs.getTimestamp(2) + " xxxx " + rs.getString(2) + " xxxx " + rs.getTime(2));


        }

        rs.close();
        pr.close();
        con.close();

    }

    public static void tz2() throws SQLException {

        OracleDataSource ds = new OracleDataSource();
        ds.setUser("asanga");
        ds.setPassword("asa");
        ds.setURL("jdbc:oracle:thin:@192.168.0.66:1521:std11g2");

        Connection con = ds.getConnection();
        PreparedStatement pr = con.prepareStatement("select systimestamp,sysdate from dual");
        ResultSet rs = pr.executeQuery();
        while (rs.next()) {

            System.out.println(rs.getTimestamp(1) + " xxxx " + rs.getString(1) + " xxxx " + rs.getTime(1));
            System.out.println(rs.getTimestamp(2) + " xxxx " + rs.getString(2) + " xxxx " + rs.getTime(2));

        }

        rs.close();
        pr.close();
        con.close();
    }
}

Friday, October 24, 2014

APPEND_VALUES Hint and JDBC

APPEND_VALUE hint was introduced in 11gR2 for direct path inserts with values clause. Append hint is only useful when doing direct path loading with a select sub-query. On Oracle documentation mention append_value hint is useful in enhnacing performance and list OCI program and PL/SQL as example. There's no mention of JDBC. Is there a performance gain when using append_value hint when the inserts are issued through JDBC?
This is a simple test case that tried to answer this question. The test case involves inserting 50000 rows to a single column table. The insert statement is issued with and without (conventional insert) the hint. Also the table is created with and without logging enabled. Append* hints generate less redo when the table has nologging enabled. But this is not entirely dependent on table but also other factors as well. The test measures redo size, cpu used by session and the total elapsed time to insert the rows (this is measured on the side of the java test code) statistics. Test is repeated for above combination employing JDBC batching as well. The java code used for the test case is given at the end of the post. The tests were carried out on 11.2.0.3 EE database. (update 2014/10/24: Same test was done on 12.1.0.2 EE DB. The results more or less the same. In 12.1.0.2 even nologging + batching with append hint didn't out perform same test without the hint)
First comparison is the size of the redo generated during the inserts.
The first graph shows the redo generated without batching inserts. There's no surprise that when the table is in nologging mode the amount of redo generated is less than when table is in logging mode. But not having the append hint (in this post append hint means append_value hint) seems to generate less redo than having the append hint. This is true when table is logging and nologging mode. On the other hand when the inserts are batched and when table is in nologging having the append hint results in minimum redo being generated and this amount is less than the redo generated without the append hint (in both logging and nologging modes). If the table is in logging mode then batching without the append hint results in less redo generated compared to using the append hint.
The next statistic is the CPU used for completing the inserts. The CPU used by this session statistics is used to calculate this by capturing CPU statistics value before and after inserts.
Batching the inserts results in minimum CPU being consumed for inserts compared to not batching. There's no great deal difference in the CPU consumption when the append hint is used compared to it not being used. However when inserts are not batched the amount of CPU consumed is doubled and tripled when append hint is used compared to without the hint. So in terms of CPU consumption, having the append hint and not batching the inserts will result in performance degradation.
Final statistics is the total elapsed time to insert the rows. This is roughly the total execution time for the test code. The time is measured in milliseconds.
Similar to CPU usage, batching results in lowest elapsed time. This is no surprise as CPU is a component of the overall elapsed time. However when inserts are not batched then having the append hint results in high elapsed time compared to inserting without the append hint.
From this limited test case it seems that only time that's beneficial to use append hint with JDBC is when inserts are batched and table is in nologging mode. Other times the non batched inserts and batched inserts out perform the effects of append hint. But having nologging table may not be an option in a production system and even if it is possible have a nologging table, highly concurrent inserts into a nologging table could results high number of control file related wait events.
Furthermore there few other points to consider before deciding to use append hint in the application. When append_value is used to insert to a table another session cannot insert until first session commits. The second session will hang and it will be waiting on a enq: TM - contention wait event which is usually associated with unindexed foreign key related issues. So the concurrent nature of the inserts must be considered. If the inserts are highly concurrent then having the append hint may not be a good idea.
Within the same session after one insert another cannot be made without first committing the previous insert.
SQL> set auto off
SQL>  insert /*+ append_values */ into append_hint_test values ('abc');

1 row created.

SQL> insert /*+ append_values */ into append_hint_test values ('def');
insert /*+ append_values */ into append_hint_test values ('def')
                                 *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel
Therefore java codes that's been written to reuses the cursors and have auto commit set to false will encounter following error
Exception in thread "main" java.sql.SQLException: ORA-12838: cannot read/modify an object after modifying it in parallel
Also the append hint results in direct loading of data to the end of the table. This results in continuous growth of the table even if there's free space available (which may or may not be a problem in some cases). Therefore it maybe better to use traditional batching with JDBC than using append hint as the negative consequence of using it seem to out weigh the gains.



On the other hand in PL/SQL with batch inserts (using FORALL) the append hint seem to out perform the conventional inserts. PL/SQL code used is also given at the end of the post. Below graph shows the redo size generated for inserting 50000 rows with and without append hint.


Create table for the test cases.
create table append_hint_test(value varchar2(50));
create table append_hint_plsql_test(value varchar2(50));
Java Code Used.
public class AppendHintTest {

    public static void main(String[] args) throws Exception {

        OracleDataSource dataSource = new OracleDataSource();
        dataSource.setURL("jdbc:oracle:thin:@192.168.0.66:1521:ent11g2");
        dataSource.setUser("asanga");
        dataSource.setPassword("asa");

        Connection con = dataSource.getConnection();
//               con.setAutoCommit(false);  
        DBStats stats = new DBStats();
        stats.initStats(con);

        String SQL = "insert /*+ APPEND_VALUES */ into append_hint_test values (?)";
//               String SQL = "insert into append_hint_test values (?)"; 

        PreparedStatement pr = con.prepareStatement(SQL);


        long t1 = System.currentTimeMillis();

        for (int i = 0; i < 50000; i++) {
            pr.setString(1, "akhgaipghapga " + i);
            pr.execute();

//                   pr.addBatch();
//                   if(i%1000==0){
//                       pr.executeBatch();

//                   }
        }

//               pr.executeBatch();
        con.commit();
        pr.close();

        long t2 = System.currentTimeMillis();
        String[][] statsValues = stats.getStatsDiff(con);
        con.close();
        System.out.println("time taken " + (t2 - t1));

        for (String[] x : statsValues) {

            System.out.println(x[0] + " : " + x[1]);
        }

    }
}

public class DBStats {

    private HashMap stats = new HashMap<>();
    private String SQL = "select name,value " + "from v$mystat,v$statname " + "where v$mystat.statistic#=v$statname.statistic# "
            + "and v$statname.name in ('CPU used when call started','CPU used by this session','db block gets',"
            + "'db block gets from cache','db block gets from cache (fastpath)','db block gets direct',"
            + "'consistent gets','consistent gets from cache','consistent gets from cache (fastpath)',"
            + "'consistent gets - examination','consistent gets direct','physical reads',"
            + "'physical reads direct','physical read IO requests','physical read bytes',"
            + "'consistent changes','physical writes','physical writes direct',"
            + "'physical write IO requests','physical writes from cache','redo size')";

    public void initStats(Connection con) {
        try {
            PreparedStatement pr = con.prepareStatement(SQL);

            ResultSet rs = pr.executeQuery();


            while (rs.next()) {

                stats.put(rs.getString(1), rs.getLong(2));
            }

            rs.close();
            pr.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }

    }

    public String[][] getStatsDiff(Connection con) {
        
        String[][] statDif = new String[stats.size()][2];

        try {
            PreparedStatement pr = con.prepareStatement(SQL);

            ResultSet rs = pr.executeQuery();

            int i = 0;
            while (rs.next()) {
               Long val = rs.getLong(2) - stats.get(rs.getString(1));
               statDif[i][0] = rs.getString(1);
               statDif[i][1] = val.toString();
               i++;
            }

            rs.close();
            pr.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }


        return statDif;


    }
}
PL/SQL code used. Before running the PL/SQL test populate the Append_Hint_Test table with rows using java code above.
SET serveroutput ON
DECLARE
Type Arry_Type
IS
  TABLE OF Loadt%Rowtype INDEX BY PLS_INTEGER;
  Loadtable Arry_Type;
  redosize1 NUMBER;
  redosize2 NUMBER;
  t1        NUMBER;
  t2        NUMBER;
Begin
  EXECUTE immediate 'truncate table append_hint_plsql_test';
  Select * Bulk Collect Into Loadtable From Append_Hint_Test ;
  
  Dbms_Output.Put_Line(Loadtable.Count);
  
  SELECT Value
  INTO Redosize1
  FROM V$mystat,
    V$statname
  Where V$mystat.Statistic#=V$statname.Statistic#
  AND V$statname.Name      = 'redo size';
  
  Dbms_Output.Put_Line('redo size 1 '||Redosize1);
  T1          := Dbms_Utility.Get_Time;
  
  Forall Indx IN 1 .. Loadtable.Count
--  INSERT /*+ APPEND_VALUES */
--  INTO append_hint_plsql_test VALUES
--    (Loadtable(Indx).A
--    );
  Insert Into Append_Hint_Plsql_Test Values   (Loadtable(Indx).A   );

  Commit;
  
  T2 := Dbms_Utility.Get_Time;
  
  SELECT Value
  INTO Redosize2
  FROM V$mystat,
    V$statname
  WHERE V$mystat.Statistic#=V$statname.Statistic#
  AND V$statname.Name      = 'redo size';
  Dbms_Output.Put_Line('redo size 2 '||Redosize2);
  Dbms_Output.Put_Line('redo generated : '||(Redosize2-Redosize1)|| ' Time taken : '||(t2-t1));
END;
/

Friday, September 12, 2014

Adding Far Sync Instances to Existing Data Guard Configuration

This post list the steps of adding far sync instances to an existing data guard configuration (for both primary and standby databases, to be used when standby become primary). Far sync is an new feature introduced with 12c which allows transporting of redo data synchronously from primary DB to a "near by" far sync instance, which then transport the redo data asynchronously "over a longer distance". The idea is that there's low overhead on primary when transporting redo synchronously to a "near by" far sync instance compared to transporting "over a long distance" to achieve zero data loss as well as off-loading of the redo transport.
Oracle documentation provides a complete description of the far sync concept. Given below are few important excerpts
Many configurations have a primary database shipping redo to a standby database using asynchronous transport at the risk of some data loss at failover time. Using synchronous redo transport to achieve zero data loss may not be a viable option because of the impact on the commit response times at the primary due to network latency between the two databases. Creating a far sync instance close to the primary has the benefit of minimizing impact on commit response times to an acceptable threshold (due to the smaller network latency between primary and far sync instance) while allowing for higher data protection guarantees -- if the primary were to fail, and assuming the far sync instance was synchronized at the time of the failure, the far sync instance and the terminal standby would coordinate a final redo shipment from the far sync instance to the standby to ship any redo not yet available to the Standby and then perform a zero-data-loss failover.
A far sync instance manages a control file, receives redo into standby redo logs (SRLs), and archives those SRLs to local archived redo logs. A far sync instance does not have user data files, cannot be opened for access, cannot run redo apply, and can never function in the primary role or be converted to any type of standby database.
Far sync instances are part of the Oracle Active Data Guard Far Sync feature, which requires an Oracle Active Data Guard license.
In a configuration that contains a far sync instance, there must still be a direct network connection between the primary database and the remote standby database. The direct connection between the primary and the remote standby is used to perform health checks and switchover processing tasks. It is not used for redo transport unless the standby has been configured as an alternate destination in case the far sync instance fails and there is no alternate far sync configured to maintain the protection level.
The existing data guard configuration's primary database parameter setting and active data guard creation script of the standby database is given below. (other prerequisites for setting up data guard is omitted)
Primary database parameter changes
alter system set log_archive_config='dg_config=(ent12c1,ent12c1s)' scope=both ;
alter system set log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=ent12c1' scope=both;
alter system set log_archive_dest_2='service=ENT12C1STNS ASYNC NOAFFIRM max_failure=10 max_connections=5 reopen=180 valid_for=(online_logfiles,primary_role) db_unique_name=ent12c1s' scope=both;
alter system set log_archive_dest_state_2='defer' scope=both;
alter system set log_archive_dest_state_1='enable' scope=both;
alter system set fal_server='ENT12C1STNS' scope=both;
alter system set log_archive_max_processes=10 scope=both;
alter system set db_file_name_convert='/opt/app/oracle/oradata/ENT12C1S','/data/oradata/ENT12C1' scope=spfile;
alter system set log_file_name_convert='/opt/app/oracle/oradata/ENT12C1S','/data/oradata/ENT12C1' ,'/opt/app/oracle/fast_recovery_area/ENT12C1S','/data/flash_recovery/ENT12C1' scope=spfile;
alter system set standby_file_management='AUTO' scope=both;
Standby creation script
mkdir -p  /opt/app/oracle/oradata/ENT12C1S/controlfile
mkdir -p /opt/app/oracle/fast_recovery_area/ENT12C1S/controlfile

duplicate target database for standby from active database spfile
parameter_value_convert 'ent12c1','ent12c1s','ENT12C1','ENT12C1S','data','opt/app/oracle','flash_recovery','fast_recovery_area'
set db_unique_name='ent12c1s'
set db_create_file_dest='/opt/app/oracle/oradata'
set db_recovery_file_dest='/opt/app/oracle/fast_recovery_area'
set db_file_name_convert='/data/oradata/ENT12C1','/opt/app/oracle/oradata/ENT12C1S'
set log_file_name_convert='/data/oradata/ENT12C1','/opt/app/oracle/oradata/ENT12C1S','/data/flash_recovery/ENT12C1','/opt/app/oracle/fast_recovery_area/ENT12C1S'
set log_archive_max_processes='10'
set fal_server='ENT12C1TNS'
set log_archive_dest_2='service=ENT12C1TNS ASYNC NOAFFIRM max_failure=10 max_connections=5 reopen=180 valid_for=(online_logfiles,primary_role) db_unique_name=ent12c1'
set log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=ent12c1s';
This single instance data guard configuration which transport redo asynchronously is transformed as show below.
In situation 1 ENT12C1 is the primary database while ENT12C1S is the standby and FS12C1 is the far sync instance the primary database ships redo synchronously.
After a role switch when the ENT12C1S becomes the new primary then it uses FS12C1S as the far sync instance to transport redo synchronously.
In both situation there exists a direct redo transport path between primary and standby and this will transport redo asynchronously in case of far sync instance failure. Once the far sync is backup again the data guard configuration will revert to using the far sync instance for redo transport. If the standby redo logs were created on the primary then when far sync instances are in use for redo transport standby redo logs will be created automatically for them.
1. On the servers used for creating the far sync instances, install the oracle database software and create a listener. There's no requirement for creating static listener configuration as far sync instance automatically registers with the listener.
2. Create TNS entries for far sync instances on the existing databases (primary and standby) and copy the existing TNS entries into the far sync instances tnsnames.ora file.
cat tnsnames.ora

ENT12C1TNS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ent12c1-host)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ent12c1)
    )
  )

ENT12C1STNS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ent12c1s-host)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ent12c1)
    )
  )

FS12C1TNS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = fs12c1-host)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = fs12c1)
    )
  )

FS12C1STNS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = fs12c1s-host)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = fs12c1s)
    )
  )
3. Create a control file for the far sync instances by connecting to the primary database. Same control file is used for both far sync instances in this case (fs12c1 and fs12c1s).
SQL> ALTER DATABASE CREATE FAR SYNC INSTANCE CONTROLFILE AS '/home/oracle/controlfs.ctl';
4. Copy the control file to the far sync instances. In this case the control file is multiplexed and renamed as control01.ctl and control02.ctl on the far sync instances (shown on the pfiles in subsequent steps).
scp controlfs.ctl fs12c1-host:/opt/app/oracle/oradata/FS12C1/controlfile/controlfs01.ctl
scp controlfs.ctl fs12c1s-host:/opt/app/oracle/oradata/FS12C1S/controlfile/controlfs01.ctl
Similarly copy the password file from the primary to $ORACLE_HOME/dbs on the servers where far sync instances will be created. For far sync instances fs12c1 and fs12c1s the password file need to be renamed as orapwfs12c1 and orapwfs12c1s respectively.

5. Create pfile from the primary spfile. This will be modified to reflect the far sync instance settings.
SQL> create pfile='/home/oracle/pfilefs.ora' from spfile;
6. Copy the pfile to far sync instances ($ORACLE_HOME/dbs) and rename them to reflect the instance names (eg. initfs12c1.ora and initfs12c1s.ora). Modify the init file used for the primary far sync instance (fs12c1) as shown below. Not all the parameters are needed for far sync and those could be removed from the pfile.
cat initfs12c1.ora

*.audit_file_dest='/opt/app/oracle/admin/fs12c1/adump'
*.audit_trail='OS'
*.compatible='12.1.0.2'
*.control_files='/opt/app/oracle/oradata/FS12C1/controlfile/controlfs01.ctl','/opt/app/oracle/oradata/FS12C1/controlfile/controlfs02.ctl'
*.db_block_size=8192
*.db_create_file_dest='/opt/app/oracle/oradata'
*.db_name='ent12c1'
*.db_recovery_file_dest_size=21474836480
*.db_recovery_file_dest='/opt/app/oracle/fast_recovery_area'
*.db_unique_name='fs12c1'
*.diagnostic_dest='/opt/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=fs12c1XDB)'
*.fal_server='ENT12C1TNS'
*.log_archive_config='dg_config=(ent12c1,ent12c1s,fs12c1,fs12c1s)'
*.log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=fs12c1'
*.log_archive_dest_2='service=ENT12C1STNS ASYNC NOAFFIRM valid_for=(STANDBY_LOGFILES,standby_role) db_unique_name=ent12c1s max_failure=10 max_connections=5 reopen=180'
*.log_archive_dest_state_1='enable'
*.log_archive_dest_state_2='enable'
*.log_archive_format='%t_%s_%r.dbf'
*.log_archive_max_processes=10
*.log_file_name_convert='/data/oradata/ENT12C1','/opt/app/oracle/oradata/FS12C1','/data/flash_recovery/ENT12C1','/opt/app/oracle/fast_recovery_area/FS12C1'
*.remote_login_passwordfile='EXCLUSIVE'
*.standby_file_management='AUTO'
7. Mount the far sync instance using the pfile and then create a spfile from the pfile. Without the spfile a warning is shown when the far sync is added to a data guad broker configuration. Beside that having a spfile also helps with any subsequent parameter changes without the need to restart the far sync instance. Restart (mount) the far sync instance using the spfile.



8. Similarly create the pfile for the far sync instance used by current standby (when it becomes the primary) FS12C1S.
cat initfs12c1s.ora

*.audit_file_dest='/opt/app/oracle/admin/fs12c1s/adump'
*.audit_trail='OS'
*.compatible='12.1.0.2'
*.control_files='/opt/app/oracle/oradata/FS12C1S/controlfile/control01.ctl','/opt/app/oracle/fast_recovery_area/FS12C1S/controlfile/control02.ctl'
*.db_create_file_dest='/opt/app/oracle/oradata'
*.db_name='ent12c1'
*.db_recovery_file_dest_size=21474836480
*.db_recovery_file_dest='/opt/app/oracle/fast_recovery_area'
*.db_unique_name='fs12c1s'
*.diagnostic_dest='/opt/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=fs12c1sXDB)'
*.fal_server='ENT12C1STNS'
*.log_archive_config='dg_config=(ent12c1,ent12c1s,fs12c1,fs12c1s)'
*.log_archive_dest_1='location=use_db_recovery_file_dest valid_for=(all_logfiles,all_roles) db_unique_name=fs12c1s'
*.log_archive_dest_2='service=ENT12C1TNS ASYNC NOAFFIRM valid_for=(standby_logfiles,standby_role) db_unique_name=ent12c1 max_failure=10 max_connections=5 reopen=180'
*.log_archive_dest_state_1='enable'
*.log_archive_dest_state_2='enable'
*.log_archive_format='%t_%s_%r.dbf'
*.log_archive_max_processes=10
*.log_file_name_convert='/data/oradata/ENT12C1','/opt/app/oracle/oradata/FS12C1S','/data/flash_recovery/ENT12C1','/opt/app/oracle/fast_recovery_area/FS12C1S'
*.remote_login_passwordfile='EXCLUSIVE'
*.standby_file_management='AUTO'
9. Similar to earlier (step 7) create a spfile and restart (mount) the far sync instance using the spfile.

10. Update the log archive config parameter on both primary and standby to include the far sync instance information as well.
alter system set log_archive_config='dg_config=(ent12c1,ent12c1s,fs12c1,fs12c1s)' scope=both ;
11. Update the fal server parameter on the primary (ent12c1) as below which allows the ent12c1 to fetch archive logs (when it becomes a standby) either from the primary (ent12c1s) or from the far sync instance (fs12c1s).
alter system set fal_server='ENT12C1STNS','FS12C1STNS' scope=both;
12. Update the fal server parameter on the standby (ent12c1s) so that it can fetch the archive logs either from the ent12c1 (primary) or far sync instance (fs12c1).
alter system set fal_server='ENT12C1TNS','FS12C1TNS' scope=both;
13. Update the log archive destination and log archive destination state on the primary such that redo transport is synchronized between primary and far sync and asynchronous between the primary and standby (direct). Further more the asynchronous log archive destination is set with state alternate so that when the synchronous log archive destination fails the data guard configuration start shipping redo via this alternate log archive destination.
alter system set log_archive_dest_state_2='enable' scope=both;
alter system set log_archive_dest_2='service=FS12C1TNS SYNC AFFIRM  db_unique_name=fs12c1  max_failure=1 valid_for=(online_logfiles,primary_role) alternate=log_archive_dest_3 max_connections=5' scope=both;

alter system set log_archive_dest_state_3='alternate' scope=both;
alter system set log_archive_dest_3='service=ENT12C1STNS ASYNC NOAFFIRM  db_unique_name=ent12c1s  valid_for=(online_logfiles,primary_role) alternate=log_archive_dest_2 max_failure=10 max_connections=5 reopen=180' scope=both;
14. Change and add log archive destination settings on the standby database so that when it becomes primary (ent12c1s) it too can use the far sync instance for synchronous redo transport and failing that use asynchronous redo transport directly with the standby at the time (ent12c1)
alter system set log_archive_dest_state_2='enable' scope=both;
alter system set log_archive_dest_2='service=FS12C1STNS SYNC AFFIRM db_unique_name=fs12c1s valid_for=(online_logfiles,primary_role) alternate=log_archive_dest_3 max_connections=5  max_failure=1' scope=both;

alter system set log_archive_dest_state_3='alternate' scope=both;
alter system set log_archive_dest_3='service=ENT12C1TNS ASYNC NOAFFIRM  db_unique_name=ent12c1 valid_for=(online_logfiles,primary_role) alternate=log_archive_dest_2 max_failure=10 max_connections=5 reopen=180' scope=both;
15. Increase the protection mode of the primary database to maximum availability.
ALTER DATABASE SET STANDBY TO MAXIMIZE AVAILABILITY;

SQL> SELECT PROTECTION_MODE FROM V$DATABASE;

PROTECTION_MODE
--------------------
MAXIMUM AVAILABILITY
16. Do few log switches and verify that redo transport is happening via the far sync instance. Easiest way to monitor is through the alert log, which log switch will be logged on both far sync instance alert log and standby instance alert log if it was transported via the far sync. If there's any issues with regard to log archive destination this could be observed on the primary
SQL> SELECT DEST_NAME,STATUS,DESTINATION FROM V$ARCHIVE_DEST WHERE DESTINATION IS NOT NULL;

DEST_NAME                      STATUS    DESTINATION
------------------------------ --------- ------------------------------
LOG_ARCHIVE_DEST_1             VALID     USE_DB_RECOVERY_FILE_DEST
LOG_ARCHIVE_DEST_2             VALID     FS12C1TNS
LOG_ARCHIVE_DEST_3             ALTERNATE ENT12C1STNS
and on the far sync instance
SQL> SELECT DEST_NAME,STATUS,DESTINATION FROM V$ARCHIVE_DEST WHERE DESTINATION IS NOT NULL;

DEST_NAME                      STATUS    DESTINATION
------------------------------ --------- ------------------------------
LOG_ARCHIVE_DEST_1             VALID     USE_DB_RECOVERY_FILE_DEST
LOG_ARCHIVE_DEST_2             VALID     ENT12C1STNS
STANDBY_ARCHIVE_DEST           VALID     USE_DB_RECOVERY_FILE_DEST
Above output shows that primary is transporting to log archive dest 2 and status valid and dest 3 is still in alternative state. On the far sync output it shows that far sync instance is shipping redo as per its log archive dest 2 value and current status is valid.

17. Shutdown abort the far instance and view the archive dest output. If the configuration works properly then log archive dest 3 should be valid and redo transport should be happening directory between primary and standby in asynchronous mode.
SQL> SELECT DEST_NAME,STATUS,DESTINATION FROM V$ARCHIVE_DEST WHERE DESTINATION IS NOT NULL;

DEST_NAME                      STATUS    DESTINATION
------------------------------ --------- ------------------------------
LOG_ARCHIVE_DEST_1             VALID     USE_DB_RECOVERY_FILE_DEST
LOG_ARCHIVE_DEST_2             ALTERNATE FS12C1TNS
LOG_ARCHIVE_DEST_3             VALID     ENT12C1STNS
From the above output it could be seen that after far sync instance is terminated the log archive dest 3 has become the valid destination and log archive dest 2 is kept as an alternate destination. On the primary instance alert log following could be observed when the far sync instance is terminated
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Failed to archive log 1 thread 1 sequence 1503 (3113)
Destination LOG_ARCHIVE_DEST_2 no longer supports SYNCHRONIZATION
and on the standby instance alert log the following
Primary database is in MAXIMUM PERFORMANCE mode
Changing standby controlfile to MAXIMUM PERFORMANCE mode
It must be also mentioned on few occasions where the far sync was abruptly terminated (shutdown abort) the recovery process on standby got stopped due to lost writes
MRP0: Background Media Recovery terminated with error 742  <-- far sync instance terminated
Fri Sep 12 17:32:28 2014
Errors in file /opt/app/oracle/diag/rdbms/ent12c1s/ent12c1s/trace/ent12c1s_pr00_10098.trc:
ORA-00742: Log read detects lost write in thread 1 sequence 1503 block 868
ORA-00312: online log 4 thread 1: '/opt/app/oracle/fast_recovery_area/ENT12C1S/onlinelog/o1_mf_4_b0y1dn8v_.log'
ORA-00312: online log 4 thread 1: '/opt/app/oracle/oradata/ENT12C1S/onlinelog/o1_mf_4_b0y1dml4_.log'
Managed Standby Recovery not using Real Time Apply
RFS[16]: Assigned to RFS process (PID:10165)
RFS[16]: Selected log 5 for thread 1 sequence 1504 dbid 209099011 branch 833730501
Fri Sep 12 17:32:28 2014
Recovery interrupted!
Recovered data files to a consistent state at change 19573793
Fri Sep 12 17:32:28 2014
Errors in file /opt/app/oracle/diag/rdbms/ent12c1s/ent12c1s/trace/ent12c1s_pr00_10098.trc:
ORA-00742: Log read detects lost write in thread 1 sequence 1503 block 868
ORA-00312: online log 4 thread 1: '/opt/app/oracle/fast_recovery_area/ENT12C1S/onlinelog/o1_mf_4_b0y1dn8v_.log'
ORA-00312: online log 4 thread 1: '/opt/app/oracle/oradata/ENT12C1S/onlinelog/o1_mf_4_b0y1dml4_.log'
Fri Sep 12 17:32:28 2014
MRP0: Background Media Recovery process shutdown (ent12c1s)
Fri Sep 12 17:32:28 2014
Archived Log entry 107 added for thread 1 sequence 1503 rlc 833730501 ID 0xde697d0 dest 3:
Fri Sep 12 17:38:51 2014
alter database recover managed standby database disconnect  <-- manual start at 17:38 after it was stopped 17:32
So it maybe good idea to keep an eye on the recovery when the far sync instance terminates. According to 1302539.1 when active data guard is in place the there's automatic block repair transparent to the user. Once the far sync instance is backup again the redo transport will go back to original setting
SQL>  SELECT DEST_NAME,STATUS,DESTINATION FROM V$ARCHIVE_DEST WHERE DESTINATION IS NOT NULL;

DEST_NAME                      STATUS    DESTINATION
------------------------------ --------- ------------------------------
LOG_ARCHIVE_DEST_1             VALID     USE_DB_RECOVERY_FILE_DEST
LOG_ARCHIVE_DEST_2             VALID     FS12C1TNS
LOG_ARCHIVE_DEST_3             ALTERNATE ENT12C1STNS
And following could be observed on the alert log of primary
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Standby redo logfile selected to archive thread 1 sequence 1506
LGWR: Standby redo logfile selected for thread 1 sequence 1506 for destination LOG_ARCHIVE_DEST_2
Fri Sep 12 16:21:08 2014
Thread 1 advanced to log sequence 1506 (LGWR switch)
  Current log# 1 seq# 1506 mem# 0: /data/oradata/ENT12C1/onlinelog/o1_mf_1_9bcsl3ds_.log
  Current log# 1 seq# 1506 mem# 1: /data/flash_recovery/ENT12C1/onlinelog/o1_mf_1_9bcsl3hm_.log
Fri Sep 12 16:21:08 2014
Archived Log entry 1408 added for thread 1 sequence 1505 ID 0xde697d0 dest 1:
Fri Sep 12 16:21:11 2014
Thread 1 cannot allocate new log, sequence 1507
Checkpoint not complete
  Current log# 1 seq# 1506 mem# 0: /data/oradata/ENT12C1/onlinelog/o1_mf_1_9bcsl3ds_.log
  Current log# 1 seq# 1506 mem# 1: /data/flash_recovery/ENT12C1/onlinelog/o1_mf_1_9bcsl3hm_.log
Fri Sep 12 16:21:14 2014
Destination LOG_ARCHIVE_DEST_2 is SYNCHRONIZED
On the alert log of the far sync instance
Primary database is in MAXIMUM AVAILABILITY mode
Changing standby controlfile to RESYNCHRONIZATION level
Standby controlfile consistent with primary
RFS[1]: Assigned to RFS process (PID:3557)
RFS[1]: Selected log 5 for thread 1 sequence 1506 dbid 209099011 branch 833730501
Fri Sep 12 17:04:06 2014
******************************************************************
TT00: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2
******************************************************************
TT00: Standby redo logfile selected for thread 1 sequence 1506 for destination LOG_ARCHIVE_DEST_2
Fri Sep 12 17:04:07 2014
Changing standby controlfile to MAXIMUM PERFORMANCE mode
RFS[2]: Assigned to RFS process (PID:3561)
RFS[2]: Opened log for thread 1 sequence 1505 dbid 209099011 branch 833730501
Fri Sep 12 17:04:08 2014
Archived Log entry 203 added for thread 1 sequence 1505 rlc 833730501 ID 0xde697d0 dest 2:
Fri Sep 12 17:04:13 2014
Changing standby controlfile to MAXIMUM AVAILABILITY mode
On the standby alert log
Primary database is in MAXIMUM PERFORMANCE mode
Changing standby controlfile to MAXIMUM AVAILABILITY mode
Changing standby controlfile to RESYNCHRONIZATION level
Similarly it is possible to shutdown the standby instance and see if primary is able to ship redo to far sync instance and if the redo is fetch by standby once it is started again. With these test the situation 1 on the figure shown at the beginning of the post is complete.

18. To test the situation 2 on the figure above do a switchover and check the redo transport via the far sync (fs12c1s) instance.

This conclude adding of far sync instances to existing data guard configuration on 12c.

Useful metalink notes
Cascaded Standby Databases in Oracle 12c [ID 1542969.1]
Data Guard 12c New Feature: Far Sync Standby [ID 1565071.1]
Best Practices for Corruption Detection, Prevention, and Automatic Repair - in a Data Guard Configuration [ID 1302539.1]

Tuesday, September 2, 2014

LOB Chunk and Tablespace Block Size

Chunk value corresponds to the data size used by oracle when reading or writing a lob value. Once set chunk size cannot be changed. Though it doesn't matter for lobs stored in row, for out of row lobs the space is used in multiples of the chunk size.
From Oracle documentation (for basicfile lobs) A chunk is one or more Oracle blocks. You can specify the chunk size for the BasicFiles LOB when creating the table that contains the LOB. This corresponds to the data size used by Oracle Database when accessing or modifying the LOB value. Part of the chunk is used to store system-related information and the rest stores the LOB value. If the tablespace block size is the same as the database block size, then CHUNK is also a multiple of the database block size. The default CHUNK size is equal to the size of one tablespace block, and the maximum value is 32K. Once the value of CHUNK is chosen (when the LOB column is created), it cannot be changed. Hence, it is important that you choose a value which optimizes your storage and performance requirements.
For securefile CHUNK is an advisory size and is provided for backward compatibility purposes.
From performance perspective it is considered that accessing lobs in big chunks is more efficient. You can set CHUNK to the data size most frequently accessed or written. For example, if only one block of LOB data is accessed at a time, then set CHUNK to the size of one block. If you have big LOBs, and read or write big amounts of data, then choose a large value for CHUNK.
This post shows the result of a test case carried out to compare the performance benefits of using a large chunk size along with a tablespace with a large block size (8k vs 32k). The blob used for the test case is 800KB. The java code used for the test case is given at the end of the post. For each chunk size (8k vs 32k) the caching option was also changed (nocache vs cache) as they also have direct impact on the IO usage. The lob is stored out of row in a separate tablespace than the table.
The table creation DDL used for basicfile test is shown below (comment and uncomment each option based on the test case). The LOB32KTBS is a tablespace of 32k block size while LOB8KTBS is a tabelspace of 8k block size.
create table ses(sesid varchar2(100), sesob blob) SEGMENT CREATION IMMEDIATE TABLESPACE users
 LOB
  (
    sesob
  )
  STORE AS object_lob_seg (
  TABLESPACE LOB32KTBS
  --TABLESPACE LOB8KTBS
  DISABLE STORAGE IN ROW
  CHUNK 32K
  --CHUNK 8K
                CACHE
  --NOCACHE
  PCTVERSION 0
        STORAGE (MAXEXTENTS UNLIMITED)
        INDEX object_lob_idx (
            TABLESPACE LOB32KTBS
     --TABLESPACE LOB8KTBS
            STORAGE (MAXEXTENTS UNLIMITED)
        )
    )
/
Test cases comprised of reading a lob column for a row and inserting lobs. The IO related statistics comparison for select test case is given below. Based on the graphs it could be seen that 32K chunk size on a tablespace with a block size of 32K requires less number of logical or physical reads compared to having a 8k chunk and lob segment on a 8k block size tablespace. Though not shown on the graphs, on a separate test where using a 32k chunk size and placing the lob segment on a 8K block size tablespace had the same performance characteristics of having a 8k chunk on a 8k block size tablespace. On the other hand having a chunk of 8k and placing the lob segment on a 32k block size tablespace had the same performance characteristics of having a 32k chunk on a 32k block size tablespace. This means that chunk size alone is not going to reduce the amount of IO but the tablespace block size where the lob segment is stored has an influence as well.

The next test was the inserting of lob. The results are shown on the following two graphs. Similar to the read test, having a large chunk size and tablespace block size for lob reduces the IO.




The same test was carried out for securefile lob segments. The table creation DDL is given below. Only difference to the DDL compared to basicfile is the "retention none". All other parameters/options are the same.
create table ses(sesid varchar2(100), sesob blob) SEGMENT CREATION IMMEDIATE TABLESPACE users
 LOB
  (
    sesob
  )
  STORE AS  securefile object_lob_seg (
  TABLESPACE LOB32KTBS
  --TABLESPACE LOB8KTBS
  DISABLE STORAGE IN ROW
  CHUNK 32K
  --CHUNK 8K
                CACHE
  --NOCACHE
  RETENTION NONE
        STORAGE (MAXEXTENTS UNLIMITED)
        INDEX object_lob_idx (
            TABLESPACE LOB32KTBS
   --TABLESPACE LOB8KTBS
            STORAGE (MAXEXTENTS UNLIMITED)
        )
    )
/
The results of the select test is shown on the graphs below. Similar to basicfile the larger chunk size advisory and tablespace block size combination out perform the smaller chunk/block size combination. In all cases the securefile out perform basicfile for the amount of logical or physical reads.

The outcome for the insert test also same as that of basicfile insert test where larger chunk/block size combination out performs the smaller chunk/block size combination. Also between basicfile and secfile the secfile out performs the basicfile.

This tests have shown that it's better to use large chunk/tablespace block sizes for larger LOBs to reduce logical/physical IO related to LOBs.

Useful White papers
SecureFile Performance
Oracle 11g: SecureFiles

Related Post
Nologging vs Logging for LOBs and enq: CF - contention

Java code used for the test. For the code of LobStat class refer the earlier post
public class LobChunkTest {

    final String URL = "jdbc:oracle:thin:@192.168.0.66:1521:ent11g2";
    final String USERNAME = "asanga";
    final String PASSWORD = "asa";

    public static void main(String[] args) {

        LobChunkTest test = new LobChunkTest();
        //Insert test
        test.insertTest();

        System.out.println("\n\n************* end of insert test **************\n\n");

        //select test
        test.selectTest();

    }

    public void insertTest() {
        try {

            OracleDataSource pool = new OracleDataSource();
            pool.setURL(URL);
            pool.setUser(USERNAME);
            pool.setPassword(PASSWORD);

            Connection con = pool.getConnection();
            con.setAutoCommit(false);

            long t1 = System.currentTimeMillis();

            LobStat.displayStats(con);

            byte[] x = new byte[800 * 1024];
            x[1] = 10;
            x[798 * 1024] = 20;

            for (int i = 0; i < 100; i++) {

                OraclePreparedStatement pr = (OraclePreparedStatement) con.prepareStatement("insert into ses values(?,?)");

                String sesid = "abcdefghijklmnopqrstuvwxy" + Math.random();
                pr.setString(1, sesid);
                pr.setBytes(2, x);

                pr.execute();
                con.commit();
                pr.close();

            }

            long t2 = System.currentTimeMillis();

            LobStat.displayStats(con);

            con.close();

            System.out.println("time taken " + (t2 - t1));

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }

    public  void selectTest() {
        try {

            OracleDataSource pool = new OracleDataSource();
            pool.setURL("jdbc:oracle:thin:@192.168.0.66:1521:ent11g2");
            pool.setUser("asanga");
            pool.setPassword("asa");

            Connection con = pool.getConnection();
            con.setAutoCommit(false);

            String[] sesids = new String[100];

            PreparedStatement pr1 = con.prepareStatement("select sesid from ses");
            ResultSet rs1 = pr1.executeQuery();
            int i = 0;
            while (rs1.next()) {

                sesids[i] = rs1.getString(1);

                i++;

            }
            rs1.close();
            pr1.close();
            con.close();

            con = pool.getConnection();
            LobStat.displayStats(con);

            OraclePreparedStatement pr = (OraclePreparedStatement) con.prepareStatement("select SESOB from ses where sesid=?");

            long t1 = System.currentTimeMillis();
            for (String x : sesids) {


                pr.setString(1, x);
                ResultSet rs = pr.executeQuery();

                while (rs.next()) {

                    byte[] blob = rs.getBytes(1);

                }

                rs.close();
            }

            long t2 = System.currentTimeMillis();
            System.out.println("time taken " + (t2 - t1));

            LobStat.displayStats(con);

            pr.close();

            con.close();

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

Tuesday, August 19, 2014

Nologging vs Logging for LOBs and enq: CF - contention

Changing the logging option is one of the possible performance tuning tasks when dealing with LOBs. However use of nologging will make recovery of these lob segments impossible as such use of this will also depends on the nature of the application data. For transient data where recovery is not expected in case of database failure nologging option would be suitable. When the logging option is not explicitly mentioned this results in the tablespace's logging option being inherited by the lob segment. Refer Oracle documentation for more logging related information on basicfile and securefile.
This post presents the result of a simple test carried out comparing the nologging vs logging for both basicfile and securefile LOB segments. The java code used for the test is given at the end of the post. First case is the basicfile. Nologging is not possible if the cache is enable on the lob segment. Therefore nocache option is chosen for both logging and nologging. The table definition is given below.
create table ses(sesid varchar2(100), sesob blob) SEGMENT CREATION IMMEDIATE TABLESPACE users
 LOB
  (
    sesob
  )
  STORE AS object_lob_seg (
  TABLESPACE lob32ktbs
  DISABLE STORAGE IN ROW
        CHUNK 32K
        NOCACHE NOLOGGING
        --NOCACHE LOGGING
        PCTVERSION 0
        STORAGE (MAXEXTENTS UNLIMITED)
        INDEX object_lob_idx (
            TABLESPACE lob32ktbs
            STORAGE (MAXEXTENTS UNLIMITED)
        )
    )
/
The test involves inserting an 800KB LOB into the table 100 times and then later updating it with similar size LOB. Since the LOB is greater than 4K, the table is created with disable in row. Chunk size is set to 32k and the lob segment is stored in a tablespace of 32K block size (lob32ktbs) while the table resides in a different (users) tablespace of 8K block size. Pctversion is set to 0 as no consistent reads are expected. Only option that is changed is the logging option. The test database version is 11.2.0.3
Redo size statistic and the log file sync wait event times are compared for the two test runs as these are directly related to the logging option. Graphs below show the comparison of these for each test run.

There's no surprise that nologging option generates the lowest amount of redo. The logging test case generated around 83-84MB of redo for insert and update which is roughly the same size as the LOB inserted/updated (800KB x 100). There's minimal logging during the nologging test. Since redo is counted for the entire test, the redo seen could be the redo generated for the table data insert (as oppose to lobs) which still is on a tablespace with logging. Nevertheless a clear difference could be observed in the amount of redo generated when nologging is used. This is also reflected in the log file sync time for the two test cases which got reduced from several minutes to under a minute.
Next the same test was executed but this time the lob segment was stored as securefile. The table DDL is given below. Only difference apart from the securefile is Pctversion 0 has been replaced with retention none. All other settings are same as the basic file (tablespaces, database and etc). Chunk size is depreciated in securefile, and when specified is considered only as a guidance.
create table ses(sesid varchar2(100), sesob blob) SEGMENT CREATION IMMEDIATE TABLESPACE users
 LOB
  (
    sesob
  )
  STORE AS securefile object_lob_seg (
  TABLESPACE lob32ktbs
  DISABLE STORAGE IN ROW
        CHUNK 32K
        NOCACHE NOLOGGING
        --NOCACHE LOGGING
        RETENTION NONE
        STORAGE (MAXEXTENTS UNLIMITED)
        INDEX object_lob_idx (
            TABLESPACE lob32ktbs
            STORAGE (MAXEXTENTS UNLIMITED)
        )
    )
/
Similar to the earlier test the nologging generated low amount of redo compared to logging and resulted in short wait on log file sync event.

Comparing redo size and log file sync time for nologging option between the basicfile and securefile shows a mix bag of results. Basicfile has performed well for inserts in reducing redo generated and log file sync while securefile performed well for updates.

Comparing the IO types on OEM console during the nologging test it was noted that securefile uses predominantly large writes while the basicfile uses small writes.
Comparing the test result with logging enabled the securefile out performance basic file for insert and updates in terms of log file sync wait time. Both securefile and basicfile generates roughly the same amount of redo. It must be noted nocache logging is the default for secure file.
The table below shows all the test results.




It seems that when application permits it's best to use nologging for lobs which reduce the amount of redo generated and log file sync waits. However there are some drawbacks to using nologging on LOBs which only comes to light when there are multiple sessions doing LOB related DMLS. Following is an AWR snippet from a load test on pre-production system.
After CPU the top wait event is log file sync and high portion of this wait event is due an basicfile LOB related insert statement that store some transient data. Changing the lob segment option to nologging resulted in lower log file sync time but it also introduced high enq: CF - contention wait times.
According to 1072417.1 enq: CF - contention is normal and expected when doing DML on LOBs with nocache nologging. CF contention occurs as oracle records the unrecoverable system change number (SCN) in the control file. From the application perspective the overall response time degraded after changing to nologging and had to be reverted back to cache logging.

Useful metalink notes
Performance Degradation as a Result of 'enq: CF - contention' [ID 1072417.1]
LOB Performance Guideline [ID 268476.1]
LOBS - Storage, Redo and Performance Issues [ID 66431.1]
LOBS - Storage, Read-consistency and Rollback [ID 162345.1]
Master Note - RDBMS Large Objects (LOBs) [ID 1268771.1]
Performance problems on a table that has hundreds of columns including LOBs [ID 1292685.1]
POOR PERFORMANCE WITH LOB INSERTS [ID 978045.1]
Securefiles Performance Appears Slower Than Basicfile LOB [ID 1323933.1]


Java Code Used for Testing
public class LobLoggingTest {

    final String URL = "jdbc:oracle:thin:@192.168.0.66:1521:ent11g2";
    final String USERNAME = "asanga";
    final String PASSWORD = "asa";

    public static void main(String[] args) {

        LobLoggingTest test = new LobLoggingTest();
        //Insert test
        test.insertTest();

        System.out.println("\n\n************* end of insert test **************\n\n");

        //Update test
        test.updateTest();

    }

    public void insertTest() {
        try {

            OracleDataSource pool = new OracleDataSource();
            pool.setURL(URL);
            pool.setUser(USERNAME);
            pool.setPassword(PASSWORD);

            Connection con = pool.getConnection();
            con.setAutoCommit(false);

            long t1 = System.currentTimeMillis();

            LobStat.displayStats(con);
            LobStat.displayWaits(con);

            byte[] x = new byte[800 * 1024];
            x[1] = 10;
            x[798 * 1024] = 20;

            for (int i = 0; i < 100; i++) {

                OraclePreparedStatement pr = (OraclePreparedStatement) con.prepareStatement("insert into ses values(?,?)");

                String sesid = "abcdefghijklmnopqrstuvwxy" + Math.random();
                pr.setString(1, sesid);
                pr.setBytes(2, x);

                pr.execute();
                con.commit();
                pr.close();

            }

            long t2 = System.currentTimeMillis();

            LobStat.displayStats(con);
            LobStat.displayWaits(con);

            con.close();

            System.out.println("time taken " + (t2 - t1));

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }

public void updateTest() {
        try {

            OracleDataSource pool = new OracleDataSource();
            pool.setURL(URL);
            pool.setUser(USERNAME);
            pool.setPassword(PASSWORD);

            Connection con = pool.getConnection();
            con.setAutoCommit(false);

            String[] sesids = new String[100];

            PreparedStatement pr1 = con.prepareStatement("select sesid from ses");
            ResultSet rs1 = pr1.executeQuery();
            int i = 0;
            while (rs1.next()) {

                sesids[i] = rs1.getString(1);
                i++;
            }
            rs1.close();
            pr1.close();
            con.close();

            con = pool.getConnection();
            LobStat.displayStats(con);
            LobStat.displayWaits(con);

            OraclePreparedStatement pr = (OraclePreparedStatement) con.prepareStatement("update ses set SESOB=? where sesid=?");

            byte[] xx = new byte[800 * 1024];
            xx[1] = 10;
            xx[798 * 1024] = 20;

            long t1 = System.currentTimeMillis();
            for (String x : sesids) {

                pr.setBytes(1, xx);
                pr.setString(2, x);
                pr.execute();
                con.commit();
            }

            long t2 = System.currentTimeMillis();
            System.out.println("time taken " + (t2 - t1));

            LobStat.displayStats(con);
            LobStat.displayWaits(con);

            pr.close();
            con.close();

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}


public class LobStat {

    public static void displayStats(Connection con) {
        try {
            PreparedStatement pr = con.prepareStatement("select name,value from v$mystat,v$statname where v$mystat.statistic#=v$statname.statistic# "
                    + " and v$statname.name in ('CPU used when call started',"
                    + " 'CPU used by this session','db block gets','db block gets from cache','db block gets from cache (fastpath)',"
                    + " 'db block gets direct','consistent gets','consistent gets from cache','consistent gets from cache (fastpath)',"
                    + " 'consistent gets - examination','consistent gets direct','physical reads','physical reads direct',"
                    + " 'physical read IO requests','physical read bytes','consistent changes','redo size','redo writes',"
                    + " 'lob writes','lob writes unaligned','physical writes direct (lob)','physical writes','physical writes direct','physical writes from cache','physical writes direct temporary tablespace'"
                    + " ,'physical writes direct temporary tablespace','securefile direct read bytes','securefile direct write bytes','securefile direct read ops'"
                    + " ,'securefile direct write ops') order by 1");

            ResultSet rs = pr.executeQuery();


            while(rs.next()){

                System.out.println(rs.getString(1)+" : "+rs.getDouble(2));
            }

            rs.close();
            pr.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }

    }

    public static void displayWaits(Connection con){
        try {
            PreparedStatement pr = con.prepareStatement("select event,total_waits,TIME_WAITED_MICRO from  V$SESSION_EVENT where sid=SYS_CONTEXT ('USERENV', 'SID') and event in ('log file sync','enq: CF - contention')");
            ResultSet rs = pr.executeQuery();

            System.out.println("event : total waits : time waited micro");
            while(rs.next()){

                System.out.println(rs.getString(1)+" : "+rs.getLong(2)+" : "+rs.getLong(3));

            }

            rs.close();
            pr.close();
        } catch (SQLException ex) {
            ex.printStackTrace();
        }

    }
}