Saturday, October 15, 2022

Testing Application Continuity in Oracle 19c RAC (19.12) using Java Application

Introduction to Application Continuity

Application Continuity (AC) is a feature available with the Oracle Real Application Clusters (RAC), Oracle RAC One Node and Oracle Active Data Guard (ADG) options that masks outages from end users and applications by recovering the in-flight database sessions following recoverable outages. Application Continuity performs this recovery beneath the application so that the outage appears to the application as a slightly delayed execution.

Application Continuity improves the user experience for both planned maintenance and unplanned outages. Application Continuity enhances the fault tolerance of systems and applications that use an Oracle database.

Oracle MAA is a collection of architecture, configuration, and life cycle best practices and blueprints. It provides Oracle’s customers with valuable insights and expert recommendations which have been validated and tested working with enterprise customers. This is also an outcome of ongoing communication, with the community of database architects, software engineers, and database strategists, that helps Oracle develop a deep and complete understanding of various kinds of events that can affect availability or data integrity. Over the years, this led to the development and natural evolution of an array of availability reference architectures.

Application Continuity extends the MAA architecture to the application tier.

Environment Setup:

Primary RAC Database: orcl (orcl1 and orcl2)
GRID Home : /u01/app/19.0.0/grid
RDBMS Home: /u01/app/oracle/product/19.0.0/dbhome_1
Version    : Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.12.0.0.0
Scan        : oel70p-scan

Pre-requisites for Application Continuity:

1. Database and client (driver) version >= Oracle 12c
2. Enable Application Continuity on the Service
[oracle@oel70-base1 ~]$ srvctl modify service -d ORCL -s acservice -failovertype TRANSACTION -commit_outcome TRUE
3. Use the recommended tnsnames.ora or JDBC URL
4. Use Orachk: ($orachk -accheck)
5. Check in AWR Reports: (User Calls in = User Calls Protected)

Before Application Continuity:

Database outages cause in-flight work to be lost, leaving users and applications in-doubt

1. Developer pains
2. Duplicate submissions
3. Cancelled work
4. Errors even when planned 
5. Restart applications and mid-tiers

Step-1: Login as 'oracle' user check the database status (orcl)

[oracle@oel70-base1 ~]$ . oraenv
ORACLE_SID = [orcl1] ?
The Oracle base remains unchanged with value /u01/app/oracle
[oracle@oel70-base1 ~]$

[oracle@oel70-base1 ~]$ srvctl status database -d orcl
Instance orcl1 is running on node oel70-base1
Instance orcl2 is running on node oel70-base2
[oracle@oel70-base1 ~]$

Step-2: Create service (acservice) and check the status of the service

[oracle@oel70-base1 ~]$ srvctl add service -s acservice -d orcl -r orcl1,orcl2
[oracle@oel70-base1 ~]$ srvctl start service -s acservice -d orcl
[oracle@oel70-base1 ~]$ srvctl status service -s acservice -d orcl
Service acservice is running on instance(s) orcl1,orcl2
[oracle@oel70-base1 ~]$

Step-3: Modify the service to check 'Application Continuity' property

[oracle@oel70-base1 ~]$ srvctl modify service -d ORCL -s acservice -failovertype TRANSACTION -replay_init_time 300 -failoverretry 30 -failoverdelay 3 -notification TRUE -commit_outcome TRUE

[oracle@oel70-base1 ~]$

a. failovertype = AUTO or TRANSACTION for Application Continuity
b. commit_outcome = TRUE for Transaction Guard
c. failoverretry = Number of connection retries per replay
d. failoverdelay = Delay in seconds between connection retries


[oracle@oel70-base1 ~]$ srvctl config service -d  `srvctl config database` -s ac_service
Service name: ac_service
Server pool:
Cardinality: 1
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: true
Global: false
Commit Outcome: true
Failover type: TRANSACTION
Failover method:
Failover retries:
Failover delay:
Failover restore: LEVEL1
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Pluggable database name: pdb1
Hub service:
Maximum lag time: ANY
SQL Translation Profile:
Retention: 86400 seconds
Replay Initiation Time: 300 seconds
Drain timeout:
Stop option:
Session State Consistency: DYNAMIC
GSM Flags: 0
Service is enabled
Preferred instances: orcl1
Available instances: orcl2
CSS critical: no
Service uses Java: false
[oracle@oel70-base1 ~]$

Step-4: Install JDK as a 'root' user and download ojdbc.jar file and place it to lib folder.

[root@oel70-base1 oracle]# rpm -ivh jdk-12.0.2_linux-x64_bin.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:jdk-12.0.2-2000:12.0.2-ga        ################################# [100%]
[root@oel70-base1 oracle]#
[root@oel70-base1 oracle]# chmod 755 ojdbc8.jar
[root@oel70-base1 oracle]# mv ojdbc8.jar /lib/

Step-5: Update bash_profile accordingly and execute the bash_profile

[oracle@oel70-base1 ~]$ cat .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/.local/bin:$HOME/bin
export PATH
export TMP=/tmp
export TMPDIR=$TMP
export JAVA_HOME=/usr/java/jdk-12.0.2/bin
export ORACLE_HOSTNAME=oel70-base1.localdomain
export ORACLE_UNQNAME=orcl
export ORACLE_BASE=/u01/app/oracle
export GRID_HOME=/u01/app/19.0.0/grid
export DB_HOME=$ORACLE_BASE/product/19.0.0/dbhome_1
export ORACLE_HOME=$DB_HOME
export ORACLE_SID=orcl1
export ORACLE_TERM=xterm
export BASE_PATH=/usr/sbin:$PATH
export PATH=$ORACLE_HOME/bin:$BASE_PATH:$JAVA_HOME
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib
export CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib:/lib/ojdbc8.jar
[oracle@oel70-base1 ~]$

Step-6: Create 'Actest.java' program to test the functionality of 'Application Continuity'

[oracle@oel70-base1 ~] cat AcTest.java
import java.sql.*;
import oracle.jdbc.*;
import oracle.jdbc.pool.*;
import oracle.jdbc.replay.*;
public class AcTest
{
public static void main(String[] args) throws SQLException,java.lang.InterruptedException
{
oracle.jdbc.replay.OracleDataSource  AcDatasource = oracle.jdbc.replay.OracleDataSourceFactory.getOracleDataSource();
AcDatasource.setURL("jdbc:oracle:thin:@racs-scan:1521/acservice");
AcDatasource.setUser("system");
AcDatasource.setPassword("oracle");
Connection conn = AcDatasource.getConnection();
conn.setAutoCommit(false);
PreparedStatement stmt = conn.prepareStatement("select instance_name from v$instance");
ResultSet rset = stmt.executeQuery();
while (rset.next())
{
System.out.println("You are Connected to RAC Instance - "+ rset.getString(1));
}
Thread.currentThread().sleep(60000);
((oracle.jdbc.replay.ReplayableConnection)conn).beginRequest();
PreparedStatement stmt1 = conn.prepareStatement("select instance_name from v$instance");
ResultSet rset1 = stmt1.executeQuery();
while (rset1.next())
{
System.out.println("After Replay Connected to RAC Instance - "+rset1.getString(1));
}
rset.close();
stmt.close();
rset1.close();
stmt1.close();
conn.close();
((oracle.jdbc.replay.ReplayableConnection)conn).endRequest();
}
}
[oracle@oel70-base1]#

Note: Configure connection string and sleep (60000) based on your environment/testing cycles.

Step-7: Login as 'root' user and compile 'Actest.java' program and execute the 'Actest.java' program using 'oracle' user.

[root@oel70-base1 ~]# javac /home/oracle/AcTest.java
[root@oel70-base1 ~]#

[oracle@oel70-base1 ~]$ java AcTest.java
You are Connected to RAC Instance - orcl1
- - -> Waiting 

Actest.java program connected to 'orcl1' instance. We will stop instance using srvctl with force option and check the 'Actest.java' program output in the other screen.

[oracle@oel70-base1 ~]$ srvctl stop instance -i orcl1 -d orcl -force

[oracle@oel70-base1 ~]$ java AcTest.java
You are Connected to RAC Instance - orcl1
After Replay Connected to RAC Instance - orcl2
[oracle@oel70-base1 ~]$

Note: It will be reconnected to 'orcl2' instance automatically based on 'Actest.java' program.

Step-8: Execute 'Actest.java' program again and it will be connected to 'orcl2' instance because 'orcl1' is not yet started. 

[oracle@oel70-base1 ~]$ java AcTest.java
You are Connected to RAC Instance - orcl2
Start the 'orcl1' instance and check the output of 'Actest.java' program.

[oracle@oel70-base1 ~]$ srvctl start instance -i orcl1 -d orcl

Note: The above command is taking time to start 'orcl1' instance mean time 'Actest.java' program connected to again 'orcl2' instance.

[oracle@oel70-base1 ~]$ java AcTest.java
You are Connected to RAC Instance - orcl2
After Replay Connected to RAC Instance - orcl2
[oracle@oel70-base1 ~]$

Now, both the instances are UP and try to connect 'Actest.java' program.

[oracle@oel70-base1 ~]$ srvctl status database -d orcl
Instance orcl1 is running on node oel70-base1
Instance orcl2 is running on node oel70-base2
[oracle@oel70-base1 ~]$

[oracle@oel70-base1 ~]$ java AcTest.java

You are Connected to RAC Instance - orcl2

So, it connected to 'orcl2' instance.

Additional notes:  Configuring your Service

When using Oracle Database on premise, you can create services on Oracle RAC that use Transparent Application Continuity, or Application Continuity, or TAF. You can use roles to distinguish whether the services are active on Active Data Guard or the primary database. The following examples illustrate this:

Basic Service Creation

$ srvctl add service -db mydb -service MYSERVICE –preferred inst1 -available
inst2 -notification TRUE -drain_timeout 300 -stopoption IMMEDIATE -role PRIMARY

Transparent Application Continuity

$ srvctl add service -db mydb -service GOLD -preferred serv1 -available serv2 -
failover_restore AUTO -failoverretry 30 -failoverdelay 10 -commit_outcome TRUE -
failovertype AUTO -replay_init_time 1800 -retention 86400 -notification TRUE -
drain_timeout 300 -stopoption IMMEDIATE

Application Continuity

$ srvctl add service -db mydb -service SILVER -preferred serv1 -available serv2
-failover_restore LEVEL1 -failoverretry 30 -failoverdelay 10 -commit_outcome
TRUE -failovertype TRANSACTION -replay_init_time 1800 -retention 86400 -
notification TRUE -drain_timeout 300 -stopoption IMMEDIATE

Transparent Application Failover

$ srvctl add service -db mydb -service BRONZE -preferred serv1 -available serv2
-failover_restore LEVEL1 -failoverretry 30 -failoverdelay 10 -commit_outcome
TRUE -failovertype SELECT -retention 86400 -notification TRUE -drain_timeout 300
-stopoption IMMEDIATE

To add with the Data Guard role, here is the TAC example:

$ srvctl add service -db mydb -service GOLD -preferred serv1 -available serv2 -
failover_restore AUTO -failoverretry 30 -failoverdelay 10 -commit_outcome TRUE -
failovertype AUTO -replay_init_time 1800 -retention 86400 -notification TRUE -
role PHYSICAL_STANDBY -drain_timeout 300 -stopoption IMMEDIATE

SQL Queries to check the status of services

Query-1: To report protection by PDB, use the following example:

set lines 85
col Service_name format a30 trunc heading "Service"
break on con_id skip1
col Total_requests format 999,999,9999 heading "Requests"
col Total_calls format 9,999,9999 heading "Calls in requests"
col Total_protected format 9,999,9999 heading "Calls Protected"
col Protected format 999.9 heading "Protected %"
select con_id, total_requests,
total_calls,total_protected,total_protected*100/NULLIF(total_calls,0) as Protected
from(
select * from
(select s.con_id, s.name, s.value
FROM GV$CON_SYSSTAT s, GV$STATNAME n
WHERE s.inst_id = n.inst_id
AND s.statistic# = n.statistic#
AND s.value != 0 )
pivot(
sum(value)
for name in ('cumulative begin requests' as total_requests, 'cumulative end requests' as
Total_end_requests, 'cumulative user calls in requests' as Total_calls, 'cumulative user
calls protected by Application Continuity' as total_protected)
))
order by con_id;

Query-2: To report protection by service, use the following example:

set pagesize 60
set lines 120
col Service_name format a30 trunc heading "Service"
break on con_id skip1
col Total_requests format 999,999,9999 heading "Requests"
col Total_calls format 9,999,9999 heading "Calls in requests"
col Total_protected format 9,999,9999 heading "Calls Protected"
col Protected format 999.9 heading "Protected %"
select con_id, service_name,total_requests,
total_calls,total_protected,total_protected*100/NULLIF(total_calls,0) as Protected
from(
select * from
(select a.con_id, a.service_name, c.name,b.value
FROM gv$session a, gv$sesstat b, gv$statname c
WHERE a.sid = b.sid
AND a.inst_id = b.inst_id
AND b.value != 0
AND b.statistic# = c.statistic#
AND b.inst_id = c.inst_id
AND a.service_name not in ('SYS$USERS','SYS$BACKGROUND'))
pivot(
sum(value)
for name in ('cumulative begin requests' as total_requests, 'cumulative end requests' as
Total_end_requests, 'cumulative user calls in requests' as Total_calls, 'cumulative user
calls protected by Application Continuity' as total_protected) ))
order by con_id, service_name; 

Query-3: To report protection history over last three days, use the following example:

set lines 85
col Service_name format a30 trunc heading"Service"
break on con_id skip1
col Total_requests format 999,999,9999 heading "Requests"
col Total_calls format 9,999,9999 heading "Calls in requests"
col Total_protected format 9,999,9999 heading "Calls Protected"
col Protected format 999.9 heading "Protected %"
set lines 85
col Service_name format a30 trunc heading"Service"
break on con_id skip1
col Total_requests format 999,999,9999 heading "Requests"
col Total_calls format 9,999,9999 heading "Calls in requests"
col Total_protected format 9,999,9999 heading "Calls Protected"
col Protected format 999.9 heading "Protected %"
select a.instance_number,begin_interval_time, total_requests, total_calls, total_protected,
total_protected*100/NULLIF(total_calls,0) as Protected
from(
select * from
(select a.snap_id, a.instance_number,a.stat_name, a.value
FROM dba_hist_sysstat a
WHERE a.value != 0 )
pivot(
sum(value)
for stat_name in ('cumulative begin requests' as total_requests, 'cumulative end requests' as
Total_end_requests, 'cumulative user calls in requests' as Total_calls, 'cumulative user calls
protected by Application Continuity' as total_protected)
)) a,
dba_hist_snapshot b
where a.snap_id=b.snap_id
and a.instance_number=b.instance_number
and begin_interval_time>systimestamp - interval '3' day
order by a.snap_id,a.instance_number;

Please refer more details:

https://www.youtube.com/watch?v=kBG5DxZ3szg
https://www.oracle.com/database/technologies/high-availability/app-continuity.html
https://www.youtube.com/watch?v=DOtNjigyjV4

3 comments: