I have a sql script with a procedure. I run it using
sqlplus -s @script.sql logfile.log myparam1 myparam2
But after ~ 2 hour and 10 mins, my script ends with
ERROR:
ORA-03114: not connected to ORACLE
DECLARE
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 12345
Session ID: 33 Serial number: 54321
I have following tcp settings
-bash-4.1$ cat /proc/sys/net/ipv4/tcp_keepalive_time (which is two hours)
7200
-bash-4.1$ cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
-bash-4.1$ cat /proc/sys/net/ipv4/tcp_keepalive_probes
9
When the script was running, I checked that TCP connection was established at my end , but on the database side (database server machine) there was no such connection.
My theory is that somehow database server is dropping the connection. And when my system sends first keepalive probe after 2 hours (7200 seconds), it finds that connection is no more active and closes the connection and script returns.
I am unable to understand why would a database system drop a connection? Are there any settings at database end to increase verbosity? Or could this be related to some firewall settings? Also, in 2 hours and 10 mins, we can guess that 2 hours portion is coming from tcp_keepalive_time, what would be 10 mins portion. Any database side retries?
EDIT: DBA and I looked at the issue, I see the TCP connection as ESTABLISHED at my end and he didn't see any connection coming from my side.
We tried setting SQLNET.EXPIRE_TIME to 10 mins. But it didn't work. We bounced database server machine, but it still didn't work. Maybe some latest firewalls may not see DCD packets as a valid traffic as mentioned in the article mentioned in the article (also given by @user1683793 above). Finally, we changed keepalive time to 25 mins (on the client machine) so that there will be some traffic on the tcp connection. Fortunately firewall seems to consider keep alive packet as a traffic.
-bash-4.1$ cat /proc/sys/net/ipv4/tcp_keepalive_time
1500