sql server 2008 - How to determine root cause for Communication link failure TCP Provider: The specified network name is no longer available? -
here latest effort @ revising question. time, trying follow counsel given oded in article getting answers on stackoverflow.
i need find out how can determine root cause following error:
communication link failure tcp provider: specified network name no longer available
from time time, seeing error when running set of ssis packages. error can occur when 1 many packages run from:
- a sql server agent job
- a batch file
- in debug mode bids
the full error message see follows:
ssis error code dts_e_oledberror. ole db error has occurred. error code: 0x80004005. ole db record available. source: "microsoft sql server native client 10.0" hresult: 0x80004005 description: "communication link failure". ole db record available. source: "microsoft sql server native client 10.0" hresult: 0x80004005 description: "tcp provider: specified network name no longer available. ". ssis error code dts_e_oledberror. ole db error has occurred. error code: 0x80004005. ole db record available. source: "microsoft sql server native client 10.0" hresult: 0x80004005 description: "protocol error in tds stream". ole db record available. source: "microsoft sql server native client 10.0" hresult: 0x80004005 description: "communication link failure". ole db record available. source: "microsoft sql server native client 10.0" hresult: 0x80004005 description: "tcp provider: existing connection forcibly closed remote host."
this overview of how have designed etl process:
- two servers
- both virtual machines
- the ssis packages run on application server
- the sql server database lives on database server
i use ole db connection manager connect ssis package on application server sql server database on database server.
the packages run file system deployment on application server , not database deployment on database server.
the main reason etl integrated set of tools no found on , drives not accessible database server. these tools include apex data loader salesforce , pgadmin iii.
so far cannot consistently reproduce error. however, have observed:
- failure occurs more during regular business hours
- failure occurs less during off hours
for 2 hour period on friday morning able reproduce error on specific package.
the error occurred during large data flow if child package call precedes large data flow enabled.
the error did not occur during same large data flow if child package call precedes large data flow disabled.
the child package in question calls database retrieve tiny amount of information use in email body , sends email.
it feels maybe resource limit being exceeded?
maybe connection limit?
i wondering tools should using try , determine root cause of error.
technical details 2 servers involved listed below:
sql server , database server info:
microsoft sql server 2008 r2 (sp1) - 10.50.2500.0 (x64) jun 17 2011 00:54:03 copyright (c) microsoft corporation enterprise edition (64-bit) on windows nt 6.1 (build 7601: service pack 1) (hypervisor)
ssis info:
microsoft visual studio 2008 version 9.0.30729.1 sp microsoft .net framework version 3.5 sp1
application server info:
os name: microsoft windows server 2008 r2 standard version: 6.1.7601 service pack 1 build 7601
i have researched error message online , found these, expert's insight before proceeding:
how disable tcp chimney, tcpip offload engine (toe) or tcp segmentation offload (tso).
using netsh commands enable or disable tcp chimney offload
any appreciated.
thanks
update:
further testing shows not "an ssis thing" same error seen @ same rate when using sql server management studio. complexity of query not make error more or less likely. in attempt resolve, have tried 1 fix (below):
#1 how disable tcp chimney, tcpip offload engine (toe) or tcp segmentation offload (tso).
this our first attempt. tcp chimney disabled on application server , database server. testing shows same error occurs @ same rate.
go here? not sure. 1 seemingly option remains:
application server , database server sql server installations not match
application server = sql server 2008 (sp1) - 10.0.2531.0 (x64)
database server = sql server 2008 r2 (sp1) - 10.50.2500.0 (x64)
the plan upgrade sql server installation on application server. kind of hit , hope, @ point seems best option. in brain tells me might solved fixing hardware issue (by mean repair or replace) , there might not hardware , software configuration can it.
however, still not sure how go determining root cause. still left wondering tools should using diagnose root cause.
- first off did tried remove large send offload setting on nic ?
- second point, can run wireshark capture packets if can reproduce error ?
- third point, did tried change vnic vm ? model can cause issue. (if use vmxnet3, try e1000, etc..)
- last point, have vswitch between them, on same host, physical switch between, etc... badly configured switch can drop traffic, if inside host same host , same vswitch it's best test, traffic never leave server.
Comments
Post a Comment