Note: this is a technical article for developers who work with SQL Server, SQL Azure, Windows Azure, ADO.NET, .netTiers and other geek technologies. Skip this if you’re not a developer – this will be really boring otherwise.
We have deployed a few web roles to Windows Azure and every once in a while we get the following exception when connecting to SQL Azure:
System.Data.SqlClient.SqlException: A transport-level error has occurred when sending the request to the server. (provider: TCP Provider, error: 0 - An established connection was aborted by the software in your host machine.)
This does not happen on every call, but it happened over 70 times a few weeks ago as our QA team was testing our web role. Both the web roles and the SQL Azure database were in the same Windows Azure Data Center. We did set the maxconnection to 96 for connection pooling. We use .netTiers (http://nettiers.com/) for our ORM, it's similar to nHibernate.
I asked this question on the Windows Azure Forums
We finally figured out how to solve the issue when using .netTiers
Below I show you how you can avoid these exceptions – with a specific fix if you are using .netTiers (http://nettiers.com/) as your ORM. See below for the exciting details.
The Problem
As briefly described above, every once in a while we get an exception (a System.Data.SqlClient.SqlException) when one of our web or worker roles running on Windows Azure in the cloud attempts to connect to our SQL Azure database also running in the same Azure data center. The exception may look like this:
The Solution – write retry logic
It looks like the transport level errors are due to different timeouts between ADO.NET connection pools and SQL Azure. My guess is that the connections in the pool timeout at 10 minutes while SQL Azure times out at 5 minutes.
In threads found in the Windows Azure forums the current solution seems to be to write retry logic to work around this issue. This is different for every project, but here’s a sample solution from “DavidM51”:
1: Int16 retries = 0;
2:
3: TryAgain:
4:
5: try
6: {
7: using (SqlConnection conn = new SqlConnection(connectionString))
8: {
9: conn.Open();
10:
11: SqlCommand cmd = new SqlCommand();
12:
13: cmd.Connection = conn;
14: cmd.CommandText = "SPROC Name";
15: cmd.CommandType = CommandType.StoredProcedure;
16:
17: //Code for SPROC
18:
19: Int32 rowsAffected = cmd.ExecuteNonQuery();
20:
21: //Added conn.Close() to extraspecially make sure connection is closed!
22: conn.Close();
23: }
24: }
25: catch (Exception ex)
26: {
27: CommonFunctions.LogActivity("SQL Error (Retry): " + ex.Message);
28: if (retries < 1)
29: {
30: retries++;
31: //Added short sleep
32: System.Threading.Thread.Sleep(50);
33: goto TryAgain;
34: }
35: //Failed!
36: CommonFunctions.LogActivity("SQL Error (Failed): " + ex.Message);
37: }
The Solution if using .netTiers – fix the “Retry” feature
If you are using .netTiers, we have this retry logic out of the box. In CodeSmith open your NetTiers.cst template and under Properties, check out the “07 – CRUD – Advanced” section:
When my co-workers and I saw these properties, we said “great, we don’t need to write any retry logic of our own - .netTiers once again has it out of the box!”.
Unfortunately, even though we set the values above and re-created and re-compiled our data access layer, we still kept seeing the exceptions when connecting to SQL Azure.
Last week, my co-worker Lanh found the problem (thanks Lanh!): there’s a bug in the .netTiers 2.3 templates – when the exception is raised at runtime, the .netTiers data access layer is not properly catching and executing the retry logic.
To fix this bug, open your NetTiers\DataAccessLayer\Utility.cst template:
In line 459, change
catch (DataException)
to
catch (System.Data.SqlClient.SqlException)
Save your Utility.cst, re-build your data access layer, re-compile and re-deploy to the Windows Azure Cloud and you should no longer see the “System.Data.SqlClient.SqlException” errors.
Good times!
Below I copy the full threads from the Microsoft forums.
Exceptions connecting to SQL Azure: "An established connection was aborted by the software in your host machine
Here’s the text if anyone is doing a Google/Bing search -
On a web role that accesses a SQL Azure database, we keep getting the exception:
System.Data.SqlClient.SqlException: A transport-level error has occurred when sending the request to the server. (provider: TCP Provider, error: 0 - An established connection was aborted by the software in your host machine.)
This does not happen on every call, but it happened over 70 times last week as our QA team was testing our web role.
Both the web role and the SQL Azure database are in the same data center.
We did set the maxconnection to 96 for connection pooling.
We use .netTiers (http://nettiers.com/) for our ORM, it's similar to nHibernate.
Does anyone have any ideas on why this keeps happening?
Here's the full exception -
MYMETHOD() Exception : System.Data.SqlClient.SqlException: A transport-level error has occurred when sending the request to the server. (provider: TCP Provider, error: 0 - An established connection was aborted by the software in your host machine.) at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection) at
System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj) at
System.Data.SqlClient.TdsParserStateObject.WriteSni() at System.Data.SqlClient.TdsParserStateObject.ExecuteFlush() at
System.Data.SqlClient.TdsParser.TdsExecuteRPC(_SqlRPC[] rpcArray, Int32 timeout, Boolean inSchema, SqlNotificationRequest notificationRequest, TdsParserStateObject stateObj, Boolean isCommandProc) at
System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, DbAsyncResult result) at
System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method) at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method) at System.Data.SqlClient.SqlCommand.ExecuteDbDataReader(CommandBehavior behavior) at
Microsoft.Practices.EnterpriseLibrary.Data.Database.DoExecuteReader(DbCommand command, CommandBehavior cmdBehavior) at Microsoft.Practices.EnterpriseLibrary.Data.Database.ExecuteReader(DbCommand command) at
MYPROJECT.DataAccessLayer.Utility.ExecuteReader(Database database, DbCommand dbCommand) in C:\vss\MYPROJECT\MYPROJECTBusiness\main\MYPROJECT.DataAccessLayer\Utility.cs:line 361 at
MYPROJECT.DataAccessLayer.SqlClient.SqlMycAccountUserProviderBase.GetPaged(TransactionManager transactionManager, String whereClause, String orderBy, Int32 start, Int32 pageLength, Int32& count) in C:\vss\MYPROJECT\MYPROJECTBusiness\main\MYPROJECT.DataAccessLayer.SqlClient\SqlMycAccountUserProviderBase.generated.cs:line 638 at
MYPROJECT.DataAccessLayer.Bases.EntityProviderBaseCore`2.GetTotalItems(TransactionManager mgr, String whereClause, Int32& count) in C:\vss\MYPROJECT\MYPROJECTBusiness\main\MYPROJECT.DataAccessLayer\Bases\EntityProviderBaseCore.generated.cs:line 241 at
MYPROJECT.DataAccessLayer.Bases.EntityProviderBaseCore`2.GetTotalItems(String whereClause, Int32& count) in C:\vss\MYPROJECT\MYPROJECTBusiness\main\MYPROJECT.DataAccessLayer\Bases\EntityProviderBaseCore.generated.cs:line 228 at
MYPROJECT.DeployManager.ClientResponseFacade.MYMETHOD(MYPROJECTPointClientRequest initialRequest) in C:\vss\MYPROJECT\MYPROJECT.Deploymanager\Main\ClientResponseFacade.cs:line 156
I searched the forums but found very few references to this error.
Sporadic transport-level connection errors from SQL Azure
http://social.msdn.microsoft.com/Forums/en-US/ssdsgetstarted/thread/0545004d-8fc4-405a-8fb8-a4f22f167d44?prof=required
Comments