Wednesday, December 05, 2007

Focusing on SSIS

Since our migration to Sql Server 2005 a couple of months ago, Integration Services has always been a constant point of attention to our database development team. That's mainly because of the necessary and upcoming migration of the existing DTS packages, but also because of the constant trouble we experience with the technology.

Yep, we like it just as much as we hate it at times. Some random stuff from the past weeks...

* SSIS is in a lot of ways much stricter than DTS, certainly when it comes to data transfer tasks. Encoding of files didn't bother us in the past, but now we have to be very aware of how we handle and manipulate our imports. A data conversion task gave me some hastle when it discovered a special character in an import which caused an error concerning the truncation of data for that field. Although the content wasn't exceeding the length of the field, it appeared that the task couldn't correctly determine the length, so I had to ignore the so-called truncation in the error handling options.

* On our main production cluster we had to deploy a package. We store our packages in Sql Server (msdb) at this time, but we just did not succeed in the import on the server (with all the SP's and CU's installed). It produced the following error:

Element not found. (Exception from HRESULT: 0x8002802B
(TYPE_E_ELEMENTNOTFOUND)) (Microsoft.SqlServer.DTSRuntimeWrap)

In the end, we had to import the package using the client (SSMS) from another server.

Tuesday, November 27, 2007

Database snapshot files

Be aware of the fact that a database snapshot contains all the data of the referenced database at a certain moment. Not only the actual data, but also the metadata, the bookkeeping of Sql Server, is present in the snapshot.

This can give some unexpected results when you query tables like sys.database_files for example. Instead of getting the set of files the snapshot exists of, you get the info that was originally in the referenced database, being the files of that database, not the snapshot. That information and much more is available through master.sys.master_files.

Tuesday, October 16, 2007

SSIS, 64-bit and a breakpoint

Despite of 2 service packs and the numerous hotfixes we installed since the release of Sql Server 2005, we still discover problems with SSIS much too often. This time, it was some weird message in our acceptance environment telling me the "Script files failed to load". I had thoroughly tested the package on our development environment, so I was amazed to hear it failed on another environment.

A quick search on the specialized forums returned some hits - mostly related to the Script task the package contained - but nothing really described the conditions I faced. The only difference I could think of was the fact that the acceptance environment was a 64-bit edition, unlike development. In such a case, the settings of the Script task are quite simple: the PrecompileScriptIntoBinaryCode flag should be on.

There was a non-active breakpoint however that caught my attention which I hadn't noticed before because the Script task was not part of my changes to the package and the tests had never halted at that break. After removing the breakpoint, the package effectively ran without a problem.

It's a shame we had to discover this bug the hard way, but it proves that an acceptance environment should stick as close to production as possible. In this case, it payed off.

Monday, October 15, 2007

SSIS's interaction with DTC

I encountered some locking problems while developing an SSIS package today. The purpose of the package is to upload new content from flat file to a table. It contains basically two control flow tasks - one clearing the table (TRUNCATE TABLE) and another one filling it with a data flow task - that require a transaction. But, while executing, the package blocked at the start of the data flow task.

Investigating the blocking, I encountered some SPID -2 owned locks in sys.dm_tran_locks, thus blocking the data flow task from retrieving the necessary metadata from the database at the PreExecute event. I fiddled around with the setting of the TransactionOption and IsolationLevel of the different components, but nothing resolved my problem adequately. With the help of profiler however, I could determine that SSIS did not propagate the DTC transaction - started for the SQL task - to the Data Flow task.

The blocking can be resolved by either changing the ValidateExternalMetadata attribute of the OLE DB Destination or changing the TRUNCATE TABLE of the SQL task into a DELETE statement.

The reason why both of these solutions have such an impact remains unclear to me. It's true that each of these modifications affect the locks that are required by the package, but that doesn't explain the fact that the Data Flow task was not enlisted appropriately into MSDTC.

Thursday, September 13, 2007

sys.indexes and table valued functions

Today, while lending a hand in the creation of a new operational reindexing job, we discovered the existence of primary keys on table valued functions. I had written a little script to gather all the clustered and non-clustered indexes with their necessary attributes from sys.indexes. I assumed that indexes are only present on physical tables, so I didn't made any checks on the type of object it was associated with. In our first test-run, that turned out to be a mistake.

In the definition of the return value of a table valued function, you can define the primary key for that table and although it isn't an actual table - just a table variable created at run-time - metadata about the index is kept similar to any user table.

USE tempdb
GO
CREATE FUNCTION dbo.udf_TableValuedFunction ()
RETURNS @Table TABLE ( id int PRIMARY KEY , comment varchar(200) )
AS
BEGIN
INSERT @Table VALUES ( 1, 'comment 1' )
INSERT @Table VALUES ( 2, 'comment 2' )
RETURN
END
GO
SELECT *
FROM sys.objects o
JOIN sys.indexes i ON i.object_id = o.object_id
WHERE o.name = 'udf_TableValuedFunction'
GO
DROP FUNCTION dbo.udf_TableValuedFunction
GO

Thursday, May 24, 2007

Unexpected dump

Had some fun with SQL Server 2000 (SP4 + hotfixes) today. While tuning a certain stored procedure, I made a stupid variation with an obvious error (wrong default value for a procedure parameter). It returned no problem on creation, it just dumps the connection on executing (and thus compiling) the procedure in a very nasty way. Don’t try this on production systems ;-). Luckily, this problem gets intercepted in Sql Server 2005 at runtime.

USE TempDB
GO
CREATE PROC proc_error
@Date DATETIME = blabla --no good!
AS
SELECT @Date + 1
GO
EXEC proc_error
GO
DROP PROC proc_error
GO

Msg 0, Level 19, State 0, Procedure proc_error, Line 4
SqlDumpExceptionHandler: Process 65 generated fatal exception c0000005 EXCEPTION_ACCESS_VIOLATION. SQL Server is terminating this process.

Sunday, May 13, 2007

Plan Guides

In our production environment, we had a poorly performing stored procedure in terms of reads and CPU usage. A quick peek using the available dmv's gave us an insight into the plan and the parameters that were used to obtain that plan. That way, it was easy to simulate the behavior on a development environment.

Apparently, every morning, with a clean procedure cache, the procedure got called with certain parameters which resulted in a not very intuitive, but good plan with a cost of 3 reads. Subsequent calls with different parameters reused the cached plan, but that resulted in > 8000 logical reads and quite some CPU usage. A more intuitive plan resulting from typical parameters had a constant cost of about 8 logical reads for both types of parameters, which can give a really big performance boost, given the fact that the procedure gets called more than 40.000 times a day and delays a vital business process.

Because I'm not allowed to modify anything of the stored procedure (yep, it's CRM), I had to use a plan guide to 'guide' the engine to the more performant plan. There are several options: specifying the whole plan, add the parameters for the compilation, force recompile... I chose the parameter hint.

--solution, add plan guide
exec sp_create_plan_guide
@name = N'Plan for procedure p_GrantInheritedAccess UPDATE'
, @stmt = N'UPDATE poa1
SET InheritedAccessRightsMask = poa1.InheritedAccessRightsMask poa2.AccessRightsMask poa2.InheritedAccessRightsMask 0x08000000
FROM PrincipalObjectAccess poa1 JOIN PrincipalObjectAccess poa2 ON (poa1.PrincipalId = poa2.PrincipalId)
WHERE poa1.ObjectId = @ReferencingObjectId
AND poa1.ObjectTypeCode = @ReferencingObjectTypeCode
AND poa2.ObjectId = @ReferencedObjectId
AND poa2.ObjectTypeCode = @ReferencedObjectTypeCode'
, @type = N'OBJECT'
, @module_or_batch = N'p_GrantInheritedAccess'
, @params = NULL
, @hints = N'OPTION(OPTIMIZE FOR(@ReferencingObjectId=''D4F70FD5-41FA-DB11-9668-0011258C53FD'',@ReferencingObjectTypeCode=4200,@ReferencedObjectId=''2052793D-41FA-DB11-9668-0011258C53FD'',@ReferencedObjectTypeCode=4200))'
SELECT * FROM sys.plan_guides

The result of this simple operation is a clean and transparent solution. I haven't put it into production though, because it stays a last resort, but it's always good to know your options.

Wednesday, May 09, 2007

MS CRM, from a database perspective

Some time ago, I was appointed as the database dude for our customized MicroSoft CRM 3.0 implementation. The project had already been put into production, but there was a major concern about the performance of the application, mainly on the database. And that was the time when I came into the picture.

We started the cleanup with resolving some major issues in the custom code. That code was written by rather inexperienced developers with little supervision on their creations. That resulted in some very interesting behavior on our production environment from time to time, at least if you're in for a challenge. The problems can be roughly categorized into two groups: locking issues and suboptimal plan usage, mostly due to wrong datatype usage.

After the resolution of the most urgent issues on the customization, I got the chance to take a look on the actual MSCRM database. And it isn't always a pretty sight. Sure, the system of a controlling Metabase and a generic customizable CRM database are pretty neat, but the statements that I see passing in the profiler traces are often quite unmanageable. In exchange, the application itself offers very flexible functionality, but it doesn't always seem to scale that well to the data volumes we are handling in our company.

Well, working with MS CRM 3.0 is proving to be an enriching experience. You actually shouldn't have to look at the database and you should definitely not temper with it, but Sql Server 2005 offers very interesting features to pinpoint the culprits and to help the engine a little in the right direction. More to come in hopefully a little more technical posts ;-).

Thursday, January 25, 2007

SSIS BypassPrepare

Yesterday, we stumbled into an annoying bug of the RTM version of SQL Server 2005. A new version of a previous project of mine was being released with a modified SSIS package. One of the modifications consists of the usage of a mapped input parameter in an Execute SQL Task. It requires the BypassPrepare flag of that task to be changed to True to make this construct work. It was succesfully tested in development, acceptance and then, in production, we suddenly got the following error.

Executing the query "…" failed with the following error: "Syntax error, permission violation, or other nonspecific error". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.

This error lists almost everything that can possibly go wrong with an Execute SQL Task without giving any indication what actually went wrong. Sigh.

At first I thought that the problem was caused by the fact that our production environment didn't had SP1 and the necessary hotfixes like it should have. That's an essential factor indeed, but there was more to the problem. The trouble with the RTM version is that it doesn’t take this BypassPrepare into account. Whether you set this flag to true or false in the development environment, if you look at the actual xml that represents the package, you won't find a trace of this setting. This was fixed in SP1 (as correctly answered on the forums).

What I didn't know was that our operational DBA did export and import the package with his local client (SSMS) on which he didn't have SP1 either. Similar to the behavior in Visual Studio, the setting gets lost with SSMS in RTM version. After a quick install and verification, everything worked smoothly.

In the upcoming SP2 it is announced that the default of this setting will be changed to True, meaning that SSIS won't prepare that statement unless you explicitly specify it do so. It could have saved us some trouble. Valuable lesson learned is to keep all environments, and also the clients, always up to date.