Tuesday, September 26, 2017

#SQLServer Just How Minimal Can That Problem Repro Get? Bit Column + outdated stats + histogram amendment

***** Update 2019/0408 *****
Wanted to point out that the kb4316948 fix for histogram amendments (quickstats) is now available in SQL Server 2016 SP2 CU5 as well as SQL Server 2017 CU8.
***** End Update *****
 
***** Update 2018/0713 *****

The Connect item link below is, of course, dead now.  No matter.  Because this issue was fixed in SQL Server 2017 CU8 by this kb.

FIX: Access violation when you compile a query and histogram amendment is enabled with default Cardinality Estimation in SQL Server 2017
https://support.microsoft.com/en-gb/help/4316948/fix-access-violation-when-you-compile-a-query-and-histogram-amendment 

***** End Update *****

Often among the hardest of my decisions is whether I should spend more time trying to simplify a given problem by eliminating additional factors, shrinking the data needed for a repro, etc... or just put all that effort into investigation purely aimed at understanding the behavior. I expect that to be a long-term challenge :-)

I was quite happy with the way this particular one worked out, though. It started as a maze... access violations generated on a SQL Server 2016 instance with dozens of databases. The access violation came from an insert query using a synonym - with the underlying table in another database! (I didn't know that was possible - or *ever* desirable - until I saw it in this context.) The AV was occurring during a multi-step data transfer process and it was hard to isolate the data flowing into and out of the two databases in question. But after some finagling, I got the problem repro pretty doggone small. Reproduced the AVs on several CUs of SQL Server 2016 and on SQL Server 2017 RC2.

Let me offer a disclaimer before the code to produce the AV. I don't *think* this will do any persistent harm other than a terminated session and dumps like these below.  Regardless, if you choose to reproduce the access violation, please only do so on a nonproduction system.  The dump the AV created on my SQL Server 2017 system...



Here's the setup.  Two 2-column tables in tempdb.  One of those tables with a single row.  The other table with 1 row - then stats auto-created - then a second row inserted.

-- error reproduced on SQL Server 2016 SP1-CU2, SP1-CU3, SP1-CU4
SELECT @@version    -- SQL Server 2017 RC2
SET NOCOUNT ON
USE tempdb;


DROP TABLE IF EXISTS #temp;
SELECT 1 AS key1, 1 AS key2 
INTO #temp;


DROP TABLE IF EXISTS #temp2;
SELECT CONVERT(INT, 1) AS key1, CONVERT(BIT, 0) as deleted 
INTO #temp2

-- auto create stats while only 1 row
SELECT * FROM #temp2 where deleted = CONVERT(BIT, 0); 
INSERT INTO #temp2 -- add a 2nd row with *different* value for key1
SELECT CONVERT(INT, 2), CONVERT(BIT, 0);

So that's the setup.

After the setup, the offending query is quite simple.


-- New CE (via T2312 or default) plus (T4139 or ENABLE_HIST_AMENDMENT_FOR_ASC_KEYS) produces an Access Violation
SELECT #temp.key2
FROM #temp 
INNER JOIN #temp2 ON #temp.key1 = #temp2.key1
GROUP BY #temp.key2
HAVING MIN(CAST(#temp2.Deleted AS INT)) = 1
OPTION (USE HINT('ENABLE_HIST_AMENDMENT_FOR_ASC_KEYS'), QUERYTRACEON 2312);

And the result...


Ouch.  That looks really painful.

An interesting sidenote: in the not-too-distant past, histogram amendments via trace flags 2389, 2390 and/or 4139 seemed to have been exclusively the realm of the legacy cardinality estimater.  This Access Violation is a strong indication that is no longer the case.

Filed a Connect item for this.  Fortunately, the workflow and data involved in this were amenable to a schema change to no longer use a column of type bit - allowed side-stepping this error for now.

SQL Server 2017 Access Violation - outdated bit column stats and histogram amendments
https://connect.microsoft.com/SQLServer/feedback/details/3141605

No comments:

Post a Comment