Export Query Store tables separately from the database tables
I'd like to get the Query Store tables from my customer's database for analysis. Due to security (SOC2) or EU regulation, I'm now allowed to get the customer's data.
Upvotes: 24<=-=Apr 22 2016 6:59AM=-=>
Queries and query plans can also include customer data. For example, if your customer has:
SELECT * FROM dbo.Customers WHERE LastName = ‘Tran’ AND FirstName = ‘Peter’ AND HasDeadlyIllness = 1
Then you’re going to get that identifying query in Query Store. Are you sure you’re allowed to get that out?<=-=Apr 22 2016 10:00AM=-=>
That’s a good point…can’t win them all. There’s nothing that can be done with literals in the predicate. I guess it wouldn’t really matter whether we remove the bind values, since this is still a security hole with the literals. We’ll need to make the customer aware of this if they’re providing us the Query Store tables.
In most cases (95+%) of our customers don’t care. They often give us the entire database for one reason or another and we have NDAs for all them, but a handful of customers are pretty strict and will not give us anything.
Thanks for your feedback.<=-=Apr 27 2016 2:00PM=-=>
I can’t modify the proposed solution, but remove the “additional request regarding the security request”.<=-=May 11 2016 12:36PM=-=>
I don’t know that I need the data to exist in a separate filegroup (that’s a “nice to have”), it’s more being able to export the Query Store data in some manner. And not just to get it from a client, but to use for comparison between copies of the database different environments (e.g. production vs. test). Export also assumes I have a way to import it back in to another copy of the database :)<=-=May 13 2016 1:27AM=-=>
Brent’s answer leads us to another tool that would be great : a query normalizer.
IT would basically remove hard-coded values (as in Brent’s example) and variables values (as it can show up in sp_executesql for example).
This in turns eases finding out offending queries based on their execution frequency.
I’ll second what Erin mentioned – good value in query store portability among instances for testing purposes.
What Brent mentions(privacy-protected data within query plans & query text) is a data privacy concern for me, too.
Important to note that this is not a new privacy concern. The long-standing practice of “clone database” with DDL & stats only presents this conundrum whenever a column typically considered private information (eg SSN, etc) is in the query predicate and there are relevant stats.
It can take forever to scrub or recreate stats so they can be shared outside of an NDA or business associate agreement.
Thank you for taking the time to post this issue! We understand that this could be an important issue for you.
Unfortunately, we do not plan to include a fix for this issue in the upcoming release, although, we might include it as an improvement in a future release.