RSS

PolyBase: I now Pronounce you SQL and Hadoop


 

Are you interested in Hadoop? With SQL Server 2016 just around the corner, a powerful new feature is being included called PolyBase.  To summarize what it is, just think about it as a marriage between SQL Server and Hadoop.  It is a way to store and query data on HDFS using SQL Server.

HDFS is a distributed file system that works differently than what we’re used to in the Windows OS side of things; the general principle is to use cheap commodity hardware that replicates data in order to account for availability and to prevent loss of data. With that in mind, it makes a great use case to store a lot of data cheaply for archiving purposes or can be used to store large quantities of data that been to be processed in large quantities as well.

For more information please visit: https://msdn.microsoft.com/en-us/library/mt143171.aspx

Now if you want to try it out for yourself, make sure you install the PolyBase Engine (from the SQL Server setup) and feel free to try the modified code sample below.

Read the rest of this entry »

 
3 Comments

Posted by on April 27, 2016 in Other, SQL Server 2016

 

Tags: , ,

SQL Server 2016: What’s the best new feature?


With SQL 2016 around the corner, and my blog without a post this year, I thought I would write about some of the new features.  However, I’m not sure which feature I like the most to talk about! So I’m going to summarize a few of the new features I have been exposed to and see which ones my readers like the best.

Stretch Database:

The idea is simple, what if you could “stretch” your database between your local on-premise server and Azure? That would be fantastic of course!  It’s a wonderful new feature in which you configure your database to push certain “cold” data into the Cloud (specifically Azure) while your “hot” data sits on your local on premise server.  This is an excellent way for archiving data.  But how do the users access the “cold” data? Simple, they just keep querying the same way the have been! From a logical perspective, this is very similar to Table Partitioning where the DB engine understands where the data is stored and the application does not have to change the way it behaves with the database.  Just keep in mind, querying older data will result in longer wait times for the data to be retrieved from Azure.

Always Encrypted:

In today’s world, security is a very big concern. And what is the gold that the technology thieves are after? Your DATA! Nothing else is more important than your data.  Now how does this feature differ from Transparent Data Encryption? Well in a lot of ways. First, TDE encrypts the entire database at rest. So if someone pulls the Database off the server, or it’s backup, they cannot access it without the proper certificate. If they are able to gain access to your server while the database is online, they can read all of the data right off of there as plain text.  Access is usually not a problem for expert hackers.

Now Always Encrypted works in a different way. Data is encrypted, column by column, using a certificate that is not stored inside the SQL Server Engine. Ok so you may think, well TDE encrypts the entire DB not just column by column. Yes, but it does not store the data encrypted as Always Encrypted does. Also, in most cases you do not need to encrypt the entire database, you just need to encrypt the sensitive data. Additionally, with SQL 2014 and above, you can can encrypt the backups without having the need for TDE. So when your backups are moved between locations, they are encrypted.

Let’s not get away from the main point here. Now with Always Encrypted, the certificate to Encrypt and Decrypt the data is used on the client side. So Encryption and Decryption is done outside the DB engine which allows it to scale a lot better.  Encryption can also be done deterministically instead of randomly.  Let me explain that a bit more.  If you have a unique ID like a national ID (let’s use 123456789 for example) you can query against that because the hash used to encrypt it is deterministic. So if you were to store the value twice in the same table (or a different table using the same certificate) you would get the same hash.  This improves query performance tremendously.  Deterministic encryption would not be wise for a column that has few values however, such as gender, because someone can quickly figure out the pattern in the data.

So to sum up this feature, those without access to the certificate cannot decrypt the data (including your DBAs). Data is encrypted even when the database is opened, unlike TDE, so if someone was to gain access to your system they would still need the certificate to decrypt the data. You can use different certificates for different columns, even within the same table, which can further increase security.

Dynamic Data Masking:

Please don’t confuse this with a security feature, because it really is not.  This feature is designed to display a “mask” over specific data. For example, for email it can display aXXXXXXX@XXX.com for the email. Now the data is not stored like that, and people with elevated privileges can see the data as clear text.  However, this is very easy to implement and does not require a lot of planning.  It can be used as a “quick fix” to potentially block data that you don’t want others to see on reports and webpages that they may have access to for which you are not aware.  It’s handy to keep not-so-sensitive data blocked, but is not a replacement for something like Always Encrypted.

Row Level Security:

This feature has been around in other database engines, but is new for SQL 2016.  It’s very simple, you programmatically allow for certain users to see certain data in a table.  It’s performance is better than using views or other types of logic to do row-by-row security since the engine is optimized to process the Row Level Security features.You can even block your admins from getting access to the data if they are not supposed to. A small disclaimer, they can always disable the functions that force the security and get by it so make sure you are auditing those actions.  This is a great way to allow for multi-tenancy within a database since you can exclude results to those that do not have permissions.

PolyBase:

I like to say it is a marriage between SQL and Hadoop.  PolyBase allows for you to query using your SQL Server Engine data that is stored in Hadoop.  It also allows for you to store data into Hadoop by creating External Tables and loading data into them.  It’s a great way for SQL Server to stretch out and touch HDFS while allowing for SQL professionals to access data on HDFS using regular T-SQL statements. It’s definitely a great bridge between the two worlds and will allow many data professionals to get closer to understanding and using other technologies outside the SQL Server landscape.

New SSRS 2016 Portal and Mobile Reports:

SSRS has finally gotten a face lift!  The portal looks much cleaner and now allows for two new types of objects to be placed. The first is the KPI (key performance indicator) which is basically just a tile on the Reports homepage that displays a value. It’s useful if you want to have specific information highlighted and readily available.  The other new object you can put up there is mobile reports.  With the acquisition of Datazen, Microsoft has basically added the functionality right into SSRS 2016. They look fantastic on a regular screen, tablet, and even phone.  It’s a great step forward in terms of making reports look great and easy to navigate.  The road map also has PowerBI reports being available to on-premise SSRS 2016 implementations. You can read all about the road map here: https://blogs.technet.microsoft.com/dataplatforminsider/2015/10/29/microsoft-business-intelligence-our-reporting-roadmap/

The Verdict?

My favorite feature is a battle between PolyBase and Always Encrypted. I love all the new features (even the ones I did not list here) but I feel like these two specifically push the boundaries of what a DBMS can do and I love it!

 
12 Comments

Posted by on April 19, 2016 in Other, SQL Server 2016

 

Tags:

Get valuable Virtual Log File (VLF) information for all your databases log files

Get valuable Virtual Log File (VLF) information for all your databases log files

I had this problem where I needed to gather Transaction Log information on multiple databases and check for valuable statistics on them. Running the command [DBCC Loginfo] brings back a number of rows for each Virtual Log File (VLF) in your Log File. It is really hard to do anything useful with that information on a larger scale. Each row returned gives you an estimate for the number of VLFs per Log File. Sure you could use the internal (and hidden) system stored procedure sp_msforeachdb to get the information for all database but it looks horrible. Here try it out for yourself before you read the rest of the post:


exec sp_MSforeachdb 'Use [?] select db_name(); DBCC LogInfo'

So why not make it better? That’s what I thought to myself, and I have recently been playing with storing DBCC command output to tables for analysis. I’ve put some together some code that allows you to capture the output of DBCC LogInfo into a Temp Table and then get some interesting information about the number of VLFs per database and other valuable information; see the comments for more information. Just by storing some of this data temporarily, I was able to write queries against it and discovered a major inconsistency in the size of my VLFs in a Log File that could potentially cause performance issues.

Feel free to create a permanent table for this data and run it on a regular basis to get an understanding of what your system is doing for troubleshooting. I also commented out the date field since I deemed it unnecessary, but if you’re looking for trending it maybe a good option to have that additional data.

Read the rest of this entry »

 

Tags: , , ,

SQL 2016 Row-Level Security Demo: Users with various clearance levels

SQL 2016 Row-Level Security Demo: Users with various clearance levels

SQL 2016 is right around the corner and one of the new security enhancements promised is Row-Level security for tables.  It’s a great new feature and pretty easy to implement.  I have created a simple demo that gives various users access to data based on specific clearance level to the data.  Feel free to modify the code and play around with it how to see fit.  There are many different ways to setup Row-Level security and this is just one scenario.  One of the things you’ll notice if you go through the scripts below is that the dbo user does not have access to the data after the the security policy is applied.  This is key for many environments where customers do not want administrators to have access to sensitive data.  Of course anyone with good coding skills and the proper permissions could circumvent that, but that’s why we put auditing measures in place🙂

I’ve broken up the code into three sections.  The first is for setting up the database and permissions.  The second section creates the tables in the database and puts test data in them.  The third section is for the creation of the function and security policy which enables Row-Level Security. After creating the function and security policy, go back to the second section and re-run the select statements to see the security policy in action.  This demo was created on SQL 2016 CTP 2.2.  If you are interested in learning more about Row-Level Security and to see some other demos please refer to this webinar from PASS.

Read the rest of this entry »

 

Tags: , , ,

Another Career Podcast – Working at Microsoft

Another Career Podcast – Working at Microsoft

I have to admit, I had a lot of fun at SQL Saturday in Philly this last June; I mean two podcasts recorded at one event!! It’s also convenient to do a Podcast with other professionals because they end up doing all the work😛

Before I post the information, I wanted to take a short break here and explain my “absence” from blogging recently. Currently with a new job, more of my kids going starting schooling, and moving, things have been busy. In fact the computer I use to record my video training material is in a box while we’re in between moves. I love to create material where I am satisfied with the quality, and so I’ve been taking a step back to realign myself and I hope to get back to blogging and creating video training soon. I hope that what I’ve provided for the community has been valuable so far and I hope to keep up the trend to help people better their own careers.

Now to the good stuff! This was another fun podcast to do with my good friend Carlos Chacon. I had the pleasure of doing the interview with my colleague and good friend Tim McAliley. Hopefully you enjoy listening as much as we enjoyed preparing this for you!

http://sqldatapartners.com/podcast/  (Episode 01)
Direct Download

 
2 Comments

Posted by on August 7, 2015 in Other

 

Tags:

Podcast about my career and the SQL Server Community with Chris Bell

Podcast about my career and the SQL Server Community with Chris Bell

During my time at SQL Saturday in Philadelphia (Early June), Chris Bell invited me to be a guest on his Podcast.  Here is the result of that fun interaction🙂

https://wateroxconsulting.com/archives/podcast/010-ayman-el-ghazali/

I just want to reiterate my thanks to everyone who ever helped me on my career path from my colleagues at work to my colleagues in the field. HUGE THANK YOU!

I hope you enjoy it!

 

Tags:

Working at Microsoft – A Year in Reflection

Working at Microsoft – A Year in Reflection

Last year after starting my new job with Microsoft as a Senior Consultant I wrote an article about my interviewing experience with many companies during my tenure at my last employer (Click here for article).  I realized it is now time to talk about my journey to Microsoft.

It has been a very busy year for me both on the personal and career front.  Having four growing boys is not an easy task to manager; I end up breaking fights regularly over who squirted who first with a water pistol.  Never-the-less it is exciting to be a father!  My year has been an interesting one of challenges and growth career wise. I spent a good number of years and several interviews in order to finally land a job at Microsoft. One of my interviews was with a SQL MCM which went decently well. Although I did not have all the answers for him, he appreciated my enthusiasm and method of trying to solve problems with the knowledge I have.

Read the rest of this entry »

 
1 Comment

Posted by on June 24, 2015 in Other

 

Tags:

 
Follow

Get every new post delivered to your Inbox.

Join 1,482 other followers

%d bloggers like this: