Data is the most valuable asset of the 21st century. Lately, collecting and analyzing data became not only desired, but mandatory for any business’ success. In order to answer relevant questions and evaluate outcomes, companies often reach out to data analysts and BI specialists who turn data into cognizable information. The challenge of data rests in the speed of its generation. According to Forbes, every day we produce 2.5 quintillion bytes of data which corresponds to 2.5 million terabytes, and the number grows daily. Although not all data is compatible for analysis, companies still end up with millions of records in a single table within a data warehouse. A large data warehouse will probably be accessed by numerous employees from different departments, from different countries around the globe, for various business intelligence operations. It is crucial to enable those employees ready access while still maintaining data integrity and security.
Aside from the physical security of a data warehouse and data encryption, there is another significant method used when securing data warehouses (and information systems in general) – logical access control. The main purpose of logical access control is the protection of resources (hardware and software) from unauthorized remote access.
Many vendors of data warehouses include features in their products that simplify design of data warehouse logical access control. In this article, I will introduce you to Row-level Security, a feature that Microsoft has built in their database and data warehouse solutions.
As people in Microsoft say, Row-Level Security (or RLS) is designed to help database administrators implement restrictions on data row access. For example, ensuring that employees access only to those data rows that are relevant to their department. Since the restriction logic is located on the database side, rather than in the application that accesses the database, the security system is more reliable and robust.
Row-level Security can be applied to SQL Server, Azure SQL Database and Azure SQL Data Warehouse. It supports two types of filter conditions that define row-level access to records in a table, so called security predicates:
In RLS implementation, restricted access to data rows is controlled by a security predicate defined as a function that is invoked and enforced by a security policy. Depending on the security predicate, users can’t access the filtered rows, delete, or update them, and if the block predicate is defined, they are not able to use write operations. The function that defines security predicate is table-valued function, meaning that it returns data of a table type.
Imagine the following scenario: You work as a database developer for a hospital that provides inpatient and outpatient care. Among the tons of data that needs to be collected, the hospital’s staff wants to keep track of their patients’ records. Every time a patient visits the hospital, they want to record the valuable information such as the date he arrived, diagnosis, treatment, etc. Because of the practice of maintaining the security and confidentiality of patients’ records, you need to ensure that nurses only have access to the patients they care and restrict them access to other patients’ records. For the sake of this demonstration, let’s keep it simple and say that there are two nurses and one doctor that admit patients on daily basis. Each of them can access the hospital database using their own username, because you have already created three corresponding users in the SQL Server database. Users are Doctor, Nurse1 and Nurse2.
You have also created the required table that holds patients’ records and you simply named it PatientRecords. For now, the doctor and two nurses can access all patients’ records since you have granted them select permission. Every patient is assigned to either Nurse1 or Nurse2 which is also documented in the table.
Before Microsoft introduced Row-level Security feature in SQL Server 2016, the lowest security level of a database was table-level security, meaning that users could be restricted to access a whole table. The need of row-level restriction was met by writing custom stored procedures or more often by creating views for different users. Those approaches get the work done, but lately it has become difficult to maintain them because of the growing amount of data, and growing number of users that require large numbers of views and stored procedures. Row-level Security, on the other hand, filters rows based on execution context using security policy, which makes the solution more flexible and dynamic. As a database developer you like to keep your solution in order, so for easier maintenance you create a new schema that will contain the RLS function with security predicate.
This function is used to compare incoming @NurseResponsible argument with the username of user that is currently executing the query. The function will return 1 only if the current user is @NurseResponsible or Doctor. Since the same doctor is treating all patients, you need to make sure he has access to all of the patient records. Next step is to create a security policy that will enforce created function with the security predicate. Since you need to restrict access to rows on a read level, you need to use filter predicate and call the function with security predicate on PatientRecords table. Don’t forget to parse the correct column to @NurseResponsible argument and activate the policy.
Once you’re done, you want to test your solution, so you execute a simple SELECT statement three times, each time as different user.
Doctor should see all patients’ records while Nurse1 and Nurse2 should have access to only those patients they are responsible for. Congratulations! You have just implemented simple but powerful RLS solution.
Data security has become one of the greatest challenges in data-driven world. It is essential to find a compromise between user access and data integrity, security and privacy. This data-people-security balance is often achieved using access control features like RLS. RLS enables more fine-grained access control execution process of a query. Since you can have multiple security predicates in a single security policy, the security logic is centralized, which benefits the application transparency and there are no changes needed at the application level. However, since RLS is using table-valued functions, it may negatively affect the performance. Interesting fact: Microsoft also implemented RLS logic in their reporting solution, PowerBI!