Data Protection Rules for Data Virtualization

Dean Compher

29 January 2021



If you are using Data Virtualization in a Cloud Pak for Data Environment along with Watson Knowledge Center (WKC), then you can create data protection rules in WKC to control access to virtual tables and columns.  Using data protection rules allows you to control access based on attributes of the virtual tables instead of having someone specifically limit access to virtual tables with known sensitive information.  If you are using WKC Data Protection rules for other data in your Cloud Pak for Data Environment, the this is a good choice for controlling access to your virtual tables as well. 


If you wish these rules to be applied to virtual tables and views, you must apply them directly to the virtual objects, even if the underlying physical tables are already protected by these rules.  Further, to have Cloud Pak for Data (CPD) enforce data protection rules, the Data Virtualization (DV) administrator needs to enable Policy enforcement in the DV Service Settings and the DV Instance must be added to the default catalog in WKC.  All of these steps plus details on how to protect your virtual objects with rules are in the Governing virtual data page of the Knowledge Center.  If you aren’t familiar with business terms, data classes and other attributes, then please see my getting started article.  Please note that applying data protection rules to virtual objects is somewhat different than applying them to directly connected objects.  You can also secure the data in your virtual tables without data protection rules as described in my previous article


In Data Virtualization (DV) you can create virtual tables based on the source tables and also create views on the virtual tables.  I will refer to these virtual tables and views built on them as virtual objects in this article. 


Creating Data Protection Rules


The CPD Knowledge Center explains in detail how to create a data protection rule, but I’ll briefly discuss it here.  To view, edit or create data protection rules, click Governance | Rules from the main CPD menu.  To view and edit an existing rule click the rule, or to create a new one click the “New Rule” button.  You will only see the button if you have the right permissions as described in the above rule page AND are a collaborator on at least one Category.  One of the really interesting things is that you don’t actually directly select the data assets to protect.  Instead you build the rule on certain attributes that can be assigned to the assets.  The attributes can be assigned to physical and virtual assets.  In example 1 you can see the list of attributes that can be used in the protection rule.  In this case you can select owner, business term, data class, tag or classification. 


Example 1.  Data Asset Attributes for Data Protection Rules


Graphical user interface, text, application, email

Description automatically generated


In example 2, you can see a rule I created to protect anything with a Credit Card Number.  Once this rule is active and a virtual table or view is created using the steps specified below and that virtual object or any of its columns has a Data class of “Credit Card Number”, then if anyone other than Fred or Dean tries to access that virtual object,  an error will be thrown and the data will not be presented.  Please note that Condition 1 names the attribute used to determine the data asset.  Condition 2 enumerates the users who are allowed to see the information.  This rule denies access to everyone else.  The action I selected is to deny access to the entire table.  As of CPD 3.5 redacting the data in the sensitive column is in Tech Preview mode only for virtual objects and not recommended for production.  In some cases the text in the Business definition will be displayed to a denied user, so make sure to add a useful description. 


Example 2.  Data Protection Rule page



Graphical user interface, application, Teams

Description automatically generated



Configure Virtual Objects to allow Protection


Now you may be asking, “How does CPD know the attributes of a virtual table or view such as the data class?”  The short answer is that the virtual object must be added to the WKC default catalog and profiled.  I’ll now discuss how this is done specifically.  First a DV admin user who is not excluded from viewing the source table by a data protection rule must virtualize the table as discussed in my earlier Data Virtualization article.  When virtualizing the table, make sure to leave the “Submit to Catalog” box checked or submit the object to the catalog later.  After virtualizing the table with the catalog box checked, the user then needs to grant access to the virtual object using the Manage access option in the My virtualized data screen.  Only those given access using the manage access option can see the data in the table regardless of the presence of any data protection rules.  The rules limit access further.  I’ll note here that access to new virtual objects can be given to all users by checking a box in the Manage access function and in that case a data protection rules would be more useful. 


The submit to catalog function discussed above, only makes a request to have the virtual object published to the catalog.  The user who created the virtual table or view cannot publish their own requests.  Another user with the appropriate authorities must then publish the request as discussed in the Publishing requests instructions in the knowledge center.  On my Cloud Pak for Data instance I go to the home screen and look at the “Requests” box there to view pending requests. 


After the virtual object is published to the catalog, the publishing user should go into the default catalog and profile the virtual object.  Rules will only be applied to objects in the default catalog.  Profiling associates data classes and business terms to the virtual table or view.  If you don’t perform this step, then data protection rules will not be applied.  Profile a virtual object by following the Profiling data assets page in the knowledge center.  I start by going into the default catalog, searching for and selecting the virtual object I published and then clicking profile on the menu.  Profiling will apply any data classes that are in scope to the columns of the virtual table or view.  You will see them in the preview of the virtual object if they exist.  Doing this adds an extra level of protection by potentially finding sensitive data that you didn’t know was in the virtual object.  So it isn’t a bad idea to profile the virtual object even if you don’t use data protection rules.  If the data classes aren’t found then you can manually add them.  Either way, data protection rules built on the data classes will be used when trying to access the virtual object.


If you don’t want to worry about data classes, business terms you can just assign tags to any virtual objects you like.  Maybe you just have a few sensitive virtual objects you want to protect.  In that case, you can just find the object in the default catalog and instead of clicking the profile tab, you choose the Overview tab and add a tag called something like “2sensitive4U” to the object.  If you do this then you would want to  build the data protection rule on the “Tag” instead of “Data Class”.


Possible Error Messages


Here are some of the error messages you may encounter when you are trying to access a virtual object and you are blocked by a data protection rule. 


I got this error when trying to select a virtual table from the DV SQL editor when I was denied by a data protection rule:


List virtual table details failed:

SQLExecute: {42501} [IBM][CLI Driver][DB2/LINUXX8664] SQL0551N The statement failed because the authorization ID does not have the required authorization or privilege to perform the operation. Authorization ID: "USER1003". Operation: "SELECT". Object: "USER1001.CARD_TEST_DENY". SQLSTATE=42501


Error when trying to view the object from within a project:


An error occurred attempting to preview this asset.

The data for this data asset can't be retrieved. There might be a problem reaching the source connection or repository for the data. Or, there might be a temporary server outage.

Need more help? Contact us with this support ID:5658f79a-d12e-4d43-9c4a-3d79c0289048



In this article I wanted to point out the basic steps of applying data protection rules to virtual objects. If there are other steps you think others would benefit from knowing, please tell us about them on my db2Dean Facebook Page and share your thoughts about them.


HOME | Search