<html>

<head>
<meta http-equiv=Content-Type content="text/html; charset=unicode">
<meta name=Generator content="Microsoft Word 15 (filtered)">
<style>
<!--
 /* Font Definitions */
 @font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	font-size:12.0pt;
	font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
	{color:#0563C1;
	text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
	{margin-top:0in;
	margin-right:0in;
	margin-bottom:0in;
	margin-left:.5in;
	font-size:12.0pt;
	font-family:"Calibri",sans-serif;}
.MsoChpDefault
	{font-size:10.0pt;
	font-family:"Calibri",sans-serif;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
-->
</style>

</head>

<body lang=EN-US link="#0563C1" vlink="#954F72" style='word-wrap:break-word'>

<div class=WordSection1>

<p class=MsoNormal align=center style='text-align:center'><span
style='font-size:18.0pt;font-family:"Times New Roman",serif'>Data
Virtualization</span></p>

<p class=MsoNormal align=center style='text-align:center'><b><span
style='font-family:"Times New Roman",serif'>Dean Compher</span></b></p>

<p class=MsoNormal align=center style='text-align:center'><b><span
style='font-family:"Times New Roman",serif'>29 Sept 2020</span></b></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<iframe
 src="https://www.facebook.com/plugins/like.php?href=https%3A%2F%2Fwww.facebook.com%2Fdb2dean&amp;width=450&amp;layout=standard&amp;action=like&amp;size=small&amp;show_faces=false&amp;share=true&amp;height=35&amp;appId"
 width=450 height=35 style='border-bottom-style:none;border-left-style:none;
 border-right-style:none;border-top-style:none;overflow:hidden' scrolling=no
 frameborder=0 allowTransparency=true>
</iframe>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>The IBM
Data Virtualization tool allows you to have one place to query and join tables
that physically exist in a variety of places and vendors’ database
technologies.&nbsp; This makes it convenient for analysts and developers to
access many different data sources with one connection.&nbsp; Data
Virtualization is a database with tables you can query, but instead of the data
behind those tables being in files physically in the database, the data exists
in physical tables in other databases.&nbsp; Data Virtualization efficiently
queries underlying tables when you select from the <i>virtual</i> tables in
it.&nbsp; You can also create tables on Excel and delimited files on remote
servers.&nbsp; Think of the tables in the Data Virtualization database as views
like you can have in any relational database, except these “views” are built in
tables in remote physical databases or files on remote servers.&nbsp; You can
use your favorite driver to connect to the DV database like JDBC, ODBC or CLI
from your application, BI tool, etc., and query these virtual tables.&nbsp;
Advantages of using DV include allowing applications to connect to one database
to query and join tables from multiple databases, and one central place to
control access to all of those databases.&nbsp; Data virtualization also
provides ways to increase performance by allowing you to cache data and to put
agents at remote sites to possibly reduce the amount of data sent back to Data
Virtualization.&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>In this
article I’ll describe what Data Virtualization does and how it works so that
you can decide if it merits further investigation.&nbsp; I’ll frequently
shorten Data Virtualization to DV. </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>What is
Data Virtualization?&nbsp; It is similar to any relational database.&nbsp; All
relational databases have two main components, the engine or process that runs
on a server and the files that contain the data, indexes and other
objects.&nbsp; When the engine receives SQL to process, it reads the data in
the files and sends the result set to the client.&nbsp; Data Virtualization is
the same in that it has an engine that runs on a server, and when it receives
SQL, it processes it and sends the result set back to the client.&nbsp; The difference
is that instead of finding data in local files, it intelligently sends queries
to the source databases that actually have the data.&nbsp; There are several
database technologies that it can use including Db2, MongoDB, SQL Server,
Oracle, Snowflake, Hive and several others.&nbsp; You can see the full list by
going to the </span><a
href="https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/cpd/access/data-sources.html"><span
style='font-family:"Times New Roman",serif'>Supported data sources</span></a><span
style='font-family:"Times New Roman",serif'> page, expanding the data source
groups and looking at the Data Virtualization Column.&nbsp; It can even make
Excel and CSV files on remote servers appear as relational sources, but you
need to run the DV agent on those servers to provide communication.&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>As you
might imagine the first thing you do after installing Data Virtualization, is
to log in as the administrator and configure the connections to the data
sources.&nbsp; It primarily uses JDBC connections, so you need to enter things
like host name, port and database name along with a user id and password for
the remote connection.&nbsp; Of course, only tables and views available to that
user in the remote source can be used by Data Virtualization.&nbsp; Even after
you configure the connections to your remote tables, there are no tables that
can be queried from Data Virtualization yet.&nbsp; To allow that you would need
to explicitly make tables available for query by creating virtual tables.&nbsp;
</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>Before you
start making tables available, you will probably want to create some schema
names to organize the tables you make available.&nbsp; These schemas do not
have to be related to the schema names in the source databases and you can have
tables from many different databases available in one schema.&nbsp; &nbsp;This
isn’t required, but is a good idea.&nbsp; Otherwise, the schema for each
virtual table will be the user name of the administrative user configuring
access.</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>The
process of making a table or tables available for query is called
“Virtualize”.&nbsp; In this step the administrator selects the tables to be
virtualized from the sources, optionally changes the table names, renames or
excludes some columns, and optionally assigns different schemas to those
tables.&nbsp; This causes objects to be built in DV called virtual
tables.&nbsp; This is accomplished by searching the tables in the sources and
selecting the ones to be virtualized.&nbsp; At this&nbsp; point only certain
users with a high level of authorizations can see the virtual tables.&nbsp;
Finally, one of the administrators grants access to regular users by giving
authorities to individual users or roles that contain several users.&nbsp; Once
this is done individual users or applications can connect and query data.&nbsp;
Data can only be SELECTed:&nbsp; updates, inserts and deletes are not allowed
through DV.&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>There are
four &nbsp;</span><a
href="https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/cpd/svc/dv/dv_user_management.html"><span
style='font-family:"Times New Roman",serif'>DV Roles</span></a><span
style='font-family:"Times New Roman",serif'> with different levels of
authority:</span></p>

<p class=MsoListParagraph style='text-indent:-.25in'><span style='font-family:
"Times New Roman",serif'>-</span><span style='font-size:7.0pt;font-family:"Times New Roman",serif'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span><span style='font-family:"Times New Roman",serif'>DV Admin</span></p>

<p class=MsoListParagraph style='text-indent:-.25in'><span style='font-family:
"Times New Roman",serif'>-</span><span style='font-size:7.0pt;font-family:"Times New Roman",serif'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span><span style='font-family:"Times New Roman",serif'>DV Engineer</span></p>

<p class=MsoListParagraph style='text-indent:-.25in'><span style='font-family:
"Times New Roman",serif'>-</span><span style='font-size:7.0pt;font-family:"Times New Roman",serif'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span><span style='font-family:"Times New Roman",serif'>DV User</span></p>

<p class=MsoListParagraph style='text-indent:-.25in'><span style='font-family:
"Times New Roman",serif'>-</span><span style='font-size:7.0pt;font-family:"Times New Roman",serif'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span><span style='font-family:"Times New Roman",serif'>DV Steward</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>Users can
be defined and authenticated locally in DV or can be users in your LDAP, AD or
other IAM system.&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>DV also
lets you create views on your virtual tables.&nbsp; This is convenient if
certain tables are typically joined.&nbsp; In this case you can create views
that join tables across different databases, making things easier for end users
and developers.&nbsp; You can also add columns based on functions such as
substring, concatenate, or any number of mathematical columns.&nbsp; For
straight forward join views and union all views there are wizards that make
them easy.&nbsp; For more complex joins, DV provides an SQL editor where you
can issue the CREATE VIEW command.&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>You may be
thinking that selecting data through DV might be slower than directly querying
the sources and in some cases you would be correct.&nbsp; However, the speed
for most uses will be adequate and using the performance features of DV can
mitigate performance issues or make it even faster than connecting to multiple
remote sources.&nbsp; First of all, DV has an optimization engine that is used
to intelligently query remote sources.&nbsp; IBM used its years of experience
engineering Db2 Federation for performance in Data Virtualization.&nbsp; Next
you can cache data in DV.&nbsp; You do this by giving it queries that bring
back data used by user queries that you want to accelerate and that result is
cached.&nbsp; You can have DV automatically refresh these results as frequently
as every hour.&nbsp; Queries launched against the cached data do not have to
exactly match the query used to create the cache. &nbsp;If the data in the
cache will help, DV will use it automatically.&nbsp; The DV console also allows
you to easily view which caches are being used and performance statistics for
them.</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>Another very
interesting performance capability is the ability to put DV agents in remote
locations.&nbsp; If all of your source database systems are in the same data
center then the agents are not as useful, but they can increase performance
significantly when those sources are in remote data centers.&nbsp; The agent
can be placed on one or more source database servers.&nbsp; If you do not wish
to install an agent on any data base servers, then the agent can be placed on a
server near your database servers.&nbsp; The agent at a remote data center
becomes especially useful when there are multiple database servers there whose
tables are frequently joined in queries.&nbsp; This is because agent will query
the database servers near it, do the join and send only the needed results to
the central DV system.&nbsp; This not only reduces the amount of work that the
central DV server has to do but can greatly reduce the amount of data
transmitted between sites.&nbsp; As a bonus these agents can communicate with
each other and have the intelligence to learn the fastest routes back to the
central DV server and use the best routes.&nbsp; There is no additional license
charge for the agents.&nbsp; Finally, any CSV or Excel files on the server
where the agent is located can be configured to appear as virtual tables in
DV.&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>All of the
administration can be done through the Data Virtualization Console as shown
below.&nbsp; The DV window here shows the main menu in the upper left over the
Connection details screen.&nbsp; This screen is where a user can find the
information needed to connect the Data Virtualization database from a remote
client and even has the link to download the drivers needed.&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'><img
border=0 width=468 height=268 src="DV1.fld/image001.png"
alt="A screenshot of a cell phone&#10;&#10;Description automatically generated"></span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>Users of
the virtual database include applications, business analysts creating reports
and dashboards, and data scientists.&nbsp; Administrators can also use the
client to do any administration task that can be done using SQL including
creating schemas and creating views.&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>Like a
number of IBM Data and AI products, Data Virtualization uses a common
administration layer for things such as user management, troubleshooting, and
deployment.&nbsp; Further, you do not buy a software part called Data
Virtualization.&nbsp; Instead you buy enough Cloud Pak for Data for the Data
Virtualization that you want and deploy just the components needed for Data
Virtualization.&nbsp; On the other hand, if you own Cloud Pak for Data, and
have the capacity in your system, Data Virtualization is one of the many
features you can deploy and use without acquiring licenses for an additional
product.&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>In this
article I hit the highlights of the Data Virtualization features.&nbsp;
Additional features and instructions can be found in two sections of the
Knowledge Center.&nbsp; The </span><a
href="https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/cpd/svc/dv/administer-dv.html"><span
style='font-family:"Times New Roman",serif'>installation and administration
section</span></a><span style='font-family:"Times New Roman",serif'> describes
how to do things like manage users, create schemas and manage access to virtual
tables and views.&nbsp; The </span><a
href="https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_current/cpd/svc/dv/virtualizing_data.html"><span
style='font-family:"Times New Roman",serif'>Virtualizing Data</span></a><span
style='font-family:"Times New Roman",serif'> sections describes how to add data
sources, restrictions, creating objects and caches and several other
items.&nbsp; </span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal align=center style='text-align:center'><span
style='font-family:"Times New Roman",serif'>***</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>This
article describes some of the useful features of Data Virtualization.&nbsp; If
you discover any interesting use cases, please tell us about them on my </span><a
href="https://www.facebook.com/db2Dean"><span style='font-family:"Times New Roman",serif'>db2Dean
Facebook Page</span></a><span style='font-family:"Times New Roman",serif'> and
share your thoughts about them.</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal align=center style='text-align:center;text-autospace:none'><a
href="http://www.db2dean.com/"><b><span style='font-family:"Times New Roman",serif'>HOME</span></b></a><b><span
style='font-family:"Times New Roman",serif'> | </span></b><a
href="http://www.db2dean.com/Search.html"><b><span style='font-family:"Times New Roman",serif'>Search</span></b></a></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

<p class=MsoNormal><span style='font-family:"Times New Roman",serif'>&nbsp;</span></p>

</div>

</body>

</html>