Moving data from QVD to SQL Server
24 Jan 2021I had to move some large data (40GB+) from QVD file format (QlikView/Qlik Sense), into a database (SQL Server). I discuss some of the pitfalls I encountered in this post…
Filter Early
My QVD had data I didn’t need;
- SELECT as few columns and as few rows as possible; only what you need.
- Don’t try to JOIN in Qlik Sense Load Script; (Qlik Sense worked better when I loaded the tables separately)
- Use an
WHERE NOT EXISTS(...)
statement to ensure the “xref” table didnt load unnecessary rows; and likewise the linked table (which effectively “INNER JOIN”s the resulting rows) - Write each separate table to CSV output
- Use a
NullInterpret
orIF(ISNULL(...))
technique, to distinguishNULL
values from empty string values
Store QVD Into CSV
Use Qlik Sense load script to store each of the relevant QVD files (one “table” per QVD), into a separate CSV file.
- There is no simple way to use Qlik Sense load scripts to go directly from QVD -> SQL Server
- If we write to CSV file, we can use SQL Server Import Wizard/ SSIS to import from CSV -> SQL Server
CSV will not contain “schema” data, so to prepare for the next step, do this:
- Export the field information from Qlik Sense; (use the
FieldList
API to getNxFieldDescription
data for each field), includingqTags
(which tell us$numeric
or$ascii
), this helps us choose the right/”most efficient” datatype for each column in our database - Export a table showing the
=MAX(LEN([YourField]))
values, so if the field is$ascii
in Qlik Sense (and thereforevarchar
in your database); you’ll know what size to make yourvarchar
column (this was a major stumbling block for me; I usednvarchar(255)
, and for a table with many millions of rows; this took up unwieldy amounts of space on the database drive)
SQL Server Import Wizard
Choose datatypes for the CSV “Input Columns”
- Choose Flat File as your input;
- Check the CodePage; for me I could use
Latin1
aka1252
; change that on first page of Import Wizard - On the “Advanced” tab, choose the datatypes and char lengths for your CSV columns; use the most efficient datatype possible to speed up the import significantly (i.e. 10-100x faster), and decrease the size on the database
- See StackOverflow for warnings about Truncation Errors; may want to use the datatype lik
DT_TEXT
(a text stream) - this slowed down the process significantly for me, and by default used an inefficient datatype in the resulting database (nvarchar(max)
). However, if you choose theDT_TEXT
datatype for onr of your CSV formats - Use Text Qualifier; the exported CSV from Qlik Sense load scripts may use quotes
"..."
to wrap your values if your values contain commas – until I used this, I ran into “Truncation Errors” many times
Allow the Input Columns to determine the resulting Output Columns in a new table (just an easy way to get your table created for you, automatically)
JOIN in SQL Server
I use a JOIN
in SQL Server (see earlier comment, I ran into limits trying to JOIN directly in the Qlik Sense load script)
- Consider using
INSERT
instead ofSELECT ... INTO
(see Stack Overflow answer about this)
Simplify the Key
In my case, I was joining a large table (“Main”) to a small lookup table (“Lookup”) via a “bridge” table (“Bridge”). The key to join the “Main” table to the “Bridge” (“MainBridgeKey”) was at a very low-level of detail; and the resulting JOIN increases rowcount from 300K -> 300M
I used HASHBYTES
function to reduce the “Main” table into a “SimplifiedMain”