LOAD DATA (SQL)
Synopsis
LOAD DATA FROM FILE filePath INTO table
LOAD DATA FROM FILE filePath INTO table (column, column2, ...)
LOAD DATA FROM FILE filePath COLUMNS (header type, header2 type2, ...)
INTO table ...
LOAD DATA FROM FILE filePath COLUMNS (header type, header2 type2, ...)
INTO table ... VALUES (header, header2, ...)
LOAD DATA FROM FILE filePath INTO table ... USING jsonOptions
LOAD DATA FROM JDBC CONNECTION jdbcConnection
TABLE jdbcTable INTO table
LOAD DATA FROM JDBC CONNECTION jdbcConnection
TABLE jdbcTable INTO table (column,column2, ...)
LOAD DATA FROM JDBC CONNECTION jdbcConnection
TABLE jdbcTable INTO table ... VALUES (header,header2 ...)
LOAD DATA FROM JDBC CONNECTION jdbcConnection
QUERY query INTO table (column,column2, ...)
LOAD DATA FROM JDBC URL path TABLE jdbcTable ...
LOAD BULK [ %NOINDEX ] DATA FROM ...
LOAD %NOJOURN DATA FROM ...
LOAD BULK %NOJOURN DATA FROM ...
LOAD [ load-option] DATA FROM ...
Description
The LOAD DATA command loads data from a source into a previously defined InterSystems IRIS® SQL table. The source can be a data file or a table accessed using JDBC. Use this command for the rapid population of a table with well-validated data.
If the table you are loading data into is empty, LOAD DATA populates the table with the source data rows. If the table already contains data, LOAD DATA inserts the source data rows into the table without overwriting any existing rows.
When you load data, the %ROWCOUNTOpens in a new tab variable indicates the number of rows successfully loaded. If a row in the input data contains an error, LOAD DATA skips loading this row and proceeds with loading the next row. SQLCODE does not report this as an error, but the %SQL_Diag.Result log indicates how many records failed to load. For more details, see View Diagnostic Logs of Loaded Data.
The LOAD DATA command uses an underlying Java-based engine that requires a Java Virtual Machine (JVM) installation on your server. If you already have a JVM set up and accessible in your PATH environment variable, then the first time you use LOAD DATA, InterSystems IRIS automatically uses that JVM to start an External Language Server. To customize your External Language Server to use a specific JVM, or to use a remote server, see Managing External Server Connections.
Load from File
Load from File Without Headers
Use these syntaxes if your source file does not contain a header row in the first line of the file. Otherwise, LOAD DATA loads the header row into the table.
-
LOAD DATA FROM FILE filePath INTO table loads the source data from the file specified by filePath into the target SQL table. By default, LOAD DATA matches the columns from the data source to the target table by position. LOAD DATA uses the SQL column order (SELECT * column order).
-
If the data source has more columns than the input table, then the excess columns are ignored and not loaded into the table.
-
If the data source has fewer columns the input table, none of the data is loaded into the table.
The LOAD DATA command expects that the data type of the loaded data matches the data type of the target table columns.
This statement loads all the columns from the countries CSV source file into the columns that are in the corresponding positions of the target Sample.Countries table.
LOAD DATA FROM FILE 'C://mydata/countries.csv' INTO Sample.Countries
-
-
LOAD DATA FROM FILE filePath ... INTO table (column, column2, ...) loads source data positionally for only the specified target table columns. If the data source has fewer columns than the input table, then those columns are empty in the inserted rows.
This statement loads the first three columns from the countries CSV file into the Name, Continent, and Region columns of the Sample.Countries table. Even if the table stores these columns in a different order, or if there are columns in between the three shown here, LOAD DATA still loads data only into Name, Continent, and Region.
LOAD DATA FROM FILE 'C://mydata/countries.csv' INTO Sample.Countries (Name,Continent,Region)
-
LOAD DATA FROM FILE filePath COLUMNS (header type, header2 type2, ...) INTO table enables you to load data from source files that have a different column order than the target table. The COLUMNS clause provides header names and data types for the columns in the source files. The header names must match the names of columns in the target table and the data type must be consistent with the data types of those table columns.
-
If the INTO table clause specifies target columns, then the columns named in the COLUMNS clause must also appear in the INTO table clause, but they can be in any order.
-
If the INTO table clause does not specify target columns, then LOAD DATA loads the source columns positionally into the table. The COLUMNS clause must name all columns that appear in the target table.
This statement loads three columns from the countries CSV file into corresponding columns in the Sample.Countries table. If the Sample.Countries table has a different column order than the source file (for example, Name, SurfaceArea, Continent instead of Name, Continent, SurfaceArea), the table column order does not change.
LOAD DATA FROM FILE 'C://mydata/countries.csv' COLUMNS ( Name VARCHAR(50), Continent VARCHAR(30), SurfaceArea Integer) INTO Sample.Countries (Name,Continent,SurfaceArea)
-
-
LOAD DATA FROM FILE filePath COLUMNS (header type, header2 type2, ...) INTO table ... VALUES (header, header2, ...) additionally enables you to load a subset of columns from the source file into the target columns. These column names do not need to match the target table column names.
The VALUES clause specifies the source columns, as named by the headers in the COLUMNS clause, to load into the target table.
-
If the INTO table clause specifies target columns, then LOAD DATA loads the source columns into the target columns in the order those columns are specified. The number of source column headers in VALUES must match the number of columns in the INTO table clause.
-
If the INTO table clause does not specify target columns, then LOAD DATA loads the source columns positionally into the table. The number of source headers in VALUESmust match the number of columns in the table.
If you specify the VALUES clause without the COLUMNS clause, then the VALUES clause is ignored.
This statement loads three columns from the countries CSV file into corresponding columns in the Sample.Countries table. The COLUMNS clause includes a header name for an additional column, src_continent, that is not loaded into the table. This column name is ignored, but it must be included so that LOAD DATA can load the data from the subsequent columns (src_region and src_surface_area) into the table.
LOAD DATA FROM FILE 'C://mydata/countries.csv' COLUMNS ( src_name VARCHAR(50), src_continent VARCHAR(30), src_region VARCHAR(30), src_surface_area INTEGER) INTO Sample.Countries (Name,SurfaceArea,Region) VALUES (src_name,src_surface_area,src_region)
You may also supply SQL functions or ODBC function extensions in a VALUES clause to apply a transformation to all loaded data before it is added to the table. Additionally, you may specify a literal value in the VALUES clause to add that value to every row; similarly, you may use SQL functions, such as NOW(), to populate columns with values that do not originate from the file source.
Input to the VALUES clause is not validated until data is inserted into the target table. Any field names that appear in the VALUES clause must appear in the COLUMNS clause.
The following statement uses SQL functions and literals modify and populate columns loaded from a file. In the resulting table, the contents of the Name column are all in upper case, the contents of the Address column are all '-----', and the contents of the DateAdded column are populated by the result of calling NOW().
LOAD DATA FROM FILE 'C://mydata/people.csv' COLUMNS ( Name VARCHAR(20), Address VARCHAR(50), DateAdded DATE ) INTO Sample.Transactions VALUES (UCASE(Name), '-----', NOW())
-
Load from File with Headers and Specify Options
Use this syntax if the first line of your source file contains a header row. Using this syntax, you can specify an option to skip the header row. Other options include changing the default column, skipping additional rows beyond the header, and changing the default escape character.
-
LOAD DATA FROM FILE filePath INTO table ... USING jsonOptions specifies loading options by using a JSON object or a string containing a JSON object.
This statement uses a JSON object to specify that the file contains a header row, so that LOAD DATA does not include this row in the table. In this statement, it is assumed that the header names in the countries CSV file match the header names of the Sample.Countries table columns.
LOAD DATA FROM FILE 'C://mydata/countries.csv' INTO Sample.Countries USING {"from":{"file":{"header":true}}}
Note:If the header text does not validate against the field data type, such as an integer field with a header named "Total", LOAD DATA might omit the header row anyway. However, this method of validation rejection is unreliable and is not recommend. Omit the header with a USING clause instead.
The statement loads data from three columns in the countries CSV file into the three corresponding columns in the Sample.Countries table. In this statement, the header names in the countries CSV file do not match the header names of the Sample.Countries table columns. The VALUES clause specifies the column header names obtained from the file. Data from these columns is then loaded into the table columns that are in the corresponding position of the INTO table clause.
LOAD DATA FROM FILE 'C://mydata/countries.csv' INTO Sample.Countries (Name,Region,SurfaceArea) VALUES (country_name,region_name,surface_area) USING {"from":{"file":{"header":true}}}
Load from JDBC Source
-
LOAD DATA FROM JDBC CONNECTION connection TABLE jdbcTable INTO table loads data from an external JDBC data source into the target table. The data source, jdbcTable, is a JDBC-compliant SQL table that you connect to by using a defined SQL Gateway Connection, connection. For more details, see Connecting the SQL Gateway via JDBC.
This statement loads all columns from the JDBC source table, countries, into the corresponding columns of the Sample.Countries table.
LOAD DATA FROM JDBC CONNECTION MyJDBCConnection TABLE countries INTO Sample.Countries
-
LOAD DATA FROM JDBC CONNECTION connection TABLE jdbcTable (column, column2, ...) loads JDBC source data positionally for only the specified target table columns. If the JDBC source has fewer columns than the input table, then those columns are empty in the inserted rows.
This statement loads the first three columns from the JDBC countries table into the Name, Continent, and Region columns of the Countries table. Even if the table stores these columns in a different order, or if there are columns in between the three shown here, LOAD DATA still loads data only into Name, Continent, and Region.
LOAD DATA FROM JDBC CONNECTION MyConnection TABLE countries INTO Sample.Countries (Name,Continent,Region)
-
LOAD DATA FROM JDBC CONNECTION connection TABLE jdbcTable ... INTO table ... VALUES (header,header2 ...) loads data form the JDBC source for only the columns that have the header names specified in the VALUES clause. Using this syntax, you can place column data at any position in the JDBC source table into columns at any position in the target table.
The number of columns in the INTO table clause must match the number of headers in the VALUES clause. If the INTO table clause does not specify any columns, then the number of headers in the VALUES clause must match the number of the headers in the table. In this case, the source data is loaded positionally into the table.
You may also supply SQL functions or ODBC function extensions in a VALUES clause to apply a transformation to all loaded data before it is added to the table. Additionally, you may specify a literal value in the VALUES clause to add that value to every row; similarly, you may use SQL functions, such as NOW(), to populate columns with values that do not originate from the file source.
This statement loads data from the name, surface_area, and region columns of the JDBC countries table into the corresponding columns of the Sample.Countries table.
LOAD DATA FROM JDBC CONNECTION MyConnection TABLE countries INTO Sample.Countries (Name,SurfaceArea,Region) VALUES (name,surface_area,region)
The way the VALUES clause matches SQL column names positionally is similar to INSERT command syntax.
You may also supply SQL functions or ODBC function extensions in a VALUES clause to apply a transformation to all loaded data before it is added to the table. Additionally, you may specify a literal value in the VALUES clause to add that value to every row; similarly, you may use SQL functions, such as NOW(), to populate columns with values that do not originate from the file source.
Input to the VALUES clause is not validated until data is inserted into the target table. Any field names that appear in the VALUES clause must appear in the source table.
The following statement uses SQL functions and literals modify and populate columns loaded from a file. In the resulting table, the contents of the Name column are all in upper case (pulling data from the Name column of the external transactions table), the contents of the Address column are all '-----', and the contents of the DateAdded column are the result of calling NOW().
LOAD DATA FROM JDBC CONNECTION MyConnection TABLE transactions INTO Sample.Transactions (Name,Address,DateAdded) VALUES (UCASE(Name), '-----', NOW())
-
LOAD DATA FROM JDBC CONNECTION jdbcConnection QUERY query INTO table (column,column2, ...) executes the specified query against the source and loads the result set into the specified table. The query is sent to the JDBC source database as is.
Any columns specified for the INTO table should match the column order specified positionally with the columns in the query. If there are more columns in the query than you are loading into the target table, the extra columns are ignored.
LOAD DATA FROM JDBC CONNECTION SQLSamples QUERY 'SELECT A,B,C FROM Sample.Table WHERE A < 18' INTO Letters.Count (A,B,C)
-
LOAD DATA FROM JDBC URL path TABLE jdbcTable INTO table loads data from an external JDBC data source into the target table. The data source, defined by path, is located at the connection URL for the data source. For more details, see Connecting the SQL Gateway via JDBC.
This statement loads all columns from the JDBC source table, countries, into the corresponding columns of the Sample.Countries table.
LOAD DATA FROM JDBC URL jdbc:oracle:thin:@//oraserver:1521/SID TABLE countries INTO Sample.Countries
Bulk Loading Options
These options disable common checks and operations performed when data is inserted into a table. Disabling these options can significantly speed up the loading of data with a large number of rows.
These bulk loading options can result in the loading of invalid data. Before using these options, make sure that the data is valid and is from a reliable source.
-
LOAD BULK [%NOINDEX] DATA FROM ... loads data with these INSERT %keyword options specified:
-
%NOCHECK — Disables unique value checking, foreign key referential integrity checking, NOT NULL constraints (required field checks), and validation for column data types, maximum column lengths, and column data constraints.
-
%NOINDEX — Disables the defining or building of index maps during INSERT processing. During the LOAD BULK DATA operation, SQL statements run against the target table might be incomplete or return incorrect results.
All index maps are marked as NotSelectable while the command is running. By default, the index maps are marked as Selectable when the statement finishes running and the indexes are built. If you additionally specify %NOINDEX with BULK, then all the index maps for the table are set to NotSelectable until a BUILD INDEX command is issued later by the user.
-
%NOLOCK — Disables locking of the row upon INSERT.
To use the BULK keyword, you must have %NOCHECK, %NOINDEX, and %NOLOCK administrative privileges, which you can set by using the GRANT command.
This statement loads bulk data from a file.
LOAD BULK DATA FROM FILE 'C://mydata/countries.csv' INTO Sample.Countries
-
-
LOAD %NOJOURN DATA FROM ... loads data with the %NOJOURN INSERT %keyword option specified, which suppresses journaling and disables transactions for the duration of the insert operations. To use the %NOJOURN option, you must have %NOJOURN SQL administrative privileges, which you can set by using the GRANT command.
This form of the LOAD DATA command acquires a table-level lock on the target table, but each row is inserted with %NOLOCK. The table level lock is released when the command completes.
This statement loads data from a table over a JDBC connection and disables journaling.
LOAD %NOJOURN DATA FROM JDBC CONNECTION MyJDBCConnection TABLE countries INTO Sample.Countries
-
LOAD %NOJOURN BULK DATA FROM ... loads data with the INSERT %keyword options from the previous syntaxes specified. You can specify %NOJOURN and BULK in either order.
-
LOAD [ load-option ] DATA FROM ... loads data with the load-option hints, which are any combination of %NOCHECK, %NOINDEX, %NOLOCK, and %NOJOURN. This enables you to employ certain optimizations of a bulk load as needed. You cannot specify BULK with %NOCHECK or %NOLOCK.
Arguments
filePath
The server-side location of a text file containing the data to load, specified as a complete file path enclosed in quotes.
-
Each line in the file specifies a separate row to be loaded into the table. Blank lines are ignored.
-
Data values in a row are separated by a column separator character. Comma is the default column separator character. All data fields must be indicated by column separators, including unspecified data indicated by placeholder column separators. You can define a different column separator character by specifying the columnseparator option in the USING jsonOptions clause.
-
By default, no escape character is defined. To include the column separator character as a literal in a data value, enclose the data value with quotation marks. To include a quotation mark in a quoted data value, double the quote character (""). You can define an escape character specifying the escapechar option in the USING jsonOptions clause.
-
By default, data values are specified in the order of the fields in the table (or view). You can use the COLUMNS clause to specify the data in a different order. You can use a view to load a data record to a table by supplying only values for the fields that are defined in the view.
-
All data in a data file record is validated against the table’s data criteria, including the number of data fields in the record, and the data type and data length of each field. A data file record that fails validation is passed over (not loaded). No error message is issued. Data loading continues with the next record.
Date or timestamp data should be written in ODBC timestamp format (‘yyyy-mm-dd hh:mm:ss’) to ensure validation.
table
The table to load the data into. A table name can be qualified (schema.tablename), or unqualified (tablename). An unqualified table name takes the default schema name. You can specify a view to load data in the table accessed through the view.
column
The table columns to load file data into, specified in the order of the columns in the file. This list of column names enables you to specify selected table columns and to match the order of the data file items to the columns in table. Unspecified columns that are defined in the table take their default values. If this clause is omitted, all user-defined fields in table must be represented in the data file.
header
A comma-separated list of header values used to identify columns to load from the data source.
-
When loading data from a file source that does not contain a header row, specify headers in the COLUMNS header type, header2 type2, ...) clause to name the columns.
-
If you include a VALUES clause, specify these header names in VALUES (header, header2) to select which columns to load into the table.
-
If you do not include a VALUES clause, then these header names must match the column names in the target table.
-
-
When loading data from a file source that contains a header row, specify headers in the VALUES (header, header2) clause to identify which headers in the source file to load data from. These header names must exist in the source file.
-
When loading data from a JDBC source, specify headers in the VALUES (header, header2) clause to identify which columns in the JDBC source table to load data from. These header names must exist in the JDBC source table.
type
The data type of the headers specified in the COLUMNS header type, header2 type2, ...) clause. The data type for each column must be compatible with the table’s data type. The table’s data length, not the COLUMNS data length, is used to validate the data.
jsonOptions
Loading options, specified in the USING clause as a JSON (JavaScript Object Notation) object or a string containing a JSON object. These syntaxes are equivalent:
USING {"from":{"file":{"header":true}}}
USING '{"from":{"file":{"header":true}}}'
Use these JSON objects to set loading options that cannot be set using SQL keywords. Specify these objects using nested key:value pair syntax, as described in JSON Values.
The primary use of this object is to set options of the loaded data that supplements the FROM FILE syntax, although there are options for parallelizing and permitting errors during the execution of the LOAD DATA. This example shows a sample JSON object with multiple options specified. The whitespace shown here is optional and is provided for readability only.
USING
{
"from": {
"file": {
"header": true,
"skip": 2,
"charset": "UTF-8",
"escapechar": "\\",
"columnseparator": "\t"
}
}
}
This table shows the options that you can specify. Unspecified options use the default values.
Option | Description | Example |
---|---|---|
from.file.header |
Set to true (1) to indicate that the first line of the source file is a header row. Column names in this header can then be specified and used in a VALUES clause, if no COLUMNS clause is specified. For more details, see Load from File with Headers and Specify Options. Default: false (0) |
{"from":{"file":{"header":true}}} |
from.file.skip |
Specify the number of lines at the start of the file to skip. If header is set to true, then skip indicates the number of lines to skip in addition to the header. Default: 0 |
{"from":{"file":{"skip":2}}} |
from.file.charset |
Specify the character set used to parse input data. Default: LOAD DATA uses the character set of the host operating system. |
{"from":{"file":{"charset":"UTF-8"}}} |
from.file.escapechar |
Specify the escape character used for literal values, such as column separator characters that are used within a column value. Default: None |
{"from":{"file":{"escapechar":"\\"}}} |
from.file.columnseparator |
Specify the column separator character. Default: "," |
{"from":{"file":{"columnseparator":";"}}} |
into.jdbc.threads |
Specify the number of threads to parallelize the JDBC writer across. This option may be used even when not loading data from a JDBC source. Each thread feeds a single server process performing INSERT commands. If it is important that data is loaded into the table in the exact order it is defined in the table, you should specify “threads”:1. In general, you should specify a lower value when multiple LOAD DATA commands run in parallel. Default: $System.Util.NumberOfCPUs() - 2 |
{"into":{"jdbc":{"threads":4}}} |
from.jdbc.fetchsize |
Specify the how many rows to fetch at once for loading into the table. Setting a large fetchsize often improves performance when loading large amounts of data. When the fetchsize is 0, the JDBC driver chooses the fetch size; most JDBC drivers chose a value of 10. The server must also support the fetchsize parameter for this setting to take effect; otherwise, it is ignored. Default: 0 |
{"from":{"jdbc":{"fetchsize":256}}} |
maxerrors |
The maximum number of errors that may arise during the LOAD DATA command before the entire operation is determined to be a failure, closing a transaction and rolling back all changes. Default: 0 |
{"maxerrors":5} |
jdbcConnection
A defined SQL Gateway Connection name used to load data from a JDBC source. For details on establishing a JDBC connection, see Connecting the SQL Gateway via JDBC.
jdbcTable
The external SQL data source table accessed over a JDBC connection. For details on establishing a JDBC connection, see Connecting the SQL Gateway via JDBC.
path
The SQL Gateway Connection URL used to load data from a JDBC source. For details on establishing a JDBC connection, see Connecting the SQL Gateway via JDBC.
query
A SELECT query that specifies which data to load into the table. The results of the query are added to the table.
load-option
One or more INSERT %keyword options, separated by a single space character. These options specify certain behaviors for the LOAD DATA command. For details about these options, see Bulk Loading Options.
Security and Privileges
LOAD DATA is a privileged operation that requires a user to have INSERT privileges to modify the table you are loading into and to access the JVM on your server.
INSERT Privileges
To execute LOAD DATA on a table, the user must have table-level or column-level privileges for that table. In particular, the user must have INSERT privilege on the table. The Owner (creator) of the table is automatically granted all privileges for that table. If you are not the Owner, you must be granted privileges for the table. If you do not have the proper privileges, InterSystems IRIS raises a SQLCODE -99 error.
Table-level privileges are equivalent, but not identical to, having column-level privileges on all columns of the table. If you only have column-level privileges on a subset of the table’s columns, you will only be able to load data into those columns. If you attempt to load data into a column that you do not have permissions for, InterSystems IRIS raises a SQLCODE -99 error.
To determine if you have the appropriate privileges, use %CHECKPRIV. To assign table privileges to a user, use GRANT. For more details, see Privileges.
Gateway Privileges
To execute LOAD DATA, the user must have access to the JVM on your server. As with access to any external language server in InterSystems IRIS, such a connection is privileged. Users need the %Gateway_Object:USE privilege to appropriately access the JVM.
Transaction Considerations
Atomicity
LOAD DATA is an atomic operation. Like other atomic operations, a LOAD DATA command is completely rolled back if it is not successful by making use of transactions by default; if the command cannot be completed, no data is inserted and the database reverts to its state before issuing the LOAD DATA. Unlike other atomic operations, there are notable exceptions to this rule. These exceptions are as follows:
-
LOAD BULK DATA and LOAD %NOJOURN DATA do not start transactions.
-
LOAD DATA is unique among atomic operations because it enables the use of jsonoption in a USING clause. With the maxerrors JSON option, you can specify an upper limit on insertion errors during a LOAD DATA command. If this limit is reached, the transaction will fail and the database reverts to its state before issuing the LOAD DATA. If the limit is not reached, the transaction succeeds and the successfully loaded data will appear in the database; however, data that failed to be loaded in will not appear in the database.
You can modify this default for the current process within SQL by invoking SET TRANSACTION %COMMITMODE. You can modify this default for the current process in ObjectScript by invoking the SetOption()Opens in a new tab method, using this syntax:
SET status=$SYSTEM.SQL.Util.SetOption("AutoCommit",intval,.oldval)
The following intval integer options are available:
-
1 or IMPLICIT (autocommit on — default) — Calling LOAD DATA initiates and completes its own transaction.
-
2 or EXPLICIT (autocommit off) — If no transaction is in progress, LOAD DATA automatically initiates a transaction, but you must explicitly COMMIT or ROLLBACK to end the transaction. In EXPLICIT mode, the number of database operations per transaction is user-defined.
-
0 or NONE (no auto transaction) — No transaction is initiated when you invoke LOAD DATA. A failed LOAD DATA operation can leave the database in an inconsistent state, with some rows inserted and some not inserted. To provide transaction support in this mode, you must use START TRANSACTION to initiate the transaction and COMMIT or ROLLBACK to end the transaction.
A sharded table is always in no auto-transaction mode, which means all inserts, updates, and deletes to sharded tables are performed outside the scope of a transaction.
To determine the atomicity setting for the current process, use the GetOption("AutoCommit")Opens in a new tab method, as shown in this ObjectScript example:
SET stat=$SYSTEM.SQL.Util.SetOption("AutoCommit",$RANDOM(3),.oldval)
IF stat'=1 {WRITE "SetOption failed:" DO $System.Status.DisplayError(qStatus) QUIT}
SET x=$SYSTEM.SQL.Util.GetOption("AutoCommit")
IF x=1 {
WRITE "Default atomicity behavior",!
WRITE "automatic commit or rollback" }
ELSEIF x=0 {
WRITE "No transaction initiated, no atomicity:",!
WRITE "failed DELETE can leave database inconsistent",!
WRITE "rollback is not supported" }
ELSE { WRITE "Explicit commit or rollback required" }
Examples
Load Data From CSV File into SQL Tables and Views
This example shows how to load data stored in a comma-separated value (CSV) file into an existing table and view.
Create the table to load data into. This table contains three fields specifying membership data: a member ID, the membership term length in months, and the US state where the member lives, using the two-character state abbreviation.
CREATE TABLE Sample.Members (
MemberId INT PRIMARY KEY,
MemberTerm INT DEFAULT 12,
MemberState CHAR(2))
Copy these data records into a text file. Save the file on your local machine and name it members.csv. This file specifies membership IDs and member state values. The second row is missing a value, as indicated by a placeholder comma inserted before where the value would be.
6138830,MA
1720936,
4293608,NH
Use LOAD DATA to load the data into the Sample.Members table. Replace the path shown here with the path where you saved the file.
LOAD DATA FROM FILE 'C://temp/members.csv' INTO Sample.Members (MemberId,MemberState)
Examine the data. The MemberId and MemberState columns have been populated. The source file did not contain data for the MemberTerm column, so these column values default to 12. The missing row value is loaded in as a NULL value.
SELECT * FROM Sample.Members
MemberId | MemberTerm | MemberState |
---|---|---|
6138830 | 12 | MA |
1720936 | 12 | |
4293608 | 12 | NH |
LOAD DATA does not report an SQLCODE error for the missing data because the overall SQLCODE result of LOAD DATA is 0 (success). A LOAD DATA operation is considered successful if:
-
LOAD DATA can access the source.
-
The target table exists.
-
The LOAD DATA operation is valid. For example, the operation specifies the correct number of columns and the column names exist in the target table.
To view the SQLCODE errors for individual rows, along with other information about the LOAD DATA operation, you can use the %SQL_Diag.Result and %SQL_Diag.Message tables. For more details, see View Diagnostic Logs of Loaded Data.
View the messages from the most recent LOAD DATA operation. The row with the missing state abbreviation reports an SQLCODE error of -104. The results shown are truncated for readability.
SELECT actor,message,severity,sqlcode
FROM %SQL_Diag.Message
WHERE diagResult =
(SELECT TOP 1 resultId
FROM %SQL_Diag.Result
ORDER BY resultId DESC)
actor | message | severity | sqlcode |
---|---|---|---|
server | {"resultid":"1","bufferrowcount":500, ... } | info | 0 |
FileReader |
Reader Complete: Total Input file read time: 23 ms, |
completed | 0 |
JdbcWriter |
[SQLCODE: <-104>:<Field validation failed in INSERT>] [%msg: ... (Varchar Value: 'state...' Length: 5) > maxlen: (2)>] |
error | -104 |
JdbcWriter |
Writer Complete: Total write time: 72 ms, |
completed | 0 |
If you create a view from a table, you can also load data into the table by using the view. Create a view that shows only the membership ID and state columns of the Sample.Members table.
CREATE VIEW Sample.VMem (MId,State) AS SELECT MemberId,MemberState FROM Sample.Members
Copy these additional data records into a text file. Save the file on your local machine and name it members2.csv.
6785674,VT
4564563,RI
4346756,ME
Use LOAD DATA to load this new CSV data into the table by using the view you created.
LOAD DATA FROM FILE 'C://temp/members2.csv' INTO Sample.VMem(MId,State)
View the data returned by the view, which includes the data from both loaded CSV files.
SELECT * FROM Sample.VMem
MId | State |
---|---|
6138830 | MA |
1720936 | |
4293608 | NH |
6785674 | VT |
4564563 | RI |
4346756 | ME |
View the data in the base table, which includes combined column data from both CSV files. The default value of 12 is applied to the values in the MemberTerm column for the second loaded CSV file as well.
SELECT * FROM Sample.Members
MemberId | MemberTerm | MemberState |
---|---|---|
6138830 | 12 | MA |
1720936 | 12 | |
4293608 | 12 | NH |
6785674 | 12 | VT |
4564563 | 12 | RI |
4346756 | 12 | ME |
Delete the view and table.
DROP VIEW Sample.VMem
DROP TABLE Sample.Members
Troubleshooting
View Diagnostic Logs of Loaded Data
Each call to LOAD DATA generates both an entry in the SQL Diagnostic Logs, which is viewable in the Management Portal at System Operation > System Logs > SQL Diagnostic Logs, and new row in the %SQL_Diag.Result table. This table contains diagnostic information about the operation. You can view these rows by using a SELECT query. For example, this query returns the five most recent LOAD DATA calls.
SELECT TOP 5 * FROM %SQL_Diag.Result ORDER BY createTime DESC
The returned table has these columns:
-
ID — Integer ID of the log entry. This value is the primary key of the table.
-
resultId — Same as ID.
-
createTime — Timestamp for when the LOAD DATA operation took place and the log entry row was created. Timestamps are in UTC (Coordinated Universal Time), not local time.
-
namespace — Namespace in which the LOAD DATA operation took place.
-
processId — Integer ID of the process that performed the LOAD DATA operation.
-
user — User who performed the LOAD DATA operation.
-
sqlCode — SQLCODE of the overall LOAD DATA operation.
-
inputRecordCount — Number of records successfully loaded.
-
errorCount — The number of errors that occurred. Includes errors that cause LOAD DATA command and failures to load or write individual rows of data.
-
maxErrorCount — Maximum number of row insertion errors that LOAD DATA tolerates before failing the operation.
-
status — Status of the LOAD DATA operation. While LOAD DATA is executing, the status is set to "In Progress". When the LOAD DATA operation is complete, the status is updated to "Complete". If the LOAD DATA execution produces an error, the status is updated to "Failed".
-
statement — Text of the SQL statement that was executed to produce this %SQL_Diag.Result record.
The %SQL_Diag.Message table provides detailed message data for each LOAD DATA operation logged in the %SQL_Diag.Result table. The diagResult column of %SQL_Diag.Message is a foreign key reference to the resultId column of %SQL_Diag.Result table, enabling you to access messages for individuals LOAD DATA operations.
For example, this query returns all error messages associated with the LOAD DATA operation that has a resultId of 29. You can use this data to diagnose which rows of the table failed to load.
SELECT *
FROM %SQL_Diag.Message
WHERE severity = 'error'
AND diagResult = 29
The returned table has these columns:
-
ID — Integer ID of the message. This value is the primary key of the table.
-
actor — Entity that reported the message, such as server, FileReader, or JdbcWriter.
-
diagResult — Row ID of the LOAD DATA log entry recorded in the %SQL_Diag.Result table.
-
message — Message data. For errors, this column includes SQLCODE values and %msg text.
-
messageTime — Timestamp of the message in UTC (Coordinated Universal Time), not local time.
-
severity — Level of severity of the message. severity is a logical integer that has a correspond display. Valid values are "completed", "info", "warning", "error", and "abort".
-
sqlcode — SQLCODE of the message. Messages with a severity of "completed" or "info"report an SQLCODE of 0. Messages with a severity of "warning"or "error"report the SQLCODE associated with that message. Messages with a severity of "abort" report an SQLCODE of -400.