SQL
Loader is an Oracle tool used to load data from flat files into oracle tables.
It allows the user to selectively load certain columns/rows. Although a
programmer could duplicate the functionality of SQL Loader by writing his/her
own load program, SQL Loader has the advantage of flexibility, ease of use, and
performance. It is often used for migrating data from a non-Oracle source
system to an Oracle database.
Two important components of
SQL Loader are as follows
Data File: It
is a flat file containing the data to be loaded in the oracle table. Each line
contains one record to be loaded in the table. Each field in the datafile is
separated by a separator character.
Below is a datafile containing 2 records, each row containing 3 columns separated by pipe (|).
Below is a datafile containing 2 records, each row containing 3 columns separated by pipe (|).
1|Ninad|Kamerkar
2|Steven|Clau
2|Steven|Clau
Control File: It is a text file written in a language that SQL Loader
understands. It tells SQL Loader where to find the data, how to parse and
interpret the data, where to insert the data, and more. In other words, we can
say that control file is a mapping file containing the mapping from the
datafile to oracle table.
Below is the syntax for the control file.
LOAD DATA
INFILE <dataFile>
APPEND INTO TABLE <tableName>
FIELDS TERMINATED BY ‘<separator>’
(<list of all attribute names to load>)
INFILE <dataFile>
APPEND INTO TABLE <tableName>
FIELDS TERMINATED BY ‘<separator>’
(<list of all attribute names to load>)
Step-by-step approach for
learning SQL Loader
Creating the Target Table
First we need to create a target oracle table in which SQL Loader will load the data from datafile. Execute below DDL on your oracle database schema.
First we need to create a target oracle table in which SQL Loader will load the data from datafile. Execute below DDL on your oracle database schema.
CREATE TABLE Employee (
Empno NUMBER(4),
Name VARCHAR2(10),
Age NUMBER(4),
Dob DATE,
Salary NUMBER(8,2),
Emp_status VARCHAR2(5),
Job_desgn VARCHAR2(20)
);
|
Creating the Data File
Below is the datafile that will be loaded into
the Employee table. Save this file as Employee.LST. First line in the datafile
is the header record. We need to ignore this first line while loading the file.
"NAME"|"AGE"|"SALARY"|"DOB"|"DEPARTMENT"|"JOB
DESIGNATION"
"George"|"28"|"50000.30"|"02/02/1957"|"Soft.
Dept."|"Soft. Eng."
"Steven"|"38"|"50000"|"08/11/1946"|"Accounts
Dept"|""
|
Creating the Control File
Below is the control file for the above datafile. Save this file as Employee.ctl
Below is the control file for the above datafile. Save this file as Employee.ctl
LOAD DATA
INFILE 'Employee.LST'
APPEND
INTO TABLE
Employee
FIELDS TERMINATED BY '|'
ENCLOSED BY '"'
TRAILING NULLCOLS
(
NAME CHAR
,AGE INTEGER
EXTERNAL
,SALARY DECIMAL
EXTERNAL
,DOB DATE
"MM/DD/YYYY"
,EMP_STATUS CONSTANT 'A'
,EMPNO SEQUENCE( MAX, 1)
,SKIP_FIELD FILLER
,JOB_DESGN "NVL(:JOB_DESGN,'New
Employee')"
)
|
Line 1: LOAD
DATA specifies the beginning of the control file.
Line 2: INFILE
specifies name of input files. An asterisk (*) following this keyword indicates
data records are in the control file.
Line 3:This
specifies the table operation to be performed. Below are all such optopns.
INSERT puts rows into an empty table.
APPEND adds rows to a table.
REPLACE and TRUNCATE delete all rows in a table and place the
current data records into the table.
INSERT puts rows into an empty table.
APPEND adds rows to a table.
REPLACE and TRUNCATE delete all rows in a table and place the
current data records into the table.
Line 4: INTO
TABLE specifies the target table to load data.
Line 5: FIELDS
TERMINATED BY specifies the delimiter which is used in the datafile. In our example
we are using pipe (|).
Line 6: ENCLOSED
BY specify the character with which each field value is enclosed. In our
example we are using double quotes (“). This is an optional parameter.
Line 7: TRAILING
NULLCOLS tells SQL Loader how to handle short records with missing data. If the
record ends before the field is found, then SQL Loader could either treat the
field as null or generate an error. It uses the presence or absence of the
TRAILING NULLCOLS clause to determine the action.
Line 9 – 16: These
lines contain the actual mapping between the datafile fields and target table
column. Here we say that the 1st field coming in the datafile should be put
into Name column of employee table, 2nd field coming in the datafile should be
put into Age column of employee table and so on.
Line 12: As
we receive date in the 4th field, we specify the date format in which the date
will come in the datafile.
Line 13: It
may happen that you might not receive some data in the datafile and you may
want to default some value into the target table column while inserting the
record. CONSTANT keyword followed by constant value helps us to default column
value.
Line 14: Here
we use inbuilt function from SQL Loader called SEQUENCE(MAX, 1), which
generates a sequence for the target table. It gets the max empno value from
employee table and adds 1 to it to generate new sequence value.
Line 15: We
might want to ignore some fields in the datafile. The keyword FILLER identifies
the department fields (5th fields in datafile) that are not being loaded into
the database. The name does not matter. You can change SKIP_FIELD to any
desired value.
Line 16: Here
we use oracle function NVL which check whether JOB_DESGN (6th field in
datafile) is empty or not. If it is empty we use a default text ‘New Employee’.
Loading Your Data
Once the datafile and control file are ready, we are ready to do the loading. We use sqlldr command to load the datafile as below.
Once the datafile and control file are ready, we are ready to do the loading. We use sqlldr command to load the datafile as below.
sqlldr
<database_user>/<database_password>@<SID>
SKIP=1 ERRORS=1000 ROWS=100 CONTROL=Employee.ctl
LOG=Employee.log BAD=Employee.bad
|
Below is explanation for parameters passed to SQLLDR command.
Parameter
Name
|
Description
|
SKIP
|
Number of logical records to skip (default: 0). SKIP=1
will skip the first record which is header.
|
ERRORS
|
Number of errors to terminate the load (default: 50)
|
ROWS
|
Number of rows in the conventional path bind array or
between direct path data saves (conventional path: 64, direct path: all)
|
CONTROL
|
Control filename.
|
LOG
|
Log filename.
|
BAD
|
Bad filename. During the load process if sql loader finds
records which could not be loaded, they are inserted in the file specified
with this parameter. The record can then be reviewed to find out the problem.
Conditions that may cause a record to be rejected include integrity
constraint violation, datatype mismatches, and other errors in field
processing.
|
DISCARD
|
Discard filename. SQL Loader provides options to
selectively reject records based on special criteria. This criteria is
defined in the control file as part of the when clause. If SQL Loader
encounters a record that fails a specified when clause, the record is placed
in a discard file.
|
DISCARDS
|
Number of discards to terminate the load (default: all).
|
When we execute the above command, data is loaded in the
Employee table. If sql loader encounters any problem while loading records, a
detailed log file containing summary of the load, any errors that occurred
during the loading is also produced.
Below is the sample log file generated after successful load
operation.
Advantages of using SQL
Loader
1. Has a powerful data parsing engine which puts little limitation
on the format of the data in the datafile.
2. Can load data from multiple data files during the same load
session.
3. Can load data into multiple tables during the same load session.
4. Is character set aware (you can specify the character set of the
data).
5. Can selectively load data (you can load records based on the
records’ values).
6. Can manipulate the data before loading it, using SQL functions.
7. Can generate unique sequential key values in specified columns.
8. Can use the operating system’s file system to access the
datafile(s).
9. Does sophisticated error reporting which greatly aids
troubleshooting.
Summary
Oracle SQL Loader processes the input data
files according to the directions in a text file called the control file which
specifies the names and locations of the source data files, the format of the
data to be loaded and the data transformations to be performed when loading.
As the input files are processed, any records that do not pass format checks are written to the bad file and any records that do not meet the specified selection criteria are written to the discard file.
As the input files are processed, any records that do not pass format checks are written to the bad file and any records that do not meet the specified selection criteria are written to the discard file.
Records that pass both format and selection
criteria are written to the specified target tables but they may still be
rejected because of, for example, constraint violations in which case they are
written to the bad file along with those records rejected for being invalid.
A log file containing a detailed summary of
the load, including a description of any errors that occurred during the load
is also produced.
No comments:
Post a Comment