Search code examples
sqloracle11gsql-insertcreate-table

Why CREATE TABLE AS SELECT is more faster than INSERT with SELECT


I make a query with INNER JOIN and the result was 12 millions lines. I like to put this in a table. I did some tests and when I created the table using clause AS SELECT was more faster than, create the table first and run a INSERT with SELECT after. I don't understand why. Somebody can explain for me? Tks


Solution

  • If you use 'create table as select' (CTAS)

    CREATE TABLE new_table AS 
        SELECT * 
        FROM old_table
    

    you automatically do a direct-path insert of the data. If you do an

    INSERT INTO new_table AS 
        SELECT * 
        FROM old_table
    

    you do a conventional insert. You have to use the APPEND-hint, if you want to do a direct path insert instead. So you have to do

    INSERT /*+ APPEND */ INTO new_table AS 
        SELECT * 
        FROM old_table
    

    to get a similar performance as in 'CREATE TABLE AS SELECT'.

    How does the usual conventional insert work?

    Oracle checks the free list of the table for an already used block of the table segment that has still free space. If the block isn't in the buffer cache it is read into the buffer cache. Eventually this block is read back to the disk. During this process undo for the block is written (only a small amount of data is necessary here), data structures are updated, e.g. if necessary, the free list,that esides in the segment header and all these changes are written to the redo-buffer, too.

    How does a direct-path insert work?

    The process allocates space above the high water mark of the table, that is, beyond the already used space. It writes the data directly to the disk, without using a buffer cache. And it is also written to the redo buffer. When the session is committed, the highwater mark is raised beyond the new written data and this data is now visible to other sessions.

    How can I improve CTAS and direct-path inserts?

    • You can create he tale in NOLOGGING mode, than no redo information is written. If you do this, you should make a backup of the tablespace that contains the table after the insert, otherwisse you can not recover the table if you need this.
    • You can do the select in parallel

    • You can do the insert in parallel

    • If you have to maintain indexes and constraints or even triggers during an insert operation this can slow down your insert operation drastically. So you should avoid this and create indexes after the insert and maybe create constraints with novalidata.