Search code examples
sqlsql-serverimportssmsbulk

How to import a bunch of csv that share a directory and similar naming scheme into a single table in one go


I am a little inexperienced due to being self-taught so please bear with me.

I am using SQL Server 2016. I have made the following query to import some CSVs into SQL. The table you see is the table they are supposed to go into.

USE [open secrets]

CREATE TABLE [dbo].[Lobbyists] (
    [uniqID] [varchar] (36) NOT NULL,
    [lobbyist] [varchar] (50) NULL,
    [lobbyist_raw] [varchar] (50) NULL,
    [lobbyist_id] [char] (12) NULL,
    [year] [char] (4) NULL,
    [Official Position] [varchar] (254) NULL,
    [cid] [char] (9) NULL,
    [formercongmem] [char] (1) NULL
) ON [PRIMARY]

BULK
INSERT Lobbyists
FROM 'C:\aaa open secrets\Lobby\LOB_LOBBYIST\??????????????????'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO

SELECT*FROM Lobbyists

Here are the files I want to import:

screenshot

As you can see they all live in the directory and have a similar naming scheme. I need to import everything from lob_lobbyist000 all the way to lob_lobbyist123. There are no missing numbers.

My goal is to import them all at once instead of having to do so one at a time using only SQL Server Management Studio. Please show me the correct query to accomplish this. An explanation of what is going on in the query would also be much appreciated.

UPDATE current query:

USE [open secrets]

CREATE TABLE [dbo].[Lobbyists] (
    [uniqID] [varchar] (36) NOT NULL,
    [lobbyist] [varchar] (50) NULL,
    [lobbyist_raw] [varchar] (50) NULL,
    [lobbyist_id] [char] (12) NULL,
    [year] [char] (4) NULL,
    [Official Position] [varchar] (254) NULL,
    [cid] [char] (9) NULL,
    [formercongmem] [char] (1) NULL
) ON [PRIMARY]

--BULK
--INSERT Lobbyists
--FROM 'C:\aaa open secrets\Lobby\LOB_LOBBYIST\??????????????????'
--WITH
--(
--FIELDTERMINATOR = ',',
--ROWTERMINATOR = '\n'
--)
--GO

--SELECT*FROM Lobbyists


--SELECT * INTO Lobbyists_20170309 FROM Lobbyists


DECLARE @ALLFILENAMES TABLE (WHICHPATH VARCHAR(255),WHICHFILE varchar(255))
declare @filename varchar(255),
        @path     varchar(255),
        @sql      varchar(8000),
        @cmd      varchar(1000)


--get the list of files to process:
SET @path = '"C:\aaaopensecrets\LOB_LOBBYIST\'
SET @cmd = 'dir ' + @path + '*.txt" /b'
INSERT INTO  @ALLFILENAMES(WHICHFILE)
EXEC Master..xp_cmdShell @cmd
UPDATE @ALLFILENAMES SET WHICHPATH = @path where WHICHPATH is null

SELECT * FROM @ALLFILENAMES 

--cursor loop
declare c1 cursor 
for SELECT WHICHPATH,WHICHFILE 
FROM @ALLFILENAMES
open c1
fetch next from c1 into @path,@filename
While @@fetch_status <> -1
  begin
  --bulk insert won't take a variable name, so make a sql and execute it instead:
   set @sql = 'BULK INSERT Lobbyists FROM ''' + @path + @filename + ''' '
       + '     WITH ( 
               FIELDTERMINATOR = '','', 
               ROWTERMINATOR = ''\n''
            ) '
print @sql
exec (@sql)

  fetch next from c1 into @path,@filename
  end
close c1
deallocate c1

The result:

(125 row(s) affected)

(125 row(s) affected)

(125 row(s) affected)
BULK INSERT Lobbyists FROM '"C:\aaaopensecrets\LOB_LOBBYIST\lob_lobbyist000.txt'      WITH ( 
               FIELDTERMINATOR = ',', 
               ROWTERMINATOR = '\n'
            ) 
Msg 4861, Level 16, State 1, Line 1
Cannot bulk load because the file ""C:\aaaopensecrets\LOB_LOBBYIST\lob_lobbyist000.txt" could not be opened. Operating system error code 123(The filename, directory name, or volume label syntax is incorrect.).

the error loops on for every single file


Solution

  • Adapted from here:

    Import Multiple CSV Files to SQL Server from a Folder

    Before you run the code below, back up your table by running this command:

    SELECT * INTO Lobbyists_20170309 FROM Lobbyists
    

    Once you've done that, you have a copy of your table.

    Now try running this:

    DECLARE @ALLFILENAMES TABLE (WHICHPATH VARCHAR(255),WHICHFILE varchar(255))
    declare @filename varchar(255),
            @path     varchar(255),
            @sql      varchar(8000),
            @cmd      varchar(1000)
    
    
    --get the list of files to process:
    SET @path = '"C:\aaa open secrets\Lobby\LOB_LOBBYIST\'
    SET @cmd = 'dir ' + @path + '*.txt" /b'
    INSERT INTO  @ALLFILENAMES(WHICHFILE)
    EXEC Master..xp_cmdShell @cmd
    UPDATE @ALLFILENAMES SET WHICHPATH = @path where WHICHPATH is null
    
    SELECT * FROM @ALLFILENAMES 
    
    --cursor loop
    declare c1 cursor 
    for SELECT WHICHPATH,WHICHFILE 
    FROM @ALLFILENAMES
    open c1
    fetch next from c1 into @path,@filename
    While @@fetch_status <> -1
      begin
      --bulk insert won't take a variable name, so make a sql and execute it instead:
       set @sql = 'BULK INSERT Lobbyists FROM ''' + @path + @filename + ''' '
           + '     WITH ( 
                   FIELDTERMINATOR = '','', 
                   ROWTERMINATOR = ''\n''
                ) '
    print @sql
    exec (@sql)
    
      fetch next from c1 into @path,@filename
      end
    close c1
    deallocate c1