Search code examples
rtm

get rid of first column when converting dtm Matrix to DataFrame


I've converted a Document Term Matrix to a dataframe using this simple line

dtm.df <- as.data.frame(inspect(dtm))

The problem is I want to remove the first column (filenames) but the column has no name.


Solution

  • There might be two different issues here: rownames vs. columns.

    head(mtcars)
                       mpg cyl disp  hp drat    wt  qsec vs am gear carb
    Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
    Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
    Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
    Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
    Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
    Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
    

    Here you see a column printed without a name. These are the rownames. mpg is the first column. If we wanted to remove this column without refering to its name, we could use

    mtcars <- mtcars[,-1]
    head(mtcars)
                      cyl disp  hp drat    wt  qsec vs am gear carb
    Mazda RX4           6  160 110 3.90 2.620 16.46  0  1    4    4
    Mazda RX4 Wag       6  160 110 3.90 2.875 17.02  0  1    4    4
    Datsun 710          4  108  93 3.85 2.320 18.61  1  1    4    1
    Hornet 4 Drive      6  258 110 3.08 3.215 19.44  1  0    3    1
    Hornet Sportabout   8  360 175 3.15 3.440 17.02  0  0    3    2
    Valiant             6  225 105 2.76 3.460 20.22  1  0    3    1
    

    On the other hand, if you are talking about the rownames, which are still printed, you can remove them with the function rownames:

    rownames(mtcars) <- NULL
    head(mtcars)
      cyl disp  hp drat    wt  qsec vs am gear carb
    1   6  160 110 3.90 2.620 16.46  0  1    4    4
    2   6  160 110 3.90 2.875 17.02  0  1    4    4
    3   4  108  93 3.85 2.320 18.61  1  1    4    1
    4   6  258 110 3.08 3.215 19.44  1  0    3    1
    5   8  360 175 3.15 3.440 17.02  0  0    3    2
    6   6  225 105 2.76 3.460 20.22  1  0    3    1