Search code examples
rdata-import

How to Import Dataset without header and names separate files in R and attach the names to dataset?


I downloaded dataset and column names from website https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer/ breast-cancer.data - Has Data without column header breast-cancer.names - Has Column names I need to load this data and column names to R and then need to attach the column names to the data frame. Please guide me the whole procedure to do it.


Solution

  • The .data file is merely a CSV without columns headers. The .names file is not in a standard format (that I recognize), so I read the file and assigned names manually.

    dat <- read.csv("~/Downloads/breast-cancer.data", header=FALSE)
    names(dat) <- c("class", "age", "menopause", "tumor_size", "inv_nodes", "node_caps", "deg_malig", "breast", "breast_quad", "irradiat")
    head(dat)
    #                  class   age menopause tumor_size inv_nodes node_caps deg_malig breast breast_quad irradiat
    # 1 no-recurrence-events 30-39   premeno      30-34       0-2        no         3   left    left_low       no
    # 2 no-recurrence-events 40-49   premeno      20-24       0-2        no         2  right    right_up       no
    # 3 no-recurrence-events 40-49   premeno      20-24       0-2        no         2   left    left_low       no
    # 4 no-recurrence-events 60-69      ge40      15-19       0-2        no         2  right     left_up       no
    # 5 no-recurrence-events 40-49   premeno        0-4       0-2        no         2  right   right_low       no
    # 6 no-recurrence-events 60-69      ge40      15-19       0-2        no         2   left    left_low       no