Search code examples
rdataframetexttypes

My dataframe in R shows that it holds values but is_empty and is_null return TRUE for all but the first row


I am attempting to import a plain-text file (.pgn - a summary of chess moves) into RStudio Cloud and extract only a handful of the rows for further analysis.

I import the file into a data frame with:

>pgn_df <- read.delim("test.pgn")

When I view the contents I see this:

>View(pgn_df)

                                                                X.Event.Live.Chess.
1                                                                  [Site Chess.com]
2                                                                 [Date 2022.06.14]
3                                                                         [Round -]
4                                                                [White dervogel09]
5                                                                 [Black SuperCarp]
6                                                                      [Result 0-1]
7     [CurrentPosition r1b1n1k1/1p1n4/p2pN1p1/3Pp3/1P2P1rq/3B4/P2Q1P1K/RN3R2 w - -]
8                                                                    [Timezone UTC]
9                                                                         [ECO B06]
10            [ECOUrl https://www.chess.com/openings/Modern-Defense-with-1-e4-2.d4]
11                                                             [UTCDate 2022.06.14]
12                                                               [UTCTime 12:12:11]
13                                                                  [WhiteElo 1268]
14                                                                  [BlackElo 1234]
15                                                             [TimeControl 900+10]
16                                         [Termination SuperCarp won by checkmate]
17                                                             [StartTime 12:12:11]
18                                                             [EndDate 2022.06.14]
19                                                               [EndTime 12:34:16]
20                               [Link https://www.chess.com/game/live/48947924239]
21 1. d4 g6 2. e4 Bg7 3. c4 d6 4. Nf3 Nf6 5. Bd3 e5 6. d5 c6 7. O-O cxd5 8. cxd5 a6
22  9. h3 Nbd7 10. Bg5 O-O 11. b4 h6 12. Be3 Ne8 13. Qd2 f5 14. Bxh6 f4 15. Bg5 Bf6
23 16. h4 Rf7 17. g3 Bxg5 18. Nxg5 Rf6 19. gxf4 Rxf4 20. Ne6 Rg4+ 21. Kh2 Qxh4# 0-1

However, after trying to extract some rows apparently only the first row contains data. I get the following results when I test:

>is_empty(pgn_df[1,1])
[1] FALSE
>is_empty(pgn_df[1,2])
TRUE

And the same TRUE for all other rows. I am trying to extract just a handful of rows (white player, black player, opening moves, etc) which I have done before with other plain-text files (not .pgn) I imported into data frames but for some reason I'm getting null values here.

When I try to extract a single row, for example the white player, I get:

>white_player <- row(pgn_df, 4)
>View(white_player)
      [,1]
 [1,] 1   
 [2,] 2   
 [3,] 3   
 [4,] 4   
 [5,] 5   
 [6,] 6   
 [7,] 7   
 [8,] 8   
 [9,] 9   
[10,] 10  
[11,] 11  
[12,] 12  
[13,] 13  
[14,] 14  
[15,] 15  
[16,] 16  
[17,] 17  
[18,] 18  
[19,] 19  
[20,] 20  
[21,] 21  
[22,] 22  
[23,] 23  
Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Solution

  • To extract a single row, try pgn_df[4,]. To get multiple, pass a vector of indices: pgn_df[c(1, 4, 7),]. row doesn't do what it sounds like it should do! Your other example is_empty(pgn_df[1,2]) fails because there you're asking for the first row in the second column - when there's only one column in the data set. There are some good resources for learning to index data frames in R online that might be worth reviewing as well.