Search code examples
apache-sparktaskstage

Interpretting Spark Stage Output Log


When running a spark job on an AWS cluster I believe that I've changed my code properly to distribute both the data and the work of the algorithm I am using. But the output looks like this:

[Stage 3:>                                                       (0 + 2) / 1000]
[Stage 3:>                                                       (1 + 2) / 1000]
[Stage 3:>                                                       (2 + 2) / 1000]
[Stage 3:>                                                       (3 + 2) / 1000]
[Stage 3:>                                                       (4 + 2) / 1000]
[Stage 3:>                                                       (5 + 2) / 1000]
[Stage 3:>                                                       (6 + 2) / 1000]
[Stage 3:>                                                       (7 + 2) / 1000]
[Stage 3:>                                                       (8 + 2) / 1000]
[Stage 3:>                                                       (9 + 2) / 1000]
[Stage 3:>                                                      (10 + 2) / 1000]
[Stage 3:>                                                      (11 + 2) / 1000]
[Stage 3:>                                                      (12 + 2) / 1000]
[Stage 3:>                                                      (13 + 2) / 1000]
[Stage 3:>                                                      (14 + 2) / 1000]
[Stage 3:>                                                      (15 + 2) / 1000]
[Stage 3:>                                                      (16 + 2) / 1000]

Am I correct to interpret the 0 + 2 / 1000 as only one two core processor carrying out one of the 1000 tasks at a time? With 5 nodes (10 processors) why wouldn't I see 0 + 10 / 1000?


Solution

  • This looks more like the output I wanted:

    [Stage 2:=======>                                             (143 + 20) / 1000]
    [Stage 2:=========>                                           (188 + 20) / 1000]
    [Stage 2:===========>                                         (225 + 20) / 1000]
    [Stage 2:==============>                                      (277 + 20) / 1000]
    [Stage 2:=================>                                   (326 + 20) / 1000]
    [Stage 2:==================>                                  (354 + 20) / 1000]
    [Stage 2:=====================>                               (405 + 20) / 1000]
    [Stage 2:========================>                            (464 + 21) / 1000]
    [Stage 2:===========================>                         (526 + 20) / 1000]
    [Stage 2:===============================>                     (588 + 20) / 1000]
    [Stage 2:=================================>                   (633 + 20) / 1000]
    [Stage 2:====================================>                (687 + 20) / 1000]
    [Stage 2:=======================================>             (752 + 20) / 1000]
    [Stage 2:===========================================>         (824 + 20) / 1000]
    

    In AWS EMR make sure that the --executor-cores option is set to the number of nodes you are using like this:enter image description here