For example, the first thing I noticed when obtaining the total number of students was that when I ran the cell of my code,
# Get the total number of students.
student_count = school_data_complete_df.count()
student_count
I get the following results as expected:
Even the sampled output provides the same results, so I am correct. However, when I run the data frames onto a table, I get the following results for the Total Students column:
this is what the correct sampled output is supposed to be:
I am noticing similar anomalies in later sections when running my passing percentages for math and reading students. To start off, I am determining the passing grades for math and reading assessment tests:
passing_math = school_data_complete_df["math_score"] >= 70
passing_reading = school_data_complete_df["reading_score"] >= 70
The output I am getting is slightly different from the expected output:
I noticed this just now
Here is the correct output:
The rest of my code runs normally
# Get all the students that are passing reading in a new DataFrame.
passing_reading = school_data_complete_df[school_data_complete_df["reading_score"] >= 70]
# Calculate the number of students passing math.
passing_math_count = passing_math["student_name"].count()
# Calculate the number of students passing reading.
passing_reading_count = passing_reading["student_name"].count()
print(passing_math_count)
print(passing_reading_count)
Until I reached this error:
# Calculate the percent that passed math.
passing_math_percentage = passing_math_count / float(student_count) * 100
# Calculate the percent that passed reading.
passing_reading_percentage = passing_reading_count / float(student_count) * 100
The information after this cell of code says the following:
However, when I tried to run the code, I was receiving a type error cannot convert the series to class float. I mitigated this issue by editing my cell of code to look like this:
# Calculate the percent that passed math.
passing_math_percentage = passing_math_count / student_count.astype("float") * 100
# Calculate the percent that passed reading.
passing_reading_percentage = passing_reading_count / student_count.astype("float") * 100
Now I am not getting an error but this is what my overall table looks like now after creating a district summary dataframe:
# Adding a list of values with keys to create a new DataFrame.
district_summary_df = pd.DataFrame(
[{"Total Schools": school_count,
"Total Students": student_count,
"Total Budget": total_budget,
"Average Math Score": average_math_score,
"Average Reading Score": average_reading_score,
"% Passing Math": passing_math_percentage,
"% Passing Reading": passing_reading_percentage,
"% Overall Passing": overall_passing_percentage}])
district_summary_df
I only want the percentage values to appear on my table and one number to appear in the column of total students. So the correct sampled output looks like this:
student_count
should be a float so count a single columnsum
to count all the rows that are True
. You are currently using count
which will count all rows (both True
and False
).#count the number of students using a single column so the result is a float
student_count = school_data_complete_df["Student ID"].count()
#get passing math and reading masks
passing_math = school_data_complete_df["math_score"] >= 70
passing_reading = school_data_complete_df["reading_score"] >= 70
# Calculate the number of students
passing_math_count = passing_math.sum()
passing_reading_count = passing_reading.sum()
passing_math_percentage = passing_math_count/student_count * 100
passing_reading_percentage = passing_reading_count/student_count * 100