I have been asked by my school district to schedule students for non-elective classes next year based on their grade level. Unfortunately, we don't have access to some of the neat scheduling apps that can do this automatically. Rather than rostering every student by hand, I thought I'd try using Python. I've written code before to speed up some low-level admin tasks but this is proving far more challenging than I'd anticipated.
My code needs to do the following:
Here's an example of the output I'd like to see:
Class 1 Section 1 | Class 1 Section 2 |
---|---|
Student 1 | Student 3 |
Student 2 | Student 4 |
Here's another format that my code could produce:
Students | 1st Period | 2nd Period |
---|---|---|
Student 1 | Course 1 Section 1 | Course 3 Section 2 |
Student 2 | Course 2 Section 1 | Course 4 Section 1 |
Student 3 | Course 3 Section 1 | Course 1 Section 2 |
I have a CSV of 500 students (names replaced with "Student X" for confidentiality reasons) and their associated grade level. I've converted this CSV into a Pandas DataFrame and written the following code. The direction I went was to make each course (e.g., English 1, Algebra 1, etc.) into a Pandas Series that would be concatenated to a larger DataFrame. For this example, I limited it to just one course type (English 1, aka "ELA 1") as a test:
import pandas as pd
s_test = 'schedule_test.csv' # Roster of students and grade levels
df_s_test = pd.read_csv(s_test)
class_size_limit = 30
stop = 0
df_shuffle = df_s_test.sample(frac=1)
ela1 = {'ELA Section 1':[],'ELA Section 2':[],'ELA Section 3':[],'ELA Section 4':[],'ELA Section 5':[],'ELA Section 6':[]}
sections = ['Section 1','Section 2','Section 3','Section 4','Section 5','Section 6']
for s in df_shuffle.index:
for sec in sections:
if df_shuffle.loc[s,'Grade Level']=='9th':
if stop==0:
if df_shuffle.loc[s,'Student'] not in ela1[sec]:
if len(ela1[sec]) < class_size_limit:
ela1[sec]+=[df_shuffle.loc[s,'Student']]
stop+=1
stop = 0
ela1 =pd.Series(ela1)
df_ela1 = pd.DataFrame({})
df_ela1 = pd.concat([df_ela1,ela1.to_frame().T],ignore_index=True)
df_ela1.to_csv('test.csv',index=False)
Unfortunately, the DataFrame I've produced is composed of lists rather than individual strings of students:
ELA Section 1 | ELA Section 2 |
---|---|
[Student 1, Student 4...] | [Student 10, Student 152...] |
It also currently does not have a way to prevent a student from being rostered into a section that conflicts with a section they're already in. For example, my code must prevent Student 1 from being put into Algebra 1 Section 1 because it occurs during the same period as ELA Section 1.
This makes me think that I should organize my Pandas Series by the class sections in each period (e.g., all 9th grade 1st period classes, all 9th grade 2nd period classes, etc.) rather than the sections in each class type (e.g., all ELA courses, all Algebra courses, etc.).
I appreciate any help or guidance that you can offer.
I think the easiest way would be to have one data frame that will contain information about all students and all classess that they will attend. Once you will have such data frame you can get any information about attendences of the courses, such as which students attend "Course 1, Section 1" or what courses attend cirtain student. Final data frame will look like this:
Student Name Grade course 1 course 2 course 3
3 Student 4 9 Section 1 Section 1 Section 1
18 Student 19 9 Section 1 Section 1 Section 1
21 Student 22 9 Section 1 Section 1 Section 1
24 Student 25 9 Section 1 Section 1 Section 1
37 Student 38 9 Section 1 Section 1 Section 1
41 Student 42 9 Section 1 Section 1 Section 1
42 Student 43 9 Section 1 Section 1 Section 1
94 Student 95 9 Section 1 Section 1 Section 1
From where you can get information which student attends Course 1 Section1:
Student Name
3 Student 4
18 Student 19
21 Student 22
24 Student 25
37 Student 38
41 Student 42
42 Student 43
94 Student 95
Code
import pandas as pd
import random
#maximum number of grades
max_grade = 10
#create an example dataframe
students = pd.DataFrame({"Student Name": ["Student " + str(i+1) for i in range (100)], "Grade": [random.randint(1,
max_grade) for
i in range (100)]})
courses = ["course 1", "course 2", "course 3"]
sections = ['Section 1','Section 2','Section 3', "Section 4"]
grades = [i+1 for i in range(max_grade)]
class_size_limit = 30
#create columns with names of the courses and assign initial value "None"
for course in courses:
students[course] = "None"
#randomly assign students of grade 9 to different courses and different sections
for course in courses:
for sec in sections:
not_assigned_students = students[(students["Grade"] == 9) & (students[course] == "None")]
if class_size_limit < not_assigned_students.shape[0]:
sample_indexes = not_assigned_students.sample(n=class_size_limit).index
students.loc[sample_indexes, course] = sec
else:
sample_indexes = not_assigned_students.index
students.loc[sample_indexes, course] = sec
#get students assigned to the "Course1, Section 1"
students[students["course 1"]=="Section 1"][["Student Name"]]
#get number of students assigned to the "Course1, Section 1"
students[students["course 1"]=="Section 1"]["Student Name"].shape[0]