How to remove javascript from string using python and then parse remaining string to table?

I have this string that i scraped from an unversity website. I want to parse it into a table where each row would consist of strings before and after a colon,":".

This is the string.

'課程中文名稱 Title of Course in Chinese：論文課程英文名稱 Title of Course in English：Thesis (Projects) 應修系級 Major：法律學系博士班2 , 授課教師 Instructor：****** 選修類別 Required/Elective：必全半學年 Whole or Half of the Academic Year：半學年學分 Credit(s)：0 學分時數 Hour(s)：0 小時 (function(window, $) { var sheetID = "1qkUIt6x8ry7F-etZJLMNKmEtDr0mwYdV3RNWw8fmOko", // 試算表代號 gid = "0", // 工作表代號 sql = "select%20B,%20C,%20D,%20E,%20F%20where%20G%20=%20'M6106'", // SQL 語法 callback = "callback"; // 回呼函數名稱 $.getScript("https://spreadsheets.google.com/tq?tqx=responseHandler:" + callback + "&tq=" + sql + "&key=" + sheetID + "&gid=" + gid); window[callback] = function(json) { var rowArray = json.table.rows, colArray = json.table.cols, rowLength = rowArray.length, colLength = colArray.length, html = "", i, j, dataGroup, dataLength, colName = new Array(); for (i = 0; i < colLength; i++) { colName[i] = colArray[i].label.replace(/彈性授課方式\W/g,''); } for (i = 0; i < rowLength; i++) { dataGroup = rowArray[i].c; dataLength = dataGroup.length; for (j = 0; j < dataLength; j++) { if (!dataGroup[j]) { continue; } if(dataGroup[j].v == "Y") html += colName[j] + ","; else if(j == (dataLength - 2) && dataGroup[j].v !== null) html += colName[j] + "-" + dataGroup[j].v + ","; } //if (dataGroup[dataLength - 2].v !== null) { //html += colName[dataLength - 2] + "-" + dataGroup[dataLength - 2].v + ","; //} html = html.substring(0,html.length - 1); html += "
"; } $("#test").html(html); if(html != "") $("#highlight").show(); }; })(window, jQuery); 「請遵守智慧財產權」及「不得非法複製及影印」。授課老師尚未建置課程大綱，若有需要請直接洽該任課教師！'

I tried to remove the javascript from this stack overflow page

An adhoc algorithm that i tried was just iteratively pairing the splitted string by every 2 element. This is the code.

spl = "the string"
spl = [spl[i:i + 2] for i in range(0, len(spl), 2)]

I do know that i can access alot of data if i execute the javascript from the browser doms. My question is how can i first parse out the javascript then parse the remaining string into a table?

Solution

Try:

import requests
from bs4 import BeautifulSoup

url = "https://sea.cc.ntpu.edu.tw/pls/dev_stud/course_query.queryGuide?g_serial=U1382&g_year=109&g_term=2&show_info=part"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

for tr in soup.body.table.select("tr"):
    print(tr.get_text(strip=True))
    print("-" * 80)

Prints:

...
--------------------------------------------------------------------------------
課程中文名稱 Title of Course in Chinese：大學英文1B課程英文名稱 Title of Course in English：College English應修系級 Major：語文通識1  ,中國文學系1  ,歷史學系1  ,休閒運動管理學系1  ,法律學系財經法組1  ,法律學系法學組1  ,法律學系司法組1  ,授課教師 Instructor：殷雅玲選修類別 Required/Elective：必向度類別 Classification：全半學年 Whole or Half of the Academic Year：全學年學　　分 Credit(s)：2學分時　　數 Hour(s)：2小時
--------------------------------------------------------------------------------
彈性授課方式：
--------------------------------------------------------------------------------
教師網址 Instructor's Website ：
--------------------------------------------------------------------------------
教師專長 Instructor's Specialty ：英語教學
--------------------------------------------------------------------------------
課綱附檔 Attachments ：
--------------------------------------------------------------------------------
先修科目 Prerequisites：High school English
--------------------------------------------------------------------------------

...and so on.