Search code examples
google-sheetsgoogle-sheets-formulagoogle-query-language

Extract all percentages present within a text imported by IMPORTXML (Google Sheets)


enter image description here

enter image description here

As these percentages are within a function that creates the bar chart, the only way I was able to extract this data was by importing the entire function with this formula:

=IMPORTXML(
"https://int.soccerway.com/teams/spain/real-madrid-club-de-futbol/2016/statistics/",
"//div[@id='charts']//script[@type='text/javascript']")

The result is the following value:

 (function(){ function drawScoringMinutesChart() { var hSize = 180, vSize = 
180; var opts = { width: hSize, height: vSize, axisTitlesPosition: 'none', 
legend: 'none', chartArea: { left: 0, top: 0, width: hSize, height: vSize 
}, backgroundColor: '#f4f4f4', colors: ['#f85f00'], hAxis: { textPosition: 
'none', viewWindowMode: 'explicit', viewWindow: { min: 0, max: 8 }, 
baselineColor: '#999999' }, vAxis: { textPosition: 'in' }, 
enableInteractivity: true }; var data_a = new 
google.visualization.DataTable(); data_a.addColumn('string', 'Minute'); 
data_a.addColumn('number', 'Goals'); data_a.addRows(6); data_a.setValue(0, 
0, '20.0%'); data_a.setValue(0, 1, 6); data_a.setValue(1, 0, '13.3%'); 
data_a.setValue(1, 1, 4); data_a.setValue(2, 0, '3.3%'); data_a.setValue(2, 
1, 1); data_a.setValue(3, 0, '23.3%'); data_a.setValue(3, 1, 7); 
data_a.setValue(4, 0, '13.3%'); data_a.setValue(4, 1, 4); 
data_a.setValue(5, 0, '26.7%'); data_a.setValue(5, 1, 8); var chart_a = new 
google.visualization.BarChart(document.getElementById('scoring_minutes_chart_a')); 
chart_a.draw(data_a, opts); } $scru.execute(drawScoringMinutesChart, 
['google:visualization']); })(); 

I would like to know if there is any formula that would be able to filter this text and deliver only the percentages, like this:

20.0%
13.3%
3.3%
23.3%
13.3%
26.7%

So that I can, in the end, put it like this in the spreadsheet:

0-15'      20.0%
15-30'     13.3%
30-45'     3.3%
45-60'     23.3%
60-75'     13.3%
75-90'     26.7%

Solution

  • try:

    =QUERY(FLATTEN(SPLIT(REGEXREPLACE(IMPORTXML(A1, 
     "//div[@id='charts']//script[@type='text/javascript']"), 
     "[^(\d+.\d+%)]", " "), " ")), 
     "where Col1 contains '%'", 0)
    

    enter image description here