'----
Airport SPQU :S16:20:25.6431 W071:34:22.3800 8338ft
Country Name="Peru"
State Name=""
City Name="Arequipa"
Airport Name="Rodriguez Ballon"
in file: ORBX\FTX_VECTOR\FTX_VECTOR_AEC\scenery\AEC_SPQU.bgl
----
Airport SPRF :S14:15:59.9484 W070:27:59.9997 14419ft
Country Name="Peru"
State Name=""
City Name="San Rafael"
Airport Name="San Rafael"
in file: Scenery\0304\scenery\APX29370.bgl
Start 12 : S14:15:40.9653 W070:28:38.3900 14419ft Hdg: 117.0T, Length 8760ft
Start 30 : S14:16:18.9314 W070:27:21.6092 14419ft Hdg: 297.0T, Length 8760ft
0120 Lat -14.261198 Long -70.477715 Alt 14419 Hdg 120 Len 8760 Wid 98
0300 Lat -14.272106 Long -70.455620 Alt 14419 Hdg 300 Len 8760 Wid 98
----
Airport TNCB :N12:08:25.5567 W068:16:34.3503 20ft
Country Name="Netherlands Antilles"
State Name=""
City Name="Bonaire I"
Airport Name="Flamingo"
in file: Scenery\0303\scenery\APX29270.bgl
Start 10 : N12:08:23.2891 W068:17:16.0525 20ft Hdg: 92.0T, Length 9448ft
Start 28 : N12:08:20.1144 W068:15:43.9767 20ft Hdg: 272.0T, Length 9448ft
0100 Lat 12.139818 Long -68.288246 Alt 20 Hdg 100 Len 9448 Wid 148
0280 Lat 12.138905 Long -68.261757 Alt 20 Hdg 280 Len 9448 Wid 148
----
Airport TNCC :N12:11:20.0649 W068:57:34.8897 29ft
Country Name="Netherlands Antilles"
State Name=""
City Name="Curacao I"
Airport Name="Willemstad-Hato Intl."
in file: Scenery\0303\scenery\APX29270.bgl
Start 11 : N12:11:30.5607 W068:58:24.9607 29ft Hdg: 102.1T, Length 11186ft
Start 29 : N12:11:08.2410 W068:56:38.2654 29ft Hdg: 282.1T, Length 11186ft
0110 Lat 12.191923 Long -68.974129 Alt 29 Hdg 111 Len 11186 Wid 197 ILS 111.90, Flags: GS DME BC
0290 Lat 12.185513 Long -68.943428 Alt 29 Hdg 291 Len 11186 Wid 197
----
Airport TNCE :N17:29:32.4738 W062:58:29.8992 129ft
Country Name="Netherlands Antilles"
State Name=""
City Name="St Eustatius I"
Airport Name="F.D. Roosevelt"
in file: ORBX\FTX_OLC\FTX_VECTOR_FixedAPT\scenery\APT_TNCE.BGL
Start 6 : N17:29:35.1949 W062:59:02.6666 129ft Hdg: 50.3T, Length 4268ft
Start 24 : N17:30:00.9808 W062:58:30.1439 129ft Hdg: 230.2T, Length 4268ft
0060 Lat 17.492956 Long -62.984272 Alt 129 Hdg 63 Len 4268 Wid 98
0240 Lat 17.500425 Long -62.974819 Alt 129 Hdg 243 Len 4268 Wid 98
----
Airport TNCM :N18:02:27.0378 W063:06:34.2595 13ft
Country Name="Netherlands Antilles"
State Name=""
City Name="St Maarten I"
Airport Name="Princess Juliana Intl"
in file: Scenery\0303\scenery\APX31250.bgl
Start 9 : N18:02:21.9843 W063:07:08.8215 13ft Hdg: 81.7T, Length 7150ft
Start 27 : N18:02:31.8322 W063:05:57.8823 13ft Hdg: 261.7T, Length 7150ft
0090 Lat 18.039392 Long -63.119469 Alt 13 Hdg 95 Len 7150 Wid 148
0270 Lat 18.042223 Long -63.099060 Alt 13 Hdg 275 Len 7150 Wid 148
----'
This is part of my text. I am trying to extract this part :
'----
Airport TNCB :N12:08:25.5567 W068:16:34.3503 20ft
Country Name="Netherlands Antilles"
State Name=""
City Name="Bonaire I"
Airport Name="Flamingo"
in file: Scenery\0303\scenery\APX29270.bgl
Start 10 : N12:08:23.2891 W068:17:16.0525 20ft Hdg: 92.0T, Length 9448ft
Start 28 : N12:08:20.1144 W068:15:43.9767 20ft Hdg: 272.0T, Length 9448ft
0100 Lat 12.139818 Long -68.288246 Alt 20 Hdg 100 Len 9448 Wid 148
0280 Lat 12.138905 Long -68.261757 Alt 20 Hdg 280 Len 9448 Wid 148
----'
I tried this regex pattern however it extracts from the beginning to end of where I want to extract:
----.+?TNCB.+?----
As I said, it extracts from the beginning till the end of expected result. The important thing is it checks the occurrence of "----" once after the matched string "TNCB" but it doesn't extract once before that string. How can I fix that ? How can I arrange it so that it cuts from the first 4 of "-" before "TNCB" ?
import re
airport_tuple = ('TNCB','RPUJ','00IS','WALQ')
def read_text():
with open("symbols.txt","r") as f:
list_of_strings = f.readlines()
text = " ".join(list_of_strings)
return text
def main():
text = read_text()
print(re.findall(r"(?m)^----\n(Airport\s+TNCB.*(?:\n.*)*?)\n----", text))
if __name__ == "__main__":
main()
You can use
(?m)^----\n(Airport\s+TNCB.*(?:\n.*)*?)\n----
See the regex demo.
Details:
(?m)^
- start of a line ((?m)
is equal to re.M
/re.MULTILINE
)----\n
- ----
and a newline(Airport\s+TNCB.*(?:\n.*)*?)
- Group 1:
Airport\s+TNCB
- Airport
, one or more whitespaces, TNCB
.*
- the rest of the line(?:\n.*)*?
- zero or more occurrences (as few as possible) of a newline and then the rest of the line\n----
- a newline and ----
substring.In Python, you can use
re.findall(r'^----\n(Airport\s+TNCB.*(?:\n.*)*?)\n----', text, re.M)