I need to save a file with the name of the given acquisition path's file.
Given an URL I would like to parse it and extract the name of the file, here's my code...
I read a JSON parameter and give it to the Parse Url function. The acquisition path is a string.
ParseUrl.py:
from urllib.parse import urlparse as up
a = up(jtp["AcquisitionPath"]) # => http://127.0.0.1:8000/Users/YodhResearch/Desktop/LongCtrl10min.tiff
print(a)
print(os.path.basename(a))
Result:
ParseResult(scheme='http', netloc='127.0.0.1:8000', path='/Users/YodhResearch/Desktop/LongCtrl10min.tiff', params='', query='', fragment='')
[....]
TypeError: expected str, bytes or os.PathLike object, not ParseResult
As you can see it Parse the URL but "LongCtrl10min.tiff" is not in the fragment section but is all on the path section. Why is that happening? Maybe because "AcquisitionPath" is a string and UrlParse recognize all as a unique path?
EDIT:
a.path WORKS, I would like to know why I don't get it into the fragment section.
Here's another example:
from urllib.parse import urlparse as up
string = "http://127.0.0.1:8000/GIULIO%20FERRARI%20FOLDER/Giulio%20_%20CSV/Py%20Script/sparse%20python/tiff_test.tiff_IDAnal#1_IDAcq#10_TEMP_.json"
a = up(string)
print(a)
print(os.path.basename(a))
Results:
ParseResult(scheme='http', netloc='127.0.0.1:8000', path='/GIULIO%20FERRARI%20FOLDER/Giulio%20_%20CSV/Py%20Script/sparse%20python/tiff_test.tiff_IDAnal', params='', query='', fragment='1_IDAcq#10_TEMP_.json')
See, Now it doesn't get the right fragment that should be: "tiff_test.tiff_IDAnal#1_IDAcq#10_TEMP_.json"
SOLUTION:
Fragment needs '#' symbol! Thanks to all.
There are two issues here: how to identify the components of a URL, and how to create the desired path from those components.
First, you are confused over what the fragment actually is. From RFC 3986:
The following are two example URIs and their component parts:
foo://example.com:8042/over/there?name=ferret#nose
\_/ \______________/\_________/ \_________/ \__/
| | | | |
scheme authority path query fragment
| _____________________|__
/ \ / \
urn:example:animal:ferret:nose
The fragment is only the portion following the #
, not the entire final component of the path.
Second, the urlparse()
function from urllib
module returns a ParseResult
object and the basename()
-method from os.path
wants a str
as argument.
What you probably want is to get the path from the ParseResult
-object. You will get this with a.path
(the path you have given via urlparse
is saved in the attribute path
of the ParseResult
-object).
from urllib.parse import urlparse as up
a = up("http://127.0.0.1:8000/Users/YodhResearch/Desktop/LongCtrl10min.tiff")
print(os.path.basename(a.path))
This will output:
LongCtrl10min.tiff
If you want to include also the fragments, you can do this by explicitly adding this. The fragments are saved in a separated attribute in the ParseResult
object, i.e. a.fragment
in your case:
from urllib.parse import urlparse as up
a = up("http://127.0.0.1:8000/Users/YodhResearch/Desktop/LongCtrl10min.tiff#anyfragment")
print(os.path.basename(a.path) + "#" + a.fragment)
will output:
LongCtrl10min.tiff#anyfragment