Search code examples
pythonnetworkingflaskflask-login

What does netloc mean?


I'm learning to make login function with Flask-login, and I'm facing with this code in my tutorial that I'm following:

@app.route('/login', methods = ['GET', 'POST'])
def login():
    if current_user.is_authenticated:
        return redirect(url_for('index'))
    form = LoginForm()
    if form.validate_on_submit():
        user = User.query.filter_by(username=form.username.data).first()
        if user is None or not user.check_password(form.password.data):
            flash('Invalid username or password')
            return redirect(url_for('login'))
        login_user(user, remember=form.remember_me.data)
        next_page = request.args.get('next')
        if not next_page or url_parse(next_page).netloc != '': # what is it means in this line..?
            next_page = url_for('index')
        return redirect(next_page)
    return render_template('login.html', title='Sign In', form=form)

But I'm not sure what's the code above that I commented means..?, especially in netloc word, what is that..?, I know that is stand for network locality, but what is the purpose on that line..?


Solution

  • From RFC 1808, Section 2.1, every URL should follow a specific format:

    <scheme>://<netloc>/<path>;<params>?<query>#<fragment>
    

    Lets break this format down syntactically:

    • scheme: The protocol name (which you'll usually see as http/https)
    • netloc: Contains the network location - which includes the domain itself (and subdomain if present), the port number, along with an optional credentials in form of username:password. Together it may take form of username:[email protected]:80.
    • path: Contains information on how the specified resource needs to be accessed.
    • params: Element which adds fine tuning to path. (optional)
    • query: Another element adding fine grained access to the path in consideration. (optional)
    • fragment: Contains bits of information of the resource being accessed within the path. (optional)

    Lets take a very simple example to understand the above clearly:

    https://cat.example/list;meow?breed=siberian#pawsize
    

    In the above example:

    • https is the scheme (first element of a URL)
    • cat.example is the netloc (sits between the scheme and path)
    • /list is the path (between the netloc and params)
    • meow is the param (sits between path and query)
    • breed=siberian is the query (between the fragment and params)
    • pawsize is the fragment (last element of a URL)

    This can be replicated programmatically using Python's urllib.parse.urlparse:

    >>> import urllib.parse
    >>> url ='https://cat.example/list;meow?breed=siberian#pawsize'
    >>> urllib.parse.urlparse(url)
    ParseResult(scheme='https', netloc='cat.example', path='/list', params='meow', query='breed=siberian', fragment='pawsize')
    

    Now coming to your code, the if statement checks whether or not the next_page exists and whether the next_page has a netloc. In that login() function, checking if .netloc != '', means that it is checking whether the result of url_parse(next_page) is a relative URL. A relative URL has a path but no hostname (and thus no netloc).