Will creating a thread for every user of my API scale well?

I am building an API that will interact with the Spotify API in Java Spark. I went with the Authorization Code Flow for the token management - which means that the token will be valid for an hour (for a given user), and then needs to refresh.

For every user that connects their Spotify account, I create a Timer that will check if the user was active after 50 minutes:

If yes -> I refresh the user's token. If no -> I delete the user together with the user's token, which means that they will have to login again if they want to use my service (for storage purposes).

I also keep a HashMap with a User-object, with various information from every user like their profile names, images, playlists etc. This is also deleted from the HashMap if the check from the timer proves that this user was inactive.

The PROBLEM: Every Timer-object creates a new Thread. If there was, theoretically, thousands of users using my service, there would be thousands of threads... My intuition tells me that this is unacceptable. I can not seem to wrap my head around this. How should I go about to keep track of when 50 minutes has passed for every user, while maintaining as few threads as possible and not "over-powering" the API? Any tips would be appreciated!

Code:

package Authentication;

import Spotify.Users.UserSessions;
import java.util.Date;
import java.util.Set;
import java.util.Timer;
import java.util.TimerTask;

public class RefreshTokens extends TimerTask {
    private UserSessions userSessions;
    private Authentication authentication;
    private String currentUserSession;
    private Timer timer = new Timer(true);

    public RefreshTokens(UserSessions userSessions, Authentication authentication, String currentUserSession) {
        this.userSessions = userSessions;
        this.authentication = authentication;
        this.currentUserSession = currentUserSession;
    }

    public void startAutomaticProcess() {
        timer.schedule(this, 20000, 20000); //runs every 20 seconds for testing purposes
    }

    @Override
    public void run() {
        System.out.println("Automatic process started: " + new Date());
        refresh();
    }

    private void refresh() {
        if (userSessions.contains(currentUserSession)) {
            if (userSessions.get(currentUserSession).isActive()) {
                authentication.refreshToken(userSessions.get(currentUserSession));
            } else {
                System.out.println("User was not active enough and has been removed from the server.");
                System.out.println("----------");
                System.out.println("Size of HashMap before: " + userSessions.getHashMap().size());
                userSessions.getHashMap().remove(currentUserSession);
                System.out.println("Size of HashMap after: " + userSessions.getHashMap().size());
                timer.cancel();
                timer.purge();
            }
        }
    }
}

I create a new instance of this class for every new user and call the startAutomaticProcess() method.

Solution

Will creating a thread for every user of my API scale well?

Clearly no.

Each thread has a thread stack which uses at least 64K bytes and by default 1MB; see:

My application has a lot of threads and is running out of memory, why?

So, if the number of users increases you will run out of memory. This is not scalable.

Furthermore, each thread needs to wake up each time that a refresh is performed. That entails 2 context switches and associated overheads.

Suggestion:

Create a UserToken class that represents each user token, and includes a timestamp for when the token was last checked.
Create PriorityQueue<UserToken> that is ordered on the tokens' timestamps.
Use a TimerTask to remove UserToken objects from the priority queue that need checking.
When the check succeeds (i.e. the user is still active), update the timestamp and re-add the UserToken to the queue.

This approach requires scales much better. Assume that N is the number of authenticated users:

There is just one thread rather than N threads and TimerTask objects.
The thread needs to wake up once every M minutes, rather than N threads all waking up once every M2 minutes.
It needs less than 500 bytes¹ per active user rather than 64K (minimum).
Priority queue insertion/re-insertion is cheap, and scales as O(logN).

^{1 - The space consists of the UserToken object and its subsidiary objects, and the internal "node" in the priority queue. 100 to 200 bytes is a better estimate, though this will be implementation specific.}