I am running a simple web scraper written in Puppeteer / Node.js on a Raspberry Pi. It downloads data from a websites at 6pm and 6am every day. Every so often, say once a week, I'd like to send a command or signal or something telling it to download the data immediately. This presumably would be be a command from another terminal window. The question is how do I do this? I could write a small file and have the program constantly look out for the file, but this seems very crude. Is there a better way? Something simple please as I'm a bit of a novice!
Keeping in mind that I don't actually recommend doing this in most cases, as discussed in my previous answer, here's a contrived example to illustrate the readline
approach.
const puppeteer = require("puppeteer"); // ^22.2.0
const readline = require("readline");
const {setTimeout} = require("node:timers/promises");
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
const getPageTitle = async (
url = "https://en.wikipedia.org/wiki/Special:Random"
) => {
let browser;
try {
browser = await puppeteer.launch();
const [page] = await browser.pages();
await page.goto(url, {waitUntil: "domcontentloaded"});
return await page.title();
}
finally {
await browser?.close();
}
};
rl.on("line", async line => {
try {
console.log(` Received command from stdin: \`${line}\``);
console.log(" Command result:", await getPageTitle(line));
}
catch (err) {
console.error(err);
}
});
(async () => {
for (;;) {
try {
console.log(await getPageTitle());
}
catch (err) {
console.error(err);
}
await setTimeout(20_000);
}
})();
// execute with: `cat | node script.js` to enable the below behavior
console.log(
`to send command: echo 'https://www.stackoverflow.com' > /proc/${process.pid}/fd/0`
);
console.log("_".repeat(80));
This prints a random Wikipedia page title every 20 seconds while also listening to stdin for additional URLs to print titles for. With the script running on a single terminal, type in https://stackoverflow.com
to see it print Stack Overflow - Where Developers Learn, Share, & Build Careers
, for example.
Now, if you run it with cat | node script.js
to provide a pipe as described here, you can send commands to it from another terminal with echo 'https://www.stackoverflow.com' > /proc/{pid}/fd/0
where {pid}
is printed dynamically by the program at startup, or obtainable with pgrep node
.