I have a new feature on my site, where users can submit any text (I stopped all HTML entries) via a textarea. The main problem I still have though is that they could type "http://somewhere.com" which is something I want to stop. I also want to blacklist specific words. This is what I had before:
if (strpos($entry, "http://" or ".com" or ".net" or "www." or ".org" or ".co.uk" or "https://") !== true) {
die ('Entries cannot contain links!');
However that didn't work, as it stopped users from submitting any text at all. So my question is simple, how can I do it?
This is a job for Regular Expressions.
What you need to do it something like this:
// A list of words you don't allow
$disallowedWords = array(
'these',
'words',
'are',
'not',
'allowed'
);
// Search for disallowed words.
// The Regex used here should e.g. match 'are', but not match 'care' or 'stare'
foreach ($disallowedWords as $word) {
if (preg_match("/\s+$word\s+/i", $entry)) {
die("The word '$word' is not allowed...");
}
}
// This variable should contain a regex that will match URLs
// there are thousands out there, take your pick. I have just
// used an arbitrary one I found with Google
$urlRegex = '(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*';
// Search for URLs
if (preg_match($urlRegex, $entry)) {
die("URLs are not allowed...");
}