php strip_tags problem
I found that using php function, strip_tags, does not remove all the markup elements correctly from the subject content. First of all, if an anchor link includes a line break, it will not be removed correctly. Also, the style information is not properly removed as well. In the following script, I also added in the regex to remove any content in between script tags, but that may or may not be necessary.
function strip_all_tags($content)
{
$content = preg_replace(’/\n/’,’ ‘,$content);
$content = preg_replace(’/<script.*<\/script>/U’,’ ‘,$content);
$content = preg_replace(’/<style.*<\/style>/U’,’ ‘,$content);
$content = strip_tags(strtolower($content));
return $content;
}
The function will remove all line breaks so strip_tags will not have problems with finding all markups. Since strip_tags does not remove <style> tags, the new function will remove them using regex.