Web developer - Web analytics specialist

Remove Microsoft Word HTML tags

If you are working with bloggers, I guess you already had many issues with people pasting text straight from Microsoft Word. This results in bad formatted source code and various code problems…
The following function takes some of these Word HTML tags and returns a clean HTML output that you can use safely on the web.


function cleanHTML($html) {
$html = ereg_replace("<(/)?(font|span|del|ins)[^>]*>","",$html);

$html = ereg_replace("<([^>]*)(class|lang|style|size|face)=("[^"]*"|'[^']*'|[^>]+)([^>]*)>","<\1>",$html);
$html = ereg_replace("<([^>]*)(class|lang|style|size|face)=("[^"]*"|'[^']*'|[^>]+)([^>]*)>","<\1>",$html);

return $html
}

Written By

Leave a Reply

Your email address will not be published. Required fields are marked *