Remove Microsoft Word HTML tags

If you are working with bloggers, I guess you already had many issues with people pasting text straight from Microsoft Word. This results in bad formatted source code and various code problems…
The following function takes some of these Word HTML tags and returns a clean HTML output that you can use safely on the web.


function cleanHTML($html) {
$html = ereg_replace("<(/)?(font|span|del|ins)[^>]*>","",$html);

$html = ereg_replace("<([^>]*)(class|lang|style|size|face)=("[^"]*"|'[^']*'|[^>]+)([^>]*)>","<\1>",$html);
$html = ereg_replace("<([^>]*)(class|lang|style|size|face)=("[^"]*"|'[^']*'|[^>]+)([^>]*)>","<\1>",$html);

return $html
}

Written By