在Javascript中将特殊字符转换为HTML

有谁知道如何在Javascript中将特殊字符转换为HTML

例:

  '&' (ampersand) becomes '&amp' <br> '"' (double quote) becomes '&quot' when ENT_NOQUOTES is not set.<br> ''' (single quote) becomes '&#039' only when ENT_QUOTES is set.<br> '<' (less than) becomes '&lt'<br> '>' (greater than) becomes '&gt' 

你需要一个类似的function

 return mystring.replace(/&/g, "&amp;").replace(/>/g, "&gt;").replace(/</g, "&lt;").replace(/"/g, "&quot;"); 

但考虑到你对不同的单/双引号处理的需求。

我认为最好的方法是使用浏览器的内置HTML转义function来处理很多情况。 要做到这一点,只需在DOM树中创build一个元素,并将元素的innerText设置为您的string。 然后检索元素的innerHTML 。 浏览器将返回一个HTML编码的string。

 function HtmlEncode(s) { var el = document.createElement("div"); el.innerText = el.textContent = s; s = el.innerHTML; return s; } 

testing运行:

 alert(HtmlEncode('&;\'><"')); 

输出:

 &amp;;'&gt;&lt;" 

这种转义HTML的方法也被Prototype JS库所使用,尽pipe与我给出的简单样本不同。

注意:你仍然需要自己逃脱报价(双倍和单一)。 您可以使用这里其他人列出的任何方法。

这个generics函数将每个非字母字符编码为其htmlcode(数字):

 function HTMLEncode(str){ var i = str.length, aRet = []; while (i--) { var iC = str[i].charCodeAt(); if (iC < 65 || iC > 127 || (iC>90 && iC<97)) { aRet[i] = '&#'+iC+';'; } else { aRet[i] = str[i]; } } return aRet.join(''); } 

从Mozilla …

请注意,charCodeAt将始终返回小于65,536的值。 这是因为较高的代码点由一对(较低值)的“替代”伪字符表示,这些伪字符被用来组成真实的字符。 因此,为了检查或重现个别字符值为65,536及以上的整个字符,对于这些字符,不仅需要检索charCodeAt(i),而且还需要检索charCodeAt(i + 1)(就像检查/用两个>字母重现一个string)。

最好的解决scheme

 /** * (c) 2012 Steven Levithan <http://slevithan.com/> * MIT license */ if (!String.prototype.codePointAt) { String.prototype.codePointAt = function (pos) { pos = isNaN(pos) ? 0 : pos; var str = String(this), code = str.charCodeAt(pos), next = str.charCodeAt(pos + 1); // If a surrogate pair if (0xD800 <= code && code <= 0xDBFF && 0xDC00 <= next && next <= 0xDFFF) { return ((code - 0xD800) * 0x400) + (next - 0xDC00) + 0x10000; } return code; }; } /** * Encodes special html characters * @param string * @return {*} */ function html_encode(string) { var ret_val = ''; for (var i = 0; i < string.length; i++) { if (string.codePointAt(i) > 127) { ret_val += '&#' + string.codePointAt(i) + ';'; } else { ret_val += string.charAt(i); } } return ret_val; } 

用法示例:

 html_encode("✈"); 

创build一个使用stringreplace的函数

 function convert(str) { str = str.replace(/&/g, "&amp;"); str = str.replace(/>/g, "&gt;"); str = str.replace(/</g, "&lt;"); str = str.replace(/"/g, "&quot;"); str = str.replace(/'/g, "&#039;"); return str; } 
函数ConvChar(str){
   c = {'<':'&lt;','>':'&gt;','&':'&amp;',''':'&quot;',''“:'&#039;
        '#': '&#035;'  };
   return str.replace(/ [<&>'“#] / g,function(s){return c [s];});
 }

 alert(ConvChar('< - “ - & - ” - >  -  < -  \' - # -  \' - >'));

结果:

 &LT;  - &QUOT;  - &ampamp;  - &QUOT;  - &GT;  - &LT;  - &#039;  - &#035;  - &#039;  - &GT;

在testarea标签中:

 < -  “ - & - ”  - >  -  <-'-#-'->

如果你只是改变一个长码的字符…

在一个PRE标签和其他大多数HTML标签中,使用输出redirect字符(<和>)的batch file的纯文本将会破坏HTML,但这里是我的提示TEXTAREA元素中的任何内容都不会打破HTML,主要是因为我们在一个控制实例中,并由操作系统来处理,因此它的内容不被HTML引擎parsing。

作为一个例子,说我想突出使用JavaScript我的batch file的语法。 我只需将代码粘贴到textarea中,而不必担心HTML保留字符,并让脚本处理textarea的innerHTML属性,该属性的计算结果是将HTML保留字符replace为相应的ISO-8859-1实体的文本。

当您检索元素的innerHTML (和outerHTML )属性时,浏览器将自动转义特殊字符。 使用一个textarea(谁知道,也许是一个input的文本types)只是节省你做转换(手动或通过代码)。

我使用这个技巧来testing我的语法荧光笔,当我完成创作和testing时,我只是从视图中隐藏了textarea。

 function char_convert() { var chars = ["©","Û","®","ž","Ü","Ÿ","Ý","$","Þ","%","¡","ß","¢","à","£","á","À","¤","â","Á","¥","ã","Â","¦","ä","Ã","§","å","Ä","¨","æ","Å","©","ç","Æ","ª","è","Ç","«","é","È","¬","ê","É","","ë","Ê","®","ì","Ë","¯","í","Ì","°","î","Í","±","ï","Î","²","ð","Ï","³","ñ","Ð","´","ò","Ñ","µ","ó","Õ","¶","ô","Ö","·","õ","Ø","¸","ö","Ù","¹","÷","Ú","º","ø","Û","»","ù","Ü","@","¼","ú","Ý","½","û","Þ","€","¾","ü","ß","¿","ý","à","‚","À","þ","á","ƒ","Á","ÿ","å","„","Â","æ","…","Ã","ç","†","Ä","è","‡","Å","é","ˆ","Æ","ê","‰","Ç","ë","Š","È","ì","‹","É","í","Œ","Ê","î","Ë","ï","Ž","Ì","ð","Í","ñ","Î","ò","'","Ï","ó","'","Ð","ô","“","Ñ","õ","”","Ò","ö","•","Ó","ø","–","Ô","ù","—","Õ","ú","˜","Ö","û","™","×","ý","š","Ø","þ","›","Ù","ÿ","œ","Ú"]; var codes = ["&copy;","&#219;","&reg;","&#158;","&#220;","&#159;","&#221;","&#36;","&#222;","&#37;","&#161;","&#223;","&#162;","&#224;","&#163;","&#225;","&Agrave;","&#164;","&#226;","&Aacute;","&#165;","&#227;","&Acirc;","&#166;","&#228;","&Atilde;","&#167;","&#229;","&Auml;","&#168;","&#230;","&Aring;","&#169;","&#231;","&AElig;","&#170;","&#232;","&Ccedil;","&#171;","&#233;","&Egrave;","&#172;","&#234;","&Eacute;","&#173;","&#235;","&Ecirc;","&#174;","&#236;","&Euml;","&#175;","&#237;","&Igrave;","&#176;","&#238;","&Iacute;","&#177;","&#239;","&Icirc;","&#178;","&#240;","&Iuml;","&#179;","&#241;","&ETH;","&#180;","&#242;","&Ntilde;","&#181;","&#243;","&Otilde;","&#182;","&#244;","&Ouml;","&#183;","&#245;","&Oslash;","&#184;","&#246;","&Ugrave;","&#185;","&#247;","&Uacute;","&#186;","&#248;","&Ucirc;","&#187;","&#249;","&Uuml;","&#64;","&#188;","&#250;","&Yacute;","&#189;","&#251;","&THORN;","&#128;","&#190;","&#252","&szlig;","&#191;","&#253;","&agrave;","&#130;","&#192;","&#254;","&aacute;","&#131;","&#193;","&#255;","&aring;","&#132;","&#194;","&aelig;","&#133;","&#195;","&ccedil;","&#134;","&#196;","&egrave;","&#135;","&#197;","&eacute;","&#136;","&#198;","&ecirc;","&#137;","&#199;","&euml;","&#138;","&#200;","&igrave;","&#139;","&#201;","&iacute;","&#140;","&#202;","&icirc;","&#203;","&iuml;","&#142;","&#204;","&eth;","&#205;","&ntilde;","&#206;","&ograve;","&#145;","&#207;","&oacute;","&#146;","&#208;","&ocirc;","&#147;","&#209;","&otilde;","&#148;","&#210;","&ouml;","&#149;","&#211;","&oslash;","&#150;","&#212;","&ugrave;","&#151;","&#213;","&uacute;","&#152;","&#214;","&ucirc;","&#153;","&#215;","&yacute;","&#154;","&#216;","&thorn;","&#155;","&#217;","&yuml;","&#156;","&#218;"]; for(x=0; x<chars.length; x++){ for (i=0; i<arguments.length; i++){ arguments[i].value = arguments[i].value.replace(chars[x], codes[x]); } } } char_convert(this); 

正如dragon提到的,最简洁的方法是使用jQuery

 function HtmlEncode(s) { return $('<div>').text(s).html(); } function HtmlDecode(s) { return $('<div>').html(s).text(); } 
 var swapCodes = new Array(8211, 8212, 8216, 8217, 8220, 8221, 8226, 8230, 8482, 169, 61558, 8226, 61607); var swapStrings = new Array("--", "--", "'", "'", '"', '"', "*", "...", "&trade;", "&copy;", "&bull;", "&bull;", "&bull;"); var TextCheck = { doCWBind:function(div){ $(div).bind({ bind:function(){ TextCheck.cleanWord(div); }, focus:function(){ TextCheck.cleanWord(div); }, paste:function(){ TextCheck.cleanWord(div); } }); }, cleanWord:function(div){ var output = $(div).val(); for (i = 0; i < swapCodes.length; i++) { var swapper = new RegExp("\\u" + swapCodes[i].toString(16), "g"); output = output.replace(swapper, swapStrings[i]); } $(div).val(output); } } 

另一个我们现在使用的工作。 上面有一个我调用脚本,而是返回转换的代码。 只有小textareas(意思不是一个完整的文章/博客等…)


以上。 适用于大多数字符。

 var swapCodes = new Array(8211, 8212, 8216, 8217, 8220, 8221, 8226, 8230, 8482, 61558, 8226, 61607,161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 338, 339, 352, 353, 376, 402); var swapStrings = new Array("--", "--", "'", "'", '"', '"', "*", "...", "&trade;", "&bull;", "&bull;", "&bull;", "&iexcl;", "&cent;", "&pound;", "&curren;", "&yen;", "&brvbar;", "&sect;", "&uml;", "&copy;", "&ordf;", "&laquo;", "&not;", "&shy;", "&reg;", "&macr;", "&deg;", "&plusmn;", "&sup2;", "&sup3;", "&acute;", "&micro;", "&para;", "&middot;", "&cedil;", "&sup1;", "&ordm;", "&raquo;", "&frac14;", "&frac12;", "&frac34;", "&iquest;", "&Agrave;", "&Aacute;", "&Acirc;", "&Atilde;", "&Auml;", "&Aring;", "&AElig;", "&Ccedil;", "&Egrave;", "&Eacute;", "&Ecirc;", "&Euml;", "&Igrave;", "&Iacute;", "&Icirc;", "&Iuml;", "&ETH;", "&Ntilde;", "&Ograve;", "&Oacute;", "&Ocirc;", "&Otilde;", "&Ouml;", "&times;", "&Oslash;", "&Ugrave;", "&Uacute;", "&Ucirc;", "&Uuml;", "&Yacute;", "&THORN;", "&szlig;", "&agrave;", "&aacute;", "&acirc;", "&atilde;", "&auml;", "&aring;", "&aelig;", "&ccedil;", "&egrave;", "&eacute;", "&ecirc;", "&euml;", "&igrave;", "&iacute;", "&icirc;", "&iuml;", "&eth;", "&ntilde;", "&ograve;", "&oacute;", "&ocirc;", "&otilde;", "&ouml;", "&divide;", "&oslash;", "&ugrave;", "&uacute;", "&ucirc;", "&uuml;", "&yacute;", "&thorn;", "&yuml;", "&#338;", "&#339;", "&#352;", "&#353;", "&#376;", "&#402;"); 

我创build了一个包含上述function的JavaScript文件。 JSChars.html

包括所有需要的文件。 我添加了jQuery 1.4.4。 只是因为我看到其他版本的问题,还没有尝试。

 Requires: jQuery & jQuery Impromptu from: http://trentrichardson.com/Impromptu/index.php 1. Word Count 2. Character Conversion 3. Checks to ensure this is not passed: "notsomeverylongstringmissingspaces" 4. Checks to make sure ALL IS NOT ALL UPPERCASE. 5. Strip HTML // Word Counter $.getScript('js/characters.js',function(){ $('#adtxt').bind("keyup click blur focus change paste", function(event){ TextCheck.wordCount(30, "#adtxt", "#adtxt_count", event); }); $('#adtxt').blur( function(event){ TextCheck.check_length('#adtxt'); // unsures properly spaces-not one long word TextCheck.doCWBind('#adtxt');// char conversion }); TextCheck.wordCount(30, "#adtxt", "#adtxt_count", false); }); //HTML <textarea name="adtxt" id="adtxt" rows="10" cols="70" class="wordCount"></textarea> <div id="adtxt_count" class="clear"></div> // Just Character Conversions: TextCheck.doCWBind('#myfield'); // Run through form fields in a form for case checking. // Alerts user when field is blur'd. var labels = new Array("Brief Description","Website URL","Contact Name","Website","Email","Linkback URL"); var checking = new Array("descr","title","fname","website","email","linkback"); TextCheck.check_it(checking,labels); // Extra security to check again, make sure form is not submitted var pass = TextCheck.validate(checking,labels); if(pass){ //do form actions } //Strip HTML <textarea name="adtxt" id="adtxt" rows="10" cols="70" onblur="TextCheck.stripHTML(this);"></textarea> 

解决方法:

var temp = $("div").text("<"); var afterEscape = temp.html(); // afterEscape == "&lt;"

  <!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <title>html</title> <script> $(function() { document.getElementById('test').innerHTML = "&amp;"; }); </script> </head> <body> <div id="test"></div> </body> </html> 

你可以简单地使用上面的代码将特殊字符转换为html。

这里有几个我没有使用Jquery的方法:

您可以编码string中的每个字符

 function encode(e){return e.replace(/[^]/g,function(e){return"&#"+e.charCodeAt(0)+";"})} 

或者只是针对主要的安全编码字符来担心(&,inebreaks,<,>,“和”):

 function encode(r){ return r.replace(/[\x26\x0A\<>'"]/g,function(r){return"&#"+r.charCodeAt(0)+";"}) } test.value=encode('How to encode\nonly html tags &<>\'" nice & fast!'); /************* * \x26 is &ampersand (it has to be first), * \x0A is newline, *************/ 
 <textarea id=test rows="9" cols="55">www.WHAK.com</textarea> 

如果您需要支持所有标准化的命名字符引用unicode不明确的&符号 ,那么他的库是我意识到的唯一100%可靠的解决scheme!


使用示例

 he.encode('foo © bar ≠ baz 𝌆 qux'); // Output : 'foo &#xA9; bar &#x2260; baz &#x1D306; qux' he.decode('foo &copy; bar &ne; baz &#x1D306; qux'); // Output : 'foo © bar ≠ baz 𝌆 qux' 
 function escape (text) { return text.replace(/[<>\&\"\']/g, function(c) { return '&#' + c.charCodeAt(0) + ';'; }); } alert(escape("<>&'\"")); 

这不会直接回答你的问题,但是如果你使用innerHTML来在一个元素中写文本,并且遇到了编码问题,那就使用textContent ,即:

 var s = "Foo 'bar' baz <qux>"; var element = document.getElementById('foo'); element.textContent = s; // <div id="foo">Foo 'bar' baz <qux></div> 

这里有一个很好的图书馆,我发现在这方面非常有用。

https://github.com/mathiasbynens/he

根据其作者:

它支持所有按照HTML标准化的命名字符引用,像浏览器一样处理不明确的&符号和其他边界情况,有一个广泛的testing套件,和许多其他的JavaScript解决scheme相反,他处理星标Unicode符号就好了

 <html> <body> <script type="text/javascript"> var str= "&\"'<>"; alert('B4 Change:\n' + str); str= str.replace(/\&/g,'&amp;'); str= str.replace(/</g,'&lt;'); str= str.replace(/>/g,'&gt;'); str= str.replace(/\"/g,'&quot;'); str= str.replace(/\'/g,'&#039;'); alert('After change:\n' + str); </script> </body> </html> 

使用这个来testing: http : //www.w3schools.com/js/tryit.asp?filename=tryjs_text

是的,但是如果您需要将结果string插入某个位置而不将其转换回来,则需要执行以下操作:

 str.replace(/'/g,"&amp;amp;#39;"); // and so on 

查看JavaScript htmlentities http://phpjs.org/functions/htmlentities:425

 public static string HtmlEncode (string text) { string result; using (StringWriter sw = new StringWriter()) { var x = new HtmlTextWriter(sw); x.WriteEncodedText(text); result = sw.ToString(); } return result; } 

使用javaScript函数escape() ,它可以让你编码string。

例如,

 escape("yourString");