« Webwork in action 中文版面世了 | Main | Freemarker + Tiles/SiteMesh »
在使用普通字符串时,使用substring就可以进行截取部分字符串 (当然还要考虑多国语言的问题)
但是对于Html字符串来说,如果采用同样的方法,则会破坏html标签,造成页面错乱,经过对HtmlParser的研究,写出了一个类,可以对Html字符串进行截取.
可以自己根据实际情况改进,如果考虑多国语言,也要修改字符串长度的计算方法. 总之要灵活使用,随机应变,而不是照搬照抄.
下面贴出此类的内容,使用了开源项目Html Parser.
import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.htmlparser.*; import org.htmlparser.tags.CompositeTag; import org.htmlparser.util.NodeIterator; import org.htmlparser.util.NodeList; /** * Functions for HTML. * * @author Scud http://www.javascud.org Date: Nov 3, 2006 10:22:20 AM */ public class HtmlSubstring { private static Log log = LogFactory.getLog(HtmlSubstring.class); /** * get parser for substring. * @return Parser */ public static Parser getMyParser() { Parser parser = new Parser(); PrototypicalNodeFactory factory = new PrototypicalNodeFactory(); //register tags which htmlParser not have factory.registerTag(new StrongTag()); factory.registerTag(new BoldTag()); factory.registerTag(new ItalicTag()); factory.registerTag(new UnderlineTag()); factory.registerTag(new CenterTag()); factory.registerTag(new FontTag()); parser.setNodeFactory(factory); return parser; } /** * Substring for Html String. * * @param htmlString Html string * @param maxlength maxlength * @return String */ public static String substring(String htmlString, int maxlength) { StringBuffer htmlOut = new StringBuffer(); StringBuffer stringOut = new StringBuffer(); try { Parser parser = getMyParser(); parser.setInputHTML(htmlString); NodeIterator nit = parser.elements(); boolean breaked = false; while (nit.hasMoreNodes()) { Node node = nit.nextNode(); if (node instanceof Text) if (node instanceof Text) { breaked = dealText(node, stringOut, htmlOut, maxlength); } else if (node instanceof Tag) { Tag tag = (Tag) node; breaked = dealTag(tag, stringOut, htmlOut, maxlength); } else if (node instanceof Remark) { //nothing to do } if (breaked) { break; } } } catch (Exception e) { log.error("Error occured when parse Html String", e); } return htmlOut.toString(); } private static boolean dealText(Node node, StringBuffer stringOut, StringBuffer htmlOut, int maxlength) { String currentText = node.getText(); int previousLength = stringOut.length(); if (previousLength + currentText.length() >= maxlength) { String cutString = currentText.substring(0, maxlength - previousLength); stringOut.append(cutString); htmlOut.append(cutString); log.debug(cutString); return true; } else { stringOut.append(node.getText()); htmlOut.append(node.getText()); log.debug(node.getText()); } return false; } private static boolean dealTag(Tag aTag, StringBuffer stringOut, StringBuffer htmlOut, int maxlength) throws Exception { NodeList list = aTag.getChildren(); log.debug(getStartTagString(aTag)); htmlOut.append(getStartTagString(aTag)); boolean breaked = false; if (list != null) { NodeIterator it = list.elements(); while (it.hasMoreNodes()) { Node node = it.nextNode(); if (node instanceof Text) { breaked = dealText(node, stringOut, htmlOut, maxlength); } else if (node instanceof Tag) { Tag tag = (Tag) node; breaked = dealTag(tag, stringOut, htmlOut, maxlength); } else if (node instanceof Remark) { //nothing to do } if (breaked) { break; } } } Tag endTag = aTag.getEndTag(); if (endTag != null) { htmlOut.append(aTag.getEndTag().toHtml()); log.debug(aTag.getEndTag().toHtml()); } return breaked; } private static String getStartTagString(Tag aTag) { StringBuffer start = new StringBuffer("<"); for (Object o : aTag.getAttributesEx()) { Attribute ab = (Attribute) o; start.append(ab.toString()); } start.append(">"); return start.toString(); } } class StrongTag extends CompositeTag { private static final String[] mIds = new String[]{"STRONG"}; public StrongTag() { } public String[] getIds() { return (mIds); } public String[] getEnders() { return (mIds); } public String[] getEndTagEnders() { return (new String[0]); } } class BoldTag extends CompositeTag { private static final String[] mIds = new String[]{"B"}; public BoldTag() { } public String[] getIds() { return (mIds); } public String[] getEnders() { return (mIds); } public String[] getEndTagEnders() { return (new String[0]); } } class ItalicTag extends CompositeTag { private static final String[] mIds = new String[]{"I"}; public ItalicTag() { } public String[] getIds() { return (mIds); } public String[] getEnders() { return (mIds); } public String[] getEndTagEnders() { return (new String[0]); } } class UnderlineTag extends CompositeTag { private static final String[] mIds = new String[]{"U"}; public UnderlineTag() { } public String[] getIds() { return (mIds); } public UnderlineTag() { } public String[] getIds() { return (mIds); } public String[] getEnders() { return (mIds); } public String[] getEndTagEnders() { return (new String[0]); } } class CenterTag extends CompositeTag { private static final String[] mIds = new String[]{"CENTER"}; public CenterTag() { } public String[] getIds() { return (mIds); } public String[] getEnders() { return (mIds); } public String[] getEndTagEnders() { return (new String[0]); } } class FontTag extends CompositeTag { private static final String[] mIds = new String[]{"FONT"}; public FontTag() { } public String[] getIds() { return (mIds); } public String[] getEnders() { return (mIds); } public String[] getEndTagEnders() { return (new String[0]); } }
不能截取啊。。。
路过看看~~~~~~~
新手向你学习!
新手向你学习!185939643QQ
please give me a example
你真行!
用Java里的正则表达去匹配、截取你要的部分,岂不简单得多?
| « | 五月 2008 | » | ||||
|---|---|---|---|---|---|---|
| 一 | 二 | 三 | 四 | 五 | 六 | 日 |
| 1 | 2 | 3 | 4 | |||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| 19 | 20 | 21 | 22 | 23 | 24 | 25 |
| 26 | 27 | 28 | 29 | 30 | 31 | |