Objective-C里字符串NSString过滤HTML标签的方法
来源:互联网 发布:viove录屏软件 编辑:程序博客网 时间:2024/06/06 02:37
// 第一种,用NSScanner扫描,来自下面这个著名的链接,不过现在打不开鸟~ // Source: http://rudis.net/content/2009/01/21/flatten-html-content-ie-strip-tags-cocoaobjective-c
NSScanner *theScanner;
NSString *text = nil;
theScanner = [NSScanner scannerWithString:html];
while ([theScanner isAtEnd] == NO) {
// find start of tag
[theScanner scanUpToString:@"<" intoString:NULL] ;
// find end of tag
[theScanner scanUpToString:@">" intoString:&text] ;
// replace the found tag with a space
//(you can filter multi-spaces out later if you wish)
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:@"%@>", text] withString:@" "];
}
return html;
}
// 第二种,用NSString自带的Seprated自截断方法
- (NSString *)removeHTML2:(NSString *)html{
NSArray *components = [html componentsSeparatedByCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:@"<>"]];
NSMutableArray *componentsToKeep = [NSMutableArray array];
for (int i = 0; i < [components count]; i = i + 2) {
[componentsToKeep addObject:[components objectAtIndex:i]];
}
NSString *plainText = [componentsToKeep componentsJoinedByString:@""];
return plainText;
}
转载地址:http://bbs.9ria.com/thread-244433-1-1.html
- - (NSString *)flattenHTML:(NSString *)html trimWhiteSpace:(BOOL)trim
- {
- NSScanner *theScanner = [NSScanner scannerWithString:html];
- NSString *text = nil;
- while ([theScanner isAtEnd] == NO) {
- // find start of tag
- [theScanner scanUpToString:@"<" intoString:NULL] ;
- // find end of tag
- [theScanner scanUpToString:@">" intoString:&text] ;
- // replace the found tag with a space
- //(you can filter multi-spaces out later if you wish)
- html = [html stringByReplacingOccurrencesOfString:
- [ NSString stringWithFormat:@"%@>", text]
- withString:@""];
- }
- return trim ? [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] : html;
- }
第三种方法:一个第三方的库可以很容易的解决此问题:https://github.com/mwaterfall/MWFeedParser
MWFeedParser — An RSS and Atom web feed parser for iOS
MWFeedParser is an Objective-C framework for downloading and parsing RSS (1.* and 2.*) and Atom web feeds. It is a very simple and clean implementation that reads the following information from a web feed:
Feed Information
- Title
- Link
- Summary
Feed Items
- Title
- Link
- Author name
- Date (the date the item was published)
- Updated date (the date the item was updated, if available)
- Summary (brief description of item)
- Content (detailed item content, if available)
- Enclosures (i.e. podcasts, mp3, pdf, etc)
- Identifier (an item's guid/id)
If you use MWFeedParser on your iPhone/iPad app then please do let me know, I'd love to check it out :)
Important: This free software is provided under the MIT licence (X11 license) with the addition of the following condition:
This Software cannot be used to archive or collect data such as (but notlimited to) that of events, news, experiences and activities, for the purpose of any concept relating to diary/journal keeping.
The full licence can be found at the end of this document.
Demo / Example App
There is an example iPhone application within the project which demonstrates how to use the parser to display the title of a feed, list all of the feed items, and display an item in more detail when tapped.
Setting up the parser
Create parser:
// Create feed parser and pass the URL of the feedNSURL *feedURL = [NSURL URLWithString:@"http://images.apple.com/main/rss/hotnews/hotnews.rss"];feedParser = [[MWFeedParser alloc] initWithFeedURL:feedURL];
Set delegate:
// Delegate must conform to `MWFeedParserDelegate`feedParser.delegate = self;
Set the parsing type. Options are ParseTypeFull
, ParseTypeInfoOnly
,ParseTypeItemsOnly
. Info refers to the information about the feed, such as it's title and description. Items are the individual items or stories.
// Parse the feeds info (title, link) and all feed itemsfeedParser.feedParseType = ParseTypeFull;
Set whether the parser should connect and download the feed data synchronously or asynchronously. Note, this only affects the download of the feed data, not the parsing operation itself.
// Connection typefeedParser.connectionType = ConnectionTypeSynchronously;
Initiate parsing:
// Begin parsing[feedParser parse];
The parser will then download and parse the feed. If at any time you wish to stop the parsing, you can call:
// Stop feed download / parsing[feedParser stopParsing];
The stopParsing
method will stop the downloading and parsing of the feed immediately.
Reading the feed data
Once parsing has been initiated, the delegate will receive the feed data as it is parsed.
- (void)feedParserDidStart:(MWFeedParser *)parser; // Called when data has downloaded and parsing has begun- (void)feedParser:(MWFeedParser *)parser didParseFeedInfo:(MWFeedInfo *)info; // Provides info about the feed- (void)feedParser:(MWFeedParser *)parser didParseFeedItem:(MWFeedItem *)item; // Provides info about a feed item- (void)feedParserDidFinish:(MWFeedParser *)parser; // Parsing complete or stopped at any time by `stopParsing`- (void)feedParser:(MWFeedParser *)parser didFailWithError:(NSError *)error; // Parsing failed
MWFeedInfo
and MWFeedItem
contains properties (title, link, summary, etc.) that will hold the parsed data. ViewMWFeedInfo.h
andMWFeedItem.h
for more information.
Important: There are some occasions where feeds do not contain some information, such as titles, links or summaries. Before using any data, you should check to see if that data exists:
NSString *title = item.title ? item.title : @"[No Title]";NSString *link = item.link ? item.link : @"[No Link]";NSString *summary = item.summary ? item.summary : @"[No Summary]";
The method feedParserDidFinish:
will only be called when the feed has successfully parsed, or has been stopped by a call tostopParsing
. To determine whether the parsing completed successfully, or was stopped, you can callisStopped
.
For a usage example, please see RootViewController.m
in the demo project.
Available data
Here is a list of the available properties for feed info and item objects:
MWFeedInfo
info.title
(NSString
)info.link
(NSString
)info.summary
(NSString
)
MWFeedItem
item.title
(NSString
)item.link
(NSString
)item.author
(NSString
)item.date
(NSDate
)item.updated
(NSDate
)item.summary
(NSString
)item.content
(NSString
)item.enclosures
(NSArray
ofNSDictionary
with keysurl
,type
andlength
)item.identifier
(NSString
)
Using the data
All properties of MWFeedInfo
and MWFeedItem
return the raw data as provided by the feed. This content may or may not include HTML and encoded entities. If the content does include HTML, you could display the data within a UIWebView, or you could use the provided NSString
category (NSString+HTML
) which will allow you to manipulate this HTML content. The methods available for your convenience are:
// Convert HTML to Plain Text// - Strips HTML tags & comments, removes extra whitespace and decodes HTML character entities.- (NSString *)stringByConvertingHTMLToPlainText;// Decode all HTML entities using GTM.- (NSString *)stringByDecodingHTMLEntities;// Encode all HTML entities using GTM.- (NSString *)stringByEncodingHTMLEntities;// Minimal unicode encoding will only cover characters from table// A.2.2 of http://www.w3.org/TR/xhtml1/dtds.html#a_dtd_Special_characters// which is what you want for a unicode encoded webpage.- (NSString *)stringByEncodingHTMLEntities:(BOOL)isUnicode;// Replace newlines with <br /> tags.- (NSString *)stringWithNewLinesAsBRs;// Remove newlines and white space from string.- (NSString *)stringByRemovingNewLinesAndWhitespace;// Wrap plain URLs in <a href="..." class="linkified">...</a>// - Ignores URLs inside tags (any URL beginning with =")// - HTTP & HTTPS schemes only// - Only works in iOS 4+ as we use NSRegularExpression (returns self if not supported so be careful with NSMutableStrings)// - Expression: (?<!=")\b((http|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?)// - Adapted from http://regexlib.com/REDetails.aspx?regexp_id=96- (NSString *)stringByLinkifyingURLs;
An example of this would be:
// Display item summary which contains HTML as plain textNSString *plainSummary = [item.summary stringByConvertingHTMLToPlainText];
Debugging problems
If for some reason the parser doesn't seem to be working, try enabling Debug Logging inMWFeedParser.h
. This will log error messages to the console and help you diagnose the problem. Error codes and their descriptions can be found at the top ofMWFeedParser.h
.
Other information
MWFeedParser is not currently thread-safe.
Adding to your project
Method 1: Use CocoaPods
CocoaPods is great. If you are using CocoaPods (and here's how to get started), simply addpod 'MWFeedParser'
to your podfile and run pod install
. You're good to go! Here's an example podfile:
platform :ios, '7' pod 'MWFeedParser'
If you are just interested in using the HTML and/or InternetDateTime categories in your app, you can just specify those in your podfile withpod 'MWFeedParser/NSString+HTML'
orpod 'MWFeedParser/NSDate+InternetDateTime'
.
Method 2: Including Source Directly Into Your Project
- Open
MWFeedParser.xcodeproj
. - Drag the
MWFeedParser
&Categories
groups into your project, ensuring you checkCopy items into destination group's folder. - Import
MWFeedParser.h
into your source as required.
Outstanding and suggested features
- Demonstrate the previewing of formatted item summary/content (HTML with images, paragraphs, etc) within a
UIWebView
in demo app. - Provide functionality to list available feeds when given the URL to a webpage with one or more web feeds associated with it.
- Support for the Media RSS extension (from Flickr, etc.)
- Support for the GeoRSS extension.
- Look into web feed icons.
- Look into supporting/detecting images in feed items.
Feel free to get in touch and suggest/vote for other features.
- Objective-C里字符串NSString过滤HTML标签的方法
- Objective c里字符串NSString 过滤HTML标签的两种方法
- iOS字符串NSString 过滤HTML标签的两种方法
- objective-c过滤HTML标签
- Objective-C 字符串NSString
- Objective-C NSString/字符串
- Objective-C字符串NSString
- objective-c下的NSString字符串操作
- objective-c字符串类NSString的使用
- Objective-C---3---NSString的常用方法
- Objective-C NSString类的常用方法
- iOS NSString如何过滤html标签
- asp.net 截取带有html标签的字符串(先过滤html,再截取)的方法
- 【Objective-C】OC中字符串(NSString)的基本概念和常用处理方法
- Objective-C中字符串(NSString和NSMutableString)常用的方法
- [C/C++]_[字符串处理]_[过滤出HTML标签的属性值]
- JAVA过滤html标签的方法
- JAVA过滤html标签的方法
- 简单工厂模式
- 用天平(只能比较,不能称重)从一堆小球中找出其中唯一一个较轻的,使用x 次天平, 最多可以从y 个小球中找出较轻的那个,求y 与x 的关系式。
- hdu 4548 美素数
- qsort和sort学习与比较
- sql面试问题及答案详解
- Objective-C里字符串NSString过滤HTML标签的方法
- eclipse中安装tomcat插件
- hdu4135 容斥原理
- 经纬财富:乐平如何建立自己的交易系统
- hdoj1561The more, The Better(树形dp,依赖背包)
- Java流
- spring MVC配置详解
- string中c_str()、data()、copy(p,n)函数的用法
- Android中XLIFF标签的应用