杭电OJ第4018题 Parsing URL
来源:互联网 发布:美橙域名证书下载 编辑:程序博客网 时间:2024/04/29 00:23
杭电OJ第4018题,Parsing URL(题目链接)。
Parsing URL
Problem Description
In computing, a Uniform Resource Locator or Universal Resource Locator (URL) is a character string that specifies where a known resource is available on the Internet and the mechanism for retrieving it.
The syntax of a typical URL is:
scheme://domain:port/path?query_string#fragment_id
In this problem, the scheme, domain is required by all URL and other components are optional. That is, for example, the following are all correct urls:
http://dict.bing.com.cn/#%E5%B0%8F%E6%95%B0%E7%82%B9
http://www.mariowiki.com/Mushroom
https://mail.google.com/mail/?shva=1#inbox
http://en.wikipedia.org/wiki/Bowser_(character)
ftp://fs.fudan.edu.cn/
telnet://bbs.fudan.edu.cn/
http://mail.bashu.cn:8080/BsOnline/
Your task is to find the domain for all given URLs.
Input
There are multiple test cases in this problem. The first line of input contains a single integer denoting the number of test cases. For each of test case, there is only one line contains a valid URL.
Output
For each test case, you should output the domain of the given URL.
Sample Input
3
http://dict.bing.com.cn/#%E5%B0%8F%E6%95%B0%E7%82%B9
http://www.mariowiki.com/Mushroom
https://mail.google.com/mail/?shva=1#inbox
Sample Output
Case #1: dict.bing.com.cn
Case #2: www.mariowiki.com
Case #3: mail.google.com
Source
The 36th ACM/ICPC Asia Regional Shanghai Site —— Warmup
解题思路:简单的字符串解析,没有任何难度。不过要注意,不要输出端口号。直接用Java的正则表达式就能轻松搞定。
import java.io.*;import java.util.*;import java.util.regex.Matcher;import java.util.regex.Pattern;public class Main{ public static void main(String args[]) { Scanner cin = new Scanner(System.in); int n; String URL; Matcher matcher; Pattern pattern = Pattern.compile("([A-Za-z]+://)([^:/]+)[:/].*"); n = cin.nextInt(); URL = cin.nextLine(); for ( int i = 1 ; i <= n ; i ++ ) { URL = cin.nextLine(); matcher = pattern.matcher(URL); if ( matcher.matches() ) System.out.println("Case #" + i + ": " + matcher.group(2) ); } }}
喜欢用C语言搞也行。C语言本来可以用GNU正则表达式的。
#include <stdio.h>#include <stdlib.h>#include <string.h>#include <regex.h>typedef int COUNT;#define MAX_LENGTH 1000int main (void){ COUNT i; int n; char url[MAX_LENGTH]; regmatch_t pmatch[4]; regex_t match_regex; regcomp( &match_regex, "([A-Za-z]+://)([^:/]+)([:/].*)", REG_EXTENDED ); scanf( "%d", &n ); for ( i = 1 ; i <= n ; i ++ ) { scanf( "%s", url ); regexec( &match_regex, url, 4, pmatch, 0 ); url[pmatch[2].rm_eo] = '\0'; puts( &(url[pmatch[2].rm_so]) ); } regfree( &match_regex ); return EXIT_SUCCESS;}
不过杭电OJ是Windows服务器,用的gcc编译器是MinGW的gcc,所以不支持GNU正则表达式,所以如果用C语言写,就只能自己解析字符串了。C代码如下:
#include <stdio.h>#include <stdlib.h>#include <string.h>#include <stdbool.h>typedef int COUNT;#define MAX_LENGTH 1000int main (void){ COUNT i, j; int n; bool starturl; char url[MAX_LENGTH]; char outputurl[MAX_LENGTH]; int len; scanf( "%d", &n ); for ( i = 1 ; i <= n ; i ++ ) { starturl = false; scanf( "%s", url ); sprintf (outputurl, "Case #%d: ", i ); len = strlen( outputurl ); for ( j = 0 ; url[j] != '\0' ; j ++ ) { if ( !starturl ) { if ( url[j] == '/' ) { j ++; starturl = true; } } else { if ( url[j] == ':' || url[j] == '/' || url[j] == '\0' ) break; outputurl[len++] = url[j]; } } outputurl[len] = '\0'; puts( outputurl ); } return EXIT_SUCCESS;}
- 杭电OJ第4018题 Parsing URL
- hdu 4018 Parsing URL
- HDU--4018(Parsing URL)
- hdu 4018 Parsing URL
- HDU 4018 Parsing URL
- hdu 4018 Parsing URL
- HDU 4018 Parsing URL
- hdu 4018 Parsing URL
- 杭电OJ——第1000,1001题
- 杭电OJ第4252题 A Famous City
- 杭电OJ第4247题 A Famous ICPC Team
- 杭电OJ第4245题 A Famous Music Composer
- 杭电OJ第4256题 The Famous Clock
- 杭电OJ第4255题 A Famous Grid
- 杭电OJ第4011题 Working in Beijing
- 杭电OJ第4015题 Mario and Mushrooms
- 杭电oj第1004总结
- 【杭电-oj】-2005-第几天?
- 学习4-Cocos2D-X UI系统
- 杭电OJ第4256题 The Famous Clock
- 杭电OJ第4255题 A Famous Grid
- NEUOJ第1155题 Mysterious Organization —— 顺便训练一下“正则表达式”
- 吉林大学OJ第2775题 Problem F: Shadows
- 杭电OJ第4018题 Parsing URL
- 杭电OJ第4011题 Working in Beijing
- 杭电OJ第4015题 Mario and Mushrooms
- ACM-ICPC Live Archive 第4889题 Post Office
- BNU OJ 第26303 题 Touchscreen Keyboard
- C++ 11 STL 线程库实现的线程同步与互斥
- C++11标准 STL正则表达式 验证电子邮件地址
- POSIX正则表达式 验证电子邮件地址
- 湖南省第八届大学生计算机程序设计竞赛D题 平方根大搜索