c4 Internet Address - Some useful Program
来源:互联网 发布:软件工程设计招聘 编辑:程序博客网 时间:2024/06/16 11:08
SpamCheck
许多服务器要监视“垃圾邮件”,然后通知客户端它要访问的host是否是垃圾邮件。这种实时的黑洞列表要求尽可能快,并且负载很大,可能是百万级的。
解决这个问题要反应尽可能快,最好是有缓存,负载问题可以通过分布式服务器完成.可以用web server 来完成,SOAP,UDP,自定义协议等。实际上这种服务器可以DNS实现。
public static final String BLACKHOLE = "sbl.spamhaus.org";public static void main(String[] args) throws SocketException, UnknownHostException {for (String arg : args) {if (isSpammer(arg)) {System.out.println(arg + " is a known spammer.");} else {System.out.println(arg + " appears legitimate.");}}}private static boolean isSpammer(String arg) {try {InetAddress address = InetAddress.getByName(arg);byte[] quad = address.getAddress();String query = BLACKHOLE;for (byte octet : quad) {int unsignedByte = octet < 0 ? octet + 256 : octet;query = unsignedByte + "." + query;}System.out.println(InetAddress.getByName(query).getHostName());System.out.println(InetAddress.getByName(query).getHostAddress());return true;} catch (UnknownHostException e) {return false;}}
使用这种技术要注意对blackhole list和地址的维护。服务器被攻击,对所有请求都拒绝回应等问题要考虑到。
Processing Web Server Logfiles
205.160.186.76 unknown - [17/Jun/2013:22:53:58 -0500] "GET /bgs/greenbg.gif HTTP 1.0" 200 50
上面那条记录表示来着 205.160.186.76 的浏览器请求资源 /bgs/greenbg.gif ,并且成功请求到,资源大小是 50 bytes
public class Weblog { public static void main(String[] args) { try (FileInputStream fin = new FileInputStream(args[0]); Reader in = new InputStreamReader(fin); BufferedReader bin = new BufferedReader(in);) { for (String entry = bin.readLine(); entry != null; entry = bin.readLine()) { // separate out the IP address int index = entry.indexOf(' '); String ip = entry.substring(0, index); String theRest = entry.substring(index); // Ask DNS for the hostname and print it out try { InetAddress address = InetAddress.getByName(ip); System.out.println(address.getHostName() + theRest); } catch (UnknownHostException ex) { System.err.println(entry); } } } catch (IOException ex) { System.out.println("Exception: " + ex); } }}
InetAddress会缓存结果,所以同样的ip地址,不会再次访问DNS。
但上面的程序可以改造一下,变得更快!因为上面的程序花了非常多的时间在“等待”DNS的反应结果。这个时候,用多线程正好解决该问题。一个线程读取log entry,读到的entry交给其他线程去执行。但要注意到,可能log entry有很多很多,那如果每条log entry都启动一个线程的话,那VM几下就会被干趴下,所以这里要用线程池。
public class LookupTask implements Callable<String> { private String line; public LookupTask(String line) { this.line = line; } @Override public String call() { try { // separate out the IP address int index = line.indexOf(' '); String address = line.substring(0, index); String theRest = line.substring(index); String hostname = InetAddress.getByName(address).getHostName(); return hostname + " " + theRest; } catch (Exception ex) { return line; } }}
// Requires Java 7 for try-with-resources and multi-catchpublic class PooledWeblog { private final static int NUM_THREADS = 4; public static void main(String[] args) throws IOException { ExecutorService executor = Executors.newFixedThreadPool(NUM_THREADS); Queue<LogEntry> results = new LinkedList<LogEntry>(); try (BufferedReader in = new BufferedReader( new InputStreamReader(new FileInputStream(args[0]), "UTF-8"));) { for (String entry = in.readLine(); entry != null; entry = in.readLine()) { LookupTask task = new LookupTask(entry); Future<String> future = executor.submit(task); LogEntry result = new LogEntry(entry, future); results.add(result); } } // Start printing the results. This blocks each time a result isn't ready. for (LogEntry result : results) { try { System.out.println(result.future.get()); } catch (InterruptedException | ExecutionException ex) { System.out.println(result.original); } } executor.shutdown(); } private static class LogEntry { String original; Future<String> future; LogEntry(String original, Future<String> future) { this.original = original; this.future = future; } }}
不完全科学的统计,上述方法比第一种方法要快10-50倍!
但上面的程序还有一个设计上的downside!logfile可能会是很大很大,那queue就会很大,程序就会消耗很多很多内存!避免这个问题方法可以是,将output工作放在一个单独的线程中,和input共享一个queue,早先处理的entry可以先打印出来,不必等所有entry都放到queue后再去output。但这个会引起另一个问题,你需要一个单独的signal来告知output已经完成了,因为queue为空并不能保证output已经完成,最简单的办法是count input 的条数和output的条数一致!
0 0
- c4 Internet Address - Some useful Program
- c4 Internet Address - Inet4Address and Inet6Address
- c4 Internet Address - The NetworkInterface class
- some useful links
- Some useful expressions
- some useful websit
- Some useful TextBox Validations
- some useful URL
- Some useful linux commands
- Some Useful -XX Options
- some useful code
- Some useful DQL tips
- Some useful tools
- Some useful english words
- some useful Linux commands
- some useful technology english
- Some useful Eclipse hotkey
- Some useful websites
- Request.Form 和 Request.QueryString
- 50个Android开发技巧(01 好好利用layout_weight属性)
- 【机器学习算法-python实现】决策树-Decision tree(1) 信息熵划分数据集
- 简单工厂模式
- HBase-0.98.0和Phoenix-4.0.0分布式安装指南
- c4 Internet Address - Some useful Program
- C++第9周(春)项目5 - 一元一次方程类
- 连接数据库在JTable中显示
- maven(3)--依赖问题,依赖和继承
- RBF网络
- [NodeJS] 优缺点及适用场景讨论
- 大用户程序开发过程中遇到和解决的一些问题集
- 动态规划
- Tomcat 7.0启动报org.apache.catalina.deploy.WebXml addFilter快速错解决方法