Fork/Join框架介绍III 【在一个文件夹及其子文件夹中来搜索带有指定扩展名的文件】

来源:互联网 发布:淘宝蜂蜜属于什么类目 编辑:程序博客网 时间:2024/04/29 15:00

Fork/Join异步

在ForkJoinPool中执行 ForkJoinTask时,可以采用同步或异步方式。当采用同步方式执行时,发送任务给Fork/Join线程池的方法直到任务执行完成后才会返回结果。而采用异步方式执行时,发送任务给执行器的方法将立即返回结果,但是任务仍能够继续执行。

需要明白这两种方式在执行任务时的一个很大的区别。当采用同步方式,调用这些方法(比如,invokeAll()方法)时,任务被挂起,直到任务被发送到Fork/Join线程池中执行完成。这种方式允许ForkJoinPool类采用工作窃取算法(Work-StealingAlgorithm)来分配一个新任务给在执行休眠任务的工作者线程(WorkerThread)。相反,当采用异步方法(比如,fork()方法)时,任务将继续执行,因此ForkJoinPool类无法使用工作窃取算法来提升应用程序的性能。在这个示例中,只有调用join()或get()方法来等待任务的结束时,ForkJoinPool类才可以使用工作窃取算法。

代码实例

ForkJoinPool和ForkJoinTask类所提供的异步方法来管理任务。我们将实现一个程序:
在一个文件夹及其子文件夹中来搜索带有指定扩展名的文件。
ForkJoinTask类将实现处理这个文件夹的内容。而对于这个文件夹中的每一个子文件,任务将以异步的方式发送一个新的任务给ForkJoinPool类。对于每个文件夹中的文件,任务将检查任务文件的扩展名,如果符合条件就将其增加到结果列表中。

Main.java

package com.packtpub.java7.concurrency.chapter5.recipe03.core;import java.util.List;import java.util.concurrent.ForkJoinPool;import java.util.concurrent.TimeUnit;import com.packtpub.java7.concurrency.chapter5.recipe03.task.FolderProcessor;public class Main {    /**     * Main method of the example    */    public static void main(String[] args) {        // Create the pool        ForkJoinPool pool=new ForkJoinPool();        // Create three FolderProcessor tasks for three diferent folders        FolderProcessor system=new FolderProcessor("C:\\Windows", "log");        FolderProcessor apps=new FolderProcessor("C:\\Program Files","log");        FolderProcessor documents=new FolderProcessor("C:\\Documents And Settings","log");        // Execute the three tasks in the pool        pool.execute(system);        pool.execute(apps);        pool.execute(documents);        // Write statistics of the pool until the three tasks end        do {            System.out.printf("******************************************\n");            System.out.printf("Main: Parallelism: %d\n",pool.getParallelism());            System.out.printf("Main: Active Threads: %d\n",pool.getActiveThreadCount());            System.out.printf("Main: Task Count: %d\n",pool.getQueuedTaskCount());            System.out.printf("Main: Steal Count: %d\n",pool.getStealCount());            System.out.printf("******************************************\n");            try {                TimeUnit.SECONDS.sleep(1);            } catch (InterruptedException e) {                e.printStackTrace();            }        } while ((!system.isDone())||(!apps.isDone())||(!documents.isDone()));        // Shutdown the pool        pool.shutdown();        // Write the number of results calculate by each task        List<String> results;        results=system.join();        System.out.printf("System: %d files found.\n",results.size());        results=apps.join();        System.out.printf("Apps: %d files found.\n",results.size());        results=documents.join();        System.out.printf("Documents: %d files found.\n",results.size());    }}

FolderProcessor.java

package com.packtpub.java7.concurrency.chapter5.recipe03.task;import java.io.File;import java.util.ArrayList;import java.util.List;import java.util.concurrent.RecursiveTask;/** * Task that process a folder. Throw a new FolderProcesor task for each * subfolder. For each file in the folder, it checks if the file has the extension * it's looking for. If it's the case, it add the file name to the list of results. * */public class FolderProcessor extends RecursiveTask<List<String>> {    /**     * Serial Version of the class. You have to add it because the      * ForkJoinTask class implements the Serializable interfaces     */    private static final long serialVersionUID = 1L;    /**     * Path of the folder this task is going to process     */    private String path;    /**     * Extension of the file the task is looking for     */    private String extension;    /**     * Constructor of the class     * @param path Path of the folder this task is going to process     * @param extension Extension of the files this task is looking for     */    public FolderProcessor (String path, String extension) {        this.path=path;        this.extension=extension;    }    /**     * Main method of the task. It throws an additional FolderProcessor task     * for each folder in this folder. For each file in the folder, it compare     * its extension with the extension it's looking for. If they are equals, it     * add the full path of the file to the list of results     */    @Override    protected List<String> compute() {        List<String> list=new ArrayList<>();        List<FolderProcessor> tasks=new ArrayList<>();        File file=new File(path);        File content[] = file.listFiles();        if (content != null) {            for (int i = 0; i < content.length; i++) {                if (content[i].isDirectory()) {                    // If is a directory, process it. Execute a new Task                    FolderProcessor task=new FolderProcessor(content[i].getAbsolutePath(), extension);                    task.fork();                    tasks.add(task);                } else {                    // If is a file, process it. Compare the extension of the file and the extension                    // it's looking for                    if (checkFile(content[i].getName())){                        list.add(content[i].getAbsolutePath());                    }                }            }            // If the number of tasks thrown by this tasks is bigger than 50, we write a message            if (tasks.size()>50) {                System.out.printf("%s: %d tasks ran.\n",file.getAbsolutePath(),tasks.size());            }            // Include the results of the tasks            addResultsFromTasks(list,tasks);        }        return list;    }    /**     * Add the results of the tasks thrown by this task to the list this     * task will return. Use the join() method to wait for the finalization of     * the tasks     * @param list List of results     * @param tasks List of tasks     */    private void addResultsFromTasks(List<String> list,            List<FolderProcessor> tasks) {        for (FolderProcessor item: tasks) {            list.addAll(item.join());        }    }    /**     * Checks if a name of a file has the extension the task is looking for     * @param name name of the file     * @return true if the name has the extension or false otherwise     */    private boolean checkFile(String name) {        if (name.endsWith(extension)) {            return true;        }        return false;    }}

运行结果


Main: Parallelism: 4
Main: Active Threads: 4
C:\Windows: 59 tasks ran.
Main: Task Count: 91
Main: Steal Count: 0


C:\Windows\assembly\GAC_MSIL: 294 tasks ran.
C:\Windows\assembly\NativeImages_v2.0.50727_32: 122 tasks ran.
C:\Windows\assembly\NativeImages_v2.0.50727_64: 114 tasks ran.
C:\Windows\assembly\NativeImages_v4.0.30319_32: 147 tasks ran.
C:\Windows\assembly\NativeImages_v4.0.30319_64: 141 tasks ran.
C:\Windows\Microsoft.NET\assembly\GAC_MSIL: 193 tasks ran.


Main: Parallelism: 4
Main: Active Threads: 38
Main: Task Count: 1030
Main: Steal Count: 615


C:\Windows\SysWOW64: 88 tasks ran.
C:\Windows\System32: 91 tasks ran.


Main: Parallelism: 4
Main: Active Threads: 24
Main: Task Count: 521
Main: Steal Count: 1362


C:\Windows\System32\DriverStore\FileRepository: 264 tasks ran.


Main: Parallelism: 4
Main: Active Threads: 9
Main: Task Count: 1852
Main: Steal Count: 2113



Main: Parallelism: 4
Main: Active Threads: 4
Main: Task Count: 1341
Main: Steal Count: 2268



Main: Parallelism: 4
Main: Active Threads: 4
Main: Task Count: 1313
Main: Steal Count: 4623


C:\Windows\winsxs: 13303 tasks ran.


Main: Parallelism: 4
Main: Active Threads: 5
Main: Task Count: 870
Main: Steal Count: 4623



Main: Parallelism: 4
Main: Active Threads: 2
Main: Task Count: 0
Main: Steal Count: 13058



Main: Parallelism: 4
Main: Active Threads: 2
Main: Task Count: 0
Main: Steal Count: 13058


System: 21 files found.
Apps: 6 files found.
Documents: 0 files found.

工作原理

这个范例的重点在于FolderProcessor类。每一个任务处理一个文件夹中的内容。文件夹中的内容有以下两种类型的元素:

文件;
其他文件夹。
如果主任务发现一个文件夹,它将创建另一个Task对象来处理这个文件夹,调用fork()方法把这个新对象发送到线程池中。fork()方法发送任务到线程池时,如果线程池中有空闲的工作者线程(WorkerThread)或者将创建一个新的线程,那么开始执行这个任务,fork()方法会立即返回,因此,主任务可以继续处理文件夹里的其他内容。对于每一个文件,任务开始比较它的文件扩展名,如果与要搜索的扩展名相同,那么将文件的完整路径增加到结果列表中。

一旦主任务处理完指定文件夹里的所有内容,它将调用join()方法等待发送到线程池中的所有子任务执行完成。join()方法在主任务中被调用,然后等待任务执行结束,并通过compute()方法返回值。主任务将所有的子任务结果进行合并,这些子任务发送到线程池中时带有自己的结果列表,然后通过调用compute()方法返回这个列表并作为主任务的返回值。

ForkJoinPool类也允许以异步的方式执行任务。调用execute()方法发送3个初始任务到线程池中。在Main主类中,调用shutdown()方法结束线程池,并在控制台输出线程池中任务的状态及其变化的过程。ForkJoinPool类包含了多个方法可以实现这个目的。

扩展

本范例使用join()方法来等待任务的结束,然后获取它们的结果。也可以使用get()方法以下的两个版本来完成这个目的。

  • get():如果ForkJoinTask类执行结束,或者一直等到结束,那么
  • get()方法的这个版本则返回由compute()方法返回的结果。
  • get(long timeout, TimeUnit unit):如果任务的结果未准备好,那么get()方法的 这个版本将等待指定的时间。如果超过指定的时间了,任务的结果仍未准备好,那么这 个方法将返回 null值。

TimeUnit是一个枚举类,有如下的常量:DAYS、HOURS、MICROSECONDS、MILLISECONDS、MINUTES、NANOSECONDS和SECONDS。

get()方法和join()方法还存在两个主要的区别:

  • join()方法不能被中断,如果中断调用join()方法的线程,方法将抛出InterruptedException异常;
  • 如果任务抛出任何运行时异常,那么 get()方法将返回ExecutionException异常,但是join()方法将返回RuntimeException异常。

转自:http://ifeve.com/java7-concurrency-cookbook-4/

0 0