OpenCV基本操作

来源:互联网 发布:网络机顶盒有什么功能 编辑:程序博客网 时间:2024/04/25 17:50

今天在opencv wiki 闲逛,突然发现一篇好文章,这篇文章讲解了opencv的基本操作,以及图像处理算法和vc结合的示例,非常值得一看。

图片没法转过来,源地址在文章最下面已贴出。

 

Programming computer vision applications:

 

A step-by-step guide to the use of

Microsoft Visual C++

and the Intel OpenCV library

 

Robert Laganière (34)

VIVA lab (25)

University of Ottawa (27)

The objective of this tutorial is to teach you how to program computer vision applications, i.e. applications where you have to process images and (video) sequences of images. You will learn how to use MS Visual C++ and Intel OpenCV to build your applications.

Since this is a beginner’s guide, efforts have been made to describe in details all the required steps to obtain the shown results. Special emphasis has been put on good programming principles through the recourse to the Object Oriented paradigm and the use of some design patterns. All the source codes presented in the tutorial are available for download.

Your comments and questions are welcome; however, because of the number of emails I receive, I cannot guarantee that they will all be answered.

The OpenCV (31) version 1.0 has been used to produce the examples below with Microsoft Visual Studio 2005 under Windows XP.

Note: this tutorial replaces a previous that is still online at this address (34).

0. The OpenCV library

 

OpenCV is the open source library offered by Intel through a BSD license (29) and that is now widely used in the computer vision community. OpenCV can be easily installed fromSourceforge.net (33). The installer will create an OpenCV directory under your Program Files.  This directory contains all the files needed to create your applications. Look at the docs directory, you will find there a very useful documentation.

1. Creating a Dialog-based application

 

All applications presented here will be simple dialog-based applications. This kind of applications can easily be created with Visual C++ and constitutes a good way to obtain a pleasant interactive application. On you Visual C++ menu bar, select the File|New|Project…option, choose MFC Application  and select a name for your application (here cvision).

With Visual C++, you can construct a solution made of several projects (each project is basically one program). So if you build a multi-program application, for example a client and a server application, solutions are very useful because you can group your projects together and have those sharing files and libraries. Usually you create one master directory for your solution that contains all the directories of your projects. In our case, we will incrementally build one project, so I choose to uncheck the Create directory for solution option which means that the solution and the unique project will be put into one single directory. All along this tutorial, you will have access to the different versions of this project. When you become more familiar with VC++, you should take advantage of creating multi-project solutions.

Once you click OK, you will then be brought to the MFC Application Wizard that will let you select different options for your GUI. At this point simply select the Dialog-based option. Other options ore available and I invite you to explore them but for now the standard settings are mostly okay. Note however that the Unicode option that is turned on by default in VC++ 2005 should be unchecked. If you let that option checked, your application will use the larger 16-bit Unicode character set (and give you access to the many international caracters); this could be useful but as many functions requires char * (ASCII strings), you will have to convert your string from Unicode to ASCII which can be painful. Therefore, it is simpler to turn off that option for now. Remember that if you get this kind of errors:

      cannot convert parameter 1 from 'CString' to 'const char *'

      cannot convert from 'const char [11]' to 'LPCWSTR'

this means that you have conversion problems between unicode and multi-byte strings.

VC++ should create a simple OK/Cancel Dialog for you. The class with a name ending by Dlg  (here cvisionDlg) will contain the member functions that control the widget of the dialog. Never touch the other files.

The first task will be to open and display an image. To do this, we will first add a button that will allow us to select the file that contains the image. Go under the Resource and drag a button onto the dialog. You can also resize the dialog and take the time to look at all the widget available in the toolbox.  Change the caption of the button (see the Properties panel) to Open Image. Right click on the button and select Add Event Handler…; this will allow you to specify the name of the handler method that will be called when the user will click on this button.

Now if you compile and start the application, the dialog should now looks like this:

You probably compiled and ran the application under the Debug mode; this is why a Debug directory has been created inside your project directory. The Debug mode is there to help you to create and debug your application. It is a more protected environment but that generate slower executable files. Once your application will be ready, do not forget to compile it under the Release mode which will produce the real executable that will be used by your users. When you will do so, a Release directory will appear inside your project.

Let us now use the CFileDialog class in order to create a file dialog. This one will show up by adding the following code to the OnOpen member function

void CcvisionDlg::OnOpen()

{

  CFileDialog dlg(TRUE, _T("*.bmp"), NULL,

    OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY,

    _T("image files (*.bmp; *.jpg) |*.bmp;*.jpg|All Files (*.*)|*.*||"),NULL);

  dlg.m_ofn.lpstrTitle= _T("Open Image");

  if (dlg.DoModal() == IDOK) {

    CString path= dlg.GetPathName();  // contain the selected filename

  }

}

Note how the extensions of interest (here .bmp and .jpg) for the files to be opened are specified using the fourth argument of the CFileDialog constructor. Just for your information, in C++, when you write "Open Image" you generate a regular narrow ASCII string, therefore if you want instead a wide Unicode string to be generated you have to use L"Open Image". There is however a better solution: if you use _T"Open Image" then you inform the compiler that you want your string to be formatted using the current character set which is a much more flexible solution. In our case, we use the ASCII byte format so the _T conversions are not required.   

Now, by clicking on the Open Image button, the following dialog appears:

2. Loading and displaying an image

 

Now that we learnt how to select a file, let’s load and display the corresponding image. The Intel libraries will help us to accomplish this task. In particular, the HighGui component of OpenCV will be put to contribution. This one contains the required functions to load, save and display images under the Windows environment.

Since we will be using these libraries in all the examples to follow, we will first see how to setup adequately our VC++ projects in order to have the libraries linked to our application. One option would be to go under Project|Properties… but then you would have to repeat this sequence for all your projects. It is therefore a good idea to setup your environement such that it will always remember where to find the OpenCV files to include and to link to your projects. You then go under Tool|Option… Toolption. Select the VC++ Directories tab and the category Include Files. Add the following directories to additional include directories:

      C:/Program Files/OpenCV/cv/include

      C:/Program Files/OpenCV/cxcore/include

      C:/Program Files/OpenCV/otherlibs/highgui

      C:/Program Files/OpenCV/filters/ProxyTrans (you will need this ProxyTrans when you will process video sequences)

Select now the Library Files Tab and add this library path:

      C:/Program Files/OpenCV/lib

With these global settings, only the names of the library modules need to be specified when a new project is created. You go under project Project|Properties… and you enter the names of the three main components of the OpenCV library:

      cxcore.lib that contains mainly the basic data structure;

      cv.lib containing the computer vision functions;

      highgui.lib that is a basic tool for displaying and saving images.

Now if we want to create an application that will use the OpenCV classes and functions, you must first include the header file highgui.h to your xxxDlg.h file. Note that the file highgui.h already includes cxcore.h. Now here is a modified version of the OnOpen handler:

void CcvisionDlg::OnOpen()

{

  CFileDialog dlg(TRUE, _T("*.bmp"), NULL,

    OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY,

    _T("image files (*.bmp; *.jpg) |*.bmp;*.jpg|All Files (*.*)|*.*||"),NULL);

  dlg.m_ofn.lpstrTitle= _T("Open Image");

  if (dlg.DoModal() == IDOK) {

    CString path= dlg.GetPathName();  // contain the selected filename

      IplImage *image;                      // This is image pointer

      image= cvLoadImage(path);             // load the image

      cvShowImage("Original Image", image); // display it

  }

}

The function names starting with cv are OpenCV functions.  IplImage is the data structure that contains the image under OpenCV. With this modification, your program should load and display an image. Note that the cvNamedWindow function that creates the window should be called only once which means that you can put it in the OnInitDialog method of your xxxDlg.cpp file.

BOOL CcvisionDlg::OnInitDialog()

{

      CDialog::OnInitDialog();

      .

      .

      .

      // TODO: Add extra initialization here

      cvNamedWindow( "Original Image");   // create the window on which

                                                          // the image will be displayed

      return TRUE;  // return TRUE  unless you set the focus to a control

}

Also, for your program to terminate cleanly, you must add a call to cvDestroyAllWindows() which will close all the highgui windows that you could have created.  This call should be associated with the OK and Cancel button handlers (you create these by double-clicking on the widget in the resource view).

void CcvisionDlg::OnBnClickedOk()

{

      // TODO: Add your control notification handler code here

      cvDestroyAllWindows();

      OnOK();

}

void CcvisionDlg::OnBnClickedCancel()

{

      // TODO: Add your control notification handler code here

      cvDestroyAllWindows();

      OnCancel();

}

When running this application and selecting an image, you should obtain:

Check point #1: source code of the above example (25).

3. Processing an image

 

Now let’s try to call one of the OpenCV function. First, as mentioned above, the computer vision functions are found in the cv.lib component. The header file cv.h must therefore be included. And we rewrite the handler function as follows:

void CcvisionDlg::OnOpen()

{

  CFileDialog dlg(TRUE, _T("*.bmp"), NULL,

    OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY,

    _T("image files (*.bmp; *.jpg) |*.bmp;*.jpg|All Files (*.*)|*.*||"),NULL);

  dlg.m_ofn.lpstrTitle= _T("Open Image");

  if (dlg.DoModal() == IDOK) {

    CString path= dlg.GetPathName();  // contain the selected filename

      IplImage *image;                      // This is the image pointer

      image= cvLoadImage(path);              // load the image

      cvErode(image,image,0,3);               // process it

      cvShowImage("Processed Image", image); // display it

  }

}

In this example, the processing consists in the application of a simple morphological, the erosion (cvErode). And the result is:

This example is particularly simple because, in the case of the operator used, the processing is done in-place (the same image is used for input and output). In general, you need to have a distinct output image. Therefore, when you process an image, you will typically i) open an image; ii) create an output image of the same size and; iii) process the image and write the result into the output image. In addition, you must release the memory you have dynamically allocated when creating the output image. In our example, this procedure is realized as follows:

void CcvisionDlg::OnOpen()

{

  CFileDialog dlg(TRUE, _T("*.bmp"), NULL,

    OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY,

    _T("image files (*.bmp; *.jpg) |*.bmp;*.jpg|All Files (*.*)|*.*||"),NULL);

  dlg.m_ofn.lpstrTitle= _T("Open Image");

  if (dlg.DoModal() == IDOK) {

    CString path= dlg.GetPathName();  // contain the selected filename

      IplImage *image;                      // This is the image pointer (input)

      IplImage *output;                     // This is the image pointer (output)

      image= cvLoadImage(path);              // load the image

      cvShowImage("Original Image", image);  // display it

      // output image memory allocation

      output= cvCreateImage(cvSize(image->width,image->height),

                        image->depth, image->nChannels);

      cvErode(image,output,0,3);                    // process it

      cvShowImage("Processed Image", output); // display it

      cvReleaseImage(&output);

  }

}

When you create an image with OpenCV, you have to specify three parameters. The first one is the image size specified using a struct called CvSize and that contains the width and the height. The second one is the depth of the image which specifies the data type that will be associated with each pixel. Normally, an image is made of 8-bit pixel (i.e. from 0 to 255), but you can also create images made of integers or of floating point values. The different types are specified using defined constants, the main one being IPL_DEPTH_8U (unsigned char), IPL_DEPTH_16S (signed integer) and IPL_DEPTH_32F (single precision floating point number). Finally, the third parameter is the number of channels which is 1 in the color of a gray image and 3 for a color image. In the case of a color image, the channels are interleaved meaning that the image data is arranged such that the 3 channel values of a pixel are given in sequence, i.e. Blue channel of pixel 0, Green channel of pixel 0, Red channel of pixel 0, followed by B of 1, G of 1, R of 1, and B of 2, G of 2, R of 2, etc. In the code above, the image is formatted to be identical to the input image.

Check point #2: source code of the above example (20).

4. Processing an image using the Strategy design pattern

 

The preceding example has several problems. First the output image is allocated inside the OnOpen method using a local variable. This means that this image has to be de-allocated also inside this same method; it would therefore be complex to perform additional processing on this image. Also, when you process several images, the output image is allocated and de-allocated for each input image. This could be a waste of resources if all images have the same size (in such case, the same output image should be re-used). But more important, the processing is realized inside the GUI class which violates a fundamental principle of good programming design: the processing aspect of your program should be separated from the GUI management aspects. 

A separate class will therefore be created as a container of the image processing task. The Strategy Pattern is a software design pattern that is used to encapsulate an algorithm into a class. The pattern is often used as a mechanism to select an algorithm at run-time. In our case, it will facilitate the interchange and the deployment of our image processing algorithms inside more complex computer vision systems.

Here is then the general structure of our processing classes:

#if !defined PROCESSOR

#define PROCESSOR

#include "cv.h"

class Processor {

  private:

        // private attributes

  public:

        // empty constructor

        Processor() {

              // default parameter initialization here

        }

        // Add here all getters and setters for the parameters

        // to check if an initialization is required

        virtual bool isInitialized(IplImage *image)=0;

        // for all memory allocation

        virtual void initialize(IplImage *image)=0;  

        // the processing of the image

        virtual void process(IplImage *image)=0;

        // the method that checks for initilization

        // and then process the image

        inline void processImage(IplImage *image) {

              if (!isInitialized(image)) {

                    initialize(image);

              }

              process(image);

        }

        // memory de-allocation

        virtual void release() =0;

        ~Processor() {

              release();

        }

};

#endif

First we start with a 0-parameter default constructor. This makes program initialization much easier because then all objects can be created with a-priori knowledge. The constructor simply makes sure that the object is in a valid state by initializing all the parameters to their default values. You also include setters and getters for your parameters such that the user can change them at run-time, using the GUI for example.

Memory allocation has to be accomplished by a separate method. The reason for this is that to allocate the memory space for the images, we have to know the size of the input image. The goal of the isInitialized method is to check if an initialization is required; this can happen for two reasons: i) memory allocation has not been performed yet (all pointer are set to NULL, or; ii) the new image is of different size than the image that has been previously processed, we therefore need to de-allocate all memory previously allocated and then allocate new memory space. Note that in the current design, the user has the choice to himself call isInitialized when required and then process or to let the class to systematically check if an initialization is required by calling the processImage method.

The Processor class could be used as a base class for the image processing classes to be created. However, in computer vision and especially in video processing, computational efficiency is a must. Therefore, the cost of calling a virtual method could be too high (the overhead is in the order of 10% to 20%). So we will rather use this class as a model. In the case of the erosion, this class would be written as:

class Eroder {

  private:

        // private attributes

        IplImage *output;

        int nIterations;

  public:

        // empty constructor

        Eroder() : nIterations(DEFAULT_NITERATIONS), output(0) {

              // default parameter initialization here

        }

        // getters and setters

        void setNumberOfIterations(int n) {

              nIterations= n;

        }

        int getNumberOfIterations() {

              return nIterations;

        }

        IplImage* getOutputImage() {

              return output;

        }

        // to check if an initialization is required

        bool isInitialized(IplImage *image) {

              return output && (output->width == image->width)

                              && (output->height == image->height);

        }

        // for all memory allocation

        void initialize(IplImage *image) {

              cvReleaseImage(&output);

              output= cvCreateImage(cvSize(image->width,image->height),

                                          image->depth, image->nChannels);

        }

        // the processing of the image

        void process(IplImage *image) {

              cvErode(image, output, 0, nIterations);

        }

        // the method that checks for initilization

        // and then process the image

        inline void processImage(IplImage *image) {

              if (!isInitialized(image)) {

                    initialize(image);

              }

              process(image);

        }

        // memory de-allocation

        void release() {

              cvReleaseImage(&output);

        }

        ~Eroder() {

              release();

        }

};

The corresponding object is created in the application simply as an instance variable in the dialog class (file xxxDlg.h); same thing for the input image pointer.

      IplImage *image;  // This is the image pointer (input)

      Eroder eroder;    // The image processor as an automatic variable

An automatic variable is here used for the processor class (which implies automatic object instantiation); dynamic allocation could have also been used. We also decided to modify the GUI to separate image loading and image processing.

Then the handlers of the two buttons are:

void CcvisionDlg::OnOpen()

{

  CFileDialog dlg(TRUE, _T("*.bmp"), NULL,

    OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY,

    _T("image files (*.bmp; *.jpg) |*.bmp;*.jpg|All Files (*.*)|*.*||"),NULL);

  dlg.m_ofn.lpstrTitle= _T("Open Image");

  if (dlg.DoModal() == IDOK) {

    CString path= dlg.GetPathName();  // contain the selected filename

      image= cvLoadImage(path);              // load the image

      cvShowImage("Original Image", image);  // display it

  }

}

and:

void CcvisionDlg::OnProcess()

{

      eroder.processImage(image);

      cvShowImage("Processed Image", eroder.getOutputImage());

}

Test the application with images of different sizes. Make sure to understand how the out image is reallocated when necessary. Also, we use here the default value for the (unique) parameter of the processor; however a call to setNumberOfIterations would be done in case one wants to change this value.

Check point #3: source code of the above example (23).

The encapsulation of the algorithm into a class is there to facilitate its deployment; proper initialization and memory management being done by the class. There is however one major flaw in this design: the method getOutputImage which returns a pointer to a dynamically allocated instance variable. This is unsafe because this pointer can be deleted by the class’ destructor and the returned pointer that could have been stored in some variable of the application would then dangle; or worst, the application could release the allocated memory without permission. There are two solutions to this problem. The first one, which is the one adopted by OpenCV, consists in requiring the user of an algorithm to provide all the memory buffers that is needed. In our case, this would mean passing an output image together with the input image when calling the process method. This is an approach that applies well in a procedural paradigm context, but under the object-oriented paradigm this is not very convenient. We indeed want the class to manage itself all aspects of the algorithm. Encapsulation and information hiding are indeed the two key principles in object-oriented programming.

The second solution would consist in using smart pointers. Smart pointers are special classes that take care of memory de-allocation. In a sense, they play the role of garbage collector. However, in order to keep this tutorial as simple as possible, we will not use this solution here. We will simply accept our ‘returned pointer’ design flaw and work under the (unreasonable) assumption that programmers will make a good use of the class and of its returned pointers.

4. Accessing the pixels of an image

 

So far, we used an OpenCV function to perform the processing on our image. To apply our own processing algorithm we need to be able to access each pixel of an image, read their values and possibly assign new value to them. This can be easily achieved because all the pixels of an IplImage are simply stored into an array of bytes; this is the image buffer. The size of this buffer is simply the width times the height of the image times the number of channels.

In the following example, the number of colors in an image is reduced by using a divisor that subdivides the RGB color into cubes of equal size. Each color point is assigned the color value that corresponds to the middle point of the cube it is contained in. We implemented this algorithm in-place (i.e. the original is modified). The processing class is therefore very simple since no initialization is required. Here is the processing method:

      void ColorProcessor::

process(IplImage *image) {

      int nl= image->height; // number of lines

      int nc= image->width * image->nChannels; // total number of element per line

      int step= image->widthStep; // effective width

      // get the pointer to the image buffer

      unsigned char *data= reinterpret_cast<unsigned char *>(image->imageData);

      for (int i=1; i<nl; i++) {

            for (int j=0; j<nc; j+= image->nChannels) {

            // process each pixel ---------------------

                  data[j]= data[j]/div * div + div/2;

                  data[j+1]= data[j+1]/div * div + div/2;

                  data[j+2]= data[j+2]/div * div + div/2;

            // end of pixel processing ----------------

            } // end of line

            data+= step;  // next line

      }

}

Using the default divider value of 64 (4x4x4 colors) the following image is obtained:

Note that for an efficient use of the MMX capabilities of the processor, the line length of an image should always be a multiple of 8 bytes. To ensure that this condition is always met, when an image is created under OpenCV, this one is automatically padded with dummy pixels if necessary. This explains the role of the widthStep attribute; the real width of the image is given by width but if the image has been extended to become quad-word aligned, the effective width will be larger as given by widthStep.

Check point #4: source code of the above example (20).

Although this is the most efficient way to scan an image, this process can be error prone. In order to simplify this frequent task, an image Iterator can be introduced. The role of this iterator template is to take care of the pointer manipulation involve in the processing of an image. The template is as follows:

template <class PEL>

class IplImageIterator {

  int i, j, i0;

  PEL* data;

  int step;

  int nl, nc;

  int nch;

 public:

  /* constructor */

  IplImageIterator(IplImage* image,

     int x=0, int y=0, int dx= 0, int dy=0) :

       i(x), j(y), i0(0) {

    data= reinterpret_cast<PEL*>(image->imageData);

    step= image->widthStep / sizeof(PEL);

      CvRect rect= cvGetImageROI(image);

      nl= rect.height;

      nc= rect.width;

      x+= rect.x;

      y+= rect.y;

    if ((y+dy)>0 && (y+dy)<nl) nl= y+dy;

    if (y<0 || y>=nl) j=0;

    data+= step*j;

    if ((x+dx)>0 && (x+dx)<nc) nc= x+dx;

    nc*= image->nChannels;

    if (x>0 && x<nc) i0= x*image->nChannels;

    i= i0;

    nch= image->nChannels;

  }

  /* has next ? */

  bool operator!() const { return j < nl; }

  /* next pixel or next color component */

  IplImageIterator& operator++() {

      i++;

    if (i >= nc) {

            i=i0;

            j++;

            data+= step;

      }

    return *this;

  }

  const IplImageIterator operator++(int) {

      IplImageIterator<PEL> copy(*this);

      ++(*this);

      return copy;

  }

  IplImageIterator& operator+=(int s) {

      i+= s;

    if (i >= nc) {

            i=i0;

            j++;

            data+= step;

      }

    return *this;

  }

  /* pixel access */

  PEL& operator*() {

        return data[i];

  }

  const PEL operator*() const {

        return data[i];

  }

  const PEL neighbor(int dx, int dy) const {

        return *(data+dy*step+i+dx*nch);

  }

  PEL* operator&() const {

        return data+i;

  }

  /* current pixel coordinates */

  int column() const {

        return i/nch;

  }

  int line() const {

        return j;

  }

};

template <class PEL>

class IplMultiImageIterator {

  int i, j, i0;

  PEL** data;

  int step;

  int nl, nc;

  int nch;

  int nimages;

 public:

  /* constructor */

  IplMultiImageIterator(IplImage** images, int n,

     int x=0, int y=0, int dx= 0, int dy=0) :

       i(x), j(y), i0(0) {

      nimages= n;

      data= new PEL*[nimages];

      for (int i=0; i<nimages; i++) {

            data[i]= reinterpret_cast<PEL*>(images[i]->imageData);

      }

    step= images[0]->widthStep / sizeof(PEL);

      CvRect rect= cvGetImageROI(images[0]);

      nl= rect.height;

      nc= rect.width;

      x+= rect.x;

      y+= rect.y;

    if ((y+dy)>0 && (y+dy)<nl) nl= y+dy;

    if (y<0 || y>=nl) j=0;

      for (int i=0; i<nimages; i++) {

            data[i]+= step*j;

      }

    if ((x+dx)>0 && (x+dx)<nc) nc= x+dx;

    nc*= images[0]->nChannels;

    if (x>0 && x<nc) i0= x*images[0]->nChannels;

    i= i0;

    nch= images[0]->nChannels;

  }

  ~IplMultiImageIterator() {

        delete[] data;

  }

  /* has next ? */

  bool operator!() const { return j < nl; }

  /* next pixel or next color component */

  IplMultiImageIterator& operator++() {

      i++;

    if (i >= nc) {

            i=i0;

            j++;

            for (int i=0; i<nimages; i++) {

                  data[i]+= step;

            }

      }

    return *this;

  }

  const IplMultiImageIterator operator++(int) {

      IplImageIterator<PEL> copy(*this);

      ++(*this);

      return copy;

  }

  IplMultiImageIterator& operator+=(int s) {

      i+= s;

    if (i >= nc) {

            i=i0;

            j++;

            for (int i=0; i<nimages; i++) {

                  data[i]+= step;

            }

      }

    return *this;

  }

  /* pixel access */

  PEL& operator[](int n) {

        return data[n][i];

  }

  const PEL neighbor(int n, int dx, int dy) const {

        return *(data[n]+dy*step+i+dx*nch);

  }

  /* current pixel coordinates */

  int column() const {

        return i/nch;

  }

  int line() const {

        return j;

  }

};

An iterator of this type can be declared by specifying the type of the pixels in the image and by giving a pointer to the IplImage as argument to the iterator constructor, e.g.:

IplImageIterator<unsigned char> it(image);

Once the iterator constructed, two operators can be used to iterate over an image. First the ! operator allows to determine if we reach the end of the image and the * operator that give access to the current pixel. A typical loop will therefore look like this:

while (!it) {      

  if (*it < 50) {

    *it= 0xFF;   // 255

  }

  ++it;

}

Note that if the image contains more than one channel, each iteration will give access to one of the channel of a pixel. This means that in the case of a color pixel, you have to iterate three times for each pixel. In order to access all components of a pixel, the operator & can be used. This one returns an array that contains the current pixel channel values. For example, the previous color reduction example will look like this (note how the iterator is incremented this time to make sure that we go from one pixel to another):

      void ColorProcessor::

process(IplImage *image) {

  IplImageIterator<unsigned char> it(image);

  unsigned char* data;

  while (!it) {

     data= &it; // get pointer to current pixel

       data[0]= data[0]/div * div + div/2;

       data[1]= data[1]/div * div + div/2;

       data[2]= data[2]/div * div + div/2;

     it+= 3; // next pixel

  }

}

The use of image iterators is as efficient as directly looping with pointers. This is true as long as you set the compiler to optimize for speed (Project|Properties|C++|Optimization). By default, there is no optimization in Debug mode and the code is optimized for speed in Release mode.

When the processing involves more than one image, more than one iterator can be used. This is illustrated in the following example (a Sobel edge detector):

      void Sobel::

process(IplImage *image) {

  IplImageIterator<unsigned char>

      src(image,1,1,image->width-2,image->height-2);

  IplImageIterator<unsigned char>

      res(output,1,1,image->width-2,image->height-2);

  int sobel;

  while (!src) {

        sobel= abs(src.neighbor(-1,-1) -  src.neighbor(1,-1) +

                2*src.neighbor(-1,0) - 2*src.neighbor(1,0) +

                    src.neighbor(-1,1) -   src.neighbor(1,1)

              );

        sobel+=abs(src.neighbor(-1,-1) -  src.neighbor(-1,1) +

                2*src.neighbor(0,-1) - 2*src.neighbor(0,1) +

                    src.neighbor(1,-1) -   src.neighbor(1,1)

              );

        *res= sobel > 255 ? 255 : sobel;

        ++src;

        ++res;

  }

}

Since the processing here involves not only the current pixel but also its neighbors, the neighbor method defined by the iterator is used. Also, in this case, a window is specified when creating the iterator (here it defines a 1-pixel strip around the image where no processing is undertaken); this is required since otherwise the neighbors of the first pixels would fall outside of the image. The resulting image is:

Check point 5: source code of the above example (18).

 

5. Processing several images

 

Now, to process a series of images, we need to be able to select multiple images from the dialog box. This can be done by specifying the OFN_ALLOWMULTISELECT flag when constructing the CFileDialog. However, the default size of the buffer created to contain the selected filenames is quite small by default. It is therefore preferable to allocate yourself this buffer. A pointer to the buffer can be obtained using the member variable lpstrFile; one has therefore to dynamically allocate the suitable amount memory for the buffer. You can base your choice on the MAX_PATH constant that specifies the maximal number of characters that a path is allowed to contain. The new OnOpen method is therefore as follows:

void CcvisionDlg::OnOpen()

{

  CFileDialog dlg(TRUE, _T("*.avi"), "",                   

   OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY|OFN_ALLOWMULTISELECT ,

   "Image files (*.bmp;*.jpg;*.jpeg;*.png;*.pgm;*.ppm;*.tif;*.tiff) |*.bmp;*.jpg;*.jpeg;*.png;*.pgm;*.ppm;*.tif;*.tiff|All Files (*.*)|*.*||",NULL);

  char title[]= {"Open Image Sequence"};

  const int maxFiles = 200; // you can select a maximum of 200 files

  // MAX_FILE is constant specifying

  // the maximal number of characters in a path

  const int bufferSize = (maxFiles * (MAX_PATH + 1)) + 1; 

  dlg.m_ofn.lpstrTitle= title;

  dlg.m_ofn.lpstrFile= new TCHAR[bufferSize];

  dlg.m_ofn.lpstrFile[0]= '/0';              

  dlg.m_ofn.nMaxFile = bufferSize;  // should be maxFiles but there is a bug in this class...              

  if (dlg.DoModal() == IDOK) {

      std::vector<CString> files;

      // This is an iterator that extracts all filename

      POSITION pos= dlg.GetStartPosition();

      while (pos) {

            files.push_back(dlg.GetNextPathName(pos));

      }

      // do something with the vector of filename

  }

  delete[] dlg.m_ofn.lpstrFile;

}

Then, when you select multiple files using the dialog, all the filenames are inserted into the buffer. The CFileDialog class also offers functions to iterate over the buffer and then extract all the individual filenames. In our example, these filenames are inserted into a vector of CString in order to facilitate their manipulation. The next step is to go over this vector, open each corresponding image and process it. Following our good programming practice, we will create a class that will accomplish this specific task.

A vector of strings is given to this class (using the corresponding getter method) and by calling a run() method, the processing of each image is started. For now, let’s just simply display these images. Also, since the processing time can be quite fast in some cases, we have added a pause parameter that causes the process to sleep for a given duration between each processing. Here is the class as described:

class Multim {

        std::vector<CString> files;

        DWORD pause;

  public:

        Multim() : pause(0) {

              cvNamedWindow("Image");

        }

        ~Multim() {

              cvDestroyWindow("Image");

        }

        void setFiles(std::vector<CString> &f) {

              files= f;

        }

        void setPause(DWORD p) {

              pause= p;

        }

        DWORD getPause() {

              return pause;

        }

        void run() {

            IplImage *image;

            for (int i=0; i<files.size(); i++) {

                  image= cvLoadImage(files[i],-1);

                  cvShowImage( "Image", image );

                  HWND hWnd= (HWND)cvGetWindowHandle("Image");

                  UpdateWindow(hWnd); // force the window to repaint

                  Sleep(pause);

                  cvReleaseImage(&image);

            }

        }

};

Note how, in the run() method, the image display window is forced to repaint by sending the appropriate window message. This is required since window repainting has a low priority and will not normally be made until the processing loop is completed.

Check point 6: source code of the above example (25).

The Multim class allows us to loop over a series of images. In order to process these images we must add a call to a processing function. We will pass this function to the Multim class through a function pointer. Let’s add a setter method in the class to set this function pointer:

        void Multim::setImageProcess(void (*p)(IplImage *)) {

              process= p;

        }

where process is defined as:

        void (*process)(IplImage *);

We then add the call to process in the run() method:

        void run() {

            IplImage *image;

            for (int i=0; i<files.size(); i++) {

                  image= cvLoadImage(files[i],-1);

                  process(image); // process the image

                  cvShowImage( "Image", image );

                  HWND hWnd= (HWND)cvGetWindowHandle("Image");

                  UpdateWindow(hWnd); // force the window to repaint

                  Sleep(pause);

                  cvReleaseImage(&image);

            }

        }

The process function can be defined as follows:

Sobel *sobel;

void process(IplImage* img) {

      sobel->processImage(img);

      cvShowImage( "Result", sobel->getOutputImage());

      HWND hWnd= (HWND)cvGetWindowHandle("Result");

      UpdateWindow(hWnd); // force the window to repaint

}

Since the instance of the processing class (here Sobel) must be accessible from both the process function and the dialog class, we use a global variable (argh!…) that points to an instance of the Sobel class. This instance is created in the dialog (in the OnInitDialog method). The dialog is also responsible for setting the parameter of the processing class; there is none in the case of the Sobel class but Multim class has some parameters to set:

BOOL CcvisionDlg::OnInitDialog()

{

      .

      .

      .

      sobel= new Sobel();

      multi.setImageProcess(process);

      multi.setPause(1000); // number of ms between each image processing

      return TRUE;

}

 

Obviously, most of the time, we would like the user to interactively set (some of ) the parameter values. With Visual C++, an edit control can be added to the dialog using which the user can specify the values he wishes. Just go to the Ressource view and using the Toolbox, drag and drop an Edit Control. You can also drop a Static Text to add an appropriate label to this text field.

You can associate a variable with this Edit Control by right-clicking on it and select the Add Variable… option. You select Value for the Category and you have also to specify the Variable Type and the Variable Name for the variable that will be created. In our case, we will give the user the possibility to specify the pause duration between each image processing, so we select the int type.

Once you click finish, go to the xxxDialog.h file and you will notice that a new variable has been created.

class CcvisionDlg : public CDialog

{

.

.

.

public:

      afx_msg void OnOpen();

      afx_msg void OnBnClickedOk();

      afx_msg void OnBnClickedCancel();

      afx_msg void OnProcess();

      int duration;

};

This variable and the Edit Control are linked together and the function UpdateData is used to move the content of one into the other. If you call this function with false as argument, this will display the value of the variable onto the Edit Control. This is useful to provide default value that will appear when you start the dialog. For example, with this in the OnInitDialog:

      duration= 1000; // default value

      UpdateData(false); // display the content of the variable in Edit Control

You obtain:

Reciprocally, calling UpdateData with true will copy the value currently in the Edit Control into the variable. You do this before performing the processing, i.e. when you push the Process button:

void CcvisionDlg::OnProcess()

{

      UpdateData(true); // move the value from Edit Control to variable

      multi.setPause(duration); // number of ms between each image processing

      multi.run();

}

Check point 7: source code of the above example (19).

We will now redesign the preceding example. One of the problems with the previous program relies in the fact that the application is intrinsically coupled with the graphical interface. A better approach consists in creating a controller class that is responsible for mapping the user action to the application model. The controller class is a bridge that links the user interface with the core application; it defines the application behavior and the way users can interact with it (that is the API). It greatly facilitates the maintenance of both the GUI and the application and it is part of the more general Model-View-Controller design pattern.

If we continue with our application example that computes the Sobel of a series of images, the corresponding controller would offer to the user the possibility to specify the filenames to be processed , to set (and get) the pause between each processing and to start the processing loop. Here is this controller:

class SobelController {

  private:

      Multim *multi;

      Sobel *sobel;

  public:

      SobelController() {

              //setting up the application

              multi= new Multim();

              sobel= new Sobel();

              multi->setImageProcess(process);

      }

        void setPause(int ms) {

              multi->setPause(ms);

        }

        int getPause() {

              multi->getPause();

        }

        void setFiles(std::vector<CString> &f) {

              multi->setFiles(f);

        }

        void run() {

              multi->run();

        }

        // to be called by the callback

        inline void processImage(IplImage *image) {

              sobel->processImage(image);

        }

        inline IplImage* getOutputImage() {

              return sobel->getOutputImage();

        }

        ~SobelController() {

              delete multi;

              delete sobel;

        }

};

As you can see, this class is also responsible for constructing the class instances of the application. All the methods are quite simple since they simply delegate the user request to the appropriate class in the application; this is exactly the responsibility of the controller class. Note that one of the methods, processImage, is there to be called by the process callback function and not by the user.

Now, if we want to avoid the use of the inelegant global variable of the previous example, we will use another well-known design pattern: the singleton. Using the singleton will also guarantee that only one controller will exist as it should. The key idea is simple; the constructor of the class is made private such that no instance of it can be created by an external class or function. A static variable, here called singleton, contains the address of the unique singleton instance that is accessible through a public static method, called get Instance:

        static Controller *getInstance() {

              if (singleton == 0)

                  singleton= new Controller;

              return singleton;

        }

This method creates the class instance when called for the first time and simply return it for the subsequent calls. Once added to our controller, the complete class looks as follows:

#if !defined CNTRLLR

#define CNTRLLR

#include "cv.h"

#include "multim.h"

#include "sobel.h"

#include "process.h"

#include <vector> 

class SobelController {

  private:

      static SobelController *singleton; // pointer to the singleton

      Multim *multi;

      Sobel *sobel;

      SobelController() { // private constructor

              //setting up the application

              multi= new Multim();

              sobel= new Sobel();

            multi->setImageProcess(process);

      }

  public:

        void setPause(int ms) {

              multi->setPause(ms);

        }

        int getPause() {

              multi->getPause();

        }

        void setFiles(std::vector<CString> &f) {

              multi->setFiles(f);

        }

        void run() {

              multi->run();

        }

        // to be called by the callback

        inline void processImage(IplImage *image) {

              sobel->processImage(image);

        }

        inline IplImage* getOutputImage() {

              return sobel->getOutputImage();

        }

        ~SobelController() {

              delete multi;

              delete sobel;

        }

        // Singleton static members

        static SobelController *getInstance() {

              if (singleton == 0)

                  singleton= new SobelController;

              return singleton;

        }

        static void destroy() {

              if (singleton != 0) {

                    delete singleton;

                    singleton= 0;

              }

        }

};

#endif

The process function then becomes (note how the singleton class is accessed):

void process(IplImage* img) {

      SobelController::getInstance()->processImage(img);

      cvShowImage( "Result",

                  SobelController::getInstance()->getOutputImage());

      HWND hWnd= (HWND)cvGetWindowHandle("Result");

      UpdateWindow(hWnd); // force the window to repaint

}

And the OnProcess method of the GUI becomes:

void CcvisionDlg::OnProcess()

{

      UpdateData(true); // display the content of the variable in Edit Control

      SobelController::getInstance()->setPause(duration); // number of ms between each image processing

      SobelController::getInstance()->run();

}

And it is important to call the destroy method before the application is closed:

      SobelController::getInstance()->destroy();

Check point 8: source code of the above example (24).

5. Processing a video sequence

 

Processing an image sequence is relatively simple using the OpenCV video library. The first step is to specify the filename of the video sequence to be processed:

CvCapture* capture = cvCaptureFromAVI(filename);

If a camera is used instead of a video, then function cvCreateCameraCapture should be called. Once this done, you can process each frame by calling the cvQueryFrame function that is, in fact made of the two following calls:

if(cvGrabFrame(capture))

      IplImage* img=cvRetrieveFrame(capture);          

Note that the image thus obtained should not be released. At the end of the processing you simply call the releasing function:

cvReleaseCapture(&capture);

This is as simple as this. However, in practice, you will want to run the video processing function as a separate thread otherwise your application will hang until the processing terminates. By creating a thread you will be able to interact with the user interface and eventually stop, pause and resume the video processing. This is what we will do now.

But first, when you build an application with more than one thread, you must protect your code against multiple accesses. That is to say that when the data of a class is manipulated by different threads, we must make sure that each manipulation is done by one thread at a time. The standard way to ensure this protection is to use a mutex. Under MS Visual C++, this can be achieved using the WaitForSingleObject function. First the mutex must be created:

HANDLE mut;

mut= CreateMutex(NULL, FALSE, NULL);

Then, for each portion you want to protect, you enclosed the block of statement by the following two statements:

WaitForSingleObject(mut, INFINITE);

delay= d;

ReleaseMutex(mut);

You then have the guarantee that the enclose portion of code will be executed by one and only one thread at a time. You must be aware, however, that the use of mutex can slow down your code. The mutex must be released when it is no longer in use:

ReleaseMutex(mut);

Let’s now see how a new thread can be created. This one is started as follows:

HANDLE hThread = (HANDLE)_beginthreadex( NULL, 0, processImages,

                                           reinterpret_cast<LPVOID>(this), 0, &threadID );

The third parameter is the pointer to the function that will be called as a new thread (here processImages). The fourth parameter is an argument that is sent to the function; it must be a void pointer. Usually you send a pointer to an object containing the data that the function requires. Remember that the call to _beginthreadex starts a new thread; therefore it is a non-blocking method. The execution continues to the next statements concurrently with the execution of the function given in argument. When this later function terminates, it is recommended to end it with:

_endthreadex( 0 );

Finally, do not forget to release the thread handle to delete the thread:

CloseHandle( hThread );

We can then now present our class that will be responsible for the management of the video stream. This method will be responsible for the creation of the thread that will read each frame of the sequence. The callback function that will be called for each frame is specified through the setImageProcess  method while the name of the video file is specified using the setFile method. Here is the signature of the different methods of this class:

class AviProcessor {

      .    

      .

      .   

  public:

        // Constructor

        AviProcessor();

        // Destructor

        ~AviProcessor();

        // set the callback function that will be called for each frame

        void setImageProcess(void (*p)(IplImage *));

        // the getter and setters to determine if you want the

        // callback to be called

        void callProcess();

        void dontCallProcess();

        bool callingProcess();

        // the getter and setter to determine if you want to display

        // the processed frames

        void setDisplay(bool b);

        bool isDisplayed();

        // the getter and setter to introduce a delay between each frame

        void setDelay(DWORD d);

        DWORD getDelay();

        // a count is kept of the current frame number

        long getFrameNumber();

        // size of the frames of the sequence

        CvSize getFrameSize();

        // set the name of the video file

        void setFile(char *filename);

        void setFile(CString filename);

        // to grad (and process) only one frame

        bool grabOneFrame();

        // to grab (and process) the frame sequence

        bool run();

        // to restart with a new video

        bool restart();

        // to stop the capture

        void stopIt();

        bool isStopped();

};

The numerous setters and getters allow controlling the behavior of the class. For example, you may want the current frame to be displayed or not (setDisplay)  or to be processed or not (callProcess, dontCallProcess) .  The run method is the one that creates the thread; it is simply defined as:

        bool run() {

          // make sure that a thread is not already running

          if (!isStopped())

                  return false;

          // destroy any previoulsy created thread

          if (hThread != 0)

             CloseHandle( hThread );

          stop= false;

          // start the thread

          hThread = (HANDLE)_beginthreadex( NULL, 0,

                  processImages,reinterpret_cast<LPVOID>(this), 0, &threadID );

          return true;

        }

The new thread starts the execution of the function processImages (declared as a friend of class AviProcessor). This function is defined as follows:

unsigned __stdcall processImages( void *g) {

      AviProcessor *proc= reinterpret_cast<AviProcessor *>(g);

      while (!proc->isStopped()) {

            if(cvGrabFrame(proc->capture)) {

                  proc->img=cvRetrieveFrame(proc->capture);          

                  // calling the process function

                  if (proc->callingProcess())

                        proc->process(proc->img);

                  else

                        Sleep(proc->getDelay());

                  // displays image and frame number

                  if (proc->isDisplayed()) {

                        CvScalar sc;

                        char text[50];

                        sprintf(text,"Frame #%6d",proc->getFrameNumber());

                        sc= cvGet2D(proc->img,20,20);

                        if (sc.val[1]>128)

                              cvPutText(proc->img, text, cvPoint(20,20), &(proc->font), cvScalar(0,0,0));

                        else

                              cvPutText(proc->img, text, cvPoint(20,20), &(proc->font), cvScalar(255,255,255));

                        cvShowImage( "Image (AviProcessor)", proc->img );

                        HWND hWnd= (HWND)cvGetWindowHandle("Image (AviProcessor)");

                        ::SendMessage(hWnd,WM_PAINT,NULL,NULL); // force the window to repaint

                  }

                  // increment frame number

                  proc->incFrameNumber();

            } else {

                  proc->stopIt();

            }

      }

    _endthreadex( 0 );

      return 0;

}

Essentially, the function is a loop that grabs and retrieves the video frame (cvGrabFrame, cvRetrieveFrame) until the process is stopped or no more frame remains. For each frame the processing function is called (proc->process(proc->img)). A series of statement display the current frame and overlay the frame number onto it. There is also a very similar method in AviProcessor that captures only one frame and then returns; this is especially useful when you need to know the size of the frames in the video before starting the video processing.  This method is as follows:

        bool grabOneFrame() {

          // make sure that a thread is not already running

          if (!isStopped())

                  return false;

            if(cvGrabFrame(capture)) {

                  // gets the current image

                  img=cvRetrieveFrame(capture);          

                  // calling the process function

                  if (callingProcess())

                        process(img);

                  if (display) {

                        .

                        .

                        .

                  }

                  // increments current frame number

                  incFrameNumber();

            }

            return true;

        }

As it can be seen, it is essentially the same as before but without the loop. Finally, the stopIt method stops the running thread by setting the stop variable to true. This is the classic way of stopping a thread; it ensures that this one can finish a complete loop turn before stopping nicely.

        void stopIt() {

          WaitForSingleObject(mut, INFINITE);

              stop= true;

          ReleaseMutex(mut);

        }

It is also possible to restart a new processing thread by releasing the previous CvCapture object and creating a new one:

        bool restart() {

          // make sure that a thread is not already running

          if (!isStopped())

                  return false;

            // destroy any previoulsy created thread

            if (hThread != 0)

             CloseHandle( hThread );

            fnumber=0;

            stop= false;

          cvReleaseCapture(&capture);

            capture = cvCaptureFromAVI(aviName.GetBuffer());

            return true;

        }

If you call run instead, the processing will resume at the point where it stops previously.

Now to create an application using this video processing class, we will use a controller. Again, we illustrate the video processing using the Sobel class. The controller will then creates the AviProcessor and Sobel classes.

      SobelVideoController() { // private constructor

              //setting up the application

              aviproc= new AviProcessor();

              sobel= new Sobel();

              aviproc->setImageProcess(process);

              aviproc->callProcess();

              aviproc->setDisplay(true);

      }

All controls we need are the ones who would allow us to specified the video filename, start and stop the processing.

class SobelVideoController {

  private:

      static SobelVideoController *singleton; // pointer to the singleton

      AviProcessor *aviproc;

      Sobel *sobel;

        void run() {

              aviproc->run();

        }

        void stop() {

              aviproc->stopIt();

        }

        inline void setFile(CString filename) {

              aviproc->setFile(filename);

        }

      .

      .

      .

We also add two methods that will be used in the callback: one that performs the processing (i.e. call the appropriate method in Sobel) and the other to obtain the current frame:

        inline void processImage(IplImage *image) {

              sobel->processImage(image);

        }

        inline IplImage* getOutputImage() {

              return sobel->getOutputImage();

        }

The callback is then defined as follows:

void process(IplImage* img) {

      SobelVideoController::getInstance()->processImage(img);

      cvShowImage( "Result",

                  SobelVideoController::getInstance()->getOutputImage());

      HWND hWnd= (HWND)cvGetWindowHandle("Result");

      UpdateWindow(hWnd); // force the window to repaint

}

Now, the user interface is simply the following:

With this architecture the two main buttons (Process and Start) are trivially define as:

void CcvisionDlg::OnProcess()

{

      SobelVideoController::getInstance()->run();

}

void CcvisionDlg::OnStop()

{

      SobelVideoController::getInstance()->stop();

}

Check point 9: source code of the above example (25).

原文地址:http://www.site.uottawa.ca/~laganier/tutorial/opencv+directshow/cvision.htm (69)

原创粉丝点击