Feeds:
Posts
Comments

 Abstract.  We use region growing segmentation technique to segment the DCT image. Based on the segmented region, we select the size of the region in compress domain to construct indexing keys.  By using region growing technique on DCT image we reduce the number of region which is segmented regions only.  Based on these regions, we then constructing the indexing keys to match the images. Our technique will reduce the process time of constructing indexing keys. The indexing keys then will be constructed by calculating the regions distance.  Our proposed of recursive region growing is not new technique but its application on DCT images to build indexing keys is quite new and not yet presented by many authors.

1   Introduction

In the field of digital imaging, image segmentation plays a vital role as a preliminary step for high level image processing. To understand an image, one needs to isolate the objects in it and find relation among them. The process of image partioning referred as image segmentation [8]. In other words, segmentation is used to pull out the significant objects from the image.

Deng and Manjunath [27] proposed a JSEG algorithm to segment the image based on multiscale ‘Jimages’. The images which correspond to the measurements of local homogeneities at different scales are called as ‘J-images’. The system has the ability to segment colour textured images without supervision. First the colour inside the image is quantized to several classes. The pixels are then replaced by their corresponding colour class label which forms the class map of the image. A region growing method is then used to segment the image based on multiscale ‘J-images’.

Histogram thresholding is one of the common techniques for monochrome image segmentation [14,26].  This technique considers that an image consist of different regions corresponding to the grey level ranges. The histogram of an image can be separated using peaks (modes) corresponding to the different regions. A threshold value corresponding to the valley between two adjacent peaks can be used to separate these object [17]. But one of the weaknesses of this method is that, it ignores the spatial relationship information of the pixels. Guterman [4] proposed a neural network based adaptive thresholding segmentation algorithm for monochrome image. The main advantage of this method is that, it does not require a priori knowledge about number of objects in the image.

To humans, an image is not just a random collection of pixels; it is a meaningful arrangement of regions and objects. There also exits a variety of images such as natural scenes, and paintings. Despite the large variations of these images, humans have no problem to interpret them. Considering the large databases on the WWW, in our personal photograph folders, a strong and automatic image analysis would be welcome. Image segmentation is the first step in image analysis and pattern recognition. It is a critical and essential component of image analysis system, is one of the most difficult tasks in image processing, and determines the quality of the final result of analysis. Image segmentation is the process of dividing an image into different regions such that each region is homogeneous.

Many content-based image retrieval (CBIR) systems have been developed since the early nineties. A recent article published by Smeulders [19], reviewed more than 200 references in this ever changing field. Readers are referred to that article and some additional references [2, 8, 16, 20, 23, 24, 28] for more information. Most of the CBIR projects aimed at general-purpose image indexing and retrieval systems focus on searching images visually similar to the query image or a query sketch. They do not have the capability of assigning comprehensive textual description automatically to pictures, because of the great difficulty in recognizing a large number of objects.

Many researchers have attempted to use machine-learning techniques for image indexing and retrieval [13,25]. A system developed by Minka and Picard included a learning component. The system internally generated much segmentation or groupings of each image’s regions based on different combinations of features, then learned which combinations best represented the semantic categories given as examples by the user. The system requires the supervised training of various parts of the image.

The remainder of the paper is organized as follows. Section 2 describes some of basic JPEG compression. Section 3 discusses the method of indexing key based on region growing segmentation. Section 4 describes the experiment result. Section 5 as the final section, we present conclusions and some remark for the future work.

•2      JPEG image basics

To this point, we have defined functions to compute the DCT of a list of length n=8 and the 2D DCT of an 8 x 8 array.  We have restricted our attention to this case partly for simplicity of exposition, and partly because when it is used for image compression, the DCT is typically restricted to this size. Rather than taking the transformation of the image as a whole, the DCT is applied separately to 8 x 8 blocks of the image. We call this as a blocked DCT.

To compute a blocked DCT, we do not actually have to divide the image into blocks. Since the 2D DCT is separable, we can partition each row into lists of length 8, apply the DCT to them, rejoin the resulting lists, and then transpose the whole image and repeat the process.

DCT-based image compression relies on two techniques to reduce the data required second is entropy coding of the quantized coefficients. Quantization is the process of reducing the number of possible values of a quantity, thereby reducing the number of bits needed to represent it. Entropy coding is a technique for representing the quantized data as compactly as possible.  A function then developed to quantize images and to calculate the level of compression provided by different degrees of quantization.

JPEG [22] is a joint CCITT and IS0 standard for compressing images developed by the Joint Photographic Experts Group. The JPEG scheme is rather complex, but a brief outline is all that is needed to understand our use of it for image retrieval. JPEG uses a combination of spatial-domain and frequency-domain coding. The image is divided into 8 x 8 blocks, each of which is transformed into the frequency domain using the discrete cosine transform (DCT). Each block of the image is thus represented by 64 frequency components. The signal tends to concentrate in the lower spatial frequencies, enabling high-frequency components, many of which are usually zero, to be discarded without substantially affecting the appearance of the image.

The main source of loss of information in JPEG is a quantization of the DCT coefficients. A table of quantization coefficients is used, one per coefficient, usually related to human perception of different frequencies. The quantized coefficients are ordered in a”zig-zag” sequence, starting at the upper left (the DC component), since most of the energy lies in the first few coefficients. The final step is entropy coding of the coefficients, using either Huffman coding or arithmetic coding.

•3      Region growing based image indexing

Image segmentation is the first key process in numerous applications of computer vision. It partitions the image into different meaningful regions with homogeneous characteristics using discontinuities or similarities of image components, the subsequent processes depend on its performance.  In most cases, the segmentation of colour image demonstrates to be more useful than the segmentation of monochrome image, because colour image expresses much more image features than monochrome image. In fact, each pixel is characterized by a great number of combinations of R, G, B chromatic components.  However, more complicated segmentation techniques are required to deal with rich chromatic information in the segmentation of colour images.

A variety of segmentation techniques have been proposed in the literature. However, most techniques are kind of “dimensional extension” directly inherited from the segmentation of monochrome image [3].  The spatial compactness and colour homogeneity are two desirable properties in unsupervised segmentation, which lead to image-domain and feature-space based segmentation techniques. According to the strategy of spatial grouping, image-domain techniques include split-and-merge, region growing and edge detection techniques.

The segmentation of images has always been a key problem in computer vision. Up to the early nineties bottom-up techniques like edge detection and split-and-merge algorithms were the primary focus of research. However, by that time people realized that “perfect” segmentation would not be possible without incorporation of higher level knowledge. Thus the focus shifted towards model based techniques like snakes [9] and methods based on geometric models [15].

Region growing algorithms start from an initial, incomplete segmentation and try to aggregate the yet unlabelled pixels to one of the given regions. The initial regions are usually called seed regions or seeds. The decision whether a pixel should join a region or not is based on some fitness function which reflects the similarity between the region and the candidate pixel. As proposed in [1], the order in which the pixel is processed is determined by a global priority queue which sorts all candidate pixels by their fitness values. This approach elegantly mixes local (fitness) and global (pixel order) information.

There is an abundance of literature on image segmentation, and a number of review articles highlighting them [10,23].  Methods also have been defined for post processing the low-level segmentation to further regularize the segmentation output, such as Markov Random Fields [12].

Automatic image segmentation is one of the primary problems of early computer vision, has been intensively studied in the past [18]. The existing automatic image segmentation techniques can be classified into four approaches, namely: thresholding techniques, boundary-based methods, region- based methods, and hybrid techniques.

Region-based techniques rely on the assumption that adjacent pixels in the same region have similar visual features such as grey level, colours value, or texture. A well-known technique of this approach is split and merge [6, 7]. Obviously, the performance of this approach largely depends on the selected homogeneity criterion. Instead of tuning homogeneity parameters, the seeded region growing (SRG) technique is controlled by a number of initial seeds [5,8]. Given the seeds, SRG tries to find an accurate segmentation of images into regions with the property that each connected component of a region meets exactly one of the seeds. Moreover, high-level knowledge of the image components can be exploited through the choice of seeds. This property is very attractive for semantic object extraction toward content-based image database applications. However, SRG suffers from another problem: how to select the initial seeds automatically for providing more accurate segmentation of images.  The algorithm of region growing segmentation technique can be describes as follows:

Input : image I

create an (empty) set S of segments

           stage 0:        i:=0;

    for all pixel P in I

        create a new segment  Rp  of  level 0  (consisting only of P)

        put  Rp  in S

   repeat

       stage i:

       for  all segments Ri of level i in S

            repeat

find a segment Ŕj   of level j ≤ i  in S  Ri  and Ŕj  are   neighboured and      Ri  Ú Ŕj  is homogeneous enough

                remove Ri and Ŕj from S

                redefine  Ri  :=  Ri  Ú Ŕj  of level i+1

         until no such Ŕj can be found

        add  Ri  to S

   i:=i+1

   until stage i-1 has created no new segment

Markov chain Monte Carlo algorithm for image segmentation has drawn considerable attention due to its ability to integrate texture, colour, and edge information in an optimal manner to devise a robust labeling of the image into homogeneous regions [11, 21]. These methods still depend on the assumption that the pixels belonging to the object of interest share a common set of low-level image attributes, thereby allowing the object to be extracted as a single entity. If an object is composed of multiple regions of differing texture or colour then the object is divided into regions corresponding to each of these, and these sub regions must then be re-assembled through some contextual-based post processing to segment the complete object from the image. By employing an additional constraint upon the segmentation that encourages it to find a human, it would be possible to only extract the regions corresponding to the human in the image. This additional constraint can be provided through information regarding the desired shape of the final retained region.

Fig. 1.  The query process of region growing based image retrieval systemThe query process has been established as follows:  Firstly, user query an RGB image in the system, RGB image then is converted to grayscale image.  Secondly, by utilizing region growing algorithm this image will be segmented into regions. Finally based on these regions, the minimum distance between them will be calculated and compared to the image regions in the database.  Fig.2. describes the diagram query process of our purpose image retrieval system.

Once a query is specified, we score each segmented image based on how closely it satisfies the query. The score _i for each atomic query (segmented image) is calculated by using equation (1).

                                                                 (1)

where Hq  and Hk are query indexing key and image indexing keys, respectively. The distance is equal to 0, if the image is identical in all the regions. We then rank the images according to overall score and return to the twenty best matches.

•4.    Experimental results

In our experiment we use 3,500 jpeg images consist of 10    classes including “bear”, “bike”, “building”,  “car”, “cat”, “flower”, “model”, “mountain”, “sky”, and “texture”. We evaluate only the top twenty images ranked in terms of the similarity measures by using precision and recall parameters.  Precision is the ratio of the number of relevant images retrieved to the total number of irrelevant and relevant images retrieved. Whilst, Recall is the ratio of the number of irrelevant images retrieved to the total number of relevant images in the database.

                             (2)

                            (3)

Fig.2. illustrates precision and recall of image retrieval of ten image classes on two methods, image retrieval on grayscale images, and segmented images with region growing technique.  Fig.3 statistically shows the precision and recall of two methods.  The segmented images demonstrate better average precision at 0.75 rather than on grayscale images which is 0.65.

 

 

Fig.  2. The effectiveness of image retrieval with grayscale and segmented methods.

 

The excellent precision of 0.98 has been demonstrated by applying region growing technique for bear class, and worst precision of 0.25 for texture class.  Interesting result, greyscale method shows the best precision of 0.88 also for bear, and the lowest precision of 0.32 for cat class as shown in table 2.

Fig.3. Mean, maximum, and minimum of precision and recall with grayscale and segmented methods.

The experimental result shows that our propose methods on segmented images present good precision which are higher than 0.50 on all classes except for texture class. For further detail can be illustrated in table 1.

 

Table. 1. Precision and recall with grayscale and segmented methods

Class

Gray scale images

Segmented images

Precision

Recall

Precision

Recall

Bear

0.88

0.03

0.98

0.01

Bike

0.70

0.08

0.92

0.02

Build

0.60

0.10

0.88

0.03

Cars

0.58

0.11

0.93

0.04

Cat

0.25

0.19

0.78

0.06

Flower

0.47

0.13

0.72

0.07

Model

0.67

0.08

0.70

0.08

Mount

0.48

0.13

0.68

0.09

Sky

0.72

0.07

0.53

0.12

Text

0.52

0.12

0.33

0.17

 As table 2 illustrates the highest (maximum), lowest (minimum), and average (mean) of precision and recall of every query, we found that application of region growing technique (segmented images) give better maximum and average precision of ten classes in the database.

Table. 2. Statistically precision and recall with grayscale and segmented methods.

Statistic

Segmented

Grayscale

precision

recall

precision

recall

Mean

0.75

0.07

0.65

0.14

Maximum

0.98

0.17

0.88

0.29

Minimum

0.25

0.01

0.32

0.03

5.   CONCLUDING REMARK

New approach has been proposed for an image retrieval system based on region growing segmentation on DCT compress domain. It is presented as a different way to develop image indexing by using of DCT texture descriptors.  The method has been carried out for compressed images database to verify its performance in JPEG standard stream line.

From our experiments, it could be concluded that segmentation, while imperfect, is an essential step and very useful in building indexing keys.  In summary, this indexing key method is a promising method for image retrieval on segmented image on compress domain.  This new approach could be used for image indexing by other segmentation methods.

For the near future, we are applying this method on one of the DCT coefficients which is a DC coefficient only as representation of the whole image in order to simplify algorithm and improve speed of image indexing.

 

QUERY

 

GRAYSCALE

SEGMENTED

 

RANK 0

 

RANK 1

 

RANK 2

 

RANK 3

 

RANK 4

 

RANK 5

 

RANK 6

 

RANK 7

 

RANK 8

 

RANK 9

 

Fig.5. Examples segmented images retrieved with RGB image query, RGB image converted into grayscale, the grayscale image then partioned by region growing technique.

References

1. R. Adams., L. Bischof: Seeded Region Growing. PAMI 16(6), pp. 641-647, 1994.

2. A. Berman and L.G. Shapiro: Efficient Image Retrieval with Multiple Distance Measures. Proc. SPIE, vol. 3022, pp. 12-21, Feb. 1997.

3. B. Bhanu, Ed: Genetic Learning for Adaptive Image Segmentation. Norwell. MA: Kluwer, 1994.

4. V. Boskovitz, Hugo Guterman: An adaptive neuro fuzzy system for automatic image segmentation and edge detection. IEEE Transaction fuzzy systems, 10(2),pp. 247-262, 2002.

5. Y. L. Chang., and X. Li: Adaptive image region-growing. IEEE Trans. Image Processing, vol. 3, pp. 868-872, 1994.

6. R. M. Haralick and L. G. Shapiro: Survey: Image segmentation techniques. Computer Vision Graph. Image Processing, vol. 29, pp. 100-132, 1985.

7. S. A. Hijjatoleslami and J. Kittler: Region growing: A new approach.  IEEE Trans. Image Processing, vol. 7, pp. 1079-1084, 1998

8. A.K. Jain: Fundamentals of Digital Image processing. Upper Saddle River, NJ: Prentice Hall, 1989.

9. M. Kass., A. Witkin., D. Terzopoulos: Snakes: Active Contour Models. Proc. 1st Intl. Conf. on Computer Vision 1987, pp. 259-269.

10. D. Koller and M. Sahami: Toward optimal feature selection. Proceeding 13th International Conference Machine Learning, 1996, pp. 197-243.

11. M. W. Lee and I. Cohen: Proposal maps driven MCMC for estimating human body pose in static images. Proceeding IEEE Conference Computer Vision and Pattern Recognition,pp. 334-341, 2004.

12. J. Luo and C. Guo: Perceptual grouping of segmented regions in colour images. Pattern Recogniion, volume 36, pp. 2781-2792, 2003.

[13]. T.P. Minka and R.W. Picard, “Interactive Learning Using a Society of Models”, Pattern recognition, Vol. 30, No.3, p.565, 1997.

14. S.K. Pal, A. Rosenfeld: Image enhancement and thresholding by optimization of fuzzy compactness. Pattern Recognition Letter, 7: 77-86, 1988.

15. A. R. Pope: Model-Based Object Recognition: A Survey of Recent Research. Univ. of British Columbia, Dept. of Computer Science, Techn. Report CS-TR 94-04

16. S. Ravela and R. Manmatha: Image Retrieval by Appearance. Proc. SIGIR, pp. 278-285, July 1997.

17. Sahoo, P.K:A survey on Thresholding Techniques. Computer Vision Graphics Image Processing, 41, pp.233-260, 1988.

18. M. Sonka., V. Hlavac., R. Boyle: Image Processing Analysis and Machine Vision.  London, U.K.: Chapman & Hall, 1999.

19. A.W.M. Smeulders., M. Worring., S. Santini., A. Gupta., R. Jain: Content-Based Image Retrieval at the End of the Early Years. IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349-1380, Dec. 2000

20. G. Sheikholeslami., S. Chatterjee., and A. Zhang: WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases. Proc. Very Large Date Bases Conf., pp. 428-439, Aug. 1998.

21. Z. Tu and S.-C. Zhu: Image Segmentation by Data-driven Markov chain Monte Carlo. IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 657-673, May 2002.

22. G.K. Wallace: The JPEG Still Picture Compression Standard. Comm. ACM, vol. 34, pp. 30-44, Apr. 1991.

23. J.Z. Wang, J. Li, and G. Wiederhold: SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture Libraries. IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 23, no. 9, pp. 947-963, September 2001.

24. J.Z. Wang, G. Wiederhold, O. Firschein, and X.W. Sha: Content-Based Image Indexing and Searching Using Daubechies’ Wavelets. International Journal Digital Libraries (IJODL), vol. 1, no. 4, pp. 311-328, 1998.

25.  J.Z. Wang and M.A. Fischer: Visual Similarity, Judgmental Certainty and Stereo Correspondence. Proc. DARPA image understanding Workshop, G. Lukes, ed., Vol. 2,  Nov, 1998, pp. 1237-1248.

26. J.S. Weszka: A Survey of Threshold Selection Techniques. Computer Graphics Image process, 7, pp. 259-265, 1978.

27. Yining Deng., B.S. Manjunath: Unsupervised Segmentation of Colour Texture Region in Images and Video. pp.1-25, 2001.

28.  Q. Zhang, S.A. Goldman, W. Yu, and J.E. Fritts: Content-Based Image Retrieval Using Multiple-Instance Learning. Proceeding International Conference. Machine Learning, 2002.

« Newer Posts - Older Posts »