Ñæàòèå âèäåî - Video Indexing & Segmentation

Àíãëèéñêèå ìàòåðèàëû
*Àâòîðû*	*Íàçâàíèå ñòàòüè*	*Îïèñàíèå*	*Ðåéòèíã*
Edouard Fran¸cois and Bertrand Chupeau	Depth-Based Segmentation	Abstract—The tool presented in this paper performs an automatic segmentation of stereoscopic image sequences, based on the modeling of distance maps obtained by image processing. Two three-dimensional (3-D) image analysis algorithms are combined: i) estimation of dense depth maps from stereoscopic image sequences and ii) depth-based object segmentation. A Markovian statistical approach for such a segmentation of a dense depth map into arbitrarily shaped and oriented planar surfaces is described in detail. The simulation results on sequences “Fun fair” and “Tunnel,” provided on video tape to the MPEG-4 tests of November 1995, are discussed. RAR 89 êáàéò	?
Demin Wang	Unsupervised Video Segmentation Based on Watersheds and Temporal Tracking	Abstract—This paper presents a technique for unsupervised video segmentation. This technique consists of two phases: initial segmentation and temporal tracking, similar to a number of existing techniques. However, new algorithms for spatial segmentation, marker extraction, and modified watershed transformation are proposed for the present technique. The new algorithms make this technique differ from existing techniques by the following features: 1) it can effectively track fast moving objects, 2) it can detect the appearance of new objects as well as the disappearance of existing objects, and 3) it is computationally efficient because of the use of watershed transformations and a fast motion estimation algorithm. Simulation results demonstrate that the proposed technique can efficiently segment video sequences with fast moving, newly appearing, or disappearing objects in the scene. RAR 1535 êáàéò	?
A. Aydyn Alatan, Levent Onural, Michael Wollborn, Roland Mech, Ertem Tuncel, and Thomas Sikora,	Image Sequence Analysis for Emerging Interactive Multimedia Services—The European COST 211 Framework	Abstract— Flexibility and efficiency of coding, content extraction, and content-based search are key research topics in the field of interactive multimedia. Ongoing ISO MPEG-4 and MPEG-7 activities are targeting standardization to facilitate such services. European COST Telecommunications activities provide a framework for research collaboration. COST 211bis and COST 211ter activities have been instrumental in the definition and development of the ITU-T H.261 and H.263 standards for videoconferencing over ISDN and videophony over regular phone lines, respectively. The group has also contributed significantly to the ISO MPEG-4 activities. At present a significant effort of the COST 211ter group activities is dedicated toward image and video sequence analysis and segmentation—an important technological aspect for the success of emerging object-based MPEG-4 and MPEG-7 multimedia applications. The current work of COST 211 is centered around the test model, called the Analysis Model (AM). The essential feature of the AM is its ability to fuse information from different sources to achieve a high-quality object segmentation. The current information sources are the intermediate results from frame-based (still) color segmentation, motion vector based segmentation, and changedetection- based segmentation. Motion vectors, which form the basis for the motion vector based intermediate segmentation, are estimated from consecutive frames. A recursive shortest spanning tree (RSST) algorithm is used to obtain intermediate color and motion vector based segmentation results. A rule-based region processor fuses the intermediate results; a postprocessor further refines the final segmentation output. The results of the current AM are satisfactory; it is expected that there will be further improvements of the AM within the COST 211 project. RAR 495 êáàéò	?
Takashi Ida, and Yoko Sambonsugi	Image Segmentation and Contour Detection Using Fractal Coding	Abstract—Fractal coding was applied to image segmentation and contour detection. The encoding method was the same as in conventional fractal coding, and the compressed code, which we call fractal code, was used for image segmentation and contour detection instead of image reconstruction. An image can be segmented by calculating the basin of attraction on a mapping that is a set of local maps from the domain block to the range block. The local maps are parameterized using the fractal code, and contours of the objects in the image are detected by the inverse mapping from the range block to the domain block. Some objects in the test image Lena were segmented, and the contours were detected well. The proposed methods are expected to enable compressed codes to be used directly for image processing. RAR 796 êáàéò	?
Shih-Fu Chang, William Chen, Horace J. Meng, Hari Sundaram, and Di Zhong	A Fully Automated Content-Based Video Search Engine Supporting Spatiotemporal Queries	Abstract—The rapidity with which digital information, particularly video, is being generated has necessitated the development of tools for efficient search of these media. Content-based visual queries have been primarily focused on still image retrieval. In this paper, we propose a novel, interactive system on the Web, based on the visual paradigm, with spatiotemporal attributes playing a key role in video retrieval. We have developed innovative algorithms for automated video object segmentation and tracking, and use real-time video editing techniques while responding to user queries. The resulting system, called VideoQ (demo available at http://www.ctr.columbia.edu/VideoQ/), is the first on-line video search engine supporting automatic objectbased indexing and spatiotemporal queries. The system performs well, with the user being able to retrieve complex video clips such as those of skiers and baseball players with ease. RAR 475 êáàéò	?
Chee Sun Won	A Block-Based MAP Segmentation for Image Compressions	Abstract—In this paper, a novel block-based image segmentation algorithm using the maximum a posteriori (MAP) criterion is proposed. The conditional probability in the MAP criterion, which is formulated by the Bayesian framework, is in charge of classifying image blocks into edge, monotone, and textured blocks. On the other hand, the a priori probability is responsible for edge connectivity and homogeneous region continuity. After a few iterations to achieve a deterministic MAP optimization, we can obtain a block-based segmented image in terms of edge, monotone, or textured blocks. Then, using a connected blocklabeling algorithm, we can assign a number to all connected homogeneous blocks to define an interior of a region. Finally, uncertainty blocks, which are not given any region number yet, are assigned to one of neighboring homogeneous regions by a block-based region-growing method. During this process, we can also check the balance between the accuracy and the cost of the contour coding by adjusting the size of the uncertainty blocks. Experimental results show that the proposed algorithm yields larger homogeneous regions which are suitable for the objectbased image compression. RAR 284 êáàéò	?
Ioannis Kompatsiaris, Dimitrios Tzovaras, and Michael G. Strintzis,	3-D Model-Based Segmentation of Videoconference Image Sequences	Abstract— This paper describes a three-dimensional (3-D) model-based unsupervised procedure for the segmentation of multiview image sequences using multiple sources of information. The 3-D model is initialized by accurate adaptation of a twodimensional wireframe model to the foreground object of one of the views. The articulation procedure is based on the homogeneity of parameters, such as rigid 3-D motion, color, and depth, estimated for each subobject, which consists of a number of interconnected triangles of the 3-D model. The rigid 3-D motion of each subobject for subsequent frames is estimated using a Kalman filtering algorithm, taking into account the temporal correlation between consecutive frames. Information from all cameras is combined during the formation of the equations for the rigid 3-D motion parameters. The threshold used in the object segmentation procedure is updated at each iteration using the histogram of the subobject parameters. The parameter estimation for each subobject and the 3-D model segmentation procedures are interleaved and repeated iteratively until a satisfactory object segmentation emerges. The performance of the resulting segmentation method is evaluated experimentally. RAR 555 êáàéò	?
Jae Gark Choi, Si-Woong Lee, and Seong-Dae Kim	Spatio-Temporal Video Segmentation Using a Joint Similarity Measure	Abstract—This paper presents a new morphological spatiotemporal segmentation algorithm. The algorithm incorporates luminance and motion information simultaneously and uses morphological tools such as morphological filters and watershed algorithm. The procedure toward complete segmentation consists of three steps: joint marker extraction, boundary decision, and motion-based region fusion. First, the joint marker extraction identifies the presence of homogeneous regions in both motion and luminance, where a simple joint marker extraction technique is proposed. Second, the spatio-temporal boundaries are decided by the watershed algorithm. For this purpose, a new joint similarity measure is proposed. Finally, an elimination of redundant regions is done using motion-based region fusion. By incorporating spatial and temporal information simultaneously, we can obtain visually meaningful segmentation results. Simulation results demonstrates the efficiency of the proposed method. RAR 356 êáàéò	?
Joo-Hee Moon, Gwang-Hoon Park, Sung-Moon Chun, and Seok-Rim Choi	Shape-Adaptive Region Partitioning Method for Shape-Assisted Block-Based Texture Coding	Abstract—In the content-based image coding scheme, segmentation information of the arbitrarily shaped regions may be available for both encoder and decoder. The shape-assisted block-based texture coding methodologies, such as shape-adaptive discrete cosine transform (SADCT), can use this segmentation information to improve coding efficiency. In this paper, we introduce the shape-adaptive region partitioning (SARP) methods which can reduce the number of coding blocks that partition the arbitrarily shaped region by modifying the block positions. By simply adding SARP method to aid the SADCT, the coded texture bits can be reduced by 5–10%, in comparison with the SADCT using common block-based coding infrastructure which is usually used in the MPEG-1/2 and H.263. RAR 207 êáàéò	?
Ullas Gargi, Rangachar Kasturi, and Susan H. Strayer	Performance Characterization of Video-Shot-Change Detection Methods	Abstract—A number of automated shot-change detection methods for indexing a video sequence to facilitate browsing and retrieval have been proposed in recent years. Many of these methods use color histograms or features computed from block motion or compression parameters to compute frame differences. It is important to evaluate and characterize their performance so as to deliver a single set of algorithms that may be used by other researchers for indexing video databases. We present the results of a performance evaluation and characterization of a number of shot-change detection methods that use color histograms, block motion matching, or MPEG compressed data. RAR 285 êáàéò	?
Qian Huang, Atul Puri, and Zhu Liu	Multimedia Search and Retrieval: New Concepts, System Implementation, and Application	Abstract—We first present new concepts applicable to the design of multimedia search and retrieval schemes in general, and to MPEG-7 in particular, the multimedia description standard in progress. Raw multimedia data is assumed to exist in the form of programs that typically consist of a combination of media types such as visual, audio, and text. We partition each such media stream into smaller units based on actual physical events. These physical events within each media stream can then be effectively indexed for retrieval. The concept of logical events is introduced next; we define logical events as those that can provide different “views” of the content as may be desired by a user. Such events usually result from either the correlation of events that cross different media types, or by merging recursively chosen events from a lower level within each media type. We then address the related issue of how to develop a practical multimedia information retrieval system that exploits the aforementioned concepts of physical and logical events as well as other aspects such as storage, representation and indexing to enable efficient search, retrieval, and browsing. Finally, we implement the proposed concepts and solutions within a multimedia system that addresses a real application, effective browsing of broadcast news, and evaluate its performance. RAR 1127 êáàéò	?
Jungwoo Lee and Bradley W. Dickinson	Hierarchical Video Indexing and Retrieval for Subband-Coded Video	Abstract—In this paper, we present a multiresolution approach for video indexing and feature matching of subband-coded video databases. Four different scene-change detectors were tested; scene-change detection is applied only on the lowest subband for computational efficiency. Two kinds of scene changes, abrupt and smoothly accumulated, mark the beginning of new scene segments. The index for each scene segment is the pair of the histograms of two representative frames, the first and the last frame of the scene. Using the approach of query by example, the index-matching algorithm takes a multiresolution approach by hierarchically comparing histograms at different resolutions. The search algorithm for the match between example query and its target scene segment starts from the coarsest resolution and moves to the next finer resolution until the finest resolution is reached. Experimental results are presented, and the proposed indexing technique appears to be promising for its speed and its inherent hierarchical search procedure. RAR 2215 êáàéò	?
Niels Haering, Richard J. Qian, and M. Ibrahim Sezan,	A Semantic Event-Detection Approach and Its Application to Detecting Hunts in Wildlife Video	Abstract—We propose a three-level video-event detection methodology and apply it to animal-hunt detection in wildlife documentaries. The first level extracts color, texture, and motion features, and detects shot boundaries and moving object blobs. The mid-level employs a neural network to determine the object class of the moving object blobs. This level also generates shot descriptors that combine features from the first level and inferences from the mid-level. The shot descriptors are then used by the domain-specific inference process at the third level to detect video segments that match the user-defined event model. The proposed approach has been applied to the detection of hunts in wildlife documentaries. Our method can be applied to different events by adapting the classifier at the intermediate level and by specifying a new event model at the highest level. Event-based video indexing, summarization, and browsing are among the applications of the proposed approach. RAR 610 êáàéò	?
Thomas Meier, and King N. Ngan,	Video Segmentation for Content-Based Coding	Abstract—The extensive use of discrete transforms in image and video coding suggests the investigation on filtering before downsampling (FBDS) and filtering after upsampling (FAUS) methods directly acting on the transform domain. In this paper, we describe the “transform-domain resolution translation” technique that gives flexibility to resize windows of each video conferencing session for server compositing without explicit decompression, spatial domain processing, and compression. We generalize transform- domain filtering (TDF) to include nonuniform and multirate cases to implement the transform-domain resolution translator. The former is defined as a TDF problem in which the original transform domain is of different size from the target one, while the latter considers the implementation of sampling rate conversion in the transform domain. The implementation architecture is based on a pipeline that involves matrix–vector product blocks and vector addition, but is not limited to particular hardware. Such techniques are particularly useful for fast algorithms for processing compressed images and video where transform coding is extensively used (e.g., in JPEG, H.261, MPEG-1, MPEG-2, and H.263). RAR 697 êáàéò	?
P. Salembier, and F. Marqu´es	Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services	Abstract—This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main processing steps and the corresponding choices in terms of feature spaces, decision spaces, and decision algorithms, the state of the art in segmentation is reviewed. Mainly tools useful in the context of the MPEG-4 and MPEG-7 standard are discussed. The review is structured around the strategies used by the algorithms (transition based or homogeneity based) and the decision spaces (spatial, spatio-temporal, and temporal). The second part of this paper proposes a partition tree representation of images and introduces a processing strategy that involves a similarity estimation step followed by a partition creation step. This strategy tries to find a compromise between what can be done in a systematic and universal way and what has to be application dependent. It is shown in particular how a single partition tree created with an extremely simple similarity feature can support a large number of segmentation applications: spatial segmentation, motion estimation, region-based coding, semantic object extraction, and region-based retrieval. RAR 1016 êáàéò	?
Gene K. Wu and Todd R. Reed,	Image Sequence Processing Using Spatiotemporal Segmentation	Abstract— We investigate the improvements that can be obtained in several conventional video-processing algorithms through the incorporation of three-dimensional (3-D) (spatiotemporal) segmentation information. Four classes of image sequence processing techniques are considered: low-pass filtering, high-pass filtering, high-frequency emphasis, and 3-D Sobel filtering. It is demonstrated that segmentation information can improve the performance of these techniques substantially so that this approach may be promising for other applications (e.g., deinterlacing and resolution conversion) as well. can also be used to represent the interiors of regions. While more accurate, these expressions are also more complex. RAR 1300 êáàéò	?
Peter van Beek, A. Murat Tekalp, Ning Zhuang, I¸sil Celasun, and Minghui Xia	Hierarchical 2-D Mesh Representation, Tracking, and Compression for Object-Based Video	Abstract—This paper proposes methods for designing, tracking and coding hierarchical two-dimensional (2-D) content-based mesh representations. The design procedure consists of constructing a fine-to-coarse hierarchy of Delaunay meshes, using image- and shape-based criteria for mesh geometry simplifi- cation. Hierarchical tracking employs a coarse-to-fine strategy with mesh-based motion vector optimization. We introduce new techniques to maintain the initial mesh hierarchy and topology during tracking by imposing certain constraints at each stage of the procedure. The hierarchical compression technique is based on a nearest neighbor ordering of mesh node points. This ordering serves to identify the mesh boundary nodes as well as establish spatial predictors for differential coding of node coordinates and motion vectors. The proposed hierarchical mesh representation, which has applications in object-based video manipulation, indexing, and compression, provides improved tracking performance (compared to a nonhierarchical representation) and allows progressive (scalable) transmission of the object geometry (including shape) and motion information, as well as variable level-of-detail rendering. Experimental results are presented to compare the tracking and compression performance of hierarchical versus nonhierarchical mesh representations and to demonstrate the tradeoff between image quality and mesh bit rate for 2-D mesh-based video object rendering. RAR 686 êáàéò	?
Douglas Chai, and King N. Ngan,	Face Segmentation Using Skin-Color Map in Videophone Applications	Abstract—This paper addresses our proposed method to automatically segment out a person’s face from a given image that consists of a head-and-shoulders view of the person and a complex background scene. The method involves a fast, reliable, and effective algorithm that exploits the spatial distribution characteristics of human skin color. A universal skin-color map is derived and used on the chrominance component of the input image to detect pixels with skin-color appearance. Then, based on the spatial distribution of the detected skin-color pixels and their corresponding luminance values, the algorithm employs a set of novel regularization processes to reinforce regions of skincolor pixels that are more likely to belong to the facial regions and eliminate those that are not. The performance of the facesegmentation algorithm is illustrated by some simulation results carried out on various head-and-shoulders test images. The use of face segmentation for video coding in applications such as videotelephony is then presented. We explain how the face-segmentation results can be used to improve the perceptual quality of a videophone sequence encoded by the H.261-compliant coder. RAR 707 êáàéò	?
Soo-Chang Pei, and Ching-Min Cheng	Extracting Color Features and Dynamic Matching for Image Data-Base Retrieval	Abstract—Color-based indexing is an important tool in image data-base retrieval. Compared with other features of the image, color features are less sensitive to noise and background complication. Based on the human visual system’s perception of color information, this paper presents a dependent scalar quantization approach to extract the characteristic colors of an image as color features. The characteristic colors are suitably arranged in order to obtain a sequence of feature vectors. Using this sequence of feature vectors, a dynamic matching method is then employed to match the query image with data-base images for a nonstationary identification environment. The empirical results show that the characteristic colors are reliable color features for image database retrieval. In addition, the proposed matching method has acceptable accuracy of image retrieval compared with existing methods. RAR 582 êáàéò	?
Hyun Sung Chang, Sanghoon Sull, and Sang Uk Lee,	Efficient Video Indexing Scheme for Content-Based Retrieval	Abstract—Extracting a small number of key frames that can abstract the content of video is very important for efficient browsing and retrieval in video databases. In this paper, the key frame extraction problem is considered from a set-theoretic point of view, and systematic algorithms are derived to find a compact set of key frames that can represent a video segment for a given degree of fidelity. The proposed extraction algorithms can be hierarchically applied to obtain a tree-structured key frame hierarchy that is a multilevel abstract of the video. The key frame hierarchy enables an efficient content-based retrieval by using the depth-first search scheme with pruning. Intensive experiments on a variety of video sequences are presented to demonstrate the improved performance of the proposed algorithms over the existing approaches. RAR 688 êáàéò	?
Ebroul Izquierdo M.	Disparity/Segmentation Analysis: Matching with an Adaptive Window and Depth-Driven Segmentation	Abstract— Most of the emerging content-based multimedia technologies are based on efficient methods to solve machine early vision tasks. Among others tasks, object segmentation is perhaps the most important problem in single image processing, whereas pixel-correspondence estimation is the crucial task in multiview image analysis. The solution of these two problems is the key technology for the development of the majority of leading-edge interactive video-communication technologies and telepresence systems. In this paper, we present a robust framework comprised of joined pixel-correspondence estimation and image segmentation in video sequences taken simultaneously from different perspectives. An improved concept for stereo-image analysis based on block matching with a local adaptive window is introduced. The size and shape of the reference window is calculated adaptively according to the degree of reliability of disparities estimated previously. Considerable improvements are obtained just within object borders or image areas that become occluded by applying the proposed block-matching model. An initial object segmentation is obtained by merging neighboring sampling positions with disparity vectors of similar size and direction. Starting from this initial segmentation, true object borders are detected using a contour-matching algorithm. In this process, the contour of the initial segmentation is taken as a reference pattern, and the edges extracted from the original images, by applying a multiscale algorithm, are the candidates for the true object contour. The performance of the introduced methods has been verified by computer simulations using synthetic data and several natural stereo sequences. RAR 1927 êáàéò	?
Alan Hanjalic, Reginald L. Lagendijk, and Jan Biemond, Fellow	Automated High-Level Movie Segmentation for Advanced Video-Retrieval Systems	Abstract—We present a newly developed strategy for automatically segmenting movies into logical story units. A logical story unit can be understood as an approximation of a movie episode, which is a high-level temporal movie segment, characterized either by a single event (dialog, action scene, etc.) or by several events taking place in parallel. Since we consider a whole event and not a single shot to be the most natural retrieval unit for the movie category of video programs, the proposed segmentation is the crucial first step toward a concise and comprehensive contentbased movie representation for browsing and retrieval purposes. The automation aspect is becoming increasingly important with the rising amount of information to be processed in video archives of the future. The segmentation process is designed to work on MPEG-DC sequences, where we have taken into account that at least a partial decoding is required for performing content-based operations on MPEG compressed video streams. The proposed technique allows for carrying out the segmentation procedure in a single pass through a video sequence. RAR 191 êáàéò	?
Stephan Herrmann, Hubert Mooshofer, Harald Dietrich, and Walter Stechele	A Video Segmentation Algorithm for Hierarchical Object Representations and Its Implementation	Abstract—This paper describes a segmentation algorithm for generating hierarchical object representations of images and image sequences. Starting from an object model, we describe the structure of the corresponding segmentation algorithm including all analysis methods applied. Besides the well-known color and motion analysis, we also show how to utilize shape information. Furthermore, we discuss the tradeoff between reducing the computational complexity and the quality of the segmentation results. Last, we present the implementation concept for our analysis model, which uses a special toolbox model. The toolbox provides a set of addressing schemes that are needed by low-level video processing tools. The low-level tools are functions that apply a single operation to all pixels in one frame. Using these addressing functions makes it easy to implement new video processing tools, which, when combined, form new analysis methods. The toolbox exists in C-code and is partially transferred into VHDL. search in the area of video processing. Because the applications and their requirements are not well defined, there is a need for modular and flexible segmentation algorithms. In consequence, a set of mid- and low-level tools is required to perform the tasks of image segmentation and feature extraction. The combination of these tools forms a high-level segmentation algorithm. RAR 653 êáàéò	?
Roberto Castagno, Touradj Ebrahimi, and Murat Kunt, Fellow	Video Segmentation Based on Multiple Features for Interactive Multimedia Applications	Abstract—In this paper, we present a scheme for interactive video segmentation. A key feature of the system is the distinction between two levels of segmentation, namely, regions and object segmentation. Regions are homogeneous areas of the images, which are extracted automatically by the computer. Semantically meaningful objects are obtained through user interaction by grouping of regions according to the specific application. This splitting relieves the computer of ill-posed semantic problems, and allows a higher level of flexibility of the method. The extraction of regions is based on the multidimensional analysis of several image features by a spatially constrained fuzzy C-means algorithm. The local level of reliability of the different features is taken into account in order to adaptively weight the contribution of each feature to the segmentation process. Results on the extraction of regions as well as on the tracking of spatiotemporal objects are presented. RAR 287 êáàéò	?
Fabio Dell’Acqua and Paolo Gamba	Simplified Modal Analysis and Search for Reliable Shape Retrieval	Abstract—In this work, we present the application of a simplified shape analysis technique based on a modal representation of the object shape and useful for improving the efficiency and effectiveness of shape-driven searches in image databases. The proposed method computes the representation of an object by means of modes very similar to the deformation modes of a mechanical system, but in a numerically more stable way than the usual finite-element method approach. Moreover, to make the technique for the visual search more effective, many different definitions of similarity indexes are introduced and discussed. The problems related to the comparison between objects represented by a very different number of feature points are also discussed. Finally, to prove the effectiveness of the approach, the indexes are studied in a simple case study (a small database of character shapes). However, their performance on a larger image database is also addressed, as well as the ability of the method to efficiently assess the problem of retrieving images similar to a user-defined sketch. RAR 311 êáàéò	?
Ru-Shang Wang and Yao Wang,	Multiview Video Sequence Analysis, Compression, and Virtual Viewpoint Synthesis	Abstract—This paper considers the problem of structure and motion estimation in multiview teleconferencing-type sequences and its application for video-sequence compression and intermediate- view generation. First, we introduce a new approach for structure estimation from a stereo pair acquired by two parallel cameras. It is based on a 2-D mesh representation of both views of the imaged scene and a parameterization of the structure information by the disparity between corresponding nodes in the image pair. Next, we describe a novel image alignment approach which can convert images captured using nonparallel cameras to coplanar-like images. This approach greatly eases the computational burden incurred by the nonparallel camera geometry, where one must consider both horizontal and vertical disparities. Finally, we present a coder for multiview sequences, which exploits the proposed alignment and structure estimation algorithm. By extracting the foreground objects and estimating the disparity field between a selected view and a reference view, the coder can compress the image pair very efficiently. In the meantime, by using the coded structure information, the decoder can generate virtual viewpoints between decoded views, which can be very helpful for telepresence applications. RAR 1323 êáàéò	?
Thomas Meier, King N. Ngan, and Gregory Crebbin	Reduction of Blocking Artifacts in Image and Video Coding	Abstract—The discrete cosine transform (DCT) is the most popular transform for image and video compression. Many international standards such as JPEG, MPEG, and H.261 are based on a block-DCT scheme. High compression ratios are obtained by discarding information about DCT coefficients that is considered to be less important. The major drawback is visible discontinuities along block boundaries, commonly referred to as blocking artifacts. These often limit the maximum compression ratios that can be achieved. Various postprocessing techniques have been published that reduce these blocking effects, but most of them introduce unnecessary blurring, ringing, or other artifacts. In this paper, a novel postprocessing algorithm based on Markov random fields (MRF’s) is proposed. It efficiently removes blocking effects while retaining the sharpness of the image and without introducing new artifacts. The degraded image is first segmented into regions, and then each region is enhanced separately to prevent blurring of dominant edges. A novel texture detector allows the segmentation of images containing both texture and monotone areas. It finds all texture regions in the image before the remaining monotone areas are segmented by an MRF segmentation algorithm that has a new edge component incorporated to detect dominant edges more reliably. The proposed enhancement stage then finds the maximum a posteriori estimate of the unknown original image, which is modeled by an MRF and is therefore Gibbs distributed. A very efficient implementation is presented. Experiments demonstrate that our proposed postprocessor gives excellent results compared to other approaches, from both a subjective and an objective viewpoint. Furthermore, it will be shown that our technique also works for wavelet encoded images, which typically contain ringing artifacts. RAR 861 êáàéò	?
Emmanuel Reusens, Touradj Ebrahimi, Corinne Le Buhan, Roberto Castagno, Vincent Vaerman, Laurent Piron, Carmen de Sol`a F`abregas, Sushil Bhattacharjee, Frank Bossen, and Murat Kunt, Fellow	Dynamic Approach to Visual Data Compression	Abstract—This paper presents the Swiss Federal Institute of Technology (EPFL) proposal to MPEG-4 video coding standardization activity [1]. The proposed technique is based on a novel approach to audio-visual data compression entitled dynamic coding. The newly born multimedia environment supports a plethora of applications which cannot be covered adequately by a single compression technique. Dynamic coding offers the opportunity to combine several compression techniques and segmentation strategies. Given a particular application, these two degrees of freedom can be constrained and assembled in order to produce a particular profile which meets the set of specifications dictated by the application. The basic principles of this approach are presented together with the data representation system. The major characteristics of dynamic coding are reviewed, along with simulation results showing the performance of such an approach in a very low bit-rate video coding environment. RAR 872 êáàéò	?

Ñàéò î ñæàòèè >> Ñòàòüè è èñõîäíèêè >>
Ìàòåðèàëû ïî âèäåî

Ñìîòðèòå òàêæå ìàòåðèàëû:
- Ïî öâåòîâûì ïðîñòðàíñòâàì
- Ïî JPEG
- Ïî JPEG-2000

íàâåðõ
Ïîäãîòîâèëè Ñåðãåé Ãðèøèí è Äìèòðèé Âàòîëèí

Âñ¸ î ñæàòèè äàííûõ, èçîáðàæåíèé è âèäåî

Âèäåî Video

Ñðàâíåíèÿ êîäåêîâ MSU Video Codecs Comparisons

Video Quality Measurement Tool (MSU VQMT)

Benchmarks

3D-ï¿½ï¿½ï¿½ï¿½ï¿½ 3D video

Ïðîåêòû

Projects

Êíèãà «Ìåòîäû ñæàòèÿ äàííûõ»

Ðàçäåëû

About Î ñåðâåðå

Íîâîñòè:

Íîâîñòè:

Ñæàòèå âèäåî - Video Indexing & Segmentation

3D è ñòåðåîâèäåî Ïðîåêòû ïî àíàëèçó è îáðàáîòêå 3D/ñòåðåîâèäåî Ñåðèÿ ñòàòåé «Ïî÷åìó îò 3D áîëèò ãîëîâà» (New!) ×àñòü 1 : Íåäîñòàòêè îáîðóäîâàíèÿ ×àñòü 2 : Äèñêîìôîðò èç-çà êà÷åñòâà âèäåî ×àñòü 3 : Ïåðåïóòàííûå ðàêóðñû ×àñòü 4 : Ïàðàëëàêñ ×àñòü 5 : Ãåîìåòðè÷åñêèå èñêàæåíèÿ â ñòåðåî ×àñòü 6 : Èñêàæåíèÿ öâåòà Àíàëèç êà÷åñòâà 3D ôèëüìîâ (New!) Èçìåðåíèå õàðàêòåðèñòèê 3D äèñïëååâ Ìàòèðîâàíèå âèäåî Ñîçäàíèå âèäåî äëÿ 3D äèñïëåå Ñú¸ìêà âèäåî äëÿ 3D äèñïëååâ Ñîçäàíèå êàðò ãëóáèíû äëÿ 3D âèäåî Ìåòðèêè êà÷åñòâà âèäåî Ðàçëè÷íûå ðåàëèçàöèè îáúåêòèâíûõ è ñóáúåêòèâíûõ ìåòðèê êà÷åñòâà âèäåî. MSU Video Quality Measurement Tool - îáúåêòèâíûå ìåòðèêè äëÿ ñðàâíåíèÿ êîäåêîâ è âèäåîôèëüòðîâ. (New!) Implemented metrics short info: PSNR, Delta, MSAD, MSE, SSIM, VQM, MSU Blurring Metric, MSU Blocking Metric Why update Download & Purchase MSU VQMT FAQ MSU Perceptual Quality Metric - íåñêîëüêî ìåòðèê äëÿ èçìåðåíèÿ ñóáúåêòèâíîãî êà÷åñòâà âèäåî (Top!) Êðàòêàÿ èíôîðìàöèÿ ïî ðåàëèçîâàííûì ìåòðèêàì: ITU-R BT.500-11: DSIS, DSCQS I & II, SCACJ; EBU: SAMVIQ; MSUCQE MSU PVQ Metric Tool FAQ	Ñðàâíåíèÿ êîäåêîâ Îáúåêòèâíûå è ñóáúåêòèâíûå ñðàâíåíèÿ êà÷åñòâà âèäåîêîäåêîâ è êîäåêîâ èçîáðàæåíèé. Åæåãîäíîå ñðàâíåíèå âèäåîêîäåêîâ 2022 Åæåãîäíîå ñðàâíåíèå âèäåîêîäåêîâ 2021 Åæåãîäíîå ñðàâíåíèå âèäåîêîäåêîâ 2020 MSU Cloud Benchmark 2020 Ñðàâíåíèå ñåðâèñîâ îáëà÷íîãî êîäèðîâàíèÿ âèäåî Ñðàâíåíèå êîäåêîâ ñòàíäàðòà HEVC/AV1 - 2019 Ñðàâíåíèå êîäåêîâ ñòàíäàðòà HEVC/AV1 - 2018 Ñðàâíåíèå êîäåêîâ ñòàíäàðòà HEVC/AV1 - 2017 Ñðàâíåíèå êîäåêîâ ñòàíäàðòà HEVC - 2016 Ñðàâíåíèå êîäåêîâ ñòàíäàðòà HEVC - 2015 8-å ñðàâíåíèå âèäåîêîäåêîâ MPEG4-AVC/H.264 7-å ñðàâíåíèå âèäåîêîäåêîâ MPEG4-AVC/H.264 6-å ñðàâíåíèå âèäåîêîäåêîâ MPEG4-AVC/H.264 5-å ñðàâíåíèå âèäåîêîäåêîâ MPEG4-AVC/H.264 Àíàëèç âèäåîêîäåêîâ äëÿ êîìïàíèé (àíãë.) Àíàëèç îïöèé âèäåîêîäåêà x264 ñòàíäàðòà MPEG-4 AVC/H.264 4-å åæåãîäíîå ñðàâíåíèå âèäåîêîäåêîâ MPEG4-AVC/H.264 (Top!) Ñðàâíåíèå êîäåêîâ áåç ïîòåðü 2007 3-å åæåãîäíîå ñðàâíåíèå âèäåîêîäåêîâ MPEG4-AVC/H.264 Windows Meda Photo vs JPEG-2000 Ñðàâíåíèå äåêîäåðîâ ñòàíäàðòà MPEG-2 íà ïîòîêàõ ñ îøèáêàìè (Top!) Ñóáúåêòèâíîå ñðàâíåíèå ñîâðåìåííûõ âèäåîêîäåêîâ (Top!) Âòîðîå åæåãîäíîå ñðàâíåíèå âèäåîêîäåêîâ ñòàíäàðòà MPEG-4 AVC/H.264 (Top!) FAQ 2-ãî ñðàâíåíèÿ H.264-x Ïåðâîå åæåãîäíîå ñðàâíåíèå âèäåîêîäåêîâ ñòàíäàðòà MPEG-4 AVC/H.264 FAQ 1-ãî ñðàâíåíèÿ H.264-x Ñðàâíåíèå 9 êîäåðîâ JPEG 2000 (Top!) Ñðàâíåíèå âèäåîêîäåêîâ áåç ïîòåðü 2004 Ñðàâíåíèå êîäåêîâ MPEG-4 SP/ASP (Top!) Ñðàâíåíèå âèäåîêîäåêîâ (ñòàðîå) Ñì. òàêæå (ñòàðûå) ñðàâíåíèÿ àóäèîêîäåêîâ: Ñðàâíåíèå àóäèîêîäåêîâ íà 32 kbps Ñâîäíîå òåñòèðîâàíèå 10 êîäåêîâ ïî ðàçíûì ìåòðèêàì ("òåñò ìåòðèê")
Îáùåäîñòóïíûå âèäåîôèëüòðû Äîñòóïíûå ôèëüòðû äëÿ VirtualDub è AviSynth. Îáû÷íî ìû ðàçðàáàòûâàåì ñåìåéñòâà ôèëüòðîâ. Ñâÿæèòåñü ñ íàìè äëÿ ïîëó÷åíèÿ âåðñèé, îïòèìèçèðîâàííûõ ïîä ASIC/FPGA/DSP. MSU Cartoon Restore MSU Noise Estimation MSU Frame Rate Conversion MSU Image Restoration MSU Denoising (Top!) MSU Old Cinema MSU Deblocking (Top!) MSU Smart Brightness and Contrast (Top!) MSU Smart Sharpen (Top!) MSU Noise generation MSU Noise estimation MSU Motion Estimation Information MSU Subtitles removal MSU Logo removal (Top!) MSU Deflicker (Top!) MSU Field Shift Fixer AviSynth plug-in MSU StegoVideo MSU Cartoonizer (Top!) MSU SmartDeblocking (Top!) MSU Color Enhancement MSU Old Color Restoration MSU TV Commercial Detector ×àñòûå âîïðîñû (Ïðî÷òè ìåíÿ!) Ñòàòèñòèêà ôèëüòðîâ	Êîììåð÷åñêèå âèäåîôèëüòðû Ìû ðàáîòàåì ñ Intel, Samsung, Real Networks è äðóãèìè êîìïàíèÿìè íàä àäàïòàöèåé íàøèõ ôèëüòðîâ äëÿ ñïåöèôè÷åñêèõ âèäåîïîòîêîâ è àïïàðàòíîãî îáåñïå÷åíèÿ, òàêîãî êàê òåëåâèçîðû, âèäåîêàðòû è ò.ä. Ñâÿæèòåñü ñ íàìè, åñëè Âàì íóæíà ëèöåíçèÿ íà òàêèå ôèëüòðû. 3D Displays Video Generation 3D Displays Video Capturing Stereo Video Depth Map Generation Automatic Objects Segmentation Semiautomatic Objects Segmentation New Frame Rate Conversion New Deinterlacer MSU-Samsung Deinterlacing Project Digital TV Signal Enhancement Old Film Recovery Tuner TV Restore Panorama Video2Photo SuperResolution SuperPrecision High quality image and video resampling Frame Rate Conversion Motion Phase filter Deshaker (video stabilization) Film Grain/Degrain filter Deblurring filter Fast/Hi-Fi Face Detection
Ïðîåêòû ïî âèäåîêîäåêàì Ïðîåêòû ïî èññëåäîâàíèþ è ðàçðàáîòêå âèäåîêîäåêîâ. MSU Lossless Video Codec (Top!) MSU Screen Capture Lossless Codec (Top!) MSU MPEG-2 Video Codec Óëó÷øåíèå êîäåêà x264 FAQ ïî êîäåêó x264	Ðàçíîå Äðóãàÿ èíôîðìàöèÿ. Îêîëî 1000 ñëàéäîâ ëåêöèé êóðñà "Ìåòîäû ñæàòèÿ è îáðàáîòêè ìåäèàäàííûõ" (Top!) Ñòðàíèöà êóðñà íà ÂÌèÊ ÌÃÓ "Ìåòîäû ñæàòèÿ è îáðàáîòêè âèäåî" Ïîäêàñòû êóðñà "Ìåòîäû ñæàòèÿ è îáðàáîòêè âèäåî" (íåêîòîðûå ëåêöèè) Ëó÷øèå äîêëàäû ñåìèíàðà âèäåîãðóïïû (óñêîðåíèå íà GPU, 3D âèäåî, îïòè÷åñêèé ïîòîê è ò.ä.) Ñòðàíèöà ñòàðîãî êóðñà íà ÂÌèÊ ÌÃÓ "Ìåòîäû ñæàòèÿ äàííûõ" Crazy gallery (ïðèêîëû ñ ôèëüòðàìè :-) Ëèöåíçèÿ íà ÏÎ Âèäåîãðóïïû
Ñòàòüè ïî òåìàì Ôèëüòðàöèÿ: Deinterlacing Ôèëüòðàöèÿ: Øóìîïîäàâëåíèå Ôèëüòðàöèÿ: Ïðåäîáðàáîòêà âèäåîïîòîêà Ôèëüòðàöèÿ: Ïîñòîáðàáîòêà âèäåîïîòîêà Ôèëüòðàöèÿ: Ïðåîáðàçîâàíèå ÷àñòîòû êàäðîâ Ôèëüòðàöèÿ: Ìàñøòàáèðîâàíèå âèäåî Ïðåîáðàçîâàíèÿ: Èñïîëüçîâàíèå DCT è àëüòåðíàòèâíûõ ïðåîáðàçîâàíèé Ïðåîáðàçîâàíèÿ: Êâàíòîâàíèå Ïðåîáðàçîâàíèÿ: Wavelet äëÿ âèäåî Bitrate: Óïðàâëåíèå áèòðåéòîì Bitrate: Ñèëüíîå ñæàòèå (Low Bitrate) Ìåòðèêè: Îáùèå ñâåäåíèÿ è ìåòðèêè äëÿ èçîáðàæåíèé Ìåòðèêè: Èçìåðåíèå êà÷åñòâà âèäåî	Hardware: Àïïàðàòíîå óñêîðåíèå âèäåî Hardware: Îñîáåííîñòè ðàáîòû íà ðàçíûõ äèñïëåÿõ Ýíòðîïèéíîå ñæàòèå è âèäåî Êîìïåíñàöèÿ äâèæåíèÿ Ïðåîáðàçîâàíèå ôîðìàòîâ âèäåî Ñåãìåíòàöèÿ è èíäåêñèðîâàíèå âèäåî Ïîòîêîâàÿ ïåðåäà÷à âèäåî Ñòåðåîñêîïè÷åcêîå è 3D âèäåî "Îáúåêòíûé" ïîäõîä ê âèäåî Ðàçíîå