Alltid gratis frakt*

Kundene elsker oss

Norges billigste bøker

Meny

Meny

✕

Se alt i Bøker

Sjangere

Skjønnlitteratur

Krim og spenning

Hobby og fritid

Bedrift og læring

Historie og samfunn

Kunst og kultur

Lister

Engelske bøker

Kommende bøker

Forfattere

View all in Authors

Top 12 authors

George R. R. Martin

Kristina Ohlsson

James Dashner

Tarjei Vesaas

Heine T. Bakkeid

Heine Bakkeid

Hjem
Bøker
Serier
Synthesis Lectures on Image, Video, and Multimedia Processing

Bøker i Synthesis Lectures on Image, Video, and Multimedia Processing-serien

Filter

Filter

✕

Sorter etterSorter Serierekkefølge

Serierekkefølge
Populære
Nyeste
Pris

Joint Source-Channel Video Transmission

av Fan Zhai

388

This book deals with the problem of joint source-channel video transmission, i.e., the joint optimal allocation of resources at the application layer and the other network layers, such as data rate adaptation, channel coding, power adaptation in wireless networks, quality of service (QoS) support from the network, and packet scheduling, for efficient video transmission. Real-time video communication applications, such as videoconferencing, video telephony, and on-demand video streaming, have gained increased popularity. However, a key problem in video transmission over the existing Internet and wireless networks is the incompatibility between the nature of the network conditions and the QoS requirements (in terms, for example, of bandwidth, delay, and packet loss) of real-time video applications. To deal with this incompatibility, a natural approach is to adapt the end-system to the network. The joint source-channel coding approach aims to efficiently perform content-aware cross-layer resource allocation, thus increasing the communication efficiency of multiple network layers. Our purpose in this book is to review the basic elements of the state-of-the-art approaches toward joint source-channel video transmission for wired and wireless systems. In this book, we present a general resource-distortion optimization framework, which is used throughout the book to guide our discussions on various techniques of joint source-channel video transmission. In this framework, network resources from multiple layers are assigned to each video packet according to its level of importance. It provides not only an optimization benchmark against which the performance of other sub-optimal systems can be evaluated, but also a useful tool for assessing the effectiveness of different error control components in practical system design. This book is therefore written to be accessible to researchers, expert industrial R&D engineers, and university students who are interested in the cutting edge technologies in joint source-channel video transmission. Contents: Introduction / Elements of a Video Communication System / Joint Source-Channel Coding / Error-Resilient Video Coding / Channel Modeling and Channel Coding / Internet Video Transmission / Wireless Video Transmission / Conclusions
- Bok
- 388
Se mer
Image Fusion in Remote Sensing

av Nasser Kehtarnavaz & Arian Azarang

333,-
- Bok
- 333,-
Se mer
Multimodal Learning toward Micro-Video Understanding

av Liqiang Nie

766,-

Micro-videos, a new form of user-generated contents, have been spreading widely across various social platforms, such as Vine, Kuaishou, and Tik Tok. Different from traditional long videos, micro-videos are usually recorded by smart mobile devices at any place within a few seconds. Due to its brevity and low bandwidth cost, micro-videos are gaining increasing user enthusiasm. The blossoming of micro-videos opens the door to the possibility of many promising applications, ranging from network content caching to online advertising. Thus, it is highly desirable to develop an effective scheme for the high-order micro-video understanding.Micro-video understanding is, however, non-trivial due to the following challenges:(1) how to represent micro-videos that only convey one or few high-level themes or concepts; (2) how to utilize the hierarchical structure of the venue categories to guide the micro-video analysis; (3) how to alleviate the influence of low-quality caused by complex surrounding environments and the camera shake; (4) how to model the multimodal sequential data, {i.e.}, textual, acoustic, visual, and social modalities, to enhance the micro-video understanding; and (5) how to construct large-scale benchmark datasets for the analysis? These challenges have been largely unexplored to date. In this book, we focus on addressing the challenges presented above by proposing some state-of-the-art multimodal learning theories. To demonstrate the effectiveness of these models, we apply them to three practical tasks of micro-video understanding: popularity prediction, venue category estimation, and micro-video routing. Particularly, we first build three large-scale real-world micro-video datasets for these practical tasks. We then present a multimodal transductive learning framework for micro-video popularity prediction. Furthermore, we introduce several multimodal cooperative learning approaches and a multimodal transfer learning scheme for micro-video venue category estimation. Meanwhile, we develop a multimodal sequential learning approach for micro-video recommendation. Finally, we conclude the book and figure out the future research directions in multimodal learning toward micro-video understanding.
- Bok
- 766,-
Se mer
Dictionary Learning in Visual Computing

av Qiang Zhang

585

The last few years have witnessed fast development on dictionary learning approaches for a set of visual computing tasks, largely due to their utilization in developing new techniques based on sparse representation. Compared with conventional techniques employing manually defined dictionaries, such as Fourier Transform and Wavelet Transform, dictionary learning aims at obtaining a dictionary adaptively from the data so as to support optimal sparse representation of the data. In contrast to conventional clustering algorithms like K-means, where a data point is associated with only one cluster center, in a dictionary-based representation, a data point can be associated with a small set of dictionary atoms. Thus, dictionary learning provides a more flexible representation of data and may have the potential to capture more relevant features from the original feature space of the data. One of the early algorithms for dictionary learning is K-SVD. In recent years, many variations/extensions of K-SVD and other new algorithms have been proposed, with some aiming at adding discriminative capability to the dictionary, and some attempting to model the relationship of multiple dictionaries. One prominent application of dictionary learning is in the general field of visual computing, where long-standing challenges have seen promising new solutions based on sparse representation with learned dictionaries. With a timely review of recent advances of dictionary learning in visual computing, covering the most recent literature with an emphasis on papers after 2008, this book provides a systematic presentation of the general methodologies, specific algorithms, and examples of applications for those who wish to have a quick start on this subject.
- Bok
- 585
Se mer
Combating Bad Weather Part II

av Sudipta Mukhopadhyay

330

Every year lives and properties are lost in road accidents. About one-fourth of these accidents are due to low vision in foggy weather. At present, there is no algorithm that is specifically designed for the removal of fog from videos. Application of a single-image fog removal algorithm over each video frame is a time-consuming and costly affair. It is demonstrated that with the intelligent use of temporal redundancy, fog removal algorithms designed for a single image can be extended to the real-time video application. Results confirm that the presented framework used for the extension of the fog removal algorithms for images to videos can reduce the complexity to a great extent with no loss of perceptual quality. This paves the way for the real-life application of the video fog removal algorithm. In order to remove fog, an efficient fog removal algorithm using anisotropic diffusion is developed. The presented fog removal algorithm uses new dark channel assumption and anisotropic diffusion for the initialization and refinement of the airlight map, respectively. Use of anisotropic diffusion helps to estimate the better airlight map estimation. The said fog removal algorithm requires a single image captured by uncalibrated camera system. The anisotropic diffusion-based fog removal algorithm can be applied in both RGB and HSI color space. This book shows that the use of HSI color space reduces the complexity further. The said fog removal algorithm requires pre- and post-processing steps for the better restoration of the foggy image. These pre- and post-processing steps have either data-driven or constant parameters that avoid the user intervention. Presented fog removal algorithm is independent of the intensity of the fog, thus even in the case of the heavy fog presented algorithm performs well. Qualitative and quantitative results confirm that the presented fog removal algorithm outperformed previous algorithms in terms of perceptual quality, color fidelity and execution time. The work presented in this book can find wide application in entertainment industries, transportation, tracking and consumer electronics.
- Bok
- 330
Se mer
Combating Bad Weather Part I

av Sudipta Mukhopadhyay

522,-

Current vision systems are designed to perform in normal weather condition. However, no one can escape from severe weather conditions. Bad weather reduces scene contrast and visibility, which results in degradation in the performance of various computer vision algorithms such as object tracking, segmentation and recognition. Thus, current vision systems must include some mechanisms that enable them to perform up to the mark in bad weather conditions such as rain and fog. Rain causes the spatial and temporal intensity variations in images or video frames. These intensity changes are due to the random distribution and high velocities of the raindrops. Fog causes low contrast and whiteness in the image and leads to a shift in the color. This book has studied rain and fog from the perspective of vision. The book has two main goals: 1) removal of rain from videos captured by a moving and static camera, 2) removal of the fog from images and videos captured by a moving single uncalibrated camera system. The book begins with a literature survey. Pros and cons of the selected prior art algorithms are described, and a general framework for the development of an efficient rain removal algorithm is explored. Temporal and spatiotemporal properties of rain pixels are analyzed and using these properties, two rain removal algorithms for the videos captured by a static camera are developed. For the removal of rain, temporal and spatiotemporal algorithms require fewer numbers of consecutive frames which reduces buffer size and delay. These algorithms do not assume the shape, size and velocity of raindrops which make it robust to different rain conditions (i.e., heavy rain, light rain and moderate rain). In a practical situation, there is no ground truth available for rain video. Thus, no reference quality metric is very useful in measuring the efficacy of the rain removal algorithms. Temporal variance and spatiotemporal variance are presented in this book as no reference quality metrics. An efficient rain removal algorithm using meteorological properties of rain is developed. The relation among the orientation of the raindrops, wind velocity and terminal velocity is established. This relation is used in the estimation of shape-based features of the raindrop. Meteorological property-based features helped to discriminate the rain and non-rain pixels. Most of the prior art algorithms are designed for the videos captured by a static camera. The use of global motion compensation with all rain removal algorithms designed for videos captured by static camera results in better accuracy for videos captured by moving camera. Qualitative and quantitative results confirm that probabilistic temporal, spatiotemporal and meteorological algorithms outperformed other prior art algorithms in terms of the perceptual quality, buffer size, execution delay and system cost. The work presented in this book can find wide application in entertainment industries, transportation, tracking and consumer electronics. Table of Contents: Acknowledgments / Introduction / Analysis of Rain / Dataset and Performance Metrics / Important Rain Detection Algorithms / Probabilistic Approach for Detection and Removal of Rain / Impact of Camera Motion on Detection of Rain / Meteorological Approach for Detection and Removal of Rain from Videos / Conclusion and Scope of Future Work / Bibliography / Authors' Biographies
- Bok
- 522,-
Se mer
Image Understanding using Sparse Representations

av Jayaraman J. Thiagarajan

388

Image understanding has been playing an increasingly crucial role in several inverse problems and computer vision. Sparse models form an important component in image understanding, since they emulate the activity of neural receptors in the primary visual cortex of the human brain. Sparse methods have been utilized in several learning problems because of their ability to provide parsimonious, interpretable, and efficient models. Exploiting the sparsity of natural signals has led to advances in several application areas including image compression, denoising, inpainting, compressed sensing, blind source separation, super-resolution, and classification. The primary goal of this book is to present the theory and algorithmic considerations in using sparse models for image understanding and computer vision applications. To this end, algorithms for obtaining sparse representations and their performance guarantees are discussed in the initial chapters. Furthermore, approaches for designing overcomplete, data-adapted dictionaries to model natural images are described. The development of theory behind dictionary learning involves exploring its connection to unsupervised clustering and analyzing its generalization characteristics using principles from statistical learning theory. An exciting application area that has benefited extensively from the theory of sparse representations is compressed sensing of image and video data. Theory and algorithms pertinent to measurement design, recovery, and model-based compressed sensing are presented. The paradigm of sparse models, when suitably integrated with powerful machine learning frameworks, can lead to advances in computer vision applications such as object recognition, clustering, segmentation, and activity recognition. Frameworks that enhance the performance of sparse models in such applications by imposing constraints based on the prior discriminatory information and the underlying geometrical structure, and kernelizing the sparse coding and dictionary learning methods are presented. In addition to presenting theoretical fundamentals in sparse learning, this book provides a platform for interested readers to explore the vastly growing application domains of sparse representations.
- Bok
- 388
Se mer
Contextual Analysis of Videos

av Myo Thida

388

Video context analysis is an active and vibrant research area, which provides means for extracting, analyzing and understanding behavior of a single target and multiple targets. Over the last few decades, computer vision researchers have been working to improve the accuracy and robustness of algorithms to analyse the context of a video automatically. In general, the research work in this area can be categorized into three major topics: 1) counting number of people in the scene 2) tracking individuals in a crowd and 3) understanding behavior of a single target or multiple targets in the scene. This book focusses on tracking individual targets and detecting abnormal behavior of a crowd in a complex scene. Firstly, this book surveys the state-of-the-art methods for tracking multiple targets in a complex scene and describes the authors' approach for tracking multiple targets. The proposed approach is to formulate the problem of multi-target tracking as an optimization problem of finding dynamic optima (pedestrians) where these optima interact frequently. A novel particle swarm optimization (PSO) algorithm that uses a set of multiple swarms is presented. Through particles and swarms diversification, motion prediction is introduced into the standard PSO, constraining swarm members to the most likely region in the search space. The social interaction among swarm and the output from pedestrians-detector are also incorporated into the velocity-updating equation. This allows the proposed approach to track multiple targets in a crowded scene with severe occlusion and heavy interactions among targets. The second part of this book discusses the problem of detecting and localising abnormal activities in crowded scenes. We present a spatio-temporal Laplacian Eigenmap method for extracting different crowd activities from videos. This method learns the spatial and temporal variations of local motions in an embedded space and employs representatives of different activities to construct the model which characterises the regular behavior of a crowd. This model of regular crowd behavior allows for the detection of abnormal crowd activities both in local and global context and the localization of regions which show abnormal behavior.
- Bok
- 388
Se mer
Wavelet Image Compression

av William Pearlman

353,-

This book explains the stages necessary to create a wavelet compression system for images and describes state-of-the-art systems used in image compression standards and current research. It starts with a high level discussion of the properties of the wavelet transform, especially the decomposition into multi-resolution subbands. It continues with an exposition of the null-zone, uniform quantization used in most subband coding systems and the optimal allocation of bitrate to the different subbands. Then the image compression systems of the FBI Fingerprint Compression Standard and the JPEG2000 Standard are described in detail. Following that, the set partitioning coders SPECK and SPIHT, and EZW are explained in detail and compared via a fictitious wavelet transform in actions and number of bits coded in a single pass in the top bit plane. The presentation teaches that, besides producing efficient compression, these coding systems, except for the FBI Standard, are capable of writing bit streams that have attributes of rate scalability, resolution scalability, and random access decoding. Many diagrams and tables accompany the text to aid understanding. The book is generous in pointing out references and resources to help the reader who wishes to expand his knowledge, know the origins of the methods, or find resources for running the various algorithms or building his own coding system. Table of Contents: Introduction / Characteristics of the Wavelet Transform / Generic Wavelet-based Coding Systems / The FBI Fingerprint Image Compression Standard / Set Partition Embedded Block (SPECK) Coding / Tree-based Wavelet Transform Coding Systems / Rate Control for Embedded Block Coders / Conclusion
- Bok
- 353,-
Se mer
Remote Sensing Image Processing

av Gustavo Camps-Valls

388

Earth observation is the field of science concerned with the problem of monitoring and modeling the processes on the Earth surface and their interaction with the atmosphere. The Earth is continuously monitored with advanced optical and radar sensors. The images are analyzed and processed to deliver useful products to individual users, agencies and public administrations. To deal with these problems, remote sensing image processing is nowadays a mature research area, and the techniques developed in the field allow many real-life applications with great societal value. For instance, urban monitoring, fire detection or flood prediction can have a great impact on economical and environmental issues. To attain such objectives, the remote sensing community has turned into a multidisciplinary field of science that embraces physics, signal theory, computer science, electronics and communications. From a machine learning and signal/image processing point of view, all the applications are tackled under specific formalisms, such as classification and clustering, regression and function approximation, data coding, restoration and enhancement, source unmixing, data fusion or feature selection and extraction. This book covers some of the fields in a comprehensive way. Table of Contents: Remote Sensing from Earth Observation Satellites / The Statistics of Remote Sensing Images / Remote Sensing Feature Selection and Extraction / Classification / Spectral Mixture Analysis / Estimation of Physical Parameters
- Bok
- 388
Se mer
The Structure and Properties of Color Spaces and the Representation of Color Images

av Eric Dubois

388

This lecture describes the author's approach to the representation of color spaces and their use for color image processing. The lecture starts with a precise formulation of the space of physical stimuli (light). The model includes both continuous spectra and monochromatic spectra in the form of Dirac deltas. The spectral densities are considered to be functions of a continuous wavelength variable. This leads into the formulation of color space as a three-dimensional vector space, with all the associated structure. The approach is to start with the axioms of color matching for normal human viewers, often called Grassmann's laws, and developing the resulting vector space formulation. However, once the essential defining element of this vector space is identified, it can be extended to other color spaces, perhaps for different creatures and devices, and dimensions other than three. The CIE spaces are presented as main examples of color spaces. Many properties of the color space are examined. Once the vector space formulation is established, various useful decompositions of the space can be established. The first such decomposition is based on luminance, a measure of the relative brightness of a color. This leads to a direct-sum decomposition of color space where a two-dimensional subspace identifies the chromatic attribute, and a third coordinate provides the luminance. A different decomposition involving a projective space of chromaticity classes is then presented. Finally, it is shown how the three types of color deficiencies present in some groups of humans leads to a direct-sum decomposition of three one-dimensional subspaces that are associated with the three types of cone photoreceptors in the human retina. Next, a few specific linear and nonlinear color representations are presented. The color spaces of two digital cameras are also described. Then the issue of transformations between different color spaces is addressed. Finally, these ideas are applied to signal and system theory for color images. This is done using a vector signal approach where a general linear system is represented by a three-by-three system matrix. The formulation is applied to both continuous and discrete space images, and specific problems in color filter array sampling and displays are presented for illustration. The book is mainly targeted to researchers and graduate students in fields of signal processing related to any aspect of color imaging.
- Bok
- 388
Se mer
Super Resolution of Images and Video

av Aggelos K. Katsaggelos

388

This book focuses on the super resolution of images and video. The authors' use of the term super resolution (SR) is used to describe the process of obtaining a high resolution (HR) image, or a sequence of HR images, from a set of low resolution (LR) observations. This process has also been referred to in the literature as resolution enhancement (RE). SR has been applied primarily to spatial and temporal RE, but also to hyperspectral image enhancement. This book concentrates on motion based spatial RE, although the authors also describe motion free and hyperspectral image SR problems. Also examined is the very recent research area of SR for compression, which consists of the intentional downsampling, during pre-processing, of a video sequence to be compressed and the application of SR techniques, during post-processing, on the compressed sequence. It is clear that there is a strong interplay between the tools and techniques developed for SR and a number of other inverse problems encountered in signal processing (e.g., image restoration, motion estimation). SR techniques are being applied to a variety of fields, such as obtaining improved still images from video sequences (video printing), high definition television, high performance color Liquid Crystal Display (LCD) screens, improvement of the quality of color images taken by one CCD, video surveillance, remote sensing, and medical imaging. The authors believe that the SR/RE area has matured enough to develop a body of knowledge that can now start to provide useful and practical solutions to challenging real problems and that SR techniques can be an integral part of an image and video codec and can drive the development of new coder-decoders (codecs) and standards.
- Bok
- 388
Se mer
Tensor Voting

av Philippos Mordohai

388

This lecture presents research on a general framework for perceptual organization that was conducted mainly at the Institute for Robotics and Intelligent Systems of the University of Southern California. It is not written as a historical recount of the work, since the sequence of the presentation is not in chronological order. It aims at presenting an approach to a wide range of problems in computer vision and machine learning that is data-driven, local and requires a minimal number of assumptions. The tensor voting framework combines these properties and provides a unified perceptual organization methodology applicable in situations that may seem heterogeneous initially. We show how several problems can be posed as the organization of the inputs into salient perceptual structures, which are inferred via tensor voting. The work presented here extends the original tensor voting framework with the addition of boundary inference capabilities; a novel re-formulation of the framework applicable to high-dimensional spaces and the development of algorithms for computer vision and machine learning problems. We show complete analysis for some problems, while we briefly outline our approach for other applications and provide pointers to relevant sources.
- Bok
- 388
Se mer
Light Field Sampling

av Cha Zhang

388

Light field is one of the most representative image-based rendering techniques that generate novel virtual views from images instead of 3D models. The light field capture and rendering process can be considered as a procedure of sampling the light rays in the space and interpolating those in novel views. As a result, light field can be studied as a high-dimensional signal sampling problem, which has attracted a lot of research interest and become a convergence point between computer graphics and signal processing, and even computer vision. This lecture focuses on answering two questions regarding light field sampling, namely how many images are needed for a light field, and if such number is limited, where we should capture them. The book can be divided into three parts. First, we give a complete analysis on uniform sampling of IBR data. By introducing the surface plenoptic function, we are able to analyze the Fourier spectrum of non-Lambertian and occluded scenes. Given the spectrum, we also apply the generalized sampling theorem on the IBR data, which results in better rendering quality than rectangular sampling for complex scenes. Such uniform sampling analysis provides general guidelines on how the images in IBR should be taken. For instance, it shows that non-Lambertian and occluded scenes often require a higher sampling rate. Next, we describe a very general sampling framework named freeform sampling. Freeform sampling handles three kinds of problems: sample reduction, minimum sampling rate to meet an error requirement, and minimization of reconstruction error given a fixed number of samples. When the to-be-reconstructed function values are unknown, freeform sampling becomes active sampling. Algorithms of active sampling are developed for light field and show better results than the traditional uniform sampling approach. Third, we present a self-reconfigurable camera array that we developed, which features a very efficient algorithm for real-time rendering and the ability of automatically reconfiguring the cameras to improve the rendering quality. Both are based on active sampling. Our camera array is able to render dynamic scenes interactively at high quality. To the best of our knowledge, it is the first camera array that can reconfigure the camera positions automatically.
- Bok
- 388
Se mer
Real-Time Image and Video Processing

av Nasser Kehtarnavaz

411

This book presents an overview of the guidelines and strategies for transitioning an image or video processing algorithm from a research environment into a real-time constrained environment. Such guidelines and strategies are scattered in the literature of various disciplines including image processing, computer engineering, and software engineering, and thus have not previously appeared in one place. By bringing these strategies into one place, the book is intended to serve the greater community of researchers, practicing engineers, industrial professionals, who are interested in taking an image or video processing algorithm from a research environment to an actual real-time implementation on a resource constrained hardware platform. These strategies consist of algorithm simplifications, hardware architectures, and software methods. Throughout the book, carefully selected representative examples from the literature are presented to illustrate the discussed concepts. After reading the book, the readers are exposed to a wide variety of techniques and tools, which they can then employ to design a real-time image or video processing system.
- Bok
- 411
Se mer
MPEG-4 Beyond Conventional Video Coding

av Mihaela Van Der Schaar

353,-

An important merit of the MPEG-4 video standard is that it not only provided tools and algorithms for enhancing the compression efficiency of existing MPEG-2 and H.263 standards but also contributed key innovative solutions for new multimedia applications such as real-time video streaming to PCs and cell phones over Internet and wireless networks, interactive services, and multimedia access. Many of these solutions are currently used in practice or have been important stepping-stones for new standards and technologies. In this book, we do not aim at providing a complete reference for MPEG-4 video as many excellent references on the topic already exist. Instead, we focus on three topics that we believe formed key innovations of MPEG-4 video and that will continue to serve as an inspiration and basis for new, emerging standards, products, and technologies. The three topics highlighted in this book are object-based coding and scalability, Fine Granularity Scalability, and error resilience tools. This book is aimed at engineering students as well as professionals interested in learning about these MPEG-4 technologies for multimedia streaming and interaction. Finally, it is not aimed as a substitute or manual for the MPEG-4 standard, but rather as a tutorial focused on the principles and algorithms underlying it.
- Bok
- 353,-
Se mer
Modern Image Quality Assessment

av Zhou Wang

388

This Lecture book is about objective image quality assessment-where the aim is to provide computational models that can automatically predict perceptual image quality. The early years of the 21st century have witnessed a tremendous growth in the use of digital images as a means for representing and communicating information. A considerable percentage of this literature is devoted to methods for improving the appearance of images, or for maintaining the appearance of images that are processed. Nevertheless, the quality of digital images, processed or otherwise, is rarely perfect. Images are subject to distortions during acquisition, compression, transmission, processing, and reproduction. To maintain, control, and enhance the quality of images, it is important for image acquisition, management, communication, and processing systems to be able to identify and quantify image quality degradations. The goals of this book are as follows; a) to introduce the fundamentals of image quality assessment, and to explain the relevant engineering problems, b) to give a broad treatment of the current state-of-the-art in image quality assessment, by describing leading algorithms that address these engineering problems, and c) to provide new directions for future research, by introducing recent models and paradigms that significantly differ from those used in the past. The book is written to be accessible to university students curious about the state-of-the-art of image quality assessment, expert industrial R&D engineers seeking to implement image/video quality assessment systems for specific applications, and academic theorists interested in developing new algorithms for image quality assessment or using existing algorithms to design or optimize other image processing applications.
- Bok
- 388
Se mer
Recognition of Humans and Their Activities Using Video

av Rama Chellappa

353,-

The recognition of humans and their activities from video sequences is currently a very active area of research because of its applications in video surveillance, design of realistic entertainment systems, multimedia communications, and medical diagnosis. In this lecture, we discuss the use of face and gait signatures for human identification and recognition of human activities from video sequences. We survey existing work and describe some of the more well-known methods in these areas. We also describe our own research and outline future possibilities. In the area of face recognition, we start with the traditional methods for image-based analysis and then describe some of the more recent developments related to the use of video sequences, 3D models, and techniques for representing variations of illumination. We note that the main challenge facing researchers in this area is the development of recognition strategies that are robust to changes due to pose, illumination, disguise, and aging. Gait recognition is a more recent area of research in video understanding, although it has been studied for a long time in psychophysics and kinesiology. The goal for video scientists working in this area is to automatically extract the parameters for representation of human gait. We describe some of the techniques that have been developed for this purpose, most of which are appearance based. We also highlight the challenges involved in dealing with changes in viewpoint and propose methods based on image synthesis, visual hull, and 3D models. In the domain of human activity recognition, we present an extensive survey of various methods that have been developed in different disciplines like artificial intelligence, image processing, pattern recognition, and computer vision. We then outline our method for modeling complex activities using 2D and 3D deformable shape theory. The wide application of automatic human identification and activity recognition methods will require the fusion of different modalities like face and gait, dealing with the problems of pose and illumination variations, and accurate computation of 3D models. The last chapter of this lecture deals with these areas of future research.
- Bok
- 353,-
Se mer
Biomedical Image Analysis

av Nilanjan Ray & Scott Acton

388
- Bok
- 388
Se mer

Gjør som tusenvis av andre bokelskere

Abonner på vårt nyhetsbrev og få rabatter og inspirasjon til din neste leseopplevelse.

Ved å abonnere godtar du vår personvernerklæring.
Du kan når som helst melde deg av våre nyhetsbrev.

Informasjon
Om Tales
Kjøpsvilkår
Personvernerklæring

Kundeservice
Hjelpesenter
Spor bestilling
Retur

Follow Tales.no
Facebook
Instagram

© 2025 Tales.no

${{FOOTER_ICONS_HELTHJEM_ALT}}$