Morning Tutorials

T1: Signal-Dependent and Correlated Noise in Imaging: from Modelling to Parameter Estimation and Practical Filtering

T2: Perceptual Metrics for Image and Video Quality in a Broader Context: From Perceptual Transparency to Structural Equivalence

T4: Interpretable Deep Learning: Towards Understanding & Explaining Deep Neural Networks

T5: Drone Vision for Cinematography and Media Production

Afternoon Tutorials

T6: Restoration and Recognition of Blurred Images

T7: Geometry-preserving Embeddings: Dimensionality Reduction Techniques for Information Representation

T9: Machine Learning meets Computational Imaging: Big Data Analytics for Earth Observation

Morning Session

T1: Signal-Dependent and Correlated Noise in Imaging: from Modelling to Parameter Estimation and Practical Filtering

Presenters: Alessandro Foi (Tampere University of Technology, Finland)

Alessandro Foi received the M.Sc. degree in Mathematics from the Università degli Studi di Milano, Italy, in 2001, the Ph.D. degree in Mathematics from the Politecnico di Milano in 2005, and the D.Sc.Tech. degree in Signal Processing from Tampere University of Technology, Finland, in 2007. He is an Associate Professor of Signal Processing at Tampere University of Technology, leading the Signal and Image Restoration group.

His research interests include mathematical and statistical methods for signal processing, functional and harmonic analysis, and computational modelling of the human visual system. His work focuses on spatially adaptive (anisotropic, nonlocal) algorithms for the restoration and enhancement of digital images, on noise modelling for imaging devices, and on the optimal design of statistical transformations for the stabilization, normalization, and analysis of random data.

He is a Senior Member of the IEEE, Member of the Image, Video, and Multidimensional Signal Processing Technical Committee of the IEEE Signal Processing Society, and an Associate Editor for the IEEE Transactions on Computational Imaging and for the SIAM Journal on Imaging Sciences.


The additive white Gaussian noise (AWGN) model is ubiquitous in signal processing. This model is often justified by central-limit theorem (CLT) arguments. However, whereas the CLT may support a Gaussian distribution for the errors, it does not provide any justification for the assumed additivity and whiteness. As a matter of fact, data acquired in real applications can seldom be described with good approximation by the AWGN model, especially because errors are typically correlated and not additive. Failure to model accurately the noise leads to misleading analysis, ineffective filtering, and distortion or even failure in the estimation.

This tutorial provides an introduction to both signal-dependent and correlated noise and to the relevant models and methods for the analysis and practical processing of signals corrupted by these types of noise. Special emphasis is placed on effective techniques for noise suppression, and in particular on the recent developments in the optimal design of forward and inverse variance-stabilizing transformations, and their application to extreme low-energy acquisition under spatially correlated noise.

The distribution families covered as leading examples in the tutorial include Poisson, Rayleigh, Rice, multiplicative families, as well as doubly censored distributions. We consider various form of noise correlation, encompassing pixel and read-out cross-talk, fixed-pattern noise, column noise, etc., as well as related issues like photo-response and gain non-uniformities, and processing-induced noise correlation. Consequently, the introduced models and techniques are applicable to several important imaging scenarios and technologies, such as raw data from digital camera sensors, various types of radiation imaging relevant to security and to biomedical imaging, ultrasound and seismic sensing, magnetic resonance imaging, synthetic aperture radar imaging, photon-limited imaging of faint astronomical sources, microbolometer arrays for long-wavelength infrared imaging in thermography, etc.

The tutorial is accompanied by numerous experimental examples where the presented methods are applied to competitive signal processing problems, often achieving the state of the art in image and multidimensional data restoration.

T2: Perceptual Metrics for Image and Video Quality in a Broader Context: From Perceptual Transparency to Structural Equivalence

Presenters: Thrasyvoulos N. Pappas (Northwestern University, USA), Sheila S. Hemami (Draper, USA)

Thrasyvoulos N. Pappas received the S.B., S.M., and Ph.D. degrees in electrical engineering and computer science from MIT in 1979, 1982, and 1987, respectively. From 1987 until 1999, he was a Member of the Technical Staff at Bell Laboratories, Murray Hill, NJ. He is currently a professor in the Department of Electrical and Computer Engineering at Northwestern University, which he joined in 1999. His research interests are in image and video quality and compression, image and video analysis, content-based retrieval, perceptual models for multimedia processing, model-based halftoning, and tactile and multimodal interfaces. Prof. Pappas has served as Vice-President Publications, IEEE Signal Processing Society (2015-1017), editor-in-chief of the IEEE Transactions on Image Processing (2010-12), elected member of the Board of Governors of the Signal Processing Society of IEEE (2004-06), chair of the IEEE Image and Multidimensional Signal Processing (now IVMSP) Technical Committee (2002-03), technical program co-chair of ICIP-01 and ICIP-09, and co-chair of the 2011 IEEE IVMSP Workshop on Perception and Visual Analysis. He has also served as co-chair of the 2005 SPIE/IS&T Electronic Imaging Symposium and co-chair of the SPIE/IS&T Conference on Human Vision and Electronic Imaging (1997-2018). He is currently co-editor-in-chief of the IS&T Journal of Perceptual Imaging. Dr. Pappas is a Fellow of IEEE, SPIE, and IS&T.

Sheila S. Hemami received the B.S.E.E. degree from the University of Michigan in 1990, and the M.S.E.E. and Ph.D. degrees from Stanford University in 1992 and 1994, respectively. She was with Hewlett-Packard Laboratories in Palo Alto, California in 1994 and was with the School of Electrical Engineering at Cornell University from 1995-2013. From 2013 to 2016 she was Professor and Chair of the Department of Electrical & Computer Engineering at Northeastern University in Boston, MA. She is currently Director of Strategic Technical Opportunities at Draper Lab. Dr. Hemami's research interests broadly concern communication of visual information from the perspectives of both signal processing and psychophysics. She was elected a Fellow of the IEEE in 2009 for her for contributions to robust and perceptual image and video communications. Dr. Hemami has held various visiting positions, most recently at the University of Nantes, France and at Ecole Polytechnique Federale de Lausanne, Switzerland. She has received numerous university and national teaching awards, including Eta Kappa Nu's C. Holmes MacDonald Award. She served as Vice-President Publications Products and Services, IEEE (2015-2016). She was a Distinguished Lecturer for the IEEE Signal Processing Society in 2010-11, was editor-in-chief for the IEEE Transactions on Multimedia from 2008-10. She has held various technical leadership positions in the IEEE.


We will examine objective criteria for the evaluation of image quality that are based on models of visual perception. Our primary emphasis will be on image fidelity, i.e., how close an image is to a given original or reference image, but we will broaden the scope of image fidelity to include structural equivalence. We will also discuss no-reference and limited-reference metrics. We will examine a variety of applications with special emphasis on image and video compression. We will examine near-threshold perceptual metrics, which explicitly account for human visual system (HVS) sensitivity to noise by estimating thresholds above which the distortion is just-noticeable, and supra-threshold metrics, which attempt to quantify visible distortions encountered in high compression applications or when there are losses due to channel conditions. We will also consider metrics for structural equivalence, whereby the original and the distorted image have visible differences but both look natural and are of equally high visual quality. We will also take a close look at procedures for evaluating the performance of quality metrics, including database design, models for generating realistic distortions for various applications, and subjective procedures for metric development and testing. Throughout the course we will discuss both the state of the art and directions for future research.

T4: Interpretable Deep Learning: Towards Understanding & Explaining Deep Neural Networks

Presenters: Wojciech Samek (Fraunhofer HHI, Germany), Grégoire Montavon (TU Berlin, Germany), Klaus-Robert Müller (TU Berlin, Germany)

Wojciech Samek is head of the Machine Learning Group at Fraunhofer Heinrich Hertz Institute, Berlin, Germany. He studied computer science at Humboldt University of Berlin, Heriot-Watt University and University of Edinburgh from 2004 to 2010, was visiting researcher at NASA Ames Research Center, Mountain View, CA, in 2009, and received the Ph.D. degree from the Technische Universität Berlin in 2014. He is associated with the Berlin Big Data Center and is an editorial board member of Digital Signal Processing. His current research interests include deep learning, interpretability & robustness, and computer vision.

Grégoire Montavon received the master's degree in communication systems from the École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, in 2009, and the Ph.D. degree in machine learning from the Technische Universität Berlin (TU Berlin), Berlin, in 2013. He is currently a Research Associate with the Machine Learning Group, TU Berlin. His current research interests include neural networks, machine learning, and data analysis.

Klaus-Robert Müller received the degree in physics from Karlsruhe in 1989 and the Ph.D. degree in computer science from Technische Universität Karlsruhe in 1992. He has been a Professor of computer science with Technische Universität Berlin since 2006 and has been the Director of Bernstein Focus on Neurotechnology Berlin until 2014 and after that co-director of the Berlin Big Data Center. After completing a post-doctoral position at GMD FIRST, Berlin, he was a Research Fellow with The University of Tokyo from 1994 to 1995. In 1995, he founded the Intelligent Data Analysis Group, GMD-FIRST (later Fraunhofer FIRST) and directed it until 2008. From 1999 to 2006, he was a Professor with the University of Potsdam. He received the 1999 Olympus Prize by the German Pattern Recognition Society, DAGM, the SEL Alcatel Communication Award in 2006, the Science Prize of Berlin from the Governing Mayor of Berlin in 2014 and the Vodafone Innovation Award in 2017. He was elected to be a member of the German National Academy of Sciences-Leopoldina, the Berlin Brandenburg Academy of sciences in 2012 and 2017, respectively; was elected scientific member of the Max-Planck Society in 2017. His research interests is machine learning and its broad application to the sciences (neuroscience, medicine, physics and chemistry) and the industry.


Powerful machine learning algorithms such as deep neural networks (DNNs) are now able to harvest very large amounts of training data and to convert them into highly accurate predictive models. DNN models have reached state-of-the-art accuracy in a wide range of practical applications. At the same time, DNNs are generally considered as black boxes, because given their nonlinearity and deeply nested structure it is difficult to intuitively and quantitatively understand their inference, e.g. what made the trained DNN model arrive at a particular decision for a given data point. This is a major drawback for applications where interpretability of the decision is an inevitable prerequisite.

For instance, in medical diagnosis incorrect predictions can be lethal, thus simple black-box predictions cannot be trusted by default. Instead, the predictions should be made interpretable to a human expert for verification.

In the sciences, deep learning algorithms are able to extract complex relations between physical or biological observables. The design of interpretation methods to explain these newly inferred relations can be useful for building scientific hypotheses, or for detecting possible artifacts in the data/model. Also from an engineer’s perspective interpretability is a crucial feature, because it allows to identify the most relevant features and parameters and more generally to understand the strengths and weaknesses of a model. This feedback can be used to improve the structure of the model or speedup training.

Recently, the transparency problem has been receiving more attention in the deep learning community. Several methods have been developed to understand what a DNN has learned. Some of this work is dedicated to visualize particular neurons or neuron layers, other work focuses on methods which visualize the impact of particular regions of a given input image. An important question for the practitioner is how to objectively measure the quality of an explanation of the DNN prediction and how to use these explanations for improving the learned model.

Our tutorial will present recently proposed techniques for interpreting, explaining and visualizing deep models and explore their practical usefulness in computer vision.

T5: Drone Vision for Cinematography and Media Production

Presenters: Ioannis Pitas (Aristotle University of Thessaloniki, Greece), J. R. Martínez-de Dios (University of Seville, Spain), Anastasios Tefas (Aristotle University of Thessaloniki, Greece)

I. Pitas (PhD, IEEE fellow, IEEE Distinguished Lecturer, EURASIP fellow) is Professor at the Department of Informatics, Aristotle University of Thessaloniki, Greece. His current interests are in the areas of image/video processing, intelligent digital media, machine learning, human centered interfaces, affective computing, computer vision, 3D imaging and biomedical imaging. He has published over 850 papers, contributed in 44 books in his areas of interest and edited or (co-)authored another 11 books. He has also been member of the program committee of many scientific conferences and workshops. In the past he served as Associate Editor or co-Editor of eight international journals and General or Technical Chair of four international conferences. He was General Chair of ICIP 2001. He participated in 68 R&D projects, primarily funded by the European Union and is/was principal investigator/researcher in 40 such projects. He has 26700+ citations (Google Scholar) to his work and h-index 78 (Google Scholar).

J. R. Martínez-de Dios is a Professor with the Robotics, Vision and Control Group at the University of Seville. His main research topics are multi-robot systems, aerial robot perception, cooperative perception and robot-sensor network systems. In these topics he has coordinated 14 R&D projects and has participated in other 55 projects, including 18 EU-funded in FPs IV, V, VI, VII and H2020 programmes. He has also coordinated and participated in other 20 technology transfer projects to companies such as BOEING, AIRBUS and IBERDROLA, among others. He is author or co-author over 100 publications on multi-robot systems, cooperative perception and sensor fusion. He has as served as co-chair and TPC member in 15 international conferences and is member of the editorial board of 4 indexed international journals. His R&D activities have obtained 5 international awards including one “2010 EURON/EUROP Technology Transfer Award” and one “2017 EU Drone Award”.

Anastasios Tefas received the B.Sc. in informatics in 1997 and the Ph.D. degree in Informatics in 2002, both from the Aristotle University of Thessaloniki, Greece. Since 2017 he has been an Associate Professor at the Department of Informatics, Aristotle University of Thessaloniki. From 2008 to 2016 he was a Lecturer Assistant Professor at the same University. From 2006 to 2008, he was an Assistant Professor at the Department of Information Management, Technological Institute of Kavala. From 2003 to 2004, he was a temporary lecturer in the Department of Informatics, University of Thessaloniki. Dr. Tefas participated in 15 research projects financed by national and European funds. He has co-authored 72 journal papers, 177 papers in international conferences and contributed 8 chapters to edited books in his area of expertise. Over 4000 citations have been recorded to his publications and his H-index is 33 according to Google scholar. His current research interests include computational intelligence, deep learning, pattern recognition, statistical machine learning, digital signal and image processing and computer vision, biometrics and security.


The aim of drone cinematography is to develop innovative intelligent single- and multiple-drone platforms for media production. Such systems should be able to cover outdoor events (e.g., sports) that are typically distributed over large expanses, ranging, for example, from a stadium to an entire city. The drone or drone team, to be managed by the production director and his/her production crew, shall have: a) increased multiple drone decisional autonomy, hence allowing event coverage in the time span of around one hour in an outdoor environment and b) improved multiple drone robustness and safety mechanisms (e.g., communication robustness/safety, embedded flight regulation compliance, enhanced crowd avoidance and emergency landing mechanisms), enabling it to carry out its mission against errors or crew inaction and to handle emergencies. Such robustness is particularly important, as the drones will operate close to crowds and/or may face environmental hazards (e.g., wind). Therefore, it must be contextually aware and adaptive, towards maximizing shooting creativity and productivity, while minimizing production costs. Drone vision and machine learning play an important role towards this end, covering the following topics: a) drone localization, b) drone visual analysis for target/obstacle/crowd/point of interest detection, c) 2D/3D target tracking, d) privacy protection technologies in drones (e.g. face de-identification).

The tutorial will offer an overview of all the above plus other related topics such as: a) current state of the art on the use of drones in cinematography, advertisement, news coverage, sports and media production in general, b) multiple drone mission planning and flight control, c) communication issues in drones (e.g. video streaming), d) security and ethics issues in drones.

Afternoon Session

T6: Restoration and Recognition of Blurred Images

Presenters: Filip Sroubek, Jan Flusser, Barbara Zitova (Inst. of Information Theory and Automation, Acad. of Sciences, Czech Republic)

Download Slides

Filip Sroubek received the MS degree in computer science from the Czech Technical University, Prague, Czech Republic in 1998 and the PhD degree in computer science from Charles University, Prague, Czech Republic in 2003. From 2004 to 2006, he was on a postdoctoral position in the Instituto de Optica, CSIC, Madrid, Spain. In 2010 and 2011, he was the Fulbright Visiting Scholar at the University of California, Santa Cruz. He is currently with the Institute of Information Theory and Automation, the Czech Academy of Sciences, as the vice-head of the image processing department, and gives a graduate course on variational methods in image processing at the Czech Technical University and Charles University.

His research covers all aspects of image processing, in particular, image restoration (denoising, blind deconvolution, super-resolution) and image fusion (multimodal, multifocus). He is an author of 8 book chapters and over 60 journal and conference papers. In addition, he co-authored several tutorials at major international conferences (ICIP'05, EUSIPCO'07, CVPR'08, ICCV'15, ICPR'16) and was a keynote speaker at SPIE-IS&T'15 and ICIIP'13. He is a co-inventor of two patents.

His scientific achievements were awarded by several national prizes - the Josef Hlavka Student Prize, the Otto Wichterle Premium of the Czech Academy of Sciences for excellent young scientists, and the Czech Science Foundation Award.

Jan Flusser received the M.Sc. degree in mathematical engineering from the Czech Technical University, Prague, Czech Republic in 1985 and the Ph.D. degree in computer science from the Czechoslovak Academy of Sciences in 1990. Since 1985 he has been with the Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, Prague. In 1995-2007 he was holding the position of a head of Department of Image Processing. In 2007 he was appointed the Director of the Institute. Since 1991 he has been also affiliated with the Faculty of Mathematics and Physics, Charles University, Prague and with the Czech Technical University, Prague (full professorship in 2004), where he gives undergraduate and graduate courses on Digital Image Processing and Pattern Recognition and specialized graduate course on Invariants and wavelets. He has research and teaching experience from many universities and institutions worldwide.

Jan Flusser has a 25-years experience in basic and applied research on the field of image analysis, pattern recognition, and related fields. He has been involved in applications in remote sensing, medicine, and astronomy. He has authored and co-authored more than 200 research publications in these areas. He has presented more than 20 tutorials and invited/keynote talks at international conferences (ICIP'05, ICIP'07, EUSIPCO'07, CVPR'08, FUSION'08, SPPRA'09, SCIA'09, ICIP'09, SPPRA'10, COMPSTAT'06, WIO'06, DICTA'07, AIA'14, ICPR'16, and others). Slides of selected tutorials are available at http://zoi.utia.cas.cz/tutorials. Some of his journal papers became classical and are frequently cited (Google Scholar reports more than 10 000 citations of J. Flusser's publications).

J. Flusser has received several national and international scientific awards and prizes (Scopus 1000 Award, Felber Medal, Czech Science Foundation Award, The Czech Academy of Sciences Prize, and several best paper awards). His book "Moments and Moment Invariants in Pattern Recognition", Wiley, 2009, has become the world-wide textbook and the main reference on the field of moment-based image analysis.

Barbara Zitova received the M.Sc. degree in computer science and the Ph.D. degree in software systems from Charles University, Prague, Czech Republic, in 1995 and 2000, respectively. Since 1995, she has been with the Institute of Information Theory and Automation, Czech Academy of Sciences. Since 2008, she has been the Head of the Department of Image Processing. She gives undergraduate and graduate courses on digital image processing and wavelets in image processing with the Czech Technical University and Charles University. Her research interests include geometric invariants, image enhancement, image registration and image fusion, and image processing applications in cultural heritage and medical imaging.

She has authored/co-authored over 60 research publications in these areas, including mono graphs Moments and Moment Invariants in Pattern Recognition (Wiley, 2009), 2D and 3D Image Analysis by Moments (Wiley, 2016), and tutorials at major conferences (ICIP'05, ICIP'07, EUSIPCO'07, CVPR'08, ICIP'09, ICPR'16). Some of her journal papers became classical and are frequently cited (Google Scholar reports more than 7 000 citations of B. Zitova's publications).

She has received several awards - the Josef Hlavka Student Prize, the Otto Wichterle Premium of the Czech Academy of Sciences for excellent young scientists, Czech Science Foundation Award, The Czech Academy of Sciences Prize, several best paper awards, and the SCOPUS 1000 Award for more than 1000 citations of a single paper in 2010.


Blur is an inevitable, and typically unwanted, phenomenon that is present in all digital images. It results in smoothing high-frequency details, which makes the image analysis difficult. Heavy blur may degrade the image so seriously, that neither automatic analysis nor visual interpretation of the content are possible. There are also situations when the blur is not a nuisance as it conveys information about the source of the blur. For example, motion blur give us hints about the camera and/or object motion. Two major approaches to handling blurred images exist. They are more complementary rather than concurrent; each of them is appropriate for different tasks and employs different mathematical methods and algorithms.

Image restoration is one of the oldest areas of image processing. It appeared as early as in 1960's and 1970's in the work of the pioneers A. Rosenfeld, H. Andrews, B. Hunt, and others. In the last ten years, this area has received new impulses and has undergone a quick development. We have been witnesses of the appearance of multichannel techniques, blind techniques, and superresolution enhancement resolved by means of variational calculus in very high-dimensional spaces. A common point of all these methods is that they suppress or even remove the blur from the input image and produce an image of a high visual quality. However, image restoration methods are often ill-posed, ill-conditioned, and time consuming.

On the contrary, blur-invariant approach, proposed originally in 1995, works directly with the blurred data without any preprocessing. Blurred image is described by features, which are invariant with respect to convolution with some group of kernels. Image analysis is then performed in the feature space. This approach is suitable for object recognition, template matching, and other tasks where we want to recognize/localize objects rather than to restore the complete image. The mathematics behind it is based on projection operators and moment invariants.

In this tutorial, we will focus on both approaches. We start with blur modelling and analyzing potential sources of blur in real images. In the image restoration part of the tutorial we review traditional as well as modern deconvolution techniques, including blind deconvolution, space variant deconvolution, and multichannel deconvolution. The next part covers invariants to image blurring. The tutorial will be completed with numerous demonstrations and practical examples. The tutorial originates from the 20-years speakers' experience in image restoration, deconvolution, invariants, and related fields.

T7: Geometry-preserving Embeddings: Dimensionality Reduction Techniques for Information Representation

Presenters: Petros Boufounos (MERL, USA), Laurent Jacques (UCL, Belgium)

Petros T. Boufounos is Senior Principal Member of Research Staff and the Computational Sensing Team Leader at Mitsubishi Electric Research Laboratories (MERL), and a visiting scholar at the Rice University Electrical and Computer Engineering department. Dr. Boufounos completed his undergraduate and graduate studies at MIT. He received the S.B. degree in Economics in 2000, the S.B. and M.Eng. degrees in Electrical Engineering and Computer Science (EECS) in 2002, and the Sc.D. degree in EECS in 2006. Between September 2006 and December 2008, he was a postdoctoral associate with the Digital Signal Processing Group at Rice University. Dr. Boufounos joined MERL in January 2009. Dr. Boufounos' immediate research focus includes signal acquisition and processing, frame theory, quantization and data representations. He is also interested into how signal acquisition interacts with other fields that use sensing extensively, such as machine learning, robotics and mechatronics. Dr. Boufounos is a Senior Area Editor at IEEE Signal Processing Letters and a member of the IEEE Signal Processing Society Theory and Methods technical committee.

Laurent Jacques received the B.Sc. in Physics, the M.Sc. in Mathematical Physics and the PhD in Mathematical Physics from the Université catholique de Louvain (UCL), Belgium. He was a Postdoctoral Researcher with the Communications and Remote Sensing Laboratory of UCL in 2005–2006. He obtained in 2006 a four-year Postdoctoral funding from the Belgian FRS-FNRS in the same lab. He was a visiting Postdoctoral Researcher, in spring 2007, at Rice University (DSP/ECE, Houston, TX, USA), and from 2007 to 2009, at the Swiss Federal Institute of Technology (LTS2/EPFL, Switzerland). Formerly funded by Belgian Science Policy (Return Grant, BELSPO, 2010-2011), and as a F.R.S.-FNRS Scientific Research Worker (2011-2012) in the ICTEAM institute of UCL, he is a FNRS Research Associate since 2012. His research focuses on Sparse Representations of signals (1-D, 2-D, sphere), Compressed Sensing theory (reconstruction, quantization) and applications, Inverse Problems (in Optics), and Computer Vision. Since 1999, Laurent Jacques has co-authored 35 papers in international journals, 65 conference proceedings and presentations in signal and image processing conferences, and 4 book chapters.


Recent developments in compressed sensing, machine learning and dimensionality reduction have reinvigorated interest in the theory and applications of embeddings. Embeddings are transformations of signals and sets of signals that approximately preserve some aspects of the geometry of the set, while reducing the complexity of handling such signals. For example, Johnson-Lindenstrauss (JL) embeddings [20]–one of the earliest and most celebrated embedding constructions–provide tools for significant dimensionality reduction, agnostic to the data. Recent literature has significantly expanded the range of embedding constructions available, often departing from the linearity of JL’s method, to achieve different goals. This recent literature has shown that properly designed embeddings are a powerful tool for efficient information representation.

Our goal with this tutorial is to expose all this body of work, the available solutions, the theoretical underpinnings and practical considerations, as well as the problems still open in the field. We will provide a wide treatment of the topic, drawn both from our extensive work in the area, as well as other literature available in the field. Our objective is also to provide a well-balanced presentation of both theory and practice, guided by intuitive explanations, assuming only minimal background on dimensionality reduction and embedding theory.

More precisely, the tutorial aims to overview the fundamentals of embedding constructions, starting with the foundational work of Johnson and Lindenstrauss, and developing the general framework of randomized constructions. We will discuss different embedding goals, including distance and local geometry preservation, locality sensitive hashing (LSH), kernel machine implementation, and feature quantization, among others. We will also explore different applications in, e.g., classification, detection, and feature compression. Although our focus is on randomized, universal and data-agnostic constructions, we will also explore JL-style constructions that learn from the data geometry, to further improve performance at the expense of universality. The tutorial will conclude with discussion of the open problems in the area, future trends and promising research directions.

T9: Machine Learning meets Computational Imaging: Big Data Analytics for Earth Observation

Presenter: Mihai Datcu (DLR, Germany)

Mihai Datcu received the M.S. and Ph.D. degrees in Electronics and Telecommunications from the University Politechnica Bucharest UPB, Romania, in 1978 and 1986. In 1999 he received the title Habilitation diriger des recherches in Computer Science from University Louis Pasteur, Strasbourg, France. Since 1981 he has been Professor with the Department of Applied Electronics and Information Engineering, Faculty of Electronics, Telecommunications and Information Technology (ETTI), UPB, working in image processing and Electronic Speckle Interferometry. Since 1993, he has been a scientist with the German Aerospace Center (DLR), Oberpfaffenhofen. He is developing algorithms for model-based information retrieval from high complexity signals and methods for scene understanding from Very High Resolution Synthetic Aperture Radar (SAR) and Interferometric SAR data. He is engaged in research related to information theoretical aspects and semantic representations in advanced communication systems. Currently he is Senior Scientist and Image Mining research group leader with the Remote Sensing Technology Institute (IMF) of DLR, Oberpfaffenhofen. Since 2011 he is also leading the Immersive Visual Information Mining research lab at the Munich Aerospace Faculty and he is director of the Research Center for Spatial Information at UPB. His interests are in Bayesian inference, information and complexity theory, stochastic processes, model-based scene understanding, image information mining, for applications in information retrieval and understanding of high resolution SAR and optical observations. He has held Visiting Professor appointments with the University of Oviedo, Spain, the University Louis Pasteur and the International Space University, both in Strasbourg, France, University of Siegen, Germany, University of Innsbruck, Austria, University of Alcala, Spain, University Tor Vergata, Rome, Italy, Universidad Pontificia de Salamanca, campus de Madrid, Spain, University of Camerino, Italy, the Swiss Center for Scientific Computing (CSCS), Manno, Switzerland, From 1992 to 2002 he had a longer Invited Professor assignment with the Swiss Federal Institute of Technology, ETH Zurich. Since 2001 he has initiated and leaded the Competence Centre on Information Extraction and Image Understanding for Earth Observation, at ParisTech, Paris Institute of Technology, Telecom Paris, a collaboration of DLR with the French Space Agency (CNES). He has been Professor holder of the DLR-CNES Chair at ParisTech, Paris Institute of Technology, Telecom Paris. He initiated the European frame of projects for Image Information Mining (IIM) and is involved in research programs for information extraction, data mining and knowledge discovery and data understanding with the European Space Agency (ESA), NASA, and in a variety of national and European projects. He is a member of the European Big Data From Space Coordination Group (BiDS). He and his team have developed and are currently developing the operational IIM processor in the Payload Ground Segment systems for the German missions TerraSAR-X, TanDEM-X, and the ESA Sentinel 1 and 2. He is the author of more than 400 scientific publications, among them about 90 journal papers, and a book on number theory. He has served as a co-organizer of International Conferences and workshops, and as guest editor of special issue of the IEEE and other journals. He received in 2006 the Best Paper Award, IEEE Geoscience and Remote Sensing Society Prize, in 2008 the National Order of Merit with the rank of Knight, for outstanding international research results, awarded by the President of Romania, and in 1987 the Romanian Academy Prize Traian Vuia for the development of SAADI image analysis system and activity in image processing. He is IEEE Fellow of Signal Processing, Computer and Geoscience and Remote Sensing societies. In 2017 he was awarded a Chair Blaise Pascal for international recognition in the field of Data Science in Earth Observation, with the Centre d’Etudes et de Recherche en Informatique (CEDRIC) at the Conservatoire National des Arts et Métiers (CNAM) in Paris.


The deluge of Erath Observation (EO) images counting hundreds of Terabytes per day needs to be converted into meaningful information, largely impacting the socio-economic-environmental triangle. Multispectral and microwave EO sensors are unceasingly streaming millions of samples per second, which must be analysed to extract semantics or physical parameters for understanding Earth spatio-temporal patterns and phenomena. Typical EO multispectral sensors acquires images in several spectral channels, covering the visible and infrared spectra, or the Synthetic Aperture Radar (SAR) images are represented as complex values representing modulations in amplitude, frequency, phase or polarization of the collected radar echoes. An important particularity of EO images should be considered, is their “instrument” nature, i.e. in addition to the spatial information, they are sensing physical parameters, and they are mainly sensing outside of the visual spectrum.

Machine and deep learning methods are mainly used for image classification or objects segmentation, usually applied to one single image at a time and associated to the visual perception. The tutorial presents specific solutions for the EO sensory and semantic gap.

Therefore, aiming to enlarge the concepts of image processing introducing models and methods for physically meaningful features extraction to enable high accuracy characterization of any structure in large volumes of EO images. The tutorial presents the advancement of the paradigms for stochastic and Bayesian inference, evolving to the methods of deep learning and generative adversarial networks. Since the data sets are organic part of the learning process, the EO dataset biases pose new challenges. The tutorial answers open questions on relative data bias, cross-dataset generalization, for very specific EO cases as multispectral, SAR observation with a large variability of imaging parameters and semantic content. The challenge of very limited and high complexity training data sets it is addressed introducing paradigms to minimize the amount of computation and to learn jointly with the amount of known available data using cognitive primitives for grasping the behaviour of the observed objects or processes.

To practically implement these techniques, a current trend in Big Data processing is to bring the algorithms to the data on the cloud, instead of downloading large datasets and running algorithms on local servers. EO instead, is demanding more advanced paradigms, as: bring the algorithms to the sensor. The sensor is the source of the big data, and the tutorial is analysing the methods of computational imaging to optimize the EO information sensing. The tutorial is analysing the most advanced methods in synthetic aperture, coded aperture, compressive sensing, data compression, ghost imaging, and also the basics of quantum sensing. The overall theoretical trends are summarized in the perspective of practical applications.