Research Overview

Research Projects

SAVI: Synthetic Apertures for Visible Imaging

Abstract: Synthetic aperture radar is a well-known technique for improving resolution in radio imaging. Extending these synthetic aperture techniques to the visible light domain is not straightforward as optical receivers cannot measure phase information. In this work, we propose to use macroscopic Fourier ptychography (FP) as a practical means of creating a synthetic aperture for visible imaging to achieve sub-diffraction limited resolution. We demonstrate the first working prototype for macroscopic FP in a reflection imaging geometry that is able to image optically rough objects. In addition, a novel image space denoising regularization is introduced to reduce the effects of speckle and improve perceptual quality of the recovered high resolution image. Our approach is validated experimentally where the resolution of various diffuse objects is improved six fold.

PtychNet: CNN Based Fourier Ptychography

Abstract: Fourier ptychography is an imaging technique that overcomes the diffraction limit of conventional cameras with applications in microscopy and long range imaging. Diffraction blur causes resolution loss in both cases. In Fourier ptychography, a coherent light source illuminates an object, which is then imaged from multiple viewpoints. The reconstruction of the object from these set of recordings can be obtained by an iterative phase retrieval algorithm. However, the retrieval process is slow and does not work well under certain conditions. In this paper, we propose a new reconstruction algorithm that is based on convolutional neural networks and demonstrate its advantages in terms of speed and performance.

Focal-Sweep for Large Aperture Time-of-Flight Imaging

Abstract: Time-of-flight (ToF) imaging is an active method that utilizes a temporally modulated light source and a correlation-based (or lock-in) imager that computes the round-trip travel time from source to scene and back. Much like conventional imaging ToF cameras suffer from the trade-off between depth of field (DOF) and light throughput—larger apertures allow for more light collection but results in lower DoF. This is especially limiting in ToF systems since they are active and are limited by illumination power, which eventually limits performance in long-range imaging or imaging in strong ambient illumination (such as outdoors). Motivated by recent work in extended depth of field imaging for photography, we propose a focal-sweep based image acquisition methodology to increase depth-of-field and eliminate defocus blur. Our approach allows for a simple inversion algorithm to recover all-in-focus images which is validated through simulation and experiment. We demonstrate a proof-of-concept focal sweep time-of-flight acquisition system and show results for a real scene.

Toward Coherent Camera Arrays

Abstract: In this work, we propose using camera arrays coupled with coherent illumination as an effective method of improving spatial resolution in long distance images by a factor of ten and beyond. Recent advances in ptychography have demonstrated that one can image beyond the diffraction limit of the objective lens in a microscope. We demonstrate a similar imaging system to image beyond the diffraction limit in long range imaging. We emulate a camera array with a single camera attached to an X-Y translation stage. We show that an appropriate phase retrieval based reconstruction algorithm can be used to effectively recover the lost high resolution details from the multiple low resolution acquired images. We analyze the effects of noise, required degree of image overlap, and the effect of increasing synthetic aperture size on the reconstructed image quality. We show that coherent camera arrays have the potential to greatly improve imaging performance. Our simulations show resolution gains of 10x and more are achievable. Furthermore, experimental results from our proof-of-concept systems show resolution gains of 4x—7x for real scenes. Finally, we introduce and analyze in simulation a new strategy to capture macroscopic Fourier Ptychography images in a single snapshot, albeit using a camera array.

Generalized Assorted Cameras

Abstract: In this paper, we advocate for Generalized Assorted Camera (GAC) arrays for multi-modal imaging—i.e., a camera array with filters of different characteristics placed in front of each camera aperture. GAC provides us with three distinct advantages over GAP: ease of implementation, flexible application dependent imaging since filters are external and can be changed and depth information that can be used for enabling novel applications (e.g. post-capture refocusing). The primary challenge in GAC arrays is that since the different modalities are obtained from different viewpoints, there is a need for accurate and efficient cross-channel registration. Traditional approaches such as SSD, SAD, and mutual information all result in multi-modal registration errors. Here, we propose a robust cross-channel matching cost function, based on aligning normalized gradients, that allows us to compute cross-channel sub-pixel correspondences for scenes exhibiting non-trivial geometry. We highlight the promise of GAC arrays with our cross-channel normalized gradient cost for several applications such as low light imaging, post-capture refocusing, skin perfusion imaging using RGB+NIR and hyperspectral imaging.

Flutter Shutter Video Camera

Abstract: Video cameras are invariably bandwidth limited and this results in a trade-off between spatial and temporal resolution.In this paper, we show that a simple coded exposure modulation is sufficient to reconstruct high speed videos. We propose the Flutter Shutter Video Camera (FSVC) in which each exposure of the sensor is temporally coded using an independent pseudo-random sequence. Such exposure coding is easily achieved in modern sensors and is already a feature of several machine vision cameras. We also develop two algorithms for reconstructing the high speed video; the first based on minimizing the total variation of the spatio-temporal slices of the video and the second based on a data driven dictionary based approximation. We perform evaluation on simulated videos and real data to illustrate the robustness of our system.

SocialSync

Abstract: SocialSync is a sub-frame synchronization protocol for capturing images simultaneously using a smartphone camera network. By synchronizing image captures to within a frame period, multiple smartphone cameras, which are often in use in social settings, can be used for a variety of applications including light field capture, depth estimation, and free viewpoint television. Currently, smartphone camera networks are limited to capturing static scenes due to motion artifacts caused by frame misalignment. To overcome this synchronization challenge, we first characterize frame capture on Android devices by analyzing the statistics of camera setup latency and frame delivery to the software application. Next, we develop the SocialSync protocol to achieve sub-frame synchronization between devices by estimating frame capture timestamps millisecond accuracy. Finally, we demonstrate the effectiveness of SocialSync on mobile devices by reducing motion-induced artifacts when recovering the light field.

Sytrofoam

Abstract: Screen-to-camera visible light communication links are fundamentally limited by inter-symbol interference, in which the camera receives multiple overlapping symbols in a single capture exposure. By determining interference constraints, we are able to decode symbols with multi-bit depth across all three color channels. We present Styrofoam, a coding scheme which optimally satisfies the constraints by inserting blank frames into the transmission pattern. The coding scheme improves upon the state-of-the-art in camera-based visible-light communication by: (1) ensuring a decode with at least half-exposure of colored multi-bit symbols, (2) limiting decode latency to two transmission frames, and (3) transmitting 0.4 bytes per grid block at the slowest camera's frame rate. In doing so, we outperform peer unsynchronized VLC transmission schemes by 2.9x. Our implementation on smartphone displays and cameras achieves 69.1 kbps.

Hyperspectral Classification

Abstract: A tenet of object classification is that accuracy improves with an increasing number (and variety) of spectral channels available to the classifier. Hyperspectral images provide hundreds of narrowband measurements over a wide spectral range, and offer superior classification performance over color images. However, hyperspectral data is highly redundant. In this paper we suggest that only 6 measurements are needed to obtain classification results comparable to those realized using hyperspectral data. We present classification results for a natural scene using three imaging modalities: 1) using three broadband color filters (RGB) and three narrowband samples, 2) using six narrowband samples, and 3) using six commonly available optical filters. If these results hold for larger datasets of natural images, recently proposed multispectral image sensors can be used to offer material classification results equal to that of hyperspectral data.