Luminance monitoring within the field of view (FOV) is required for assessing visual comfort and overall visual preferences, but it is practically challenging and intrusive. As a result, real-time, human-centered daylighting operation remains a challenge. This paper presents a novel deep-learning based framework method to demonstrate that meaningful features in the occupant’s visual field can be extracted without invasive measurements. It is the first proof of concept to show that it is feasible to monitor luminance distributions as perceived by people, using a non-intrusive camera integrated with deep learning neural networks. A Conditional Generative Adversarial Network (CGAN), pix2pix is used to transfer information from non-intrusive images to FOV images. Two datasets were collected in an open-plan office with compact, low-cost High Dynamic Range Image (HDRI) cameras installed at two alternate locations (a wall or a monitor), to separately train two pix2pix models with the same target FOV images. The results show that the generated FOV images closely resemble the measured FOV images in terms of pixelwise luminance errors, mean luminance, and structural similarity. The main errors are due to bright scenes, visible through windows, confined to a very limited number of pixels. Overall, this work establishes a basis for future studies to assess the effect of visual environment on human perception using non-intrusive measurements. It also provides the theoretical foundation for a connected paper (Lu et al., 2025), which demonstrates that non-intrusive measurements and deep learning techniques can be used to discover daylight preferences and enable AI-assisted daylighting operation.
more »
« less
This content will become publicly available on January 1, 2027
Toward AI-assisted, human-centered daylighting operation: Non-invasive daylighting preference evaluation using deep learning
Lu et al. (2025) proved that HDRI camera sensors from different viewpoints can capture consistent and transferable luminance patterns in daylit spaces through Conditional Generative Adversarial Networks (CGANs). Building on that, this paper validates that non-intrusive luminance monitoring can be used to evaluate daylighting preferences, using collected experimental datasets with human subjects at different seating locations in a real open-plan office. To apply paired comparisons for effective learning, subjects compared successive pairs of different visual conditions and indicated their visual preferences through online surveys. Meanwhile, ten small, low-cost, and calibrated cameras captured luminance maps from both the field of view (FOV) of each occupant and non-intrusive viewpoints (on computer monitors, luminaire/ceiling and desk) under various sky conditions and interior luminance distributions. Convolutional Neural Network (CNN) models were developed and trained on luminance similarity index maps (generated from pixel-wise comparisons between successive luminance maps captured from FOV and non-intrusive cameras separately), to classify each subject’s daylight visual preferences. The results showed that the models trained on luminance distributions measured by monitor-mounted and ceiling-mounted cameras produced preference predictions consistent with those derived from FOV cameras, and can reliably learn visual preferences (83-94% accuracy) in all cases except for locations furthest from the windows. Overall, this study is the first to demonstrate that daylight preferences can be learned non-invasively by employing the full potential of HDRI and deep learning techniques, marking a significant milestone toward practical, AI-assisted, human-centered daylighting operation.
more »
« less
- Award ID(s):
- 2434194
- PAR ID:
- 10652348
- Publisher / Repository:
- Elsevier
- Date Published:
- Journal Name:
- Building and Environment
- Volume:
- 287
- Issue:
- PA
- ISSN:
- 0360-1323
- Page Range / eLocation ID:
- 113761
- Subject(s) / Keyword(s):
- Convolutional neural network Comparative preference learning Daylighting HDRI sensors Luminance similarity index Non-intrusive luminance monitoring
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
About 40% of the energy produced globally is consumed within buildings, primarily for providing occupants with comfortable work and living spaces. However, despite the significant impacts of such energy consumption on the environment, the lack of thermal comfort among occupants is a common problem that can lead to health complications and reduced productivity. To address this problem, it is particularly important to understand occupants’ thermal comfort in real-time to dynamically control the environment. This study investigates an infrared thermal camera network to extract skin temperature features and predict occupants’ thermal preferences at flexible distances and angles. This study distinguishes from existing methods in two ways: (1) the proposed method is a non-intrusive data collection approach which does not require human participation or personal devices; (2) it uses low-cost thermal cameras and RGB-D sensors which can be rapidly reconfigured to adapt to various settings and has little or no hardware infrastructure dependency. The proposed camera network is verified using the facial skin temperature collected from 16 subjects in a multi-occupancy experiment. The results show that all 16 subjects observed a statistically higher skin temperature as the room temperature increases. The variations in skin temperature also correspond to the distinct comfort states reported by the subjects. The post-experiment evaluation suggests that the networked thermal cameras have a minimal interruption of building occupants. The proposed approach demonstrates the potential to transition the human physiological data collection from an intrusive and wearable device-based approach to a truly non-intrusive and scalable approach.more » « less
-
ABSTRACT Most studies of developing visual attention are conducted using screen‐based tasks in which infants move their eyes to select where to look. However, real‐world visual exploration entails active movements of both eyes and head to bring relevant areas in view. Thus, relatively little is known about how infants coordinate their eyes and heads to structure their visual experiences. Infants were tested every 3 months from 9 to 24 months while they played with their caregiver and three toys while sitting in a highchair at a table. Infants wore a head‐mounted eye tracker that measured eye movement toward each of the visual targets (caregiver's face and toys) and how targets were oriented within the head‐centered field of view (FOV). With age, infants increasingly aligned novel toys in the center of their head‐centered FOV at the expense of their caregiver's face. Both faces and toys were better centered in view during longer looking events, suggesting that infants of all ages aligned their eyes and head to sustain attention. The bias in infants’ head‐centered FOV could not be accounted for by manual action: Held toys were more poorly centered compared with non‐held toys. We discuss developmental factors—attentional, motoric, cognitive, and social—that may explain why infants increasingly adopted biased viewpoints with age.more » « less
-
Group meetings can suffer from serious problems that undermine performance, including bias, "groupthink", fear of speaking, and unfocused discussion. To better understand these issues, propose interventions, and thus improve team performance, we need to study human dynamics in group meetings. However, this process currently heavily depends on manual coding and video cameras. Manual coding is tedious, inaccurate, and subjective, while active video cameras can affect the natural behavior of meeting participants. Here, we present a smart meeting room that combines microphones and unobtrusive ceiling-mounted Time-of-Flight (ToF) sensors to understand group dynamics in team meetings. We automatically process the multimodal sensor outputs with signal, image, and natural language processing algorithms to estimate participant head pose, visual focus of attention (VFOA), non-verbal speech patterns, and discussion content. We derive metrics from these automatic estimates and correlate them with user-reported rankings of emergent group leaders and major contributors to produce accurate predictors. We validate our algorithms and report results on a new dataset of lunar survival tasks of 36 individuals across 10 groups collected in the multimodal-sensor-enabled smart room.more » « less
-
Image stitching involves combining multiple images of the same scene captured from different viewpoints into a single image with an expanded field of view. While this technique has various applications in computer vision, traditional methods rely on the successive stitching of image pairs taken from multiple cameras. While this approach is effective for organized camera arrays, it can pose challenges for unstructured ones, especially when handling scene overlaps. This paper presents a deep learning-based approach for stitching images from large unstructured camera sets covering complex scenes. Our method processes images concurrently by using the SandFall algorithm to transform data from multiple cameras into a reduced fixed array, thereby minimizing data loss. A customized convolutional neural network then processes these data to produce the final image. By stitching images simultaneously, our method avoids the potential cascading errors seen in sequential pairwise stitching while offering improved time efficiency. In addition, we detail an unsupervised training method for the network utilizing metrics from Generative Adversarial Networks supplemented with supervised learning. Our testing revealed that the proposed approach operates in roughly ∼1/7th the time of many traditional methods on both CPU and GPU platforms, achieving results consistent with established methods.more » « less
An official website of the United States government
