A Daytime Smoke Detection Method Based on Variances of Optical Flow and Characteristics of HSV Color on Footage from Outdoor Camera in Urban City

Kikuta, Kazutaka; Murata, Ken T.; Murakami, Yuki

doi:10.1007/s10694-023-01522-4

A Daytime Smoke Detection Method Based on Variances of Optical Flow and Characteristics of HSV Color on Footage from Outdoor Camera in Urban City

Open access
Published: 13 January 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Fire Technology Aims and scope Submit manuscript

A Daytime Smoke Detection Method Based on Variances of Optical Flow and Characteristics of HSV Color on Footage from Outdoor Camera in Urban City

Download PDF

511 Accesses
Explore all metrics

Abstract

In order for detection of a fire in fields, it is effective to detect smoke since it often behaves as a precursor of the fire. One preferable way for early detection is to use visual information from outdoor cameras that widely monitor the filed. There have been many attempts to detect smokes via optical sensors on digital cameras using optical flow methods, but not fully successful from practical-use aspects. It is because the area of smokes occupying on the footage by outdoor cameras is not necessarily large enough. Moreover, in case of urban cities, discrimination of the smokes from other moving objects such as cars, trees and turbines is not easy. Herein we propose a novel method to detect daytime smokes based on variance of optical flow and characteristics of HSV (hue-saturation-value) color. We apply the method to a set of footage of three days obtained in an industrial zone in Japan. Successful results are obtained as over 90% of smokes are detected. Notable is that this method is independent of solar radiation conditions on sunny and cloudy days.

Visual Smoke Detection

Rapid Early Fire Smoke Detection System Using Slope Fitting in Video Image Histogram

Article 21 August 2019

Video Smoke Detection Based on the Optical Properties

1 Introduction

Smoke detection plays a crucial role in fire alert systems as smokes often precede flames. While indoor smoke sensors effectively detect fire heat or particles, they are not suitable for outdoor use due to the delay in smoke reaching the sensor. Additionally, these discrete sensors are inefficient for wide-range monitoring.

Video-based smoke detection systems offer a promising solution by efficiently capturing information from wider areas [1]. These sensors detect smoke via visual features such as colors, movements, textures, and shapes of smokes. However, challenges remain in visually recognizing smoke, including its limited visual characteristics compared to flames [2], fluctuations in smoke density, and variable smoke shapes [3], leading to harder separation from various backgrounds similar to smoke color. New visual techniques are needed to overcome these challenges.

Forest fires pose a significant threat to the environment, economy, and health [4]. Video cameras sensitive to smoke in the visual spectrum are commonly used to monitor forest fires. Various visual techniques have been developed for forest fire detection [5,6,7].

In contrast, urban fire camera systems have received less attention [8, 9]. One major difference is the presence of numerous moving objects in urban environments, such as vehicles and turbines in addition to trees and clouds. Therefore, progressive methods are required to eliminate these objects for effective smoke detection in urban areas.

This paper focuses on developing a daytime smoke detection technique for fire disaster mitigation in urban cities. This paper is composed of the following sections. Section 2 introduces related studies and both backgrounds and objectives of this study. Section 3 presents our proposed smoke detection method. Section 4 evaluates the method and defines evaluation criteria for comparison with previous studies. We conclude this paper in Sect. 5.

2 Backgrounds and Objectives of this Study

2.1 Related Studies

Various methods for capturing the characteristics of smoke have been studied. Piccinini et al. [10] focused on detecting smoke by analyzing energy changes in the wavelet and smoke color models. They employed a Bayesian approach to classify the obtained features. Tian et al. [11] proposed a blended image model that combines background and smoke components to accurately identify smoke. By solving the opacity of smoke, they achieved improved smoke detection results. Labati et al. [12] presented algorithms specifically designed for smoke detection in various wildfire environmental conditions. Their approach utilized computational intelligence techniques to adaptively detect smoke in frame sequences.

Terada et al. [13] introduced a method for detecting fire smoke using optical flow. Their approach was robust against different image acquisition environments and focused on early detection of fire incidents. By first detecting the region of the flame in the images, they then extracted characteristic quantities that specifically represented smoke, ensuring accurate smoke detection. Similarly, Chunyu et al. [14] proposed a video smoke detection method that incorporated both color and motion features. They utilized optical flow to approximate the motion field and estimated candidate smoke regions using background estimation and a color-based decision rule. The optical flow results were further processed to calculate motion features, allowing for differentiation between smoke and other moving objects.

When it comes to fire detection using machine learning, the focus is primarily on flames, with limited examples of its application to smoke alone [15]. One challenge in smoke detection using machine learning is the significant computational cost. In the study by Luo et al. [16], they employed an NVIDIA Tesla K40C GPU for detection, achieving processing speeds of 6-7 frames per second with a pixel size of 320 x 240.

Although aforementioned studies made significant contributions to smoke detection, a common limitation observed in most of them was the detection of large smoke areas, which may not be practical for real-world outdoor camera systems. In real field situations where cameras monitor rural or urban areas for smoke detection, the spatial resolution of the smoke to be detected is relatively small as seen in Figure 1. Additionally, it is important to exclude similar objects that may be mistaken for smoke. Therefore, the motivation of this study is to develop a technique that can overcome these practical issues and contribute to effective disaster mitigation in real-world scenarios.

2.2 Backgrounds

The authors have been developing Visual IoT systems as introduced in Appendix. Using a video transmission protocol specified to wireless networks [17], modern IP network cameras with both PTZ (pan-tilt-zoom) functions controlled by a remote operation protocol [18] and single board computers on edge side and supercomputers on cloud side, the systems have a potential of highly-functional smoke detection. While this paper focuses on the detection algorithm, the ultimate goal is to implement cutting-edge smoke image processing techniques within Visual IoT systems. After smoke detection, the smoke area is locked and enlarged. By capturing zoomed smoke images, more accurate fire information can be obtained. This PTZ function plays an important role in the discussion of F-score in Sect. 4.1.

2.3 Objectives of this Study

The study aims to develop a practical real-time smoke detection application during daytime, integrated with Visual IoT technologies. The geographic location of smoke break-out should be quickly detected by live cameras deployed over fields.

To achieve this, the authors propose a smoke detection algorithm that utilizes optical flow. Outdoor cameras with network connectivity continuously capture and provide high-resolution, high-frame-rate footage. Frame rates are important for tracking motion with optical flow since the properties of smoke, that are shape, color, and motion, change within a second.

The use of multiple frames in optical flow processing enables the emphasis of smoke-specific motion. This study employs footage with a resolution of $1920 \times 1080$ pixels and a frame rate of 25 fps, recorded every 10 min for a duration of 5 s. These parameters align with typical and practical settings for modern IP network cameras on mobile networks. Smoke examples obtained from IP network cameras in Kitakyushu city and Chikuma city, Japan, are shown in Figure 1. We use OpenCV 4.6.0 for computer vision processing including optical flow and contour drawing.

3 Smoke area Detection Method

3.1 Flow Chart to Detect Urban Smokes

This section presents our proposed smoke area detection method, illustrated in Figure 2. We begin by introducing a general optical flow technique in Sect. 3.2. Then, we propose our original smoke detection algorithm in Sects. 3.3, 3.4, 3.5, and 3.6.

Figure 3a is an sample image of smoke in an urban city taken at 10:00 AM JST on August 2nd, 2021, which will be discussed in this section. Figure 3a is an $800 \times 800$ pixel image extracted from Figure 1a. This area is selected because multiple high optical flow objects are concentrated. Figure 3b displays the manually identified positions of moving objects, including smoke from a factory on the right-hand side. Other moving objects, such as wind turbines (in the middle) and cars on the roads (on the left-hand side), are also identified.

3.2 Optical Flow

Optical flow is a method used to detect motion between image frames by analyzing the changing status of each pixel. It provides information about the direction and speed of motion, making it suitable for analyzing motion characteristics.

There are two main assumptions in optical flow calculations. First, it assumes that the brightness of moving points remains constant over a short period. Second, it assumes that the pixels around a point move in a similar manner. Lucas-Kanade method [20] and Farneback method [21] are typical optical flow methods. In this study, we utilize the Farneback method due to its higher accuracy, despite its increased computational cost.

The Farneback method approximates the brightness value of each pixel using a quadratic polynomial and estimates the amount of movement by comparing coefficients between frames. When the brightness is $f_{t} ({\mathbf{x}}) \in [0,1]$ at the coordinates $\mathbf{x}$ at time t, the brightness of the neighborhood points expressed by a quadratic polynomial is

$$\begin{aligned} f_t(\mathbf{x})=\mathbf{x}^T \mathbf{A}_t \mathbf{x} + \mathbf{b}_t \mathbf{x} + c_t \end{aligned}$$

(1)

where $\mathbf{A}_t$, $\mathbf{b}_t$, $c_t$ are symmetric matrices, column vectors, and scalar values, respectively. The coefficient is obtained by optimizing the neighborhood region by the weighted least squares method. If the amount of movement at the point $\mathbf{x}$ from the time t to $t+1$ is $\mathbf{d}_t$, then from $f_t (\mathbf{x})= f_{t+1} (\mathbf{x}+\mathbf{d}_t)$, it is expressed as below.

$$\begin{aligned} \mathbf{d}_t= -\frac{1}{2} {\mathbf{A}_t}^{-1} (\mathbf{b}_{t+1}-\mathbf{b}_t) \end{aligned}$$

(2)

Ideally, $\mathbf{A}_t = \mathbf{A}_{t+1}$, but in reality, the following approximation is used.

$$\begin{aligned} {\hat{\mathbf{A}}} = \frac{{\mathbf{A}}_t + {\mathbf{A}}_{t+1}}{2} \end{aligned}$$

(3)

As a result, the following constraints are obtained.

$$\begin{aligned} {\hat{\mathbf{A}}}_t {\mathbf{d}}_t = \Delta {\mathbf{b}}_t \end{aligned}$$

(4)

Here,

$$\begin{aligned} \Delta \mathbf{b}_t =-\frac{1}{2}(\mathbf{b}_{t+1} -\mathbf{b}_t). \end{aligned}$$

(5)

This equation gives a solution point by point, even though it is noisy. Therefore, assuming the change in displacement is gradual, the information in the neighborhood of each pixel is integrated. The energy in the point $\mathbf{x}$ neighborhood is expressed as follows.

$$\begin{aligned} \sum _{\Delta \mathbf{x}\in I}^{} w(\Delta \mathbf{x})\Vert \mathbf{A}(\mathbf{x}+\Delta \mathbf{x})\mathbf{d}(\mathbf{x}) - \Delta \mathbf{b}(\mathbf{x}+\Delta \mathbf{x})\Vert ^2 \end{aligned}$$

(6)

By differentiating this equation with the optimal amount of motion $\mathbf{d}_t(\mathbf{x})$, $\mathbf{d}_t (\mathbf{x})$ that minimizes the energy is determined. In Farneback method, the gradient can be stably obtained by approximating the local region of the image with a quadric surface. We implement this algorithm using the OpenCV calcOpticalFlowFarneback function.

3.3 Optical Flow Summation

In this section, we emphasize the smoke motions by adding optical flow values over time-series of frames in a footage. Since the source location of smoke remains stationary, the optical flow vectors of pixels in the smoke area exhibit minimal change within one footage. The optical flow at a pixel (i, j) between time t and $t + 1$ is expressed as a cartesian coordinate vector $(X_t(i, j)$, $Y_t (i, j)$).

By summing the vector values, we obtain the optical flow across multiple frames as follows:

$$\begin{aligned} X_{all}(i, j) = \sum _{t=1}^{T} X_t (i, j), Y_{all}(i, j) = \sum _{t=1}^{T} Y_t (i, j) \end{aligned}$$

(7)

We define $R_{all} (i, j)$ and $\Theta _{all} (i, j)$ as the polar coordinate transformations of $X_{all} (i, j)$ and $Y_{all} (i, j)$ as below.

$$\begin{aligned} R_{all}(i, j)&= \sqrt{X_{all} (i, j) + Y_{all}(i, j)} \end{aligned}$$

(8)

$$\begin{aligned} \Theta _{all}(i, j)&= \tan ^{-1} \frac{Y_{all}(i, j)}{X_{all} (i, j)} \end{aligned}$$

(9)

$R_{all} (i, j)$ and $\Theta _{all} (i, j)$ are the magnitude and the angle of the summed optical flow at pixel (i, j), respectively. This summation reduces noise and emphasize stable optical flow movement.

Figure 4a shows optical flow magnitudes of every pixel from two continuous frames, while Figure 4b is a summed value of the optical flow magnitudes over 31 frames. The color intensity in both panels in Figure 4 indicates the amount of motion in pixels. We note that the motion of smoke, turbine, and car are emphasized by stacking the number of flows. Note that the high-magnitude car areas expands as time goes on due to their position replacement. The addition of optical flow is limited to 31 frames since the direction of smoke may change over time and degrade summed value. To reduce processing cost, optical flow is performed without using the 3 frames in between. Thus, we get 31 frames of optical flow out of 121 frames (5 s) in the footage.

At this point, magnitude $R_{all} (i, j)$ is represented as separated $800 \times 800$ values. To recognize them as continuous areas, pixels above a certain magnitude threshold are extracted, and adjoining pixels are regarded as one area. We measure the areas of smokes in the following manner. The threshold is set for the magnitude of the optical flow. Here, the value B(i, j), which is obtained by classifying $R_{all} (i, j)$ into binary values according to a threshold, is defined by the following equation.

$$\begin{aligned} B (i, j)= {\left\{ \begin{array}{ll} \; 0,\quad R_{all} (i, j) < T_r \\ \; 1,\quad R_{all} (i, j) \ge T_r \end{array}\right. } \end{aligned}$$

(10)

Here, $T_r$ is a threshold value to be set arbitrarily. Then the neighboring $B (i, j)=1$ grids are connected to form a block that shows a high-magnitude area. This process is schematically depicted in Fig 5. The optical flow magnitude $R_{all}(i,j)$ in Figure 5a is binarized into Figure 5b, where contours of high-magnitude areas are drawn in orange lines. This process uses the OpenCV findContours function. On the assumption that the smoke area has a certain size, small areas are excluded from candidates. Note that, if the area is too small, the application of the optical flow method is unreliable. This concept is described in Figure 5c with green contours. This area selection procedure is similar to erode-dilate operation used for image denoising [22].

Figure 6a shows the case where the threshold of $T_r=10$ is empirically set for the optical flow of Figure 4b. The areas above threshold magnitude are enclosed with green contours. Let each area be $C_l$ ($l \in [1,L]$) and $M_l$ be the number of pixels in the area $C_l$. In this example, $L = 3$. We set a threshold $T_m$ and if $M_l \le T_m$, these areas are excluded. Figure 4c shows the result when $T_m = 1$. The remained area is defined as $C'_l$ ($l \in [1,L']$) and the number of pixels in $C'_l$ as $M'_l$. For Kitakyushu city footage, $T_m = 200$ is set arbitrarily. Figure 6b is the result after removing the small areas. There are 23 extracted candidates of smoke areas in green which also includes turbines or cars. In the case of cars, the entire trajectory is recognized as one high motion region by the optical flow summation.

Since high-magnitude areas occasionally include objects rather than smoke, we have to extract only the area with smoke characteristics. We use optical flow variance and HSV (hue-saturation-value) color characteristics to extract smoke from the candidates. Both methods are described in Sects. 3.4 and 3.5, respectively.

3.4 Extraction of Smoke Area by Variance of Optical Flow

To detect moving objects, it is reasonable to find areas with large value of optical flow magnitude. Figure 7a is magnitude of optical flow drawn in grey scale that is identical to $R_{all} (i, j)$ in Sect. 3.3. This is not enough to distinguish smokes from other moving objects.

As one of the characteristics of smoke motions, it generally moves upward from the source location. When wind blows, the whole smoke moves in one direction accordingly during short period. We assume in this study that the moving direction is mostly same in the region. In other words, the variance of optical flow vectors in smoke regions is small. Taking advantage of ability to acquire high frame rate footage in the system described in Sect. 2, we utilize the optical flow variance of vectors between multiple frames that show changes of extracted areas.

In this method, variances within areas are calculated not only in space but also in time. The $s_l$ which is the variance of $C'_l$ is as shown below.

$$\begin{aligned} {s_l}^2 = \frac{1}{t \cdot m}\sum _{t=1}^{T}\sum _{m=1}^{M'_l} ((X_t (C'_l(m)) - \bar{X})^2 +(Y_t (C'_l(m)) - \bar{Y} )^2 ) \end{aligned}$$

(11)

To visualize this, Figure 7b shows the variances derived from the changes over time. The variance value of the optical flow vectors at each pixel over 31 frames is represented in gray scale. The variance s at a pixel is derived as follows.

$$\begin{aligned} s(i, j)^2 = \frac{1}{t}\sum _{t=1}^{T} ((X_t (i, j) - \bar{X} (i, j))^2 +(Y_t (i, j) - \bar{Y} (i, j))^2 ) \end{aligned}$$

(12)

It can be observed that the variances of the turbines and cars are larger than those of the smoke areas in Figure 7b. As for cars, the areas correspond to the entire track of cars. The amount of motion in the area is temporally large only when moving vehicle exists, and is small enough at other times. In contrast, the positions of turbines are stable in time. The directions of movement, however, vary within the region due to its wing rotations that makes the variance larger. For these reasons, smoke and other high-motion objects are distinguishable by combining magnitude and variance of optical flow.

Figure 7a and b represent the average magnitude and variance obtained along the temporal dimension, respectively. Figure 7c is the detected area after thresholding both (a) and (b) in the following manner. In Figure 7c, the average magnitude across both the temporal and spatial dimensions is calculated for each candidate area. The variance is obtained from all pixels in the area as well. If a set of magnitude and value meet the threshold criteria, the area is selected as a detected area as described by the blue contour areas. It is important to determine appropriate thresholds for the magnitude and variance based on real smoke footage, which will be discussed in Sect. 4.2.

3.5 Extraction of Smoke Area by Characteristics of HSV Colors

In parallel with the variance of optical flow, we use color discrimination to exclude the non-smoke areas as shown in Figure 2. In general, foreground image frames in footage are denoted by RGB intensities and a set of rules are applied to each color for the discrimination. However, despite these rules, images often suffer from nonlinear visual perception and illumination dependency [14]. Appana et al. performed color segmentation by identifying the pixels that match the color of smoke in a frame [23]. They utilized HSV color analysis by transforming the RGB color space and thresholding the saturation (light intensity) and value (brightness) components in the HSV (hue-saturation-value) color space.

Figure 8 displays a set of frame image with the saturation (S) and value (V) components shown in gray scales. The color scale consists of 256 gradations with higher numbers corresponding to lighter colors. The candidate areas derived in Figure 6b are highlighted in green. Our objective in this section is to narrow down smoke candidate areas using both S and V parameters. As indicated in Figure 2, we apply a thresholding method to each smoke candidate area derived in Sect. 4.2. Equation (13) is used for thresholding the smoke areas in this study,

$$\begin{aligned} F_{color} (l)= {\left\{ \begin{array}{ll} \; 1,&\quad if \; Th(s,v) \supset (\bar{S},\bar{V})_l \\ \; 0, &\quad otherwise \end{array}\right. } \end{aligned}$$

(13)

where Th(s, v) is the S and V inside threshold and $(\bar{S},\bar{V})_l$ is the S and V in each pixel averaged over a candidate area $C'_l$. The blue contour areas in Figure 8c represent the detected areas after thresholding.

The bounded area should be defined on each target city. In [23], thresholds of saturation and value are independently given in constant values, respectively. We propose a dependent thresholding model on both saturation and value for more tuned discrimination that is derived from many smoke images of the target city. The definition of the bounded area in case of Kitakyushu city will be discussed in Sect. 4.2.

3.6 Combined Smoke Detection

The final discrimination of smoke area is performed by combining the detected areas derived from both optical flow (Sect. 3.4) and HSV color (Sect. 3.5). Figure 9a and b correspond to the detected areas by optical flow (Figure 7c) and HSV color (Figure 7c), respectively. The smoke candidate areas are highlighted in blue.

The smoke area is identified by overlaying the detected areas to extract overlapping ones. In Figure 9c, the red and blue contours correspond to the overlapped and the summed areas of the detected areas in Figure 9a and b, respectively. The final result, shown in Figure 9d, indicates that the red box corresponds to the smoke area manually selected in Figure 3b. This demonstrates the successful extraction of the smoke area from multiple candidate areas in Figure 6b.

4 Evaluations of the Proposed Method

4.1 Evaluation Criteria

We use recall and precision values as evaluation indices of the smoke detection method introduced in Sect. 3. Recall is the ratio of correctly predicted positive samples to the total number of actual positive samples. It is calculated as $\text{TP} / (\text{TP} + \text{FN})$ where TP represents true positive and FN represents false negative. Recall is commonly used to evaluate the oversight rate of detection. Precision, on the other hand, measures the ratio of correctly predicted positive samples to the total number of predicted positive samples. It is calculated as $\text{TP} / (\text{TP} + \text{FP})$, where FP represents false positives. Precision and recall have a trade-off relationship, and the F-score is commonly used to evaluate both values. Balanced F-score ($F_1$ score) is the harmonic mean of precision and recall, which is described as follows.

$$\begin{aligned} F_1 = \frac{2 Precision * Recall}{Precision + Recall} \end{aligned}$$

(14)

In real-time monitoring using a PTZ camera, it is possible to zoom in on smoke candidates to re-verify them on higher-resolution frames. It suggests importance to reduce false negative values than false positive ones for practical fire prevention since reverifying allows us to get rid of false positive cases. Therefore, in the evaluation of the F-score, we place greater importance on recall. $F_\beta$ is a generalized F score to show that recall is $\beta$ times as important as precision. We choose $\beta = 2$ represented as below.

$$\begin{aligned} F_2 = \frac{5 Precision * Recall}{4 Precision + Recall} \end{aligned}$$

(15)

4.2 Threshold

In order to evaluate the effectivity of the proposed method in Sect. 3, we apply it to a set of 5 s footage datasets observed in Kitakyushu city in 2021. The datasets were captured on August 1, August 9, and November 2. We first define the threshold values from a set of obtained footage on these days.

Kitakyushu city is known as a large-scale industrial zone and has many factories with chimneys. As mentioned in Sect. 2.3, we assume the use of IP network cameras to capture wide-view images of urban cities. As is seen in Figure 1a, the size of the smoke area is less than $50 \times 50$ pixels while the image size is in full HD (1080p). Our technique in Sect. 3 is applicable for such lower pixel cases.

The footage from three days are processed in the manner of Figure 2 and the detection rates are evaluated. We investigated the characteristics of various high-magnitude optical flow areas to obtain thresholds for extracting smoke only. The high-magnitude optical flow areas obtained from the three-day footage were manually classified into four categories: smoke, turbines, cars, and others. These areas were then analyzed in terms of their properties.

Figure 10 shows the relationship between variances and averaged magnitudes of optical flow in high-magnitude areas during the daytime (from 6:00 AM to 5:00 PM (JST)) on November 4, 2021. In the figure, smoke has higher y/x values than cars and turbines. It is suggested that smoke can be distinguished from other high-magnitude areas by using variance and intensity. We define the threshold as $mag/var = 9$, represented by the green line in Figure 10.

Figure 11 shows the relationship between the averaged color saturation and value on high-magnitude areas on November 4, 2021. The color labels are same as in Figure 10. The smoke and turbines share the same region (both have same color characteristics) but are distinguishable from cars and other objects in this scatter plot. Same as in Figure 10, the threshold is represented by the green lines. It can be written as $\{V > (256-64)/160 * S + 64\} \, \cap \, \{V < -256/160 * S + 256\}$.

Then, we verify whether the obtained threshold is applicable to data on other days. Considering that the color of smoke is affected by the weather conditions or dependent on time of day, we examine the dependence of colors (HSV saturation and value) o n weather conditions, represented by the amount of solar radiation values. Figure 12 is solar radiations on the three days whose values are obtained by AMATERASS system [24]. The AMATERASS dataset [25] is the estimated solar radiation amount acquired by the Himawari satellite [26, 27]. Figure 12 shows changes in solar radiation due to weather condition. August 1 and November 4, 2021 are sunny days and August 9, 2021 is a cloudy day.

Figure 13 shows the relationship of solar radiation with HSV color saturation and value for three-day datasets, respectively. Regarding smoke, the correlation with solar radiation is 0.01 for saturation and $-$0.08 for value, respectively. We thus conclude that neither the saturation nor the value of the smoke depends on the intensity of solar radiation.

Table 1 Recall Values of the Proposed Method (Kitakyushu City)

Full size table

4.3 Evaluation and Case Studies

We introduced a thresholding method and evaluated a set of threshold values in Sect. 4.2. These thresholds are applicable for any footage obtained at the same location. The $F_2$ values are obtained using the aforementioned thresholds on these three days. As shown in Table 1, the proposed method achieves $F_2$ values larger than 90% independently of weather condition. It should be noted that the smoke area size on the footage by the outdoor cameras in this study is limited, varying from 300 to 2000 pixels that are smaller than those in Figure 15 around 4000 pixels. Considering this limitation, we conclude that the proposed method is practical.

Figure 14 shows other typical cases of smoke detection on the three days in Sect. 4.2. The detected smoke areas are highlighted in red frames as same as in Figure 9d. Smoke is successfully detected in both Figure 14a and b, but cloud motion is also partially detected in Figure 14b. This is a case of false positive. Small fragments of clouds are recognized as smoke when the shape of the cloud changes. However, our Visual IoT system can easily avoid cloud misdetection by using geometric information, such as masking the sky area above the skyline. Figure 14c represents a case of false negative, where smoke on the left-hand side is not detected due to cloudy and dark conditions, making it difficult to distinguish the smoke from the clouds. Foggy conditions also pose challenges for smoke recognition, especially as the distance from the camera increases.

4.4 Comparison with Previous Method

To demonstrate the effectiveness of the proposed method in Sect. 3, we compare it with a previous study using a set of reference footage. We process four types of footage on a website of Bilkent samples [28], with a spatial resolution of $320 \times 240$ and frame rates of either 10 or 15 fps.

The threshold used here differs from the one used in the previous chapter. Based on the smoke characteristics in the sWasteBasket video, the thresholds for optical flow and HSV color are obtained in the same manner as described in Sect. 4.2 and applied to the other videos. The defined thresholds are $mag/var = 5$ and $\{S< 90 \} \, \cap \, \{ 60< V < 255 \}$.

The detection results are shown in Figure 15, and a summary of the comparison between the proposed method and the previous method [23] is shown in Table 2. In most cases, the proposed method outperforms the previous method. The decrease in recall values in Figure 15a is attributed to the rapid change in smoke orientation. In real-world monitoring scenarios, the detection rate is expected to be higher since smoke is typically observed in a more stable manner.

Table 2 Comparison Results of the Two Methods (Figure 15)

Full size table

4.5 Discussion

In the realm of forest fire detection methods using optical sensors and digital cameras, Alkhatib [5] reviewed four different types of methods. Among them, ForestWatch [30] is a notable video camera system because of its modern equipment. It adopts PTZ cameras and an image sampling engine in association with mobile networks. An application for this system is bundled to support decision-making.

To further enhance the speed and accuracy of fire detection, continued development of these modern techniques is necessary. Appendix presents Visual IoT [17] as a new IoT technology with numerous achievements reported in various outdoor applications (e.g., [31, 32]) One of the advantages of the Visual IoT system is its application of IP network cameras with the function of PTZ [17]. This type of camera enables autonomous direction change and zooming in on smoke (Figure 16). We believe future smoke detection applications should be closely tied to modern systems like Visual IoT (Figure 17), where PTZ plays a crucial role.

In practical disaster mitigation, the recall value is more important than precision, as any omission of fire detection can lead to a serious disaster. Our study achieved notably high recall values (Tables 1 and 2). To improve systems, we focus on reducing errors, which can be achieved using PTZ cameras. Higher-resolution footage improves precision and recall values. In this study, we visually confirmed that the smoke size ranged from 300 to 2000 pixels, so we decided to exclude areas smaller than 200 pixels. However, in actual fire monitoring, smoke may occur in smaller areas due to distance. To deal with such cases, using PTZ cameras to autonomously sweep the view area with slight zoom and capture high-definition videos is crucial.

In future studies, we plan to evaluate the smoke detection rate during nighttime and rainy conditions. At night, detecting smoke in urban areas poses challenges due to low light conditions. Streetlights can assist in detection, and the increasing availability of ultra-high sensitivity CMOS sensors is expected to improve detection rates. During rainfall, the detection rate is expected to decrease due to unclear camera images. Although the likelihood of a fire occurring in such conditions is low, it is important to examine the quantified relationship between rainfall and detection rate.

While we exclusively used smoke footage from factories in this study, future research will incorporate actual fire footage or color-adjusted substitute footage. The primary distinction between factory smoke and fire smoke lies in their color. Edited substitute footage can be created by processing the image of factory smoke to black smoke to make it resembles that of a fire.

5 Conclusion

In this paper, we proposed a novel method for detecting daytime smoke using outdoor cameras in urban cities, focusing on the specific properties of smoke. Optical flow is applied for a set of sequential frames in footage to extract smoke area. However, standard optical flow alone was insufficient as it also detected other moving objects such as cars and wind turbines. A new concept is thus introduced to apply both variance of optical flow and characteristics of HSV color on footage.

Our algorithm is developed based on high-quality footage with high-resolution (e.g., 1080p) and high-frame rate (e.g., 25 fps) obtained by modern IP network cameras. This algorithm shows better performance than the previously proposed ones; over 90% $F_2$ score detection rate is achieved using the high-quality outdoor camera footage. The accurate setting of threshold values is crucial in this type of algorithm, as the detection accuracy often depends on them. We proposed methods for deriving appropriate threshold values, particularly through color analysis. We found no significant dependence of weather conditions on smoke detection using solar radiation datasets.

Our technique in this study has the potential for even higher smoke detection capabilities when combined with IP network cameras equipped with PTZ functions. This integration can further enhance the effectiveness and efficiency of smoke detection in urban areas.

References

Healey G, Slater D, Lin T, Drda B, Goedeke AD (1993) A system for real-time fire detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 605–606. https://doi.org/10.1109/CVPR.1993.341064
Chen T-H, Yin Y-H, Huang S-F, Ye Y-T (2006) The smoke detection for early fire-alarming system base on video processing. In: 2006 International conference on intelligent information hiding and multimedia, pp 427–430. https://doi.org/10.1109/IIH-MSP.2006.265033
Zhou Z, Shi Y, Gao Z, Li S (2016) Wildfire smoke detection based on local extremal region segmentation and surveillance. Fire Saf J 85:50–58. https://doi.org/10.1016/j.firesaf.2016.08.004
Article Google Scholar
Kaushik ADS (2014) Forest fire disaster management. National Institute of Disaster Management, Ministry of Home Affairs, New Delhi
Google Scholar
Alkhatib AAA (2014) A review on forest fire detection techniques. Int J Distrib Sens Netw 10(3):597368. https://doi.org/10.1155/2014/597368
Article Google Scholar
Celik T (2010) Fast and efficient method for fire detection using image processing. ETRI J 32(6):881–890. https://doi.org/10.4218/etrij.10.0109.0695
Article Google Scholar
Govil K, Welch ML, Ball JT, Pennypacker CR (2020) Preliminary results from a wildfire detection system using deep learning on remote camera images. Remote Sens. https://doi.org/10.3390/rs12010166
Article Google Scholar
Muhammad K, Khan S, Elhoseny M, Hassan Ahmed S, Wook Baik S (2019) Efficient fire detection for uncertain surveillance environment. IEEE Trans Industr Inf 15(5):3113–3122. https://doi.org/10.1109/TII.2019.2897594
Article Google Scholar
Park M, Ko BC (2020) Two-step real-time night-time fire detection in an urban environment using static elastic-yolov3 and temporal fire-tube. Sensors. https://doi.org/10.3390/s20082202
Article Google Scholar
Piccinini P, Calderara S, Cucchiara R (2008) Reliable smoke detection in the domains of image energy and color. In: 2008 15th IEEE international conference on image processing, pp 1376–1379. https://doi.org/10.1109/ICIP.2008.4712020
Tian H, Li W, Wang L, Ogunbona P (2012) A novel video-based smoke detection method using image separation. In: 2012 IEEE international conference on multimedia and expo, pp 532–537. https://doi.org/10.1109/ICME.2012.72
Donida Labati R, Genovese A, Piuri V, Scotti F (2013) Wildfire smoke detection using computational intelligence techniques enhanced with synthetic smoke plume generation. IEEE Trans Syst Man Cybern 43(4):1003–1012. https://doi.org/10.1109/TSMCA.2012.2224335
Article Google Scholar
Terada K, Miyahara H, Nii Y (2004) A method of detecting fire smoke by using optical flow. IEEJ Trans Ind Appl 124(4):413–420. https://doi.org/10.1541/ieejias.124.413
Article Google Scholar
Chunyu Y, Jun F, Jinjun W, Yongming Z (2010) Video fire smoke detection using motion and color features. Fire Technol 46(3):651–663. https://doi.org/10.1007/s10694-009-0110-z
Article Google Scholar
Barmpoutis P, Papaioannou P, Dimitropoulos K, Grammalidis N (2020) A review on early forest fire detection systems using optical remote sensing. Sensors. https://doi.org/10.3390/s20226442
Article Google Scholar
Luo Y, Zhao L, Liu P, Huang D (2018) Fire smoke detection algorithm based on motion characteristic and convolutional neural networks. Multimed Tools Appl 77(12):15075–15092. https://doi.org/10.1007/s11042-017-5090-2
Article Google Scholar
Iyer R, Ozer E (2016) Visual IoT: architectural challenges and opportunities; toward a self-learning and energy-neutral IoT. IEEE Micro 36(06):45–49. https://doi.org/10.1109/MM.2016.96
Article Google Scholar
ONVIF (2023). http://www.onvif.org/
Murata KT, Kawanabe T, Yamamoto K, Murakami Y (2022) STARS-GIS: a GIS platform with spatiotemporal data analytic and reciprocal synchronization. NICT Res Rep 67(2):63–89
Google Scholar
Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th international joint conference on artificial intelligence. IJCAI’81, pp. 674–679, vol 2. Morgan Kaufmann Publishers Inc., San Francisco. https://doi.org/10.5555/1623264.1623280
Farnebäck G (2003) Two-frame motion estimation based on polynomial expansion. In: Bigun J, Gustavsson T (eds) Image analysis, pp 363–370. Springer, Berlin. https://doi.org/10.1007/3-540-45103-X_50
Seul M, O’Gorman L, Sammon MJ (2000) Practical algorithms for image analysis. Cambridge University Press, Cambridge
Google Scholar
Appana DK, Islam R, Khan SA, Kim J-M (2017) A video-based smoke detection using smoke flow pattern and spatial-temporal energy analyses for alarm systems. Inf Sci 418–419:91–101. https://doi.org/10.1016/j.ins.2017.08.001
Article Google Scholar
amaterass.org (2020). http://www.amaterass.org/
Alessandro D, Hitoshi I, Takashi H, Tamio T, Pradeep K, Hideaki T, Takashi N, Nakajima TY, Cordero RR (2018) Evaluation of Himawari-8 surface downwelling solar radiation by ground-based measurements. Atmos Meas Tech 11(4):2501–2521. https://doi.org/10.5194/amt-11-2501-2018
Article Google Scholar
Kotaro B, Kenji D, Masahiro H, Akio I, Takahito I, Hidekazu I, Yukihiro K, Takuya M, Hidehiko M, Tomoo O, Arata O, Ryo O, Yukio S, Yoshio S, Kazuki S, Yasuhiko S, Masuo S, Hidetaka T, Hiroaki T, Daisaku U, Hironobu Y, Ryo Y (2016) An introduction to Himawari-8/9–Japan’s new-generation geostationary meteorological satellites. J Meteorol Soc Jpn Ser II 94(2):151–183. https://doi.org/10.2151/jmsj.2016-009
Article Google Scholar
Murata KT, Pavarangkoon P, Higuchi A, Toyoshima K, Yamamoto K, Muranaga K, Nagaya Y, Izumikawa Y, Kimura E, Mizuhara T (2018) A web-based real-time and full-resolution data visualization for Himawari-8 satellite sensed images. Earth Sci Inf 11(2):217–237. https://doi.org/10.1007/s12145-017-0316-4
Article Google Scholar
Bilkent Signal Processing Group: computer vision based fire detection software (2003). http://signal.ee.bilkent.edu.tr/VisiFire
Wang Y, Wu A, Zhang J, Zhao M, Li W, Dong N (2016) Fire smoke detection based on texture features and optical flow vector of contour. In: 2016 12th world congress on intelligent control and automation (WCICA), pp 2879–2883. https://doi.org/10.1109/WCICA.2016.7578611
ALASIA marketing: fire hawk ForestWatch (2023). http://www.firehawk.co.za
Ji W, Xu J, Qiao H, Zhou M, Liang B (2019) Visual IoT: enabling internet of things visualization in smart cities. IEEE Netw 33(2):102–110. https://doi.org/10.1109/mnet.2019.1800258
Article Google Scholar
Yu Q, Hu L, Alzahrani B, Baranawi A, Alhindi A, Chen M (2021) Intelligent visual-IoT-enabled real-time 3d visualization for autonomous crowd management. IEEE Wirel Commun 28(4):34–41. https://doi.org/10.1109/mwc.021.2000497
Article Google Scholar

Download references

Acknowledgements

This work was supported by JST SICORP Grant Number JPMJSC18E3, Japan, “Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures” and “High Performance Computing Infrastructure” in Japan (Project ID: jh220048), JSPS KAKENHI Grant Number JP21K11904 and JP22H01318, and AMATERASS system operated by amaterass.org. We thank Chikuma city, Prof. Michiharu Niimi, and Dr. Hideaki Kawano from Kyushu Institute of Technology, and Prof. Fumiyuki Adachi from Tohoku University. We thank Dr. Tsutomu Nagatsuma, Dr. Kenichi Takizawa, Dr. Ryuji Nishimura, Prof. Yoichi Suzuki, Mr. Tomohiro Kawanabe, Mr. Kazunori Yamamoto, and Dr. Tomohiro Sato from NICT for their supports of facilities. The data processing is carried out on STARS (SpatioTemporal data Analytic and Reciprocal Synchronization)-gis system. The solar radiation data is supplied by Solar Radiation Consortium.

Author information

Ken T. Murata and Yuki Murakami have contributed equally to this work.

Authors and Affiliations

Resilient ICT Reserach Center, National Institute of Information and Communications Technology (NICT), Sendai, Miyagi, 980-0812, Japan
Kazutaka Kikuta & Ken T. Murata
ICT Testbed Research and Development Promotion Center, National Institute of Information and Communications Technology (NICT), Koganei, Tokyo, 184-8795, Japan
Ken T. Murata & Yuki Murakami

Authors

Kazutaka Kikuta
View author publications
You can also search for this author in PubMed Google Scholar
Ken T. Murata
View author publications
You can also search for this author in PubMed Google Scholar
Yuki Murakami
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kazutaka Kikuta.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kikuta, K., Murata, K.T. & Murakami, Y. A Daytime Smoke Detection Method Based on Variances of Optical Flow and Characteristics of HSV Color on Footage from Outdoor Camera in Urban City. Fire Technol (2024). https://doi.org/10.1007/s10694-023-01522-4

Download citation

Received: 06 June 2023
Accepted: 20 November 2023
Published: 13 January 2024
DOI: https://doi.org/10.1007/s10694-023-01522-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Daytime Smoke Detection Method Based on Variances of Optical Flow and Characteristics of HSV Color on Footage from Outdoor Camera in Urban City

Abstract