Wildfires pose significant negative impacts on nearby communities economically or environmentally. They have become more numerous over the years and are predicted to be more dangerous. As a result, predictive measures are needed to contain fires early, and one popular method is to use machine learning on datasets of previous fires. However, those datasets are typically very large and make it more difficult for machine learning to be effective. This paper investigates using machine learning to reduce the complexity of wildfire datasets. Two machine learning models are trained using two years of a large US forest fire database to classify wildfire sizes: one using all of the attributes, roughly 300, and one using 20 selected. Balanced accuracy decreases by only 0.7%, proving the effectiveness of this approach. Additionally, the number of fire stations near a fire’s origin emerges as a notable factor in the 20 selected attributes, which can improve wildfire prevention systems.
Andrew Lu (Thu,) studied this question.