What question did this study set out to answer?

The aim is to create a detailed dataset of crop distributions in Xinjiang using multi-year satellite imagery.

March 25, 2026Open Access

A 30 m Multi-Year Dataset of Major Crop Distributions in Xinjiang, China (2013–2024) Based on Harmonized Landsat–Sentinel-2 Data

Key Points

The aim is to create a detailed dataset of crop distributions in Xinjiang using multi-year satellite imagery.
Utilized Google Earth Engine for data processing.
Integrated NDVI and LSWI indices to characterize crop trajectories.
Applied a Random Forest algorithm for crop type identification.
Used a pre-extracted cropland mask to reduce interference.
Achieved overall accuracy of 0.90 and 0.93 for crop classifications.
Producer accuracies ranged from 0.83 to 0.99.
User accuracies ranged from 0.83 to 0.96.
Estimated crop areas were consistent with official statistics.

Abstract

Abstract Accurate and timely information on the spatial distribution of crops is essential for ensuring food security, achieving sustainable agricultural management, and understanding ecosystem interactions. However, in large-scale arid regions like Xinjiang, China, constructing high-spatial-resolution, continuous, and multi-year crop distribution datasets remains a significant challenge due to complex terrain, sparse ground observations, and limited computational resources. In this study, we developed a robust crop classification framework leveraging the Google Earth Engine (GEE) cloud platform. The framework integrates all available NASA–Sentinel-2 (HLSL30) imagery to construct harmonic models based on NDVI and LSWI indices, effectively characterizing crop phenological trajectories. These features are combined with a Random Forest (RF) algorithm to achieve detailed identification of major crop types. To minimize interference from non-crop vegetation and background land cover, we implemented a pre-extracted cropland mask. Using this approach, we generated a 30 m resolution dataset of major crops (including cotton, maize, wheat, and rice) across Xinjiang for the period 2013–2024. Accuracy assessments using independent validation samples from 2018 and 2019 yielded producer accuracies of 0.83–0.99 and user accuracies of 0.83–0.96. The overall accuracy reached 0.90 and 0.93, with Kappa coefficients of 0.86 and 0.89, respectively. Furthermore, the estimated crop areas at the prefecture level show high consistency with official statistical yearbooks and align well with existing distribution maps of cotton, maize, and wheat. This dataset provides a systematic characterization of the long-term spatial dynamics of major crops in Xinjiang, offering critical and reliable data support for regional agricultural monitoring, food security assessment, policy formulation, and environmental change research.

Bookmark

View Full Paper

Bookmark

View Full Paper

A 30 m Multi-Year Dataset of Major Crop Distributions in Xinjiang, China (2013–2024) Based on Harmonized Landsat–Sentinel-2 Data

Key Points

Abstract

Cite This Study