Snow, Ice and Environment Over the Tibetan Plateau Zone II Versions EN1 Vol 2 (2) : 0 2017
Download
Phenological metrics dataset, land cover types map for the Tibetan Plateau and grassland biomass dataset for Qinghai Lake Basin
 >>
: 2017 - 03 - 29
: 2017 - 9 - 26
6309 33 0
Abstract & Keywords
Abstract: With abundant types and large coverage of alpine vegetation, the Tibetan Plateau has been a hot region for researches on the spatial distribution and phenological changes of alpine vegetation. Using the MODIS Surface Reflectance Product as the main data source, supplemented by Landsat images, meteorological data and field measurements, we obtained a phenological metrics dataset of alpine grasslands, a land cover types map for Tibetan Plateau and an alpine aboveground grassland biomass dataset for Qinghai Lake Basin. Major research methods and outcomes of this study include: ①Based on NDVI time series derived from the MODIS Surface Reflectance Product, threshold method was used together with a symmetric Gaussian function to extract phenological metrics. The phenological metrics dataset of alpine grasslands (2000 – 2010) generated therefrom showed more spatial and temporal details than phenological products derived from AVHRR GIMMS. ②With phenological and biophysical metrics extracted from sample data, we applied the support vector machine (SVM) on MODIS images to get the classification of land cover types over the Tibetan Plateau, thereby creating a land cover types map for Tibetan Plateau (2010). Results showed that the map had a classification accuracy of 93%, higher than the published National Vegetation Atlas. Besides, the newly-created land cover types map had a clearer and smoother transition from alpine meadow to alpine steppe. ③Based on MODIS and Landsat TM images, the alpine aboveground grassland biomass dataset (2000 – 2015) for Qinghai Lake Basin was obtained by using STARFM data fusion algorithm and non-parameterized SVM biomass estimation model. The dataset had an estimated precision of 82%. The three datasets can serve as fundamental data for studies on vegetation over Tibetan Plateau, providing references for related scientific researches.
Keywords: Tibetan Plateau; vegetation; phenology; land cover types; grassland biomass
Dataset Profile
English titlePhenological metrics dataset of alpine grasslands over the Tibetan Plateau (2000–2010)Land cover map for Tibetan Plateau (2010)Alpine aboveground biomass dataset for Qinghai Lake Basin (2000 – 2015)
Corresponding authorsWang Cuizhen (cwang@mailbox.sc.edu),
Zhang Li (zhangli@radi.ac.cn)
Wang Cuizhen
(cwang@mailbox.sc.edu),
Zhang Li (zhangli@radi.ac.cn)
Zhang Li (zhangli@radi.ac.cn)
Data authorsWang Cuizhen, Zhang Li,
Liu Shuangyu, Qiu Yubao
Wang Cuizhen, Zhang Li,
Qiu Yubao, Zhang Yili
Zhang Binghua, Zhang Li
Time range2000 – 201020102000 – 2015
Geographical scopeTibetan Plateau: 26°00′12″ N – 39°46′50″ N, 73°18′52″ E – 104°46′59″ E; specific areas include: Kunlun Mountains, Karakoram Mountains, Tanggula Mountains, Himalaya Mountains, and other main mountain ranges.Tibetan Plateau: 26°00′12″ N – 39°46'50'' N, 73°18'52'' E – 104°46'59'' E; specific areas include: Kunlun Mountains, Karakoram Mountains, Tanggula Mountains, Himalaya Mountains, and other main mountain ranges.Qinghai Lake Basin: 37°17′ N – 38°19′ N, 90°48′ E – 101°11′ E; formed by fault subsidence between Datong, Riyue, and Qinghai Nanshan mountains.
Spatial resolution500 m500 m30 m
Data format*.GeoTIFF*.GeoTIFF*.GeoTIFF
Data volume4.08 GB70.6 MB93.7 GB
Data service system<http://www.sciencedb.cn/dataSet/handle/397><http://www.sciencedb.cn/dataSet/handle/398><http://www.sciencedb.cn/dataSet/handle/399>
Sources of fundingOpen Research Foundation of the Key Laboratory of Digital Earth Sciences, Institute of Remote Sensing and Digital Earth Sciences, Chinese Academy of Sciences (2013LDE002), International Cooperation and Exchange Project of the National Natural Science Foundation of China (41120114001), National Natural Science Foundation of China (41271372)International Cooperation and Exchange Project of the National Natural Science Foundation of China (41120114001), National Key Basic Research Program of China (973 Program, No. 2009CB723906)National Natural Science Foundation of China (41271372), International Partnership Program of Chinese Academy of Sciences (131C11KYSB20160061)
Dataset compositionThe phenological metrics dataset of the alpine grasslands over the Tibetan Plateau (2000 – 2010) includes the start and end of vegetation growth season, the season length, the NDVI peak date (pheno_change folder); it has a spatial resolution of 500 m, and a temporal resolution of 1 year. The vegetation phenological metrics trend data (pheno_trend folder) has a spatial resolution of 500 m.The land cover map for the Tibetan Plateau (LandCoverTypesMap file), with a spatial resolution of 500 m.The alpine aboveground biomass dataset for Qinghai Lake Basin (2000 – 2015) (biomass result folder), with a spatial resolution of 30 m, and a temporal resolution of 8 days (vegetation growth season, May – September each year).
1.   Introduction
The Tibetan Plateau is the highest plateau in the world. It is called the "roof of the world" and the "Third Pole". The Tibetan Plateau is bordered by the Himalayas to the south, the Kunlun Mountains, the Altyn Mountains and the Qilian Mountains to the north, the Pamir Mountains along the west and the Hengduan Mountains along the east. With an average elevation of over 4000 meters and an area of approximately 2,500,000 square kilometers, the Tibetan Plateau possesses rich natural resources and alpine wildlife resources, with alpine vegetation being an important component of its biodiversity. However, due to its distinctive geographical location, cold arid climate, the snowpack and snowmelt effect,1 and the vulnerability of alpine vegetation, the vegetation in the Tibetan Plateau is very sensitive to climate change. The vegetation phenology, growth condition, and distribution are easily affected by climate change,24 with the alpine grasslands showing significant degradation.5 In view of continuous global climate changes in recent years, studies of the phenological, physicochemical and biochemical metrics of vegetation and their relationship with the climatic factors have become a popular research topic.
Due to the plateau's vastness, complex climate, and harsh environment, field investigation and observation of this region has been extremely limited, making it difficult for researchers to carry out relevant studies. The predicament has been overcome by remote sensing technology due to its many advantages, including the wide spatial coverage, the high temporal resolution, and the large amount of information acquired.
The moderate resolution imaging spectroradiometer (MODIS) surface spectral reflectance product is an important basic data product that is widely used in vegetation index calculation, surface reflectance retrieval, change monitoring, and other types of research. By establishing a mathematical model, and entering the data to carry out data retrieval and analysis, we can obtain the spatial distribution and variation trends of the vegetation, and the correlation between the metrics in target area, thereby providing data support for research on the Tibetan Plateau.
This article introduces three remote sensing datasets for the Tibetan Plateau alpine vegetation, including the phenological metrics dataset of the alpine grasslands over the Tibetan Plateau (2000 – 2010), the land cover map for the Tibetan Plateau (2010) and the alpine aboveground biomass dataset for the Qinghai Lake Basin (2000 – 2015), while describing the retrieval method and data accuracy of this dataset, with the purpose of providing basic geographic information data for researchers.
2.   Data collection and processing
2.1   Data collection
The raw data of the three remote sensing datasets were mainly obtained from the MODIS product, the Landsat remote sensing imagery, field observations and other supplementary data. An overview of the data sources is provided in the following sections.
2.1.1   MODIS product and Landsat remote sensing imagery
The datasets were produced using MODIS products MOD09A1 and MCD43A4. MOD09A1 is a 500-m, 8-day composite surface spectral reflectance product, with 46 images collected annually from 2000 to 2010. MODIS product tiles h25v05, h25v06, h26v05, h26v06, and portions of tiles hv05 and h24v05 were combined to provide raw data for the Tibetan Plateau vegetation phenological metrics dataset. Moreover, the 2010 data was also used as the raw data for the Tibetan Plateau land cover map. MCD43A4 is a 500-m, 16-day composite surface reflectance product, covering the vegetation growth season (May – September) from 2000 to 2015. MODIS product tiles hv056v05 and h25v05 were combined to provide raw data for the Qinghai Lake Basin grassland biomass dataset. MODIS products can be downloaded at https://ladsweb.modaps.eosdis.nasa.gov/search/.
The Landsat remote sensing imagery was mainly derived from the Landsat 5 TM data, which can be downloaded at http://espa.cr.usgs.gov/index/.
2.1.2   Field observation data
Various field observation data were used during the process of creating the land cover types map for the Tibetan Plateau (2010) and the Alpine aboveground biomass dataset for the Qinghai Lake Basin (2000 – 2015). For the former dataset, we used the actual field observation data of the Tibetan Plateau, collected in 2011, and the data recorded onsite at 28 alpine meadows, 54 alpine steppes and 2 alpine deserts, adding up to 84 sites, with the data coverage range of each site being 1.5 km × 1.5 km. These data were served as training samples for the dataset generation process. For the latter dataset, we used the biomass field observation data collected between 2013 and 2015, during the vegetation growth season (May – September). Based on the collection method, the field observations were classified as fixed and unfixed. For the fixed observations, we selected transect lines as the observation experimental areas, and collected their vegetation coverage data 1 – 2 times per month during the growth season (May – September) each year. The unfixed observations were collected in quadrants, with the transect information being encoded. The encoded information for each quadrant included coordinates, main species, grassland altitude, grassland coverage, aboveground biomass, surface temperature and spectral information. The biomass was obtained by harvesting grass samples and recording their fresh weights. By the end of 2015, during the period of three years, we collected grassland biomass data from a total of 409 quadrants (from the end of July to the beginning of August, each year) and grassland coverage data from 1,125 quadrants (May – September each year), including effective biomass data for the Qinghai Lake Basin from 287 quadrants.
2.1.3   Other data
The sample point datasets used during the process of creating the land cover map for the Tibetan Plateau (2010) were based on the following:
(1) 2-m spatial resolution images captured by the aerial hyperspectral imaging experiment conducted during August 11 – 13, 2011 in the central region of Tibet (the experimental design involved 62 flight tracks, 28 of which were parallel to the four routes shown in Figure 1). Using visual interpretation, we randomly selected 33 alpine meadows and 35 alpine steppes, using them to verify the classification accuracy.
(2) The 1:1000000-scale Vegetation Atlas of China (ACAS), published in 2001, and other resources published by the Chinese Academy of Sciences,7 from which we randomly selected 22 alpine desert training sample points, using them to produce the land cover map.
(3) The field observation data provided by Zhang Yili et al., from which we selected the 2012 – 2013 alpine desert sample points from 4 sites in the Nima-Gaize region of western Tibet, using them to test the classification accuracy.
(4) The MODIS surface spectral reflectance product (day 217 of 2010), from which we selected 30 perennial snowpacks, 33 lakes, and 38 bare land training sample points to produce the land cover map. Using visual interpretation, we independently selected 50 lakes and 50 snowpacks as sample points to use for testing the classification accuracy.8
The distribution of partial sample data points is shown in Figure 1.


Figure 1   Map of the 2011 aerial photography flight path, field data and partial distribution of the land cover type sample data
The supplementary data used to produce the alpine aboveground biomass dataset for Qinghai Lake Basin (2000 – 2015) includes the following:
(1) The Landsat-based, 1:100000-scale land usage data for Qinghai Province (2000), obtained from the "Macro-scale survey and dynamic study of natural resources and environment of China by remote sensing" – a key project of the Chinese Academy of Sciences during the Eighth Five-Year Plan. The data mainly includes such aspects as water bodies, construction land, forested land, and agricultural land. It is currently the most precise land use type product in China,9 used as a mask for non-grassland areas during the biomass dataset production process.
(2) The global 30-m spatial resolution DEM obtained from the SRTM (Shuttle Radar Topography Mission) program, jointly developed by the U.S. Space Agency (NASA) and the National Imagery and Mapping Agency of the U.S. Department of Defense. It was used to calculate the altitude information of the study area. It can be downloaded at http://glcf.umd.edu/data/.
(3) The meteorological data from the dataset of the daily climate data from Chinese surface stations (V3.0) provided by the China Meteorological Science Data Sharing Center, obtained mainly from the national ground station observation data.
2.2   Data processing
2.2.1   Vegetation phenology data processing
The study of the vegetation phenology in the Tibetan Plateau can be divided into three parts: data pre-processing, vegetation phenological metrics extraction and phenological trend analysis.1011
(1) Data preprocessing: Using the MODIS land surface reflectance product, we obtained the NVDVI time series images and curves for 2000 – 2010. After discarding the significantly low-value pixels, we used the Savitzky-Golay filter to smooth the NDVI time-series curves with any cloud pollution.
(2) Vegetation phenological metrics extraction: We used the asymmetric Gaussian function to fit the smoothed curves in order to extract four phenological metrics, namely, the start of growth season, the end of growth season, the season length, and the NDVI peak date, thus providing the vegetation phenological database for the Tibetan Plateau (2000 – 2010). The four phenological metrics correspond to the four scenes of the phenological metric images in the pheno_change file, respectively. The images consist of 11 bands, corresponding to the annual vegetation phenological metrics from 2000 to 2010, respectively.
(3) Vegetation phenology trend analysis: Although the vegetation phenological trends can be obtained using linear regression, since autocorrelation exists between the metrics, using linear regression as the sole method of analysis will result in residual errors that do not satisfy the requirements, such as zero mean and constant variance. Therefore, a more accurate vegetation phenological metric trends dataset for 2000 – 2010 can be obtained using the Mann-Kendall method to analyze the trends. The four phenological metrics correspond to the four scenes of the phenological metrics trends image in the pheno_change file, respectively. This dataset reflects the overall variation tendency of each phenological metrics during 2000 – 2010.
2.2.2   Plateau land cover data processing
The data processing flow consists of data preprocessing, metrics extraction, and land cover classification accuracy.8
(1) Data preprocessing: Using MODIS surface spectral reflectance product (MOD09A1), we analyzed 46 images of the NDVI time series for the Tibetan Plateau during 2010. Please refer to Section 2.2.1 for the data preprocessing method used for the NDVI curve smoothing and denoising.
(2) Phenological and biophysical metrics extraction: In addition to the four phenological metrics extracted from the smoothed NDVI time series curves, we used the method described in Section 2.2.1 to extract two other biophysical metrics from the NDVI curves, namely, the NDVI peak value and the NDVI growth season sum. These two metrics, which reflect the growth condition and the growth cycle of the grassland, can be used to distinguish various grassland types in different types of land cover.
(3) Land cover classification: Based on the field observation data and the supplementary data, while using the biophysical and phenological metrics of various types of grasslands as training samples, we used a support vector machine (SVM), in combination with a multi-threshold classification method (Figure 2) to divide the Tibetan Plateau into six types of land cover: alpine desert, alpine steppe, alpine meadow, bare land, lake, and snowpack, thus obtaining the land cover map for the Tibetan Plateau (2010).


Figure 2   Flowchart illustrating the classification of land cover on the Tibetan Plateau
2.2.3   Grassland biomass data processing
The biomass data processing mainly consisted of the STARFM fusion scheme and the establishment of grassland biomass estimation model.12 The dataset production workflow is shown in Figure 3.
(1) Data fusion based on the STARFM algorithm: Upon comparing a variety of fusion schemes during the production process of this dataset using the MODIS MCD43A4 TM data, based on the acquisition conditions, we selected data that was less than one year old, with one- to two-year interval for the remote sensing input data. Subsequently, using the STARFM fusion algorithm, we fused the vegetation indices calculated separately from the MODIS and Landsat images to create a new set of vegetation indices. By neglecting the atmospheric errors and the image registration errors, the STARFM algorithm assumes that a single object type and the same-phase Landsat image pixel can be represented by the weighted average of the MODIS image pixels. However, due to the effect of pixel variation, land cover types, phenology, and other factors, it is very difficult to represent a single Landsat pixel using the MODIS imagery. Therefore, during the actual process, the value of a Landsat pixel is initially predicted using similar pixels adjacent to it. Subsequently, using the predicted pixel as the center, based on the degree of similarity, the spatial and temporal distance, and other factors, the calibration point of the predicted pixel is derived within a fixed window, with a weighted value assigned to all similar pixels. Finally, the value of the central predicted pixel is obtained. This method is used to convolute the entire image, while predicting the pixel values for the entire image. During this process, a 350-m window was used for cultivated land, a 950-m window for forested land, and a 750-m one for grassland and other land types. Finally, we generated the NDVI time series images that combine the advantages of high temporal resolution of MODIS and the medium spatial resolution of Landsat TM.
(2) Establishment of the grassland biomass estimation model: By combining the NDVI time series produced using data fusion and the field observation data, we separately established parametric and non-parametric biomass estimation models. Based on the accuracy verification and analysis, we selected the support vector regression model as the optimal model, using it to estimate the grassland biomass. We created an 8-day, 30-m grassland biomass dataset for the growth season (May – September) of 2000 – 2015, determined the average biomass value, and created a 16-scene image for the grassland biomass average value of the Qinghai Lake Basin during the growth season of 2000 – 2015.


Figure 3   Flowchart showing the acquisition of grassland biomass dataset for Qinghai Lake Basin
3.   Sample description
3.1   Phenological metrics dataset of alpine grasslands over Tibetan Plateau (2000 – 2010)
Figure 4 shows the spatial distribution of the Tibetan Plateau vegetation phenological metrics in 2010, while Figure 5 shows the spatial distribution of the phenological metrics trends during 2000 – 2010. Numbers 1 – 6 in Figure 5 represent the various typical ecological and climatic regions. Region 1 represents the east Qinghai-Qilian steppe, which has a cold, semi-arid climate; Region 2 represents the Naqu-Yushu meadow, which has a cold, semi-humid climate; Region 3 represents the south Tibet steppe, which has a warm, semi-arid climate; Region 4 represents the Ali alpine desert, which has a cool, arid climate; Region 5 represents the Qiangtang steppe, which has a cold, semi-arid climate; and Region 6 represents the south Qinghai steppe, which has a cold, semi-arid climate.
Images (a) – (d) in Figure 4 show the start of vegetation growth season (day-of-year), the end of growth season (day-of-year), the season length (days), and the NDVI peak date (day-of-year), respectively, with the days of year being counted from January 1st. Images (a) – (d) in Figure 5 show the day-of-year variation trends of the start of growth season, the end of growth season, the season length, and the NDVI peak date. No change means that there was no significant day-of-year advancement or delay in the corresponding phenological metrics. Date delay indicates that the arrival date of the beginning and the end of the growth season, and the NDVI peak date during the 11-year period were gradually delayed, whereas date advancement indicates that the arrival times of the phenological metrics were gradually advanced. Accordingly, the season length trend is shown either as prolonged or shortened.


Figure 4   Vegetation phenological metrics for the Tibetan Plateau (2010): (a) Start of vegetation growth season; (b) End of vegetation growth season; (c) Season length; (d) NDVI peak date


Figure 5   Vegetation phenological metrics trends for the Tibetan Plateau (2000 – 2010): (a) Start of vegetation growth season; (b) End of vegetation growth season; (c) Season length; (d) NDVI peak date11
During 2000 – 2010, nearly 40% of the Tibetan Plateau showed significant phenological variations. The beginning and the end of the growth season, and the NDVI peak date were delayed in the western region, while being advanced in the eastern region. Accordingly, the season length was shortened in the west, while being lengthened in the east. This change is in good agreement with the climatic change occurring in the Tibetan Plateau, namely, the decrease in precipitation in the western part and the increase in precipitation in the eastern part during the growth season. Therefore, with the increase in overall temperature in the Tibetan Plateau, the precipitation trends during the growth season might become the main driving factor behind the changes in the alpine vegetation types and distribution.
3.2   Land cover map for the Tibetan Plateau (2010)
The 2010 land cover map of the Tibetan Plateau in Figure 6 shows that the cover type of the entire region gradually transits from alpine meadow in the east to alpine desert in the west. Alpine meadow accounts for one-third of the total area of the Tibetan Plateau, and mainly distributes in the eastern and northeastern parts of the region which have warm semi-humid and cold semi-arid climate due to the monsoon effect during the growth season. The alpine steppe cover type is mainly distributed in the semi-arid climate of the central and southwestern parts of the Tibetan Plateau. Starting in the west region of the Tibetan Plateau, the land cover types gradually shift to heat- and drought-resistant alpine desert types. The Tibetan Plateau has several lakes. While its northern part mainly consists of desert and bare land, the Karakoram Mountains to the west and the Himalayas to the south are covered by perennial snowpacks and glaciers. Some non-alpine vegetation land cover types, including forest, shrubs and agricultural land, are scattered in the southeastern part of the Tibetan Plateau.


Figure 6   Land cover map for the Tibetan Plateau (2010)
3.3   Alpine aboveground biomass dataset for Qinghai Lake Basin (2000 – 2015)
After obtaining the NDVI time series and entering them into the non-parametric model of the support vector machine, we carried out an estimation of the grassland biomass of the Qinghai Lake Basin, and obtained the alpine aboveground biomass dataset for the Qinghai Lake Basin (2000 – 2015). Figure 7 shows the spatial variations of the Qinghai Lake Basin grassland biomass using 8-day intervals, during the vegetation growth season in 2015. It can be seen from Figure 7 that the biomass generated using the fused vegetation indices shows more spatial heterogeneity. In 2015, the Qinghai grassland vegetation displayed relatively low biomass in May, followed by a more vigorous growth in June, with the growth peak lasting from the beginning of July to the beginning of August. At the end of July, the regional average of the entire basin biomass was approximately 250 g/m2, with the vegetation biomass being over 500 g/m2 in some areas, mainly in the alpine meadow areas on the northern and southern sides of the Qinghai Lake and in the central part of the lake basin. At the end of August, the biomass began to decrease gradually until the end of September, when the biomass of the entire basin reduced to the same level as in May.


Figure 7   Spatial distribution at 8-day interval of the Qinghai Lake Basin grassland biomass during the growth season in 2015
4.   Quality control and assessment
4.1   For the vegetation phenological dataset
4.1.1   Quality control
In order to increase the accuracy and quality of the vegetation phenology dataset, we used the following measures:
(1) In order to ensure the accuracy of the extracted phenological data, we found it necessary to filter the NDVI time series images obtained from the surface spectral reflectance products, designating the pixels with the largest NDVI value of less than 0.1 as bare land, thus excluding them from the metrics extraction process.
(2) We used a smoothing process on the original NDVI time-series curve to remove the cloud pollution and noise, allowing for a more efficient and accurate extraction of the phenological metrics.
4.1.2   Data assessment
The phenological metrics and the trends present in this dataset were consistent with the climatic variation recorded by 109 weather stations on the Tibetan Plateau. Based on the Tibetan Plateau precipitation data recorded by the weather stations, the precipitation trends and the phenological metrics trends showed consistency in their geospatial distribution, coinciding with the response of vegetation to climate, thus verifying the reliability of the phenological metrics of this dataset.
4.2   For the land cover map
4.2.1   Quality control
In order to increase the accuracy and quality of the land cover map, we used the following measures:
(1) Since we used the vegetation phenological metrics as classification indices during the land cover map generation, in order to ensure the accuracy of the extracted metrics, it was necessary to use the same quality control measures as the ones described in Section 4.1.1 during data processing.
(2) Using a multi-level, comprehensive classification method that combines the phenological and biophysical metrics of vegetation, can reduce the occurrence of misclassification due to the spatial and spectral homogeneity of the various grassland types, thus improving the data quality.
4.2.2   Data assessment
In order to evaluate the data quality of this dataset, it was necessary to verify the accuracy of the land cover classification results. We selected several sample points, referred to in Section 2.1.3, as verification points to generate the confusion matrix (Table 1). If the alpine desert steppe and bare land types are ignored, the classification accuracy of this dataset can reach 93%. Among the alpine vegetation cover types, only 1 out of the 33 alpine meadow sample points was mistakenly identified as alpine steppe, corresponding to a user accuracy of 97%, with 5 out of the 35 alpine steppe sample points being mistakenly identified as alpine meadows, corresponding to a user accuracy of 86%. The snowpack and lake classification accuracy was high, as their training sample points and verification sample points were obtained from the same MODIS image scenes.
Table 1   Confusion matrix for land cover map sample verification8
4.3   For the grassland biomass dataset
4.3.1   Quality control
In order to increase the accuracy and quality of the grassland biomass dataset, we adopted the following measures:
(1) In order to ensure the data fusion quality, during the dataset generation process, we selected three types of surface reflectance data, namely, MOD09A1, MOD09Q1 and MCD43A4, and calculated NDVI based on fused reflectance data. We subsequently compared them for verification with NDVI that were calculated separately from the reflectance data, eventually fusing them. Moreover, we used the root mean square error (RMSE) and the relative root mean square error (RMSEr) as accuracy evaluation indices for obtaining the optimal data fusion scheme. The formula for the relative root mean square error (RMSEr) is shown in Equation (1) below.
\[{RMSE}_{r}=\frac{\sqrt{\frac{1}{M×N}\sum _{i=1}^{M}\sum _{j=1}^{N}{\left({p}_{\left(i,j\right)}-{q}_{\left(i,j\right)}\right)}^{2}}}{\stackrel{-}{p}}×100%\]
(1)
where, M and N are the image row and column numbers, respectively, p is the original image before fusion, q is the image after fusion, and \(\stackrel{-}{p}\) is the original image average pixel value. RMSEr represents the relative error between the fused and the observed images. The smaller the difference, the better the fusion effect.
(2) During the process of establishing the biomass estimation model, we selected the optimal estimation model by comparing the accuracy of linear, power function, exponential, logarithmic, polynomial, SVM and various other parametric/non-parametric estimation models.
4.3.2   Data assessment
Before generating the grassland biomass dataset, it was necessary to divide the 287 valid field samples of biomass data from the experimental region into two groups. The first group, consisting of 215 points, was used as the training sample for the biomass estimation models. The second group, consisting of 72 points, was used as the model accuracy verification data. The accuracy verification indicators selected for the quality assessment process included the correlation coefficient (r), the root mean square error (RMSE), and the relative root mean square error (RMSEr). Among these parameters, the larger the r, the smaller the RMSE, and the smaller the RMSEr, the closer the estimated model values were to the observed values. When the RMSE or the RMSEr was equal to 0, the model was perfect, with no errors. By using the training samples of the retrieval model, we can obtain a correlation coefficient (r) of 85%, a root mean square error (RMSE) of 74.45 g/m2, and an RMSEr of 34.5%. Moreover, by substituting the verification samples into the model, we calculated a correlation coefficient (r) of 82%, an RMSE of 78.28 g/m2, and an RMSEr of 37.05%. The study results are consistent with the results of a previous study conducted in the Xilinguole region of Inner Mongolia,13 suggesting high accuracy of the biomass estimation model and the reliability of the dataset results.
5.   Value and Significance
Through this article, we carried out a comparative analysis of the phenological dataset, the land cover map and the biomass dataset of the Tibetan Plateau, with other similar datasets in China and elsewhere. The results show that the three datasets in this study are advantageous in terms of temporal and spatial resolution, as well as product accuracy.
(1) Various researchers used the AVHRR GIMMS NDVI product to monitor the growth conditions of the vegetation on the Tibetan Plateau, and their results varied. For example, some researchers found an increase in the vegetation greenness in recent decades,1417 while others showed that there were no significant change in vegetation greenness.18 Some researchers even concluded that there was a certain level of vegetation degradation on the Tibetan Plateau, with a delay in the start of the alpine steppe growth season.1921 Such controversies reflect the large uncertainties of the AVHRR GIMMS NDVI product. Using the MODIS NDVI product, the phenological metrics dataset of this study was verified using the method described in Section 4.1, which proved the reliability of this dataset and the product. Moreover, when extracting the phenological metrics, we used the MODIS NDVI product, which has a higher temporal and spatial resolution than the AVHRR GIMMS NDVI product.
(2) The land cover map, obtained based on the MODIS images, was compared with the ACAS vegetation conditions of the Tibetan Plateau. Using the sample points shown in Table 1, we verified the ACAS, with the resulted confusion matrix shown in Table 2. Table 1 shows that the overall accuracy of the MODIS image-based classification reached 93%, which was higher than the overall accuracy of 79% for ACAS. Meanwhile, the demarcation of the alpine meadows and steppes in the map generated by this study is clearer than that in the ACAS map, with smoother transition, and more complete boundary region. In addition, the MODIS map has a higher accuracy than the ACAS map in the classification of alpine steppes, alpine meadows, alpine deserts, small lakes, and other surface types.8 This is mainly due to the fact that the MODIS image data used in this study was acquired in 2010, and the field data points used for the verification were obtained in 2011 (ground survey and hyperspectral aerial survey). Neglecting vegetation changes in the past two years, this dataset has a higher overall classification accuracy. On the other hand, as the original images used for the ACAS atlas were derived from the Landsat TM imagery of the 1980s and 1990s, changes in surface vegetation in recent decades as a result of global climate changes has affected the accuracy verification of the ACAS atlas. However, at the time of the dataset production, since we were unable to obtain field samples for the ACAS period to verify its classification accuracy, we selected large plots of vegetation as the verification points to minimize verification errors.
Table 2   Confusion matrix for verifying ACAS map samples8
(3) Biomass is one of the most important indices used to evaluate the condition of plateau grasslands. Therefore, the estimation of this parameter can be used to effectively manage and protect grassland resources. However, since a single remote sensing data source cannot satisfy the high spatial-temporal resolution requirements essential for grazing grassland surveys,12 the advantages of using a fusion of the high-spatial resolution TM images and the high-temporal resolution MODIS images are very clear. Figure 8 shows the comparison between the fused NDVI images and the 500-m MODIS NDVI dataset of Qinghai Lake Basin area samples. A comparison of the mean values and the standard deviations shows no significant discrepancies between the two types of NDVI images, thus proving that the fused NDVI image series are not only able to effectively reflect the 500-m MODIS NDVI information, but can also utilize the advantages of the high-spatial resolution MODIS images and the high-temporal resolution Landsat TM images. In addition, compared with the 500-m MODIS NDVI images, the fused NDVI images better reflect the spatial-temporal variations in vegetation, thus being able to more effectively characterize the distribution and growth conditions of vegetation around lake bank.


Figure 8   Comparison showing the variation of Qinghai Lake watershed sample fusion NDVI and MODIS NDVI in June 201512
6.   Usage notes
In this article, we have introduced three groups of results for the Tibetan Plateau vegetation study, which was based on phenological metrics extraction, remote sensing data fusion and estimation model building. The study produced a phenological metrics dataset for alpine grasslands (2000 – 2010), a land cover map (2010), and an alpine aboveground biomass dataset for the Qinghai Lake Basin (2000 – 2015).
The phenological metrics dataset for alpine grasslands (2000 – 2010) is divided into two files: pheno_trend and pheno_change. The Tibetan Plateau vegetation phenology thematic map (2000 – 2010), present in the pheno_change file, includes four images, representing the start of vegetation growth season, the end of growth season, the NDVI peak date and the season length. Each image with 11 bands corresponds to the phenological information for each year. The unit for the first three phenological metrics is day-of-year, while the unit for the season length is days. The dataset has a spatial resolution of 500 m, and a temporal resolution of 1 year. The four vegetation phenological trend data images present in the pheno_trend file, with a spatial resolution of 500 m, include the information shown in Figure 5. The images in this dataset are in the GeoTIFF format, which can be opened with the ENVI or ArcGIS software.
The land cover map for the Tibetan Plateau (2010), which can be found in the SVM_2010_3x3major_Plateau file within the LandCoverTypesMap folder, has a spatial resolution of 500 m. Within the dataset, class 11 is outside the study area. The images in this dataset are in the GeoTIFF format, which can be opened with the ENVI or ArcGIS software.
The alpine aboveground biomass dataset for the Qinghai Lake Basin (2000 – 2015) is located in the biomass result folder. The dataset includes the Qinghai Lake Basin grassland biomass data during the growth season (May – September) of 2000 – 2015, consisting of 16 images, with 20 bands per image, showing the vegetation biomass of each year using g/m2 as the unit. The images have a spatial resolution of 30 m, and a temporal resolution of 8 days (during the growing period from May to September each year). The images in this dataset are in the GeoTIFF format, which can be opened with the ENVI or ArcGIS software.
The above-mentioned datasets can provide reference information for studies on the relationship between the vegetation phenology and climate, and the vegetation distribution and trends of the Tibetan Plateau. The datasets can also be used for a more comprehensive assessment of the ecological response of alpine vegetation to global warming and regional warming on the Qinghai-Tibet Plateau.22 It also helps manage the grassland resources, maintain the steppe ecosystem stability, and study climate change in the Tibetan Plateau or globally, on the basis of the phenological changes in vegetation.
Acknowledgments
The authors would like to thank the National Natural Science Foundation of China – Dynamic Prediction Model of Grassland Biomass Based on Multi-source Remote Sensing Collaboration (41271372), and the International Cooperation Bureau of the Chinese Academy of Sciences (131C11KYSB20160061) for providing the funding for this dataset. Special thanks to USGS for providing the MODIS and Landsat image data, and to the Chinese Meteorological Data Sharing Service System for providing the observation data of the Tibetan Plateau meteorological stations.
1.
Wang K, Zhang L, Qiu Y et al. Snow effects on alpine vegetation in the Qinghai-Tibetan Plateau. International Journal of Digital Earth 8 (2015): 56 – 73.
2.
Hou X, Zhang L, Zhang Bet al. Research progress in responses of vegetation ecosystem to the climate change in Qinghai-Tibetan Plateau. Journal of Anhui Agricultural Sciences 44 (2016): 230 – 235, 244.
3.
Zhang L, Guo H, Wang C et al. The long-term trends (1982 – 2006) in vegetation greenness of the alpine ecosystem in the Qinghai-Tibetan Plateau. Environmental Earth Sciences 72 (2014): 1827 – 1841.
4.
Zhang L, Guo H, Ji L et al. Vegetation greenness trend (2000 to 2009) and the climate controls in the Qinghai-Tibetan Plateau. Journal of Applied Remote Sensing 7 (2013): 1 – 17.
5.
Li Y, Zhang L, Liao J et al. Remote sensing monitoring of grassland degradation in the central of the northern Tibet. Remote Sensing Technology and Application 28 (2013): 1069 – 1075.
6.
Li B, Zhang L, Yan Q et al. Application of piecewise linear regression in the detection of vegetation greenness trends in the Tibetan Plateau. International Journal of Remote Sensing 35 (2014): 1526 – 1539.
7.
Zhang Y, Bai W, Liu L et al. Land use and land cover change in the Tibetan Plateau. Workshop on the Third Pole Environment, LUCC and Climate Adaption in Tibetan Plateau. Beijing, 2010.
8.
Wang C, Guo H, Zhang L et al. Improved alpine grassland mapping in the Tibetan Plateau with MODIS time series: A phenology perspective. International Journal of Digital Earth 8 (2013): 131 – 150.
9.
Liu J, Liu M, Zhuang D et al. Study on spatial pattern of land-use change in China during 1995 – 2000. Science in China 46 (2003): 373 – 384.
10.
Liu S, Zhang L, Wang C et al. Vegetation phenology in the Tibetan Plateau using MODIS data from 2000 to 2010. Remote Sensing Information 29 ((2014)): 25 – 30.
11.
Wang C, Guo H, Zhang L et al. Assessing phenological change and climatic control of alpine grasslands in the Tibetan Plateau with MODIS time series. International Journal of Biometeorology 59 (2015): 11 – 23.
12.
Zhang B. Application of Data Fusion Technique and SVM on Biomass Estimation in Qinghai Lake Basin Using Remote Sensing Data. Master's Dissertation, University of Chinese Academy of Sciences, 2016.
13.
Zhang B, Zhang L, Xie D et al. Application of synthetic NDVI time series blended from Landsat and MODIS data for grassland biomass estimation. Remote Sensing 8 (2016): 10.
14.
Yang Y & Piao S. Variations in grassland vegetation cover in relation to climatic factors on the Tibetan Plateau. Journal of Plant Ecology 30 (2006): 1 – 8.
15.
Xu X, Chen H & Levy J. Spatiotemporal vegetation cover variations in the Qinghai-Tibet Plateau under global climate change. Chinese Science Bulletin 53 (2008): 915 – 922.
16.
Zhong L, Ma Y, Salama M et al. Assessment of vegetation dynamics and their response to variations in precipitation and temperature in the Tibetan Plateau. Climatic Change 103 (2010): 519 – 535.
17.
Zhang G, Zhang Y, Dong J et al. Green-up dates in the Tibetan Plateau have continuously advanced from 1982 to 2011. PNAS 110 (2013): 4309 – 4314.
18.
Zhang J, Yao F, Zheng L et al. Evaluation of grassland dynamics in the Northern-Tibet Plateau of China using remote sensing and climate data. Sensors 7 (2007): 3312 – 3328.
19.
Liang S, Chen J, Jin X et al. Regularity of vegetation coverage changes in the Tibetan Plateau in the past 21 years. Advances in Earth Science 22 (2007): 33 – 40.
20.
Mao F, Zhang Y, Hou Y et al. Evaluation of grassland degradation in Naqu Northern Tibet. Chinese Journal of Applied Ecology 19 (2008): 278 – 284.
21.
Yu H, Luedeling E & Xu J. Winter and spring warming result in delayed spring phenology on the Tibetan Plateau. PNAS Early Edition (2010): 1 – 6.
22.
Wang C. A remote sensing perspective of alpine grasslands on the Tibetan Plateau: Better or worse under "Tibet Warming"?. Remote Sensing Applications: Society and Environment 3 (2016): 36 – 44.
Data citations
1. Wang C, Zhang L, Liu S et al. Phenological metrics dataset of alpine grasslands over the Tibetan Plateau (2000 – 2010). Science Data Bank. DOI: 10.11922/sciencedb.397
2. Wang C, Zhang L, Qiu Y et al. Land cover map for Tibetan Plateau (2010). Science Data Bank. DOI: 10.11922/sciencedb.398
3. Zhang B & Zhang L. Alpine aboveground biomass dataset for Qinghai Lake Basin (2000 – 2015). Science Data Bank. DOI: 10.11922/sciencedb.399
Article and author information
How to cite this article
Zhang L, Wang C, Yang H et al. Phenological metrics dataset, land cover types map for the Tibetan Plateau and grassland biomass dataset for Qinghai Lake Basin. China Scientific Data 2 (2017). DOI: 10.11922/csdata.170.2017.0132
Zhang Li
zhangli@radi.ac.cn
Zhang Li, Professor; research area: vegetation ecology and remote sensing. Contribution: design of the grassland biomass estimation scheme and modeling work, production of the vegetation phenology dataset and the land cover map.
Wang Cuizhen
Wang Cuizhen, PhD, Associate Professor; research area: vegetation environment remote sensing and environmental monitoring. Contribution: generation of the vegetation phenological dataset and the land cover map.
Yang Haoxiang
Yang Haoxiang, Master's student; research area: vegetation ecology and remote sensing. Contribution: data processing and analysis.
Zhang Binghua
Zhang Binghua, Master's student; research area: vegetation ecology and remote sensing. Contribution: establishment of the grassland biomass estimation model and data retrieval.
Zheng Yi
Zheng Yi, Master's student; research area: vegetation ecology and remote sensing. Contribution: data processing and analysis.
Publication records
Published: Sept. 27, 2017 ( VersionsEN1
Released: June 19, 2017 ( VersionsZH1
Published: Sept. 27, 2017 ( VersionsZH2
References
中国科学数据
csdata