{"id":1578,"date":"2018-06-25T13:25:35","date_gmt":"2018-06-25T11:25:35","guid":{"rendered":"https:\/\/blogs.fu-berlin.de\/reseda\/?page_id=1578"},"modified":"2018-10-15T11:30:29","modified_gmt":"2018-10-15T09:30:29","slug":"landsat-8-preprocessing","status":"publish","type":"page","link":"https:\/\/blogs.fu-berlin.de\/reseda\/landsat-8-preprocessing\/","title":{"rendered":"Landsat 8 Preprocessing"},"content":{"rendered":"<p>Landsat 8 ships as a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Tar_(computing)\" target=\"_blank\">tar-archived file<\/a> with the spectral bands as individual georeferenced tif images. We want to stack these images into a single geotiff-file, i.e., into a so-called raster stack. Afterwards, it is much easier to work with such a raster stack. While this could also be done in QGIS, we will use R for this preprocessing, because it is easier to automate things this way. This results in the following \tintermediate steps we have to check off:<\/p>\n<ol>\n<li>unzip your downloaded L8 files<\/li>\n<li>put together the spectral band files of your choice into a rasterstack<\/li>\n<li>save this rasterstack to your hard drive<\/li>\n<li>(optional) delete all redundant data <\/li>\n<li>(optional) create pyramid layers for a better visualization in QGIS <\/li>\n<\/ol>\n<p>We will practice everything exemplarily on the basis of a single L8 Level-2 data set. There will be an exhaustive explanation for each line of code. Based on that we will develop a script that will automatically do everything for you in the future. <\/p>\n<p><\/br><\/p>\n<h1>Prerequisite<\/h1>\n<p>The following content requires that you have either successfully downloaded some Landsat 8 scenes as part of the <a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/earthexplorer-exercise\/\">Download Section Exercise<\/a>, or that you have acquired some datasets from the USGS EarthExplorer by your own. If this is not the case, look into the <a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/usgs-earth-explorer\/\">chapter Landsat \/ Earthexplorer<\/a>!<\/p>\n<p>Done? &#8211; Then start <a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/r-studio\/\">R-Studio<\/a> now!<\/p>\n<p><\/br><a name=\"1\"><\/a><\/p>\n<h1>Preprocess a single dataset<\/h1>\n<p>We recommend writing the following code in the script window of R-Studio and executing it from there (see <a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/r-studio\/\">chapter R-Studio<\/a>).<br \/>\nThe example below assumes that you have one or more Landsat 8 scenes in a &#8220;landsatdata&#8221;-folder located in the exchange folder of your VM. Otherwise, adjust the folder according to your data!<\/p>\n<p>We will use the <a href=\"https:\/\/cran.r-project.org\/web\/packages\/raster\/raster.pdf\" target=\"_blank\">raster library<\/a> to write a raster file later. Additional libraries should always be loaded first, using the function <span class=\"crayon-inline theme:amityreseda\">library()<\/span>:<\/p>\n<pre class=\"theme:amityreseda\">\r\nlibrary(raster)\r\n## Loading required package: sp\r\n<\/pre>\n<p>A few libraries make use of other libraries. So does the raster library with the <a href=\"https:\/\/cran.r-project.org\/web\/packages\/sp\/sp.pdf\" target=\"_blank\">sp-package<\/a>, which will be loaded automatically.<\/p>\n<p>Then define the Landsat 8 product of your choice with its entire file path and save it as the string variable <span class=\"crayon-inline theme:amityreseda\">product<\/span>:<\/p>\n<pre class=\"theme:amityreseda\">\r\nproduct &lt;- &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412.tar.gz&quot;\r\n\r\nfile.exists(product)\r\n## [1] TRUE\r\n<\/pre>\n<p><strong>NOTE:<\/strong> this is just an example &#8211; you have to change the file and its path according to your own settings!<br \/>\nYou can use the function <span class=\"crayon-inline theme:amityreseda\">file.exists() <\/span> to check whether the file can be found on your system or not. If it returns <span class=\"crayon-inline theme:amityreseda\">FALSE<\/span>, make sure you did not mess up the file name.<\/p>\n<p>We want to unpack the file into a folder with the same name as the file. The character variable <span class=\"crayon-inline theme:amityreseda\">product<\/span> already contains the full name of the Landsat scene. We just have to get rid of the suffix &#8220;.tar.gz&#8221;. Therefore we can use the function <span class=\"crayon-inline theme:amityreseda\">substr()<\/span> to delete the last seven characters (=&#8221;.tar.gz&#8221;) of the string:<\/p>\n<pre class=\"theme:amityreseda\">\r\nproductname &lt;- substr(product, 1, nchar(product) - 7) \r\n\r\nproductname\r\n## [1] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412&quot;\r\n<\/pre>\n<p>The substring function <span class=\"crayon-inline theme:amityreseda\">substr()<\/span> accepts three arguments here: a string (the entire file path with the suffix), and two integer values &#8211; one for a start (&#8220;1&#8221; = start with the first character) and one for a stop position within the given string. In order to define the stop position, we need to count the number of all characters of the file path via <span class=\"crayon-inline theme:amityreseda\">nchar(product)<\/span> (which is 77 in our case) and substract 7.<\/p>\n<p>Unzip the Landsat product using the <span class=\"crayon-inline theme:amityreseda\">untar()<\/span> method and define the directory to which all data will be extracted by setting the argument <span class=\"crayon-inline theme:amityreseda\">exdir = productname<\/span>:<\/p>\n<pre class=\"theme:amityreseda\">\r\nuntar(product, exdir = productname)\r\n<\/pre>\n<p>You should notice a new folder being created next to your initial data product. This could take a short moment. During processing, a red exclamation point can be seen in the upper right corner of the console window. <strong>NOTE:<\/strong> If this step fails, your data is corrupt. In this case, you will need to download the file again because something went wrong during data transfer.<br \/>\nAfter unpacking is complete, we want to have a look in the newly extracted folder and save the file names of all files in this folder into a vector <span class=\"crayon-inline theme:amityreseda\">productfiles<\/span> using the function <span class=\"crayon-inline theme:amityreseda\">list.files<\/span>. By providing the argument <span class=\"crayon-inline theme:amityreseda\">full.names = TRUE<\/span>, the full paths are returned for each file:<\/p>\n<pre class=\"theme:amityreseda\">\r\nproductfiles &lt;- list.files(productname, full.names = TRUE)\r\n\r\nproductfiles\r\n##  [1] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_ANG.txt&quot;       \r\n##  [2] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_MTL.txt&quot;       \r\n##  [3] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_pixel_qa.tif&quot;  \r\n##  [4] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_radsat_qa.tif&quot; \r\n##  [5] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_aerosol.tif&quot;\r\n##  [6] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band1.tif&quot;  \r\n##  [7] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band2.tif&quot;  \r\n##  [8] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band3.tif&quot;  \r\n##  [9] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band4.tif&quot;  \r\n## [10] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band5.tif&quot;  \r\n## [11] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band6.tif&quot;  \r\n## [12] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band7.tif&quot;  \r\n## [13] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1.xml&quot;\r\n<\/pre>\n<p>These are all 13 files that come with a Level-2 product (see <a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/landsat-8\/\">Landsat 8<\/a> for more information). As a double check, you can have a look at the identical files in your file explorer:<\/p>\n<p><a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/files\/2018\/06\/preproc_01.png\"><img decoding=\"async\" src=\"https:\/\/blogs.fu-berlin.de\/reseda\/files\/2018\/06\/preproc_01.png\" class=\"aligncenter\" \/><\/a><\/p>\n<p>Now we want to select all the spectral bands that we want to put into our new data stack. Have a deeper look at the files in <span class=\"crayon-inline theme:amityreseda\">productfiles<\/span>. The files are named after the corresponding spectral channels at the end of the file name, e.g., &#8220;band1&#8221;, &#8220;band2&#8221;, and so on. We use these terms to extract the bands by utilizing the <span class=\"crayon-inline theme:amityreseda\">grep()<\/span> function. The following code looks a little bloated. Maybe it is. But he gives you the freedom to exclude individual bands that you may not need. <strong>NOTE:<\/strong> This is an example of level 2 data. Landsat Level-1 scenes potentially have 11 channels. In the case of Level 1 data, you could easily enter the remaining lines here.<\/p>\n<pre class=\"theme:amityreseda\">\r\nbands &lt;- c(grep(&#039;band1&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band2&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band3&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band4&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band5&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band6&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band7&#039;, productfiles, value=TRUE)\r\n           ) \r\n\r\nbands\r\n## [1] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band1.tif&quot;\r\n## [2] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band2.tif&quot;\r\n## [3] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band3.tif&quot;\r\n## [4] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band4.tif&quot;\r\n## [5] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band5.tif&quot;\r\n## [6] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band6.tif&quot;\r\n## [7] &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412\/LC08_L1TP_193023_20170602_20170615_01_T1_sr_band7.tif&quot;\r\n<\/pre>\n<p>The <span class=\"crayon-inline theme:amityreseda\">grep()<\/span> function searches for the string, which is given as the first argument, e.g., &#8216;band1&#8217;, in the vector of strings (2nd argument). Normally, when the function finds a match, it only returns the position of this find in the vector <span class=\"crayon-inline theme:amityreseda\">productfiles<\/span>. By setting the argument <span class=\"crayon-inline theme:amityreseda\">value=TRUE<\/span> we can write out the content of the vector at the respective position instead. By doing so, we create an vector <span class=\"crayon-inline theme:amityreseda\">bands<\/span> containing all file paths via the standard combine-function <span class=\"crayon-inline theme:amityreseda\">c()<\/span>. <\/p>\n<p>Now we can create a rasterstack based on the vector of bands in the variable <span class=\"crayon-inline theme:amityreseda\">bands<\/span>. In order to do so, we use the function <span class=\"crayon-inline theme:amityreseda\">stack()<\/span>, which is part of the raster-library we loaded in the beginning! This <a href=\"https:\/\/www.rdocumentation.org\/packages\/raster\/versions\/2.6-7\/topics\/stack\" target=\"_blank\">rasterstack function<\/a> creates a rasterstack object based on a list of filenames. You will learn more about the beauty of rasterstacks in the <a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/visualization\/\">chapter Visualization<\/a>. <\/p>\n<pre class=\"theme:amityreseda\">\r\nrasterstack &lt;- stack( bands )\r\n<\/pre>\n<p>We can save this rasterstack and store it on your hard drive via <span class=\"crayon-inline theme:amityreseda\">writeRaster<\/span>:<\/p>\n<pre class=\"theme:amityreseda\">\r\nwriteRaster(rasterstack, \r\n            paste0(productname, &quot;.tif&quot;), \r\n            format = &quot;GTiff&quot;,\r\n            overwrite = TRUE\r\n            )\r\n<\/pre>\n<p>The <a href=\"https:\/\/www.rdocumentation.org\/packages\/raster\/versions\/2.6-7\/topics\/writeRaster\" target=\"_blank\">writerRaster fuction<\/a>  is a powerful tool, which is also provided by the <a href=\"https:\/\/cran.r-project.org\/web\/packages\/raster\/raster.pdf\" target=\"_blank\">raster package<\/a>. In line 1 we set our rasterstack object as the first argument. In line 2 we define the output name of the new file we will create. To do this, just add the suffix &#8220;.tif&#8221; to our filename using the handy function <span class=\"crayon-inline theme:amityreseda\">paste0<\/span>, which just put all the strings together to one string. Line 3 explicitly defines the output format &#8220;GTiff&#8221;. How should I know this string &#8220;GTiff&#8221;, you ask? These strings are fixed by the function and are also listed for other data formats in the <a href=\"https:\/\/cran.r-project.org\/web\/packages\/raster\/raster.pdf\" target=\"_blank\">raster documentation<\/a> on page 220. The fourth argument in line 4 only gives the authority to overwrite existing files.<\/p>\n<p><strong>DONE!<\/strong> A new file containing all spectral bands is now written directly next to the initial packed file you downloaded! <\/p>\n<p>There are still two useful additional things left to do:<br \/>\nFirst, we can now delete the folder we created by uncompressing your Landsat data because it is no longer needed. So, if you want to save disk space, use the command <span class=\"crayon-inline theme:amityreseda\">unlink()<\/span> to simply delete the folder. By setting the argument <span class=\"crayon-inline theme:amityreseda\">recursive=TRUE<\/span>, all files within the folder will be deleted:<\/p>\n<pre class=\"theme:amityreseda\">\r\nunlink(productname, recursive=TRUE)\r\n<\/pre>\n<p>Second, it is advisable to create so-called <a href=\"http:\/\/desktop.arcgis.com\/en\/arcmap\/latest\/manage-data\/raster-and-images\/raster-pyramids.htm\" target=\"_blank\">pyramid layers<\/a> for each Landsat dataset. Pyramid layers are used to improve performance. They are a downsampled version of the original raster and speed up the display of raster data by retrieving only the data at a specified resolution that is required for the display. We can automatically create them by using the <a href=\"http:\/\/www.gdal.org\/\" target=\"_blank\">Geospatial Data Abstraction Library<\/a> (GDAL). Initially, GDAL has nothing to do with R, but it can be operated via R using <a href=\"https:\/\/www.rdocumentation.org\/packages\/rgdal\/versions\/1.3-2\" target=\"_blank\">rgdal<\/a>. However, we use the standalone installation of GDAL at this point, or more precisely, the function <a href=\"http:\/\/www.gdal.org\/gdaladdo.html\" target=\"_blank\">gdaladdo<\/a>. If you do not use our VM, you have to install GDAL on your machine first. We build the appropriate command to run <span class=\"crayon-inline theme:amityreseda\">gdaladdo<\/span> using <span class=\"crayon-inline theme:amityreseda\">paste0()<\/span> and pass that command to the <span class=\"crayon-inline theme:amityreseda\">system()<\/span> function. The <span class=\"crayon-inline theme:amityreseda\">system()<\/span> function executes a command exactly as if you entered it yourself in the command window of Linux. Try it!<\/p>\n<pre class=\"theme:amityreseda\">\r\nmakeOVR &lt;- paste0(&quot;gdaladdo -ro &quot;, productname, &quot;.tif 2 4 8 16&quot;)\r\n\r\nmakeOVR\r\n## [1] &quot;gdaladdo -ro \/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412.tif 2 4 8 16&quot;\r\n\r\nsystem( makeOVR )\r\n<\/pre>\n<p>Run the code and you will create a .ovr-file next to your .tif-file. This ovr-file holds the pyramid information, which will prove useful later in QGIS.<\/p>\n<p>Phew! This was A LOT of (explanatory) text by now. Fortunately, the code is actually much shorter! Here is the complete code for preprocessing one exemplary L8 Level-2 file:<\/p>\n<pre class=\"theme:amityreseda\">\r\nlibrary(raster)\r\n\r\nproduct &lt;- &quot;\/media\/sf_exchange\/landsatdata\/LC081930232017060201T1-SC20180613160412.tar.gz&quot;\r\n\r\nproductname &lt;- substr(product, 1, nchar(product) - 7) \r\n\r\nuntar(product, exdir = productname)\r\n\r\nproductfiles &lt;- list.files(productname, full.names = TRUE)\r\n\r\nbands &lt;- c(grep(&#039;band1&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band2&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band3&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band4&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band5&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band6&#039;, productfiles, value=TRUE),\r\n           grep(&#039;band7&#039;, productfiles, value=TRUE)\r\n           ) \r\n\r\nrasterstack &lt;- stack( bands )\r\n\r\nwriteRaster(rasterstack, \r\n            paste0(productname, &quot;.tif&quot;), \r\n            format = &quot;GTiff&quot;,\r\n            overwrite = TRUE\r\n            )\r\n\r\nunlink(productname, recursive=TRUE)\r\n\r\nmakeOVR &lt;- paste0(&quot;gdaladdo -ro &quot;, productname, &quot;.tif 2 4 8 16&quot;)\r\nsystem( makeOVR )\r\n<\/pre>\n<p>Learn how to automate things for many datasets in the next section!<\/p>\n<hr style=\"height:4px;background-color:#6b9e1f\">\n<a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/landsat-8-bulk-process\/\"><br \/>\n<button style=\"width:100%;text-align:right;padding: 10 0;background-color:white;margin:-55px 0 0 0\"><\/p>\n<div style=\"font-family: 'Noto Sans',sans-serif;line-height: 1.2\">\n<span style=\"font-size: 12px;color:#bfbfbf\"><strong><em>NEXT<\/em><\/strong><\/span><br \/>\n<span style=\"font-size: 30px;color:#6b9e1f\"><strong><em>Landsat 8 Bulk Preprocessing<\/em><\/strong><\/span>\n<\/div>\n<p><\/button><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Landsat 8 ships as a tar-archived file with the spectral bands as individual georeferenced tif images. We want to stack these images into a single geotiff-file, i.e., into a so-called raster stack. Afterwards, it is much easier to work with such a raster stack. While this could also be done in QGIS, we will use &hellip; <a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/landsat-8-preprocessing\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Landsat 8 Preprocessing&#8221;<\/span><\/a><\/p>\n","protected":false},"author":3237,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1578","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/pages\/1578","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/users\/3237"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/comments?post=1578"}],"version-history":[{"count":49,"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/pages\/1578\/revisions"}],"predecessor-version":[{"id":2842,"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/pages\/1578\/revisions\/2842"}],"wp:attachment":[{"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/media?parent=1578"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}