{"id":1514,"date":"2018-06-22T11:16:08","date_gmt":"2018-06-22T09:16:08","guid":{"rendered":"https:\/\/blogs.fu-berlin.de\/reseda\/?page_id=1514"},"modified":"2018-10-15T11:23:35","modified_gmt":"2018-10-15T09:23:35","slug":"sentinel-big-data-download","status":"publish","type":"page","link":"https:\/\/blogs.fu-berlin.de\/reseda\/sentinel-big-data-download\/","title":{"rendered":"Sentinel BIG DATA"},"content":{"rendered":"<p>Once again it is time for a BIG DATA download, which is useful when you need dozens or hundreds of scenes. People have come up with some  Python (e.g., <a href=\"https:\/\/pypi.org\/project\/sentinelsat\/\" target=\"_blank\">sentinelsat<\/a> or <a href=\"https:\/\/github.com\/olivierhagolle\/Sentinel-download\" target=\"_blank\">Sentinel_download<\/a>) and R libraries (e.g., <a href=\"https:\/\/github.com\/IVFL-BOKU\/sentinel2\" target=\"_blank\">sentinel2<\/a>) in recent years to automatically copy Sentinel data from the ESA servers. All of these libraries are based on two Application Program Interfaces (APIs) provided by the ESA Data Hub used for browsing and downloading the data stored in the rolling archives.<br \/>\nWe will take a closer look at ESA&#8217;s underlying, <a href=\"https:\/\/github.com\/SentinelDataHub\/Scripts\/blob\/master\/dhusget_0.3.5.sh\" target=\"_blank\">official script<\/a> and we will use it to automatically download ALL records of any of your search queries. The script is named <strong>dhusget.sh<\/strong> and already deposited in our VM (or <a href=\"https:\/\/box.fu-berlin.de\/s\/e7dGWGKnJPJi47Y\" rel=\"noopener\" target=\"_blank\">available for download here<\/a>). You need to have a Linux Environment with <a href=\"https:\/\/wiki.ubuntuusers.de\/wget\/\" target=\"_blank\">wget <\/a>installed (such as <a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/get-your-vm\/\" target=\"_blank\">our VM<\/a>) in order to use the script, which is a Unix shell script (.sh).<br \/>\nFirst of all, let us have a look at a blueprint of the command, we will adapt to your needs in the following:<\/p>\n<pre class=\"theme:amityresedaterminal nums:false\">\r\n\/home\/student\/Documents\/dhusget.sh -d https:\/\/scihub.copernicus.eu\/dhus -l 100 -P 1 -C \/media\/sf_exchange\/product_list.csv -u user123 -p password123 -o product -O \/media\/sf_exchange\/sentineldata -R \/media\/sf_exchange\/sentinel_data\/failed.txt -F 'query'\r\n<\/pre>\n<p>Explanation: That is a lot of options! Each option is separated from the next by a blank. Anyway, you do not need to modify most of those options:<\/p>\n<ul>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">\/home\/student\/Documents\/dhusget.sh<\/span>: location of shell script in the VM<\/li>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">-d https:\/\/scihub.copernicus.eu\/dhus<\/span>: the URL of the Data Hub Service<\/li>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">-l 100<\/span>: number of results per page (100 max.)<\/li>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">-P 1<\/span>: page number (each page contains a maximum of 100 results)<\/li>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">-C \/media\/sf_exchange\/product_list.csv<\/span>: write all products found to a list and save it as a CSV text file<\/li>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">-u user123<\/span>: insert your ESA SciHUB username here<\/li>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">-p password123<\/span>: insert your ESA SciHUB password here<\/li>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">-o product<\/span>: activate download for the Product ZIP files<\/li>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">-O \/media\/sf_exchange\/sentineldata<\/span>: folder where downloaded Sentinel data will be stored<\/li>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">-R \/media\/sf_exchange\/sentinel_data\/failed.txt<\/span>: file listing products which failed to download<\/li>\n<li><span class=\"crayon-inline theme:amityresedaterminal\">-F &#8216;query&#8217;<\/span>: this is the full text query you can copy\/paste from the ESA SciHUB website<\/li>\n<\/ul>\n<p>If you use our VM, you just need to replace your username <span class=\"crayon-inline theme:amityresedaterminal\">-u user123<\/span> and password <span class=\"crayon-inline theme:amityresedaterminal\">-p password123<\/span> with your own login information. Actually, you only need to integrate your search query in <span class=\"crayon-inline theme:amityresedaterminal\">-F &#8216;query&#8217;<\/span> as a last step. Thus, visit the ESA Sci-HUB portal and set your data query to your liking as shown in <a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/esa-scihub\/#3\">Perform a Search section<\/a>. In the upper part of the result window, the query can be obtained in text form. This is an example of a query giving all Sentinel 2 products sensed between the first and second of June 2018 globally:<\/p>\n<p><a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/files\/2018\/05\/ESA_020.png\"><img decoding=\"async\" src=\"https:\/\/blogs.fu-berlin.de\/reseda\/files\/2018\/05\/ESA_020.png\" class=\"aligncenter\" \/><\/a><\/p>\n<div style=\"margin: -30px 0 20px 0;text-align: center\"><span style=\"color: #686868;font-size: small\">Free text OpenSearch query example and number of products available<\/span><\/div>\n<p>You will want to mark and copy the whole query expression and paste it between the simple quotes of the <span class=\"crayon-inline theme:amityresedaterminal\">-F &#8216;query&#8217;<\/span> option, like so:<\/p>\n<pre class=\"theme:amityresedaterminal nums:false\">-F ' ( beginPosition:[2018-06-01T00:00:00.000Z TO 2018-06-02T23:59:59.999Z] AND endPosition:[2018-06-01T00:00:00.000Z TO 2018-06-02T23:59:59.999Z] ) AND (platformname:Sentinel-2) '\r\n<\/pre>\n<p>Open a terminal (taskbar in our VM) and type in the complete query, e.g.:<\/p>\n<p><a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/files\/2018\/05\/ESA_021.png\"><img decoding=\"async\" src=\"https:\/\/blogs.fu-berlin.de\/reseda\/files\/2018\/05\/ESA_021.png\" class=\"aligncenter\" \/><\/a><\/p>\n<div style=\"margin: -30px 0 20px 0;text-align: center\"><span style=\"color: #686868;font-size: small\">Terminal with exemplary download command for 100 Sentinel 2 products<\/span><\/div>\n<p>By pressing enter it will start downloading up to 100 scenes at once. Previously downloaded scenes will not be downloaded again, and if the internet connection is lost, the download will continue at that point when restarting the script. When a scene is fully loaded, the <a href=\"https:\/\/en.wikipedia.org\/wiki\/MD5\" target=\"_blank\">MD5 integrity<\/a> will be checked, telling you whether the file is complete or not. If not, the scene is written to a text file to notify you <span class=\"crayon-inline theme:amityresedaterminal\">-R \/media\/sf_exchange\/sentinel_data\/failed.txt<\/span>.<\/p>\n<p>By doing a search query on the ESA SciHUB Website, you can also see the resulting products available (which is the insane amount of 24,360 in our example &#8211; you probably do not need that many). However, for some reason ESA capped the maximum amount of scenes you can query\/download at once at 100. Because of that, you will only download the search results 1-100. In order to download search results 101-200 you have to change the result page with the option <span class=\"crayon-inline theme:amityresedaterminal\">-P 2<\/span>, for the search results 201-300 set this option to <span class=\"crayon-inline theme:amityresedaterminal\">-P 3<\/span>, and so on. If required, this could also be automated, either by a shell script or in R.<\/p>\n<p><\/br><\/br><\/p>\n<hr style=\"height:4px;background-color:#6b9e1f\">\n<a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/preprocessing\/\"><br \/>\n<button style=\"width:100%;text-align:right;padding: 10 0;background-color:white;margin:-55px 0 0 0\"><\/p>\n<div style=\"font-family: 'Noto Sans',sans-serif;line-height: 1.2\">\n<span style=\"font-size: 12px;color:#bfbfbf\"><strong><em>NEXT<\/em><\/strong><\/span><br \/>\n<span style=\"font-size: 30px;color:#6b9e1f\"><strong><em>Preprocess<\/em><\/strong><\/span>\n<\/div>\n<p><\/button><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Once again it is time for a BIG DATA download, which is useful when you need dozens or hundreds of scenes. People have come up with some Python (e.g., sentinelsat or Sentinel_download) and R libraries (e.g., sentinel2) in recent years to automatically copy Sentinel data from the ESA servers. All of these libraries are based &hellip; <a href=\"https:\/\/blogs.fu-berlin.de\/reseda\/sentinel-big-data-download\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Sentinel BIG DATA&#8221;<\/span><\/a><\/p>\n","protected":false},"author":3237,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1514","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/pages\/1514","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/users\/3237"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/comments?post=1514"}],"version-history":[{"count":10,"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/pages\/1514\/revisions"}],"predecessor-version":[{"id":2841,"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/pages\/1514\/revisions\/2841"}],"wp:attachment":[{"href":"https:\/\/blogs.fu-berlin.de\/reseda\/wp-json\/wp\/v2\/media?parent=1514"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}