Week 5

Data formats, groups and sources in geospatial technology; main data portals of NOAA, NASA, USGS, GIS Clearinghouses

 

 

================================================================

Basic Common GIS Formats

Below is a list of common and less common formats of data used in GIS analysis. Some data formats are being used solely in GIS (e.g. shapefiles, coverages) and some are used solely in remote sensing and earth science (e.g. NetCDF, HDF). Development of the software helps to integrate these data in easier way now.

Used abbreviations:

ESRI: Environmental Systems Research Institute, Redlands, CA

USGS: United States Geological Survey

NOAA: National Oceanographic and Atmospheric Administration

NASA: National Aeronautics and Space Administration

 

 

Related to ArcGIS software (ESRI Inc.):

  1. Coverage (vector), original ESRI format, not much in use

  2. Interchange Format (or Export Format, .e00), not much in use

  3. Shapefile (vector), widely used by ESRI and other GIS software

  4. ESRI GRID (raster), still in use in many ArcGIS applications

  5. ESRI Geodatabases (file .gdb and personal, .mdb), standard compact format

USGS

 

  1. Digital Line Graph (DLG) (vector), USGS, still can be found in many USGS sites, not much in general use

  2. Spatial Data Transfer Standard (SDTS), USGS, still can be found in many USGS sites, not much in general use

  3. Digital Raster Graph (DRG) (raster), USGS, still can be found in many USGS sites, not much in general use

NOAA, NASA

 

  1. NetCDF (.nc)

  2. HDF (.hdf5)

Both formats are very common for many meteorological, oceanographic and other earth science data applications

General standard raster/image formats:

Various image formats: .bil, .jpg, .tiff, .sid, etc. Very common to store raster data in ArcGIS and images from NASA and USGS (e.g. Landsat data).

 

================================================================

 

ESRI FORMATS:

 

Coverage (vector format) and GRID (raster format):

Both data formats have separate folders for spatial data and attribute tables

 

Example: ESRI coverage named “towns” stores municipal boundaries of a township.

Its structure consists of two sub-folders:               /info        /towns

/info is a standard folder for storing non-spatial information (i.e. attribute tables or “what is there?”)

/towns is a standard folder containing spatial information (i.e. features; or “where is it?”)

Directory with both /info and /towns is called “Workspace

Workspace can have multiple coverages and grids; however /info directory will be common for all, it will store attribute data.

Copying, renaming and deleting datasets can be done only through ArcInfo interface or ArcCatalog.

NO DRAGGING AND MOVING of these directories!!! You will lose internal links between data and software will not be able to work with them.

 

================================================================

 

EXAMPLE OF FILE MANAGER FOLDER WITH WORKSPACE AND COVERAGES/GRIDS.

 

 

 

 

What is the workspace name?

What is the name of the first coverage or grid in the file manager window?

Why do we consider this subdirectory a workspace?

 

================================================================

SHAPEFILE FORMAT

Shapefile characteristics:

  1. Vector format

  2. Have separate files for spatial and non-spatial (attribute table) data

  3. Does not follow or maintain topological rules

  4. Does not automatically calculate arc lengths and polygon areas

  5. Requires projection definition file “.prj” that stores information about spatial reference of your data; you can still view data without it but the software will issue a warning message and you will not be able to use them together with other data properly set with spatial reference.

Example: ESRI shapefile that stores municipal boundaries and is called “towns”.

 

towns.shp      (spatial component, main file)

towns.shx      (index file)

towns.dbf       (non-spatial component, attribute dBase table, can open with EXCEL)

towns.prj        (projection definition (spatial reference) file)

The minimum necessary number of shapefile components for viewing it in ArcGIS is three: .dbf, .shp, .shx.

Files can be ZIPed (WinZip), RARed (WinRAR), dragged and moved together, but NOT SEPARATED!!!

When compressing file make sure that ALL relevant files are inside your .zip or .rar file

Especially critical are .dbf, .shp and .shx

 

More Info on shapefiles: http://en.wikipedia.org/wiki/Shapefile

 

================================================================

 

EXAMPLE OF FILE MANAGER FOLDER WITH SHAPEFILE

 

 

Name spatial and non-spatial components of shapefile

Name shapefile component that stores spatial reference

 

 

================================================================

INTERCHANGE FILE (aka Export):

 

 1. Compressed version of ESRI Coverage or Grid, similar to TAR, ZIP, RAR, etc.

 2. Combines in one file (.e00) spatial data folders and attribute tables from /info folder.

 

  

 

 3. Helps transfer COVERAGES or GRIDS easily over the internet

 4. Contains only ESRI coverages or grids

 5. Has to be IMPORTED to produce back coverage or grid. See below.

 Example: ESRI export file that stores municipal boundaries and is called “towns”.

            towns.e00

Can be moved, dragged, etc. outside of ESRI interfaces, however it has to be converted back to coverage or grid either via ArcCatalog or ArcMap or ArcTools or ArcInfo

Interchange files are used rare now but you can still find them on some governmental servers

 

================================================================

 

ESRI GEODATABASES:

 

  1. Consist of data imported from variety of sources, such as shapefiles, coverages, grids, text files, images, etc. Like a closet.

  2. Compact storage and handling of datasets, easy to transfer

  3. Very efficient use for data processing by applying and devising set of rules and properties for all data within them.

File Geodatabases Stored as folders in a file system (.gdb). Each dataset is held as a file that can scale up to 1 TB in size.

Personal Geodatabases – All datasets are stored within a Microsoft Access data file (.mdb), which is limited in size to 2 GB. More vulnerable to corruption since it depends on (and only supported by) the Microsoft windows file system management and security; however it is very compact and can be easily transferred as one file.

Additional Resources:

 

On ArcCatalog and Managing Databases:

http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/A_quick_tour_of_ArcCatalog/006m00000001000000/

 

================================================================

 

GEODATABASE EXAMPLE

 

 

 

================================================================

 

GENERAL GIS DATA PORTALS

 

GIS Data Depot: http://data.geocomm.com/

 US Data by state: http://libinfo.uark.edu/GIS/us.asp

 International Data (by country):  http://libinfo.uark.edu/GIS/international.asp?Category=Geospatial+and+Attribute+Links&Listing=International&SubListing=By+Country

UN Data Portal:  http://geodata.grid.unep.ch/

 WebGIS:  http://www.webgis.com/

 Census Bureau Cartographic Data:  http://www.census.gov/geo/www/cob/

 New York State GIS Clearinghouse: https://gis.ny.gov/

US EPA (Wetlands, Oceans, Watersheds):  http://www.epa.gov/owow/data.html

USGS Water Resources Maps and GIS Information: http://water.usgs.gov/maps.html

USGS Earth Explorer: https://earthexplorer.usgs.gov/

USGS Global Visualization Viewer:  http://glovis.usgs.gov/

Geospatial Platform: https://ckan.geoplatform.gov/

==================================================================

 

PORTALS FOR ENVIRONMENTAL DATA ACCESS:

 

Precipitation:

 

  1. For individual monitoring stations and models use HydroDesktop: http://hydrodesktop.codeplex.com/

  2. For continuous precipitation data in equatorial and sub-equatorial zones use TRMM (only available until 2015): http://trmm.gsfc.nasa.gov/

  3. For global precipitation measurements (GPM) use GPM (http://www.nasa.gov/mission_pages/GPM/main/index.html) data (go to http://disc.sci.gsfc.nasa.gov/, choose Precipitation and then follow with Data Holding, etc.)

 

Temperature:

 

  1. For individual monitoring stations and models use HydroDesktop: http://hydrodesktop.codeplex.com/

  2. For continuous global data use Landsat, (Band 10 in Landsat 8): http://landsat.usgs.gov/Landsat_Search_and_Download.php

 

Global Topography:

 

  1. Shuttle Radar Topographic Mission (SRTM): http://www2.jpl.nasa.gov/srtm/cbanddataproducts.html

  2. Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER): http://asterweb.jpl.nasa.gov/gdem.asp

 

Soils:

 

  1. HARMONIZED WORLD SOIL DATABASE: http://webarchive.iiasa.ac.at/Research/LUC/External-World-soil-database/HTML/

  2. US, NRCS, Web Soil Survey: http://websoilsurvey.nrcs.usda.gov/app/WebSoilSurvey.aspx

 

Land Cover:

 

  1. Global Land Cover Facility: http://glcfapp.glcf.umd.edu:8080/esdi/index.jsp

  2. UN, FAO, Global Land Cover-SHARE of year 2014 – Beta-Release 1.0: http://www.glcn.org/databases/lc_glcshare_en.jsp

 

Climatic Data:

 

National Climatic Data Center: http://www.ncdc.noaa.gov/cdo-web/search

National Weather Service: http://water.weather.gov/precip/

NOAA/NWS Cooperative Observer Network: http://www.wrcc.dri.edu/climatedata/inventory/

New Jersey Climate Data: http://climate.rutgers.edu/stateclim_v1/njclimdata.html

 

Population Data:

 

  1. The WorldPop project: http://www.worldpop.org.uk/

  2. Gridded Population of the World: http://sedac.ciesin.columbia.edu/data/collection/gpw-v3

  3. Global Rural-Urban Mapping Project (GRUMP), v. 1: http://sedac.ciesin.columbia.edu/data/collection/grump-v1

  4. Oak Ridge National Lab, Landscan: http://web.ornl.gov/sci/landscan/

 

Stream Data:

 

USGS, National Hydrography Dataset: http://nhd.usgs.gov/

 

NASA DAACs, GPM, HDF & NetCDF links:

DAAC

Distributed Active Archive Centers:

https://earthdata.nasa.gov/about/daacs

EOS

Earth Observing System:

http://eospso.nasa.gov/

EOSDIS

Earth Observing System Data Information System:

https://worldview.earthdata.nasa.gov/

https://earthdata.nasa.gov/about

GPM

Global Precipitation Measurement:

http://www.nasa.gov/mission_pages/GPM/main/

IMERG

Integrated Multi-satellitE Retrievals for GPM

http://disc.sci.gsfc.nasa.gov/datareleases/imerg_data_release

HDF

Hierarchical Data Format:

https://eosweb.larc.nasa.gov/HBDOCS/hdf.html

NetCDF

Network Common Data Format:

https://en.wikipedia.org/wiki/NetCDF

 

================================================================

  • Majority of GIS data are in compressed format: .rar, .zip, .gz, .tar., etc.; NASA stores mainly in .hdf (HDF5) and .nc (NetCDF)

  • You need to learn available compression tools for .rar, .zip, .tar.

  • I use most of the time WinRAR because so far it proved to be the best in handling binary data compression (I never had corrupted files using WinRAR); download it from here: http://www.rarlab.com/download.htm

  • When you download compressed file, keep in mind that you will need first to uncompress it and place its content in a specific directory (e.g. D:/yuri/data )

  • Outside of regular GIS and image formats ArcGIS can recognize .txt, .csv and .xls and .xlsx files. It does not recognize compressed formats! and does not uncompress your downloaded compressed data!!! (I had students spending virtually hours trying to read .rar files in ArcGIS J)

Creative Commons License
Unless otherwise noted, this work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Skip to toolbar