For detailed descriptions on the first part of the project please see the two Master theses by Burger (2015) and Cantzler (2015) (both in German language) and also the paper by Kolbe et al. (2015) (in English language). This section briefly describes the most important aspects of the first part of this project.


Challenges and Tasks

The generation of a semantic 3D city model for NYC faces a number of challenges. First of all there are issues concerning data transformation and integration. All spatial datasets on the NYC Open Data Portal have 2D geometries, except for the DTM which is 2.5D. From these geometries 3D geometries must be generated, but the methods differ substantially for different feature types. In some cases new 3D objects have to be created based on the given 2D representation (e.g. volumetric building shapes from footprints, or areal 3D street surfaces from 2D center lines). The geometries of the source datasets are using different coordinate reference system (CRS). In order to generate an integrated 3D city model with aligned 3D geometries a common CRS should be used.

Another major challenge are the semantic transformations from the separate datasets defined and provided by the different departments of the NYC administration to the common semantic model of CityGML. The difficulties here are to define the correct mappings from the source data models to the one of CityGML. A preliminary investigation showed that many mappings are not just 1:1, but often 1:n and sometimes even n:m. This means, that n objects from the NYC datasets have to be mapped to m CityGML objects. Since we intend to enrich all objects of the 3D city model by thematic attributes, suitable and relevant datasets need to be identified first.

Last but not least handling of the huge data volumes must be addressed explicitly. New York City is a very big city both regarding its regional and vertical extent. This means that the methods and tools must be able to cope with large files and a huge number of spatial objects. Geoobjects with large spatial extents may need to be subdivided in order to be able to handle (and use them) efficiently.

Considered Object Types
  • Digital Terrain Model (DTM)
    The NYC Open Data Portal provides a DTM covering the entire city territory. It consists of a single rectangular raster dataset of 158,100 x 156,100 cells covering an area of around 2,300 km² with a cell resolution of 1x1 ft. The file size is 121 GB making it difficult to work with it when not using a geoinformation system or a geodatabase management system. The DTM is based on data that was acquired in an airborne LiDAR measurement campaign carried out in 2010. Since we intend to integrate street and water surfaces with the DTM we decided to perform a triangulation of the raster based DTM according to a regular tiling schema, which creates CityGML TINRelief components that can easily be handled by users and applications. With a tile size of 250m x 250m in total 35,153 tiles were created. In order to ensure that the height profiles of neighboring tiles match along their boundaries, the height profiles of the bounding boxes of all tiles were computed in the form of 3D lines.
  • Buildings
    Building objects are generated from a building footprint dataset. The footprints are first elevated to a base height by projecting each 2D polygon onto the DTM and selecting the height value of the lowest polygon point. This ensures that all wall surfaces are completely grounded on the terrain. Then the footprint polygons are extruded upwards in vertical direction according to a measured height value coming with each polygon, effectively creating an LOD1 3D solid geometry. The original building footprint dataset contains 1,082,005 polygons having 15 attributes each. Garages and sheds are also represented by their footprints. Some of the building attributes like the building name, usage, function, and measured height are mapped to the respective predefined attributes of the CityGML building model. The others like building identification number (BIN), borough block lot number (BBL) are represented by generic attributes of the CityGML building objects. For each building the volume of the 3D solid geometry is computed and added as an attribute which enables simple subsequent queries and computations without the further need of 3D geometric operations.
  • Streets and Roads
    The NYC Open Data Portal provides street geodata within the so-called LION dataset of the Department of City Planning. It consists of street segments which are geometrically represented by 2D centerlines. Furthermore, it contains 2D point objects representing street crossings. The two datasets establish a geometric-topological network, i.e. a graph where street line segments represent the arcs and street crossing points the nodes. Both the street center lines and crossing points come with a number of thematic attributes like street name, traffic direction, height level, and priority for snow removal. We want the 3D model to be as realistic as possible. Thus, we do not want to use a buffering of the center lines by some default street widths, but intend to determine the widths of all street segments individually. For this purpose we used the land cover classification map that is also available as Open Data. The employed method creates sample points along the center line in regular steps. Then for each sample point the orthogonal distances from the centerline to the last cell of the land cover map that is classified as street area are determined for the left and the right side. Since the street widths are generally varying around street crossings, distances from these areas are filtered out. A histogram analysis is performed on the sampled distances and the distance value class with the highest number of occurrences is considered to represent the street width.The center line then is buffered according to the width of the street segment. All street segments have two attributes indicating the qualitative height level at the start and the end of the segment respectively. The level information is given by one of 17 different letters which means that 17 different height levels relative to the ground are distinguished. Since no quantitative height information is given, we assumed a vertical distance of 4m between two consecutive levels.
  • Other Feature Types
    Besides the DTM, buildings, and streets further feature types were generated. From the NYC MAPPLUTO dataset of the Department of Finance, the land lots – geometrically represented by 2D polygons – were transformed to 866,853 CityGML LandUse objects. Each object has 75 thematic attributes including land ownership information together with tax assessment information. NYC Parks were also transformed to CityGML LandUse objects. The 2D spatial extents were mapped onto the DTM resulting in triangulated 3D surface geometries. Each of the 16,159 objects has 10 thematic attributes. The datasets about street trees contains 623,920 entries, but it turned out that the same trees were contained multiple times and that the 2D center points are typically lying in the middle of the streets and not on either street side. Nevertheless, a 3D tree model was created in the shape of a ‘lollipop’ at the location of each tree. All trees are represented by 277,108 CityGML SolitaryVegetation objects with 16 thematic attributes each. Water bodies are provided in the NYC Open Data Portal by attributed 2D polygons. Since rivers are often represented by just one polygon, they were segmented into smaller extents. The 2D polygons were then mapped onto the DTM generating 9,542 CityGML 3D WaterSurface objects.
Implementation and Results

All of the described transformation processes were implemented as workspaces for the spatial Extract, Transform, and Load (ETL) tool Feature Manipulation Engine (FME). For the storage, intermediate processing, and the management of the 3D city model the Open Source 3D geodatabase 3DCityDB (Version 3.0) was employed. In a first step the NYC datasets were processed, integrated, brought into the same coordinate reference system, and transformed to CityGML (all with base height 0). In the 2nd step the resulting CityGML files were imported into a 3DCityDB account. In order to integrate the features with the DTM and to cope with the large data volumes and the huge number of objects, an ETL master process was defined that performs its sub processes on tiles of the stored geodata only. The data inside each tile were then transformed to 3D, enriched by further attributes, and reimported to the 3DCityDB. In general all CityGML objects are enriched by external references pointing with their URLs to the download addresses of the original datasets and to the object IDs within the respective datasets.

The following table gives an overview on the number of objects from the source datasets and the resulting CityGML model, together with the number of included thematic attributes for the latter. Also the file sizes are given.


The generated 3D city model is complete in the sense that it covers the entire area of New York City. The following image shows a screenshot of the represented street, building and lot objects, visualized in GoogleEarth together with semantic information.



A Web-Map-Client Demo of the results can be viewed here.
  • Keine Stichwörter