The detailed representation of the street space is tested using the example of New York City. The NYC Open Data Portal provides an extensive number of datasets, including geometric as well as semantic information on street space objects for the entire city suitable for detailed street space modelling. All FME Workspaces created in the course of this project will be made available for download on the project's GitHub Page. The actual implementations are based on the currently (July 2017) valid OGC standard CityGML2.0 but already take into account conceptual ideas (as far as they are compliant to this standard).
General Workflow
FME Workspaces
| Number | FME Workspace name | Short description | Used for |
|---|---|---|---|
| 1 | NYC_CenterlinesMergeAttributes | Add additional attributes to the street centerline data generated in project phase 1. Information from different datasets is merged using corresponding attributes such as 'segmentID' or 'gml_name'. | entire city |
| 2 | NYC_filter_complex_Centerlines | Centerlines representing complex interchanges of motorway junctions are filtered out. | entire city |
| 3 | NYC_add_Track_Centerline_to_lod0Network | A lod0Network is created using the centerlines generated in (1) and adding sidewalk centerlines as 'Track' features. | entire city |
| 4 | NYC_Streetspace_PlanimetricData_to_CityGML_LoD2 | Data from the Planimetric Database is manipulated semantically and geometrically. Relevant information contained within the centerline data (generated in 1) is transferred using a spatial correlation method. | entire city |
| 5 | NYC_Streetspace_Planimetric_addTexture | Suitable textures are added to each object generated in (4). | entire city |
| 6 | NYC_advancedDataStructure_Streetspace_PlanimetricData_to_CityGML_LoD2 | An advanced data structure is implemented by creating 'TrafficAreas' as well as 'AuxiliaryTrafficAreas' and assigning these objects to corresponding 'Road', 'Track' and 'Square' features. | central Manhattan |
| 7 | NYC_advancedDataStructure_addTexture_and_2Dto3D | The output from (6) is textured and extruded to 3D objects. | central Manhattan |
The approach employed to generate a detailed street space model for the entire city of New York is described in the following section (FME Workspaces 4 and 5). A slightly modified approach was used to generate a semantically more detailed model for an example area in central Manhattan (FME Workspaces 6 and 7). This second approach is described in the second to last section of this page.
Data Sources
There are three main data sources used to generate a detailed streets space model of New York City. Within a study project conducted by the Chair of Geoinformatics of the Technical University of Munich a semantic 3D city model of New York City was generated (Kolbe et al. 2015). The produced CityGML datasets are available for download on the project website. The provided road dataset includes a LoD0 line network representation and is the basis for the street model generated in this work. The second major data source is the so called ‘NYC Planimetric Database’ provided in the New York City Open Data Portal. This contains representations of a variety of features such as roadbeds, sidewalks, or parking lots in the form of areal Shapefile data. Additional data is gathered searching websites of the NYC Department of Transportation and the NYCD of City planning. This includes information such as speed limits and pavement ratings as well as guidelines with respect to the physical dimensions and used materials of street space objects described in the so called NYC Street Design Manual.
NYC Street Centerlines CityGML LoD0 (Results from Project Phase 1)
The CityGML compliant street centerlines generated in an earlier stage of the project and used as a foundation to create this detailed street space model can be downloaded from the project's website (Download Section → Download: LoD1 datasets for the entire city (previous datasets, generated in 2015) ).
Planimetric Database (Downloaded from the NYC Open Data Portal)
Planmietric Database: https://data.cityofnewyork.us/Transportation/NYC-Planimetrics/wt4d-p43d/data
Individual datasets used in this project can be downloaded from the following web links:
| Data | Sub feature classes | Link to data source | Geometry | Nr. of objects |
|---|---|---|---|---|
| Roadbed | Roadbed, Intersection | https://data.cityofnewyork.us/City-Government/Roadbed/xgwd-7vhd/data | Polygon | 90'396 |
| Sidewalk | Interior Sidewalk, Row Sidewalk | https://data.cityofnewyork.us/City-Government/Sidewalk/vfx9-tbb6/data | Polygon | 49'479 |
| Curbs | - | https://data.cityofnewyork.us/dataset/Curbs/ikvd-dex8/data | Polyline | 211'868 |
| Median | Median Painted, Median Raised, Median Grass | https://data.cityofnewyork.us/Transportation/Median/27b5-th78/data | Polygon | 14'846 |
| Parking Lots | - | https://data.cityofnewyork.us/City-Government/Parking-Lots/e2f7-cs7i/data | Polygon | 20'714 |
| Plazas | - | https://data.cityofnewyork.us/Transportation/Plazas/m4mp-ji5y | Polygon | 1'360 |
Detailed documentation for these datasets as well as information on how the data was gathered is documented on the following GitHub page:
Planimetric Database: Capture Rules: https://github.com/CityOfNewYork/nyc-planimetrics/blob/master/Capture_Rules.md
Other datasets and information sources
| Data | Link to data source | Metadata |
|---|---|---|
| Speed Limits | http://www.nyc.gov/html/dot/html/about/vz_datafeeds.shtml#speed | http://www.nyc.gov/html/dot/downloads/pdf/vision-zero-view-metadata.pdf |
| Pavement Rating | https://data.cityofnewyork.us/Transportation/Street-Pavement-Rating/2cav-chmn/data | http://www.nyc.gov/html/dot/downloads/pdf/street-pavement-rating-metadata.pdf |
| Updated Centerline Data (LION Single Line Street Base Map 2016) | https://www1.nyc.gov/site/planning/data-maps/open-data/dwn-lion.page | http://www.nyc.gov/html/dot/downloads/pdf/street-pavement-rating-metadata.pdf |
| NYC Street Design Manual | http://www.nyc.gov/html/dot/html/pedestrians/streetdesignmanual.shtml |
Also suitable textures in the form of jpeg images were used.
Preparations
The original data is be pre-processed in several ways before the actual street space model generation can be executed. These first steps are descriped in the following section. All data transformations as well as geometric and semantic manipulations are executed using the software 'Feature Manipulation Engine' FME (2016.1) by Safe Software.
Coordinate transformation
The datasets contained in the Planimetric Database are delivered via an ESRI geodatabase in New York State Plane Coordinates, Long Island East Zone, NAD83, US foot.
All relevant datasets are transformed into the coordinate reference system (EPSG:32118) using a CsmapReprojector transformer. Specifications for this CRS are given in the table below.
| CRS | Name | Description | Geodetic Datum | Ellipsoid | Unit |
|---|---|---|---|---|---|
| 2D | EPSG:32118 | NAD83 New York State Planes, Long Island, meter | NAD83 | GRS1980 | meter |
| 1D | EPSG:5703 | NADV88 height | NAVD88 | meter |
Selection of test areas
One of the fundamental challenges of this project is the size of New York City and thus the huge amount of data to work with. In order to get manageable computing times, smaller areas are selected. These 'test areas' are used to try different approaches and to optimize the data manipulating process. Two main test arreas are slected. One in central Manhattan (representative for most of New York City) and another one around the 'Bruckner Interchange' (complex junctions and motorways). The corner coordinates of those test areas in the coordinate reference system EPSG:32118 are shown below.
| Test area | Coordinate | South-West | Noth-West | Noth-East | South-East |
|---|---|---|---|---|---|
| Central Manhattan | X (meter) | 299854.488 | 299854.488 | 301951.503 | 301951.503 |
| Y (meter) | 63072.872 | 64606.695 | 64606.695 | 63072.872 | |
| Bruckner Interchange | X (meter) | 312445.738 | 312445.738 | 315045.738 | 315045.738 |
| y (meter) | 72253.563 | 74853.563 | 74853.563 | 72253.563 |
Entire city of New York and the selected test area in central Manhattan
Test for data consistency
The datasets used to generate a detailed street space model for New York City need to be checked for their consistency as they originate form different providers and data sources. The main task of this project is to integrate semantic information on streets contained in a CityGML compliant centerline dataset with the exact geometric information provided by areal data on 'Roadbeds' and 'Intersections' (with no or very little semantic information). Those datasets are therefore checked for their consistency by spatial overlay of the transformed but otherwise unmanipulated original datasets. The following tables show the number of centerlines overlapping with one specific object of the 'Roadbed' or 'Intersection' datasets.
For example, 283 from a total of 317 'Roadbed' objects (86,5 %) overlap with exactly one street centerline from the CityGML compliant dataset of central Manhattan. This means semantic information can be transfered from those centerlines to corresponding areal objects without any further manipulation. For all 'Roadbed' objects that don't overlap with exactly one centerline, a corrsponding centerline must be determined. For example, 23 'Roadbed' objects overlap with exactly 2 centerlines and so on. In these cases, the centerline most likely corresponding with the respective 'Roadbed' object has to be determined. This process is described in the section 'Data Maipulation'. Most 'Intersection' objects overlap with 4 individual centerlines (4-way junction).
The table on the left shows the results for the test area in central Manhattan, while the table on the right shows the results for the entire city of New York.
| Number of centerlines overlapping with an areal object (Central Manhattan) | Roadbed | Intersection |
|---|---|---|
| 0 | 1 (0,3 %) | 0 |
| 1 | 283 (86,5 %) | 0 |
| 2 | 23 (7,0 %) | 0 |
| 3 | 10 (3,1 %) | 1 (0,7 %) |
| 4 | 2 (0,6 %) | 114 (85,1 %) |
| 5 | 1 (0,3 %) | 1 (0,7 %) |
| 6 | 3 (0,9 %) | 0 |
| >6 | 4 (1,2 %) | 18 (13,4 %) |
| Total | 327 (100 %) | 134 (100 %) |
| Number of centerlines overlapping with an areal object (Complete NYC) | Roadbed | Intersection |
|---|---|---|
| 0 | 4’320 (6,4%) | 373 (1,6%) |
| 1 | 43’218 (63,9%) | 52 (0,2%) |
| 2 | 3’441 (5,1%) | 32 (0,1%) |
| 3 | 6’603 (9,8%) | 311 (1,3%) |
| 4 | 1’653 (2,5%) | 19’504 (85,3%) |
| 5 | 2’181 (3,2%) | 689 (3,0%) |
| 6 | 772 (1,1%) | 224 (0,9%) |
| >6 | 5’344 (7,9%) | 1’669 (7,3%) |
| Total | 67’542 (100%) | 22’854 (100%) |
As expected the accordance between the centerline data and areal objects is better for the test area in central Manhatttan than for the entire area of New York City.
Segmentation of complex streets and motorways
Due to sometimes very complex streets a clear correspondence between street centerlines and areal 'Roadbed' / 'Intersection objects isn't always possible. In order to reduce the number of problematic cases, the street centerline data is filtered for complex street segments such as motorway interchanges. The following images shows the result of this segmentation. On the left all resulting 'simple' centerlines are displayed. The right images shows all 'complex' centerlines including interchanges and motorways.
Semantic information contained in the centerlines displayed in the right image is then transfered to areal 'Roadbed' objects using a spatial correlation method. This pocess is described in the following section. The filtered complex centerlines on the other hand are used to generate a 3D street model by using information on hight as well as number of driving lanes.
Data Manipulation (FME Workspaces 1, 2, 4 and 5)
First, the CityGML compliant street centerlines generated in project phase 1 are enriched with additional semantic information. This is achieved by machting attributes form different datasets to the centerline data using identical attributes. For example, the dataset 'Pavement Ratings' containes information on road surfcase conditions for each centerline segment. Those segments also contain an attribute called 'segment_id' which is also contained by the CityGML compliant dataset. Information on pavement ratings thus can be matched by using corresponding segment IDs. The same procedure is executed with datasets called 'Speed Limits' and 'LION Single Line Street Base Map 2016'. The attributes contained in this 'new' CityGML centerline are then assigned to corresponding areal 'Roadbed' and 'Intersection' objects using a spatial correlation method.
Spatial correlation method
This method is used to transfer semantic information from data of street centerlines in New York City to corresponding areal 'Roadbed' and 'Intersection' objects.
The image below shows the spatial correlation method used to transfer semantic information storred in the street centerline data (black lines) to areal information on the exact shape of 'Roadbed' objects (light red). (1) shows an overlay of the original datasets for a small example area around 'Prospect Park West'. First, uisng a 'LineOnAreaOverlay' transformer, a centerline is split whenever it intersects with the border of a 'Roadbed' shape. Then the 'Centerline per 'Roadbed' object ratio' is checked. If a 'Roadbed' object contains exactly one centerline, relevant attributes (e.g. street name, number of lanes or pavement rating) are transferred to the corresponding areal object. This is the case for all 'Roadbed' objects colored green in (2). Areal objects colored in orange or yellow (not visible but 2 touching centerlines) contain more than one centerline. This means at first it is unclear which of these centerlines should serve as attribute provider. Next, all centerlines with a length shorter than 18m are deleted. For all remaining centerlines the 'Centerline per Roadebed-object ratio' is checked again and semantic infroamtion is transferred if there is exactly one centerline per areal object. This leads to the result shown in (3). With the exeption of the yellow coloured 'Roadbed', all other areal objects now contain the semantic information of the corresponding centerline. The yellwo object contains two centerlines with different street names touching in the middle of the object. In order to take this into account, the areal 'Roadbed' is devided into small triangles. Each of these small pieces is than assigned to the nearest street centerline. The yellow object thus is split into two new objects, indicated by the red line in (4). This procedure also enables the assigment of centerlines to corresponding 'Roadbed' objects located next to each other but not overlapping.
This method works very well for most areas of New York City. However, in the case of complex interchanges or motorway bridges, with a number of centerlines running one over the other, the assigment often leades to faulty results. This is why those centerlines have been filtered out in advance.
Semantic and geometric manipulation of other object classes
- Intersections
Intersection objects are also enriched with semantic information contained in the street centerline data by spatial overlay. Multiple intersecting street names are inherited along with a number of other relevant attributes. - Sidewalks
The dataset 'Sidewalk' contains so called 'Row Sidewalks' along streets and 'Interior Sidewalks' within parks. The 'Row Sidewalk' data is mostly contained in the form of closed ring polygons. These 'donuts' are split into small pieces and enriched with relevant semantic information (e.g. street name) of the nearest 'Roadbed' object. These objects are also rextruded by 0.1524 meter (~ 6 inch). This is specified as the minimum height of every sidewalk in NYC by the 'New York City Street Design Manual'. - Curbs
The dataset 'Curbs' contains Polyline objects running exactly between 'Row Sidewalk' and 'Roadbed' objects. These polylines are buffered by 0.2 meter and then cut out of the sidewalk objects, thus inheriting relevant attributes such as street name or corresponding segment. Curb objects are also extruded by 0.1524 meter (~6 inch).
- Dividing Strips (Median Raised)
Dividing strips originate from a 'Median' sub feature class called 'Median Raised'. Each dividing strip is enriched with the street name attribute of the nearest 'Roadbed' object and also extruded by 0.3048 meter (~12 inch) again dirived form the NYC Street Design Manual.
- Road Markings (Median Painted)
Road Markings originate form the 'Median' sub feature class 'Median Painted'. Relevant attributes such as street names are easily assigned by cutting those 'Road Marking' objetcs from overlapping 'Roadbed' objects.
- Green areas (Median Grass)
Green areas also originate from the 'Medain' sub feature class called 'Median Grass'.
- Plazas
The dataset Plaza containes all public squares and other sealed areas used by pedestrians of New York City. These objetcs area also extruded by 0.1524 meter.
- Parking Lots
Parking lots are given as Polygon objects. Every parking lot can be reached from at least one street. This is taken into account by assigning an attribute called 'entrance street name' to each parking lot object.
- Parking Lot Entrances
Parking lot entrances are not contained by any specific dataset within the 'Planimetric Database'. These objetcs area gernated by overlapping 'Parking Lot' with 'Row Sidewalk' data. These areas are assigned multiple citygml_function attributes (e.g. 1, 2 and 6) indicating their affiliation to multiple semantic objects. Also hte citygml_usage attribute can be assigned multiple values indicating that parking lot entrances can be used by cars as well as by pedestrians. Each parking lot entrance also contains information on the adjacet street name.
For all objects additional relevant information such as 'area in square meter' or 'volume in cubic meter' is calculated and suitable textures are assigned.
Summary of the street space objects generated and their information sources
Please note that, with the exception of the LoD2 buildings, the data provided currently has a base height of 0m, i.e. the heights of roads and lots are not adapted according to the terrain so far. Due to the huge amount of data multiple thematically divided CityGML files were generated for each street space object class respectively. These include 11 thematic object classes with a total of 508,660 objects, each one assigned to the most fitting of the 3 possible subclasses 'Road', 'Square', or 'Track'.
| Input | Data source (Geometry) | Data source (Semantics) | Output | CityGML class | Object class | Nr. Objects * | Data Size ** |
|---|---|---|---|---|---|---|---|
| Curb | Curb, Roadbed | Road | Curb | 169'626 * | 2.02 (GB) | ||
| Row Sidewalk, Parking Lot | Row Sidewalk, Parking Lot, Roadbed | Road | Entrance | 24'185 | 5.5 (MB) | ||
| Intersection | Roadbed, Street Centerlines | Road | Intersection | 22'854 | 7.9 (MB) | ||
| Median Grass | Median Grass | Road | Grass | 258 | 0,3 () | ||
| Median Painted | Median Painted, Roadbed | Road | Road Marking | 7'826 | 3.9MB (MB) | ||
| Median Raised | Median Raised, Roadbed | Road | Dividing Strips | 8'841 * | 74.8 (MB) | ||
| Roadbed | Roadbed, Street Centerlines | Road | Roadbed | 72'580 | 134.9 (MB) | ||
| Row Sidewalk | Row Sidewalk, Roadbed | Road | Sidewalk | 169'056 * | 1.3 (GB) | ||
| Parking Lot | Parking Lot, Roadbed | Square | Parking Lot | 19'951 | 32.2 (MB) | ||
| Plaza | Plaza | Square | Plaza | 1'360 * | 5.5 (MB) | ||
| Interior Sidewalk | Interior Sidewalk | Track | Interior Sidewalk | 6'205 | 15.8 (MB) | ||
| Total | 508'660 |
* Objects marked with ' * ' consist of two parts. E.g. each individual 'Sidewalk' object consist of a top- an a side-surface. This means that for those objects there are twice as many surfaces contained in a dataset than there are objects.
** Compressed CityGML zip.-files
Note that the datastructure of the objects generated for the entire city is not ideal, as each individual object is saved as a CityGML-Top-Level feature (e.g. Road, Square or Track). This is allowed by the current CityGML2.0 standard. However, it would be more elegant to create individual 'TrafficAreas' and 'AuxiliaryTrafficAreas', each assigned to corresponding 'Road', 'Square' or 'Track' objects. This is tested and implemented for a smaller area in central Manhattan.
Approach to generate a semantically more detailed street space model (FME Workspaces 6 and 7)
The implementation used for the entire city produces individual 'Road', 'Square' and 'Track' features for each street space object. In order to enhance the datastructure 'TrafficAreas' and 'AuxiliaryTrafficAreas' connected to corresponding 'Road', 'Square' and 'Track' features via parent_ids are generated for a smaller area in central Manhattan. This was only possible beacuse of the high consistency of the different datasets (as shown before). The implementation method used to generate this semantically enhance street space model is describe in the following section. The main parts of this implementation remain unchange to the method described before. However, each street space object is assigned to the CityGMl callsses 'TrafficArea' and 'AuxiliaryTrafficArea'. The difficulty then lies in creating suitable Top-Level-Feature (Road, Square, Track) and link the Sub-Level-Fetures (TrafficAreas, AuxiliaryTrafficAreas) correspondingly.
Each object class is assigned the most suitable feature role attribute (Sub-Level- Feature) which again is assigned to the most fitting Top-Level-Feature. This assignement is shown in the following table.
In order to implement this assignment, first, Top-Level-Features (Road, Sqaure, Track) have to be created. This is achieved by dissolving generated TrafficAreas (1) and AuxiliaryTrafficAreas (2) into areas based on identical name attributes (4). E.g Roadbeds, Sidewalks, Curbs, Road Markings and Dividing Strips all contain an attribute called 'gml_name' (3). All objects with an identical 'gml_name' value are dissolved (4). This means objects also need to touch in order to be combined. This leads to the result displayed in (6). Intersections are handled separately by creating 'Road' objects for each Intersection area. The same procedure is executed for 'Squares' and 'Tracks'. For of these Top-Level-Features a citygml_function attribute (e.g. 1000 = Road, 1100 = Intersection) as well as an unique id attribute (gml_id) is created. Then an overlay of the original 'TrafficAreas' and 'AuxiliaryTrafficAreas' with the newly created 'Road', 'Square' and 'Track' features is performed (5). In the course of this operation, the Top-Level-Features 'gml_id' attribute is passed on to overlying Sub-Level-Features and saved to a 'gml_parent_id' attribute. In the last step the geometry of the created Top-Level-Features is deleted, only leaving the semantic information on citygml_function, gml_name and gml_id attributes for every Top-Level-Feature. Now every individual 'TrafficArea' or 'AuxiliaryTrafficArea' has a 'gml_parent_id' attribute, linking it to exactly one Top-Level-Feature, indicated by red lines in (7). With this connection between Top- and Sub-Level-Features, FME is able to create XML documents such as the one shown below. All of the mentioned transformations are exectued with 2D objects. Now these semantically enriched objects need to be textured and extruded into 3D objects. This is achieved using a separate FME Workspace.
The following screenshot shows an excerpt of the generated CityGML XML document. One specific 'Road' object is displayed. This Top-Level-Feature contains a unique 'gml:id' and 'tran:function' (1000 = Road) attribute. This 'Road' object consists of 2 'TrafficAreas' and 1 'AuxiliaryTrafficArea' (and some more parts not shown in this screenshot). Each of these Sub-Level-Features containes a number of semantic information such as external References or other attributes.
In theory this method could also be used for the entire city. However, as there are many different, sometimes very complex scenarios, this will most likely lead to 'faulty' (in the sence of: not accurate) results.




