This document describes the specification for 3D Tiles, an open standard for streaming massive heterogeneous 3D geospatial datasets.
The following are keywords to be used by search engines and document catalogues.
terrain, geospatial, gis, point cloud, spatial data, vector data, photogrammetry, gltf, 3d models, 3d tiles, metadata, implicit tiling
Bringing techniques from graphics research, the movie industry, and the game industry to 3D geospatial, 3D Tiles defines a spatial data structure and a set of tile formats designed for 3D, and optimized for streaming and rendering.
IV. Security considerations
No security considerations have been made for this document.
V. Submitting Organizations
The following organizations submitted this Document to the Open Geospatial Consortium (OGC):
- Cesium GS Inc.
|Patrick Cozzi||Cesium GS, Inc.||Yes|
|Sean Lilley||Cesium GS, Inc.||Yes|
VII. Future Work
The 3D Tiles community anticipates that revisions to this Community Standard will be required to prescribe content appropriate to meet new use cases. These use cases may arise from either (or both) the external user and developer community or from OGC review and comments. Further, future revisions will be driven by any submitted change requests that document community uses cases and requirements.
3D Tiles Specification
3D Tiles is designed for streaming and rendering massive 3D geospatial content such as Photogrammetry, 3D Buildings, BIM/CAD, Instanced Features, and Point Clouds. It defines a hierarchical data structure and a set of tile formats which deliver renderable content. 3D Tiles does not define explicit rules for visualization of the content; a client may visualize 3D Tiles data however it sees fit.
Annex A of this document describes the Objects and Properties required to implement 3D Tiles. Conformance is relative to these elements and as partly expressed via the associated 3D Tiles JSON Schema.
All figures, examples, notes, and background information are informative.
3. Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
EPSG: 4978, 2020. https://epsg.org/crs_4978/WGS-84.html
EPSG: 4979, 2020. https://epsg.org/crs_4979/WGS-84.html
L. Masinter: IETF RFC 2397, The “data” URL scheme. (1998). https://www.rfc-editor.org/info/rfc2397.
F. Yergeau: IETF RFC 3629, UTF-8, a transformation format of ISO 10646. (2003). https://www.rfc-editor.org/info/rfc3629.
T. Berners-Lee, R. Fielding, L. Masinter: IETF RFC 3986, Uniform Resource Identifier (URI): Generic Syntax. (2005). https://www.rfc-editor.org/info/rfc3986.
Khronos Group: glTF 2.0, 2021. https://www.khronos.org/registry/glTF/specs/2.0/glTF-2.0.html
Roger Lot: OGC 18-010r7, Geographic information — Well-known text representation of coordinate reference systems. Open Geospatial Consortium (2019). https://docs.ogc.org/is/18-010r7/18-010r7.html.
W3C css-color-3, CSS Color Module Level 3. https://www.w3.org/TR/css-color-3/.
4. Terms and definitions
This document uses the terms defined in OGC Policy Directive 49, which is based on the ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards. In particular, the word “shall” (not “must”) is the verb form used to indicate a requirement to be strictly followed to conform to this document and OGC documents do not use the equivalent phrases in the ISO/IEC Directives, Part 2.
This document also uses terms defined in the OGC Standard for Modular specifications (OGC 08-131r3), also known as the ‘ModSpec’. The definitions of terms such as standard, specification, requirement, and conformance test are provided in the ModSpec.
For the purposes of this document, the following additional terms and definitions apply.
Data specifying which tiles/contents/child subtrees exist within a single subtree of an implicit tileset
A boolean array stored as a sequence of bits rather than bytes.
4.3. Bounding Volume
A closed volume completely containing the union of a set of geometric objects. See Wikipedia: Bounding volume
4.4. Child subtree
A subtree reachable from an available tile in the bottommost row of a subtree of an implicit tileset
An entity, in the context of 3D Tiles metadata, is an instance of a metadata class, populated with property values conforming to a metadata class definition
In 3D Tiles, an individual component of a tile, such as a 3D model in a Batched 3D Model or a point in a Point Cloud which contains position, appearance, and metadata properties.
4.7. Geometric Error
The difference, in meters, of a tile’s simplified representation of its source geometry used to calculate the screen space error introduced if a tile’s content is rendered and its children’s are not.
An API-neutral runtime asset delivery format for 3D assets.
4.9. Hierarchical Level of Detail (HLOD)
Decreasing the complexity of a 3D representation according to metrics such as object importance or distance from the tile to the observation point or camera position. Generally, higher levels of detail provide greater visual fidelity. See Wikipedia: Level of detail
4.10. Implicit tiling
A description of a tileset using recursive subdivision.
4.11. Implicit root tile
A tile that contains the implicitTiling property, and therefore denotes the root of an implicit tileset.
In the context of 3D Tiles, this term refers to any association of 3D content with entities and properties, such that entities represent meaningful units within an overall structure.
4.13. Metadata class
The description of the structure of an entity that contains 3D Tiles metadata, consisting of multiple metadata properties
4.14. Metadata property
An element of a metadata class that defines a name and a type for the corresponding element of a metadata entity.
A 3D subdivision scheme that divides each bounding volume into 8 smaller bounding volumes along the midpoint of the x, y, and z axes.
A 2D subdivision scheme that divides each bounding volume into 4 smaller bounding volumes along the midpoint of the x and y axes.
A set of metadata classes and enums that define the structure and type of metadata
4.18. Screen-Space Error (SSE)
The difference, in pixels, of a tile’s simplified representation of its source geometry introduced if a tile’s content is rendered and its children’s are not.
4.19. Spatial Coherence
The union of all content of the child tiles is completely inside the parent tile’s bounding volume
A set of expressions to be evaluated which modify how each feature in a tileset is displayed
A fixed-size section of the tree of an implicit tileset that contains availability information.
4.22. Subtree file
A binary file storing information about a specific subtree of an implicit tileset
4.23. Subdivision scheme
A recursive pattern of dividing a parent tile into smaller child tiles occupying the same area. This is done by uniformly dividing the bounding volume of the parent tile with an implicit tileset.
4.24. Template URI
A URI pattern containing tile coordinates for directly addressing tiles in an implicit tileset.
In 3D Tiles, a subset of a tileset containing a reference to renderable content and the metadata, such as the content’s bounding volume, which is used by a client to determine if the content is rendered.
4.26. Tile Content
A binary blob containing information necessary to render a tile which is an instance of a specific tile format (Batched 3D Model, Instanced 3D Model, Point Clouds, Composite, or glTF).
4.27. Tile Format
The structure or layout of tile content data, (Batched 3D Model, Instanced 3D Model, Point Clouds, Composite, or glTF).
In 3D Tiles, a collection of 3D Tiles tile instances organized into a spatial data structure and additional metadata, such that the aggregation of these tiles represent some 3D content at various levels of detail.
No conventions are specified in this document.
6. 3D Tiles Format Specification
In 3D Tiles, a tileset is a set of tiles organized in a spatial data structure, the tree. A tileset is described by at least one tileset JSON file containing tileset metadata and a tree of tile objects, each of which may reference renderable content.
glTF 2.0 is the primary tile format for 3D Tiles. glTF is an open specification designed for the efficient transmission and loading of 3D content. A glTF asset includes geometry and texture information for a single tile, and may be extended to include metadata, model instancing, and compression. glTF may be used for a wide variety of 3D content including:
Heterogeneous 3D models. E.g. textured terrain and surfaces, 3D building exteriors and interiors, massive models
3D model instances. E.g. trees, windmills, bolts
Massive point clouds
See glTF Tile Format for more details.
Tiles may also reference the legacy 3D Tiles 1.0 formats listed below. These formats were deprecated in 3D Tiles 1.1 and may be removed in a future version of 3D Tiles.
Table 1 — Legacy tile formats and common uses
|Batched 3D Model (b3dm)||Heterogeneous 3D models|
|Instanced 3D Model (i3dm)||3D model instances|
|Point Cloud (pnts)||Massive number of points|
|Composite (cmpt)||Concatenate tiles of different formats into one tile|
A tile’s content is an individual instance of a tile format. A tile may have multiple contents.
The content references a set of features, such as 3D models representing buildings or trees, or points in a point cloud. Each feature has position and appearance properties and additional application-specific properties. A client may choose to select features at runtime and retrieve their properties for visualization or analysis.
Tiles are organized in a tree which incorporates the concept of Hierarchical Level of Detail (HLOD) for optimal rendering of spatial data. Each tile has a bounding volume, an object defining a spatial extent completely enclosing its content. The tree has spatial coherence; the content for child tiles are completely inside the parent’s bounding volume.
Figure 1 — A tree of tiles
A tileset may use a 2D spatial tiling scheme similar to raster and vector tiling schemes (like a Web Map Tile Service (WMTS) or XYZ scheme) that serve predefined tiles at several levels of detail (or zoom levels). However since the content of a tileset is often non-uniform or may not easily be organized in only two dimensions, the tree can be any spatial data structure with spatial coherence, including k-d trees, quadtrees, octrees, and grids. Implicit tiling defines a concise representation of quadtrees and octrees.
Application-specific metadata may be provided at multiple granularities within a tileset. Metadata may be associated with high-level entities like tilesets, tiles, contents, or features, or with individual vertices and texels. Metadata conforms to a well-defined type system described by the 3D Metadata Specification, which may be extended with application- or domain-specific semantics.
Optionally a 3D Tiles Style, or style, may be applied to a tileset. A style defines expressions to be evaluated which modify how each feature is displayed.
6.2. File Extensions and Media Types
3D Tiles uses the following file extensions and Media Types.
Tileset files should use the .json extension and the application/json Media Type.
Tile content files should use the file extensions and Media Type specific to their tile format specification.
Metadata schema files should use the .json extension and the application/json Media Type.
Tileset style files should use the .json extension and the application/json Media Type.
JSON subtree files should use the .json extension and the application/json Media Type.
Binary subtree files should use the .subtree extension and the application/octet-stream Media Type.
Files representing binary buffers should use the .bin extension and application/octet-stream Media Type.
Explicit file extensions are optional. Valid implementations may ignore it and identify a content’s format by the magic field in its header.
6.3. JSON encoding
3D Tiles has the following restrictions on JSON formatting and encoding.
JSON shall use UTF-8 encoding without BOM.
All strings defined in this spec (properties names, enums) use only ASCII charset and shall be written as plain text, without JSON escaping.
Non-ASCII characters that appear as property values in JSON may be escaped.
Names (keys) within JSON objects shall be unique, i.e., duplicate keys aren’t allowed.
Some properties are defined as integers in the schema. Such values may be stored as decimals with a zero fractional part or by using exponent notation, as defined in RFC 8259, Section 6.
3D Tiles uses URIs to reference tile content. These URIs may point to relative external references (RFC3986) or be data URIs that embed resources in the JSON. Embedded resources use the “data” URL scheme (RFC2397).
When the URI is relative, its base is always relative to the referring tileset JSON file.
Client implementations are required to support relative external references and embedded resources. Optionally, client implementations may support other schemes (such as http://). All URIs shall be valid and resolvable.
The unit for all linear distances is meters.
All angles are in radians.
6.6. Coordinate reference system (CRS)
3D Tiles uses a right-handed Cartesian coordinate system; that is, the cross product of x and y yields z. 3D Tiles defines the z axis as up for local Cartesian coordinate systems. A tileset’s global coordinate system will often be in a WGS 84 Earth-centered, Earth-fixed (ECEF) reference frame (EPSG 4978), but it doesn’t have to be, e.g., a power plant may be defined fully in its local coordinate system for use with a modeling tool without a geospatial context.
The CRS of a tileset may be defined explicitly, as part of the tileset metadata. The metadata for the tileset can contain a property that has the TILESET_CRS_GEOCENTRIC semantic, which is a string that represents the EPSG Geodetic Parameter Dataset identifier.
An additional tile transform may be applied to transform a tile’s local coordinate system to the parent tile’s coordinate system.
The region bounding volume specifies bounds using a geographic coordinate system (latitude, longitude, height), specifically, EPSG 4979. The reference ellipsoid is assumed to be the same as the reference ellipsoid of the tileset.
Tiles consist of metadata used to determine if a tile is rendered, a reference to the renderable content, and an array of any children tiles.
18.104.22.168. Tile Content
A tile can be associated with renderable content. A tile can either have a single tile.content object, or multiple content objects, stored in a tile.contents array. The latter allows for flexible tileset structures: for example, a single tile may contain multiple representations of the same geometry data.
The content.uri of each content object refers to the tile’s content in one of the tile formats that are defined in the Tile format specifications), or another tileset JSON to create a tileset of tilesets (see External tilesets).
The content.group property assigns the content to a group. Contents of different tiles or the contents of a single tile can be assigned to groups in order to categorize the content. Additionally, each group can be associated with Metadata.
Each content can be associated with a bounding volume. While tile.boundingVolume is a bounding volume that encloses all contents of the tile, each individual content.boundingVolume is a tightly fit bounding volume enclosing just the respective content. More details about the role of tile- and content bounding volumes are given in the bounding volume section.
22.214.171.124. Geometric error
Tiles are structured into a tree incorporating Hierarchical Level of Detail (HLOD) so that at runtime a client implementation will need to determine if a tile is sufficiently detailed for rendering and if the content of tiles should be successively refined by children tiles of higher resolution. An implementation will consider a maximum allowed Screen-Space Error (SSE), the error measured in pixels.
A tile’s geometric error defines the selection metric for that tile. Its value is a nonnegative number that specifies the error, in meters, of the tile’s simplified representation of its source geometry. Generally, the root tile will have the largest geometric error, and each successive level of children will have a smaller geometric error than its parent, with leaf tiles having a geometric error of or close to 0.
In a client implementation, geometric error is used with other screen space metrics—e.g., distance from the tile to the camera, screen size, and resolution— to calculate the SSE introduced if this tile is rendered and its children are not. If the introduced SSE exceeds the maximum allowed, then the tile is refined and its children are considered for rendering.
The geometric error is formulated based on a metric like point density, mesh or texture decimation, or another factor specific to that tileset. In general, a higher geometric error means a tile will be refined more aggressively, and children tiles will be loaded and rendered sooner.
Refinement determines the process by which a lower resolution parent tile renders when its higher resolution children are selected to be rendered. Permitted refinement types are replacement (”REPLACE”) and additive (”ADD”). If the tile has replacement refinement, the children tiles are rendered in place of the parent, that is, the parent tile is no longer rendered. If the tile has additive refinement, the children are rendered in addition to the parent tile.
A tileset can use replacement refinement exclusively, additive refinement exclusively, or any combination of additive and replacement refinement.
A refinement type is required for the root tile of a tileset; it is optional for all other tiles. When omitted, a tile inherits the refinement type of its parent.
If a tile uses replacement refinement, when refined it renders its children in place of itself.
Table 2 — A tile and a refined tile using replacement refinement
If a tile uses additive refinement, when refined it renders itself and its children simultaneously.
Table 3 — A tile and a refined tile using additive refinement
126.96.36.199. Bounding volumes
A bounding volume defines the spatial extent enclosing a tile or a tile’s content. To support tight fitting volumes for a variety of datasets such as regularly divided terrain, cities not aligned with a line of latitude or longitude, or arbitrary point clouds, the bounding volume types include an oriented bounding box, a bounding sphere, and a geographic region defined by minimum and maximum latitudes, longitudes, and heights.
Table 4 — Different bounding volume types for a tile
|Bounding box||Bounding sphere||Bounding region|
The boundingVolume.region property is an array of six numbers that define the bounding geographic region with latitude, longitude, and height coordinates with the order [west, south, east, north, minimum height, maximum height]. Latitudes and longitudes are in the WGS 84 datum as defined in EPSG 4979 and are in radians. Heights are in meters above (or below) the WGS 84 ellipsoid.
NOTE The latitude and longitude values are given in radians, deviating from the EPSG 4979 definition, where they are given in degrees. The choice of using radians is due to internal computations usually taking place in radians — for example, when converting cartographic to Cartesian coordinates.
Figure 2 — A bounding region
The boundingVolume.box property is an array of 12 numbers that define an oriented bounding box in a right-handed 3-axis (x, y, z) Cartesian coordinate system where the z-axis is up. The first three elements define the x, y, and z values for the center of the box. The next three elements (with indices 3, 4, and 5) define the x-axis direction and half-length. The next three elements (indices 6, 7, and 8) define the y-axis direction and half-length. The last three elements (indices 9, 10, and 11) define the z-axis direction and half-length.
NOTE The representation that is used for an oriented bounding box in 3D Tiles is versatile and compact: In addition the center position, the array contains the elements of a 3×3 matrix. The columns of this matrix are the images of unit vectors under a transformation, and therefore uniquely and compactly define the scaling and orientation of the bounding box.