Work Item Draft

OGC Community Standard

3D Tiles Specification
Patrick Cozzi Editor Sean Lilley Editor
Version: 1.1.0
Additional Formats: PDF
OGC Community Standard

Work Item Draft

Document number:22-025r2
Document type:OGC Community Standard
Document subtype:
Document stage:Work Item Draft
Document language:English

License Agreement

Permission is hereby granted by the Open Geospatial Consortium, (“Licensor”), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.

If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.

THIS LICENSE IS A COPYRIGHT LICENSE ONLY, AND DOES NOT CONVEY ANY RIGHTS UNDER ANY PATENTS THAT MAY BE IN FORCE ANYWHERE IN THE WORLD. THE INTELLECTUAL PROPERTY IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE DO NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE INTELLECTUAL PROPERTY WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION OF THE INTELLECTUAL PROPERTY WILL BE UNINTERRUPTED OR ERROR FREE. ANY USE OF THE INTELLECTUAL PROPERTY SHALL BE MADE ENTIRELY AT THE USER’S OWN RISK. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR ANY CONTRIBUTOR OF INTELLECTUAL PROPERTY RIGHTS TO THE INTELLECTUAL PROPERTY BE LIABLE FOR ANY CLAIM, OR ANY DIRECT, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM ANY ALLEGED INFRINGEMENT OR ANY LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR UNDER ANY OTHER LEGAL THEORY, ARISING OUT OF OR IN CONNECTION WITH THE IMPLEMENTATION, USE, COMMERCIALIZATION OR PERFORMANCE OF THIS INTELLECTUAL PROPERTY.

This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.

Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications. This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.

None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.



I.  Abstract

This document describes the specification for 3D Tiles, an open standard for streaming massive heterogeneous 3D geospatial datasets.

II.  Keywords

The following are keywords to be used by search engines and document catalogues.

terrain, geospatial, gis, point cloud, spatial data, vector data, photogrammetry, gltf, 3d models, 3d tiles, metadata, implicit tiling


III.  Preface

Bringing techniques from graphics research, the movie industry, and the game industry to 3D geospatial, 3D Tiles defines a spatial data structure and a set of tile formats designed for 3D, and optimized for streaming and rendering.

IV.  Security considerations

No security considerations have been made for this document.

V.  Submitting Organizations

The following organizations submitted this Document to the Open Geospatial Consortium (OGC):

VI.  Submitters

Name Affiliation OGC member
Patrick Cozzi Cesium GS, Inc. Yes
Sean Lilley Cesium GS, Inc. Yes

VII.  Future Work

The 3D Tiles community anticipates that revisions to this Community Standard will be required to prescribe content appropriate to meet new use cases. These use cases may arise from either (or both) the external user and developer community or from OGC review and comments. Further, future revisions will be driven by any submitted change requests that document community uses cases and requirements.

3D Tiles Specification

1.  Scope

3D Tiles is designed for streaming and rendering massive 3D geospatial content such as Photogrammetry, 3D Buildings, BIM/CAD, Instanced Features, and Point Clouds. It defines a hierarchical data structure and a set of tile formats which deliver renderable content. 3D Tiles does not define explicit rules for visualization of the content; a client may visualize 3D Tiles data however it sees fit.

2.  Conformance

Annex A of this document describes the Objects and Properties required to implement 3D Tiles. Conformance is relative to these elements and as partly expressed via the associated 3D Tiles JSON Schema.

All figures, examples, notes, and background information are informative.

3.  Normative references

The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

EPSG: 4978, 2020. https://epsg.org/crs_4978/WGS-84.html

EPSG: 4979, 2020. https://epsg.org/crs_4979/WGS-84.html

L. Masinter: IETF RFC 2397, The “data” URL scheme. (1998). https://www.rfc-editor.org/info/rfc2397.

F. Yergeau: IETF RFC 3629, UTF-8, a transformation format of ISO 10646. (2003). https://www.rfc-editor.org/info/rfc3629.

T. Berners-Lee, R. Fielding, L. Masinter: IETF RFC 3986, Uniform Resource Identifier (URI): Generic Syntax. (2005). https://www.rfc-editor.org/info/rfc3986.

T. Bray (ed.): IETF RFC 8259, The JavaScript Object Notation (JSON) Data Interchange Format. (2017). https://www.rfc-editor.org/info/rfc8259.

Khronos Group: glTF 2.0, 2021. https://www.khronos.org/registry/glTF/specs/2.0/glTF-2.0.html

Roger Lot: OGC 18-010r7, Geographic information — Well-known text representation of coordinate reference systems. Open Geospatial Consortium (2019). https://docs.ogc.org/is/18-010r7/18-010r7.html.

W3C css-color-3, CSS Color Module Level 3. https://www.w3.org/TR/css-color-3/.

4.  Terms and definitions

This document uses the terms defined in OGC Policy Directive 49, which is based on the ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards. In particular, the word “shall” (not “must”) is the verb form used to indicate a requirement to be strictly followed to conform to this document and OGC documents do not use the equivalent phrases in the ISO/IEC Directives, Part 2.

This document also uses terms defined in the OGC Standard for Modular specifications (OGC 08-131r3), also known as the ‘ModSpec’. The definitions of terms such as standard, specification, requirement, and conformance test are provided in the ModSpec.

For the purposes of this document, the following additional terms and definitions apply.

4.1. Availability

Data specifying which tiles/contents/child subtrees exist within a single subtree of an implicit tileset

4.2. Bitstream

A boolean array stored as a sequence of bits rather than bytes.

4.3. Bounding Volume

A closed volume completely containing the union of a set of geometric objects. See Wikipedia: Bounding volume

4.4. Child subtree

A subtree reachable from an available tile in the bottommost row of a subtree of an implicit tileset

4.5. Entity

An entity, in the context of 3D Tiles metadata, is an instance of a metadata class, populated with property values conforming to a metadata class definition

4.6. Feature

In 3D Tiles, an individual component of a tile, such as a 3D model in a Batched 3D Model or a point in a Point Cloud which contains position, appearance, and metadata properties.

4.7. Geometric Error

The difference, in meters, of a tile’s simplified representation of its source geometry used to calculate the screen space error introduced if a tile’s content is rendered and its children’s are not.

4.8. glTF

An API-neutral runtime asset delivery format for 3D assets.

4.9. Hierarchical Level of Detail (HLOD)

Decreasing the complexity of a 3D representation according to metrics such as object importance or distance from the tile to the observation point or camera position. Generally, higher levels of detail provide greater visual fidelity. See Wikipedia: Level of detail

4.10. Implicit tiling

A description of a tileset using recursive subdivision.

4.11. Implicit root tile

A tile that contains the implicitTiling property, and therefore denotes the root of an implicit tileset.

4.12. Metadata

In the context of 3D Tiles, this term refers to any association of 3D content with entities and properties, such that entities represent meaningful units within an overall structure.

4.13. Metadata class

The description of the structure of an entity that contains 3D Tiles metadata, consisting of multiple metadata properties

4.14. Metadata property

An element of a metadata class that defines a name and a type for the corresponding element of a metadata entity.

4.15. Octree

A 3D subdivision scheme that divides each bounding volume into 8 smaller bounding volumes along the midpoint of the x, y, and z axes.

4.16. Quadtree

A 2D subdivision scheme that divides each bounding volume into 4 smaller bounding volumes along the midpoint of the x and y axes.

4.17. Schema

A set of metadata classes and enums that define the structure and type of metadata

4.18. Screen-Space Error (SSE)

The difference, in pixels, of a tile’s simplified representation of its source geometry introduced if a tile’s content is rendered and its children’s are not.

4.19. Spatial Coherence

The union of all content of the child tiles is completely inside the parent tile’s bounding volume

4.20. Style

A set of expressions to be evaluated which modify how each feature in a tileset is displayed

4.21. Subtree

A fixed-size section of the tree of an implicit tileset that contains availability information.

4.22. Subtree file

A binary file storing information about a specific subtree of an implicit tileset

4.23. Subdivision scheme

A recursive pattern of dividing a parent tile into smaller child tiles occupying the same area. This is done by uniformly dividing the bounding volume of the parent tile with an implicit tileset.

4.24. Template URI

A URI pattern containing tile coordinates for directly addressing tiles in an implicit tileset.

4.25. Tile

In 3D Tiles, a subset of a tileset containing a reference to renderable content and the metadata, such as the content’s bounding volume, which is used by a client to determine if the content is rendered.

4.26. Tile Content

A binary blob containing information necessary to render a tile which is an instance of a specific tile format (Batched 3D Model, Instanced 3D Model, Point Clouds, Composite, or glTF).

4.27. Tile Format

The structure or layout of tile content data, (Batched 3D Model, Instanced 3D Model, Point Clouds, Composite, or glTF).

4.28. Tileset

In 3D Tiles, a collection of 3D Tiles tile instances organized into a spatial data structure and additional metadata, such that the aggregation of these tiles represent some 3D content at various levels of detail.

5.  Conventions

No conventions are specified in this document.

6.  3D Tiles Format Specification

6.1.  Introduction

In 3D Tiles, a tileset is a set of tiles organized in a spatial data structure, the tree. A tileset is described by at least one tileset JSON file containing tileset metadata and a tree of tile objects, each of which may reference renderable content.

glTF 2.0 is the primary tile format for 3D Tiles. glTF is an open specification designed for the efficient transmission and loading of 3D content. A glTF asset includes geometry and texture information for a single tile, and may be extended to include metadata, model instancing, and compression. glTF may be used for a wide variety of 3D content including:

  • Heterogeneous 3D models. E.g. textured terrain and surfaces, 3D building exteriors and interiors, massive models

  • 3D model instances. E.g. trees, windmills, bolts

  • Massive point clouds

See glTF Tile Format for more details.

Tiles may also reference the legacy 3D Tiles 1.0 formats listed below. These formats were deprecated in 3D Tiles 1.1 and may be removed in a future version of 3D Tiles.

Table 1 — Legacy tile formats and common uses

Legacy FormatUses
Batched 3D Model (b3dm)Heterogeneous 3D models
Instanced 3D Model (i3dm)3D model instances
Point Cloud (pnts)Massive number of points
Composite (cmpt)Concatenate tiles of different formats into one tile

A tile’s content is an individual instance of a tile format. A tile may have multiple contents.

The content references a set of features, such as 3D models representing buildings or trees, or points in a point cloud. Each feature has position and appearance properties and additional application-specific properties. A client may choose to select features at runtime and retrieve their properties for visualization or analysis.

Tiles are organized in a tree which incorporates the concept of Hierarchical Level of Detail (HLOD) for optimal rendering of spatial data. Each tile has a bounding volume, an object defining a spatial extent completely enclosing its content. The tree has spatial coherence; the content for child tiles are completely inside the parent’s bounding volume.

Figure 1 — A tree of tiles

A tileset may use a 2D spatial tiling scheme similar to raster and vector tiling schemes (like a Web Map Tile Service (WMTS) or XYZ scheme) that serve predefined tiles at several levels of detail (or zoom levels). However since the content of a tileset is often non-uniform or may not easily be organized in only two dimensions, the tree can be any spatial data structure with spatial coherence, including k-d trees, quadtrees, octrees, and grids. Implicit tiling defines a concise representation of quadtrees and octrees.

Application-specific metadata may be provided at multiple granularities within a tileset. Metadata may be associated with high-level entities like tilesets, tiles, contents, or features, or with individual vertices and texels. Metadata conforms to a well-defined type system described by the 3D Metadata Specification, which may be extended with application- or domain-specific semantics.

Optionally a 3D Tiles Style, or style, may be applied to a tileset. A style defines expressions to be evaluated which modify how each feature is displayed.

6.2.  File Extensions and Media Types

3D Tiles uses the following file extensions and Media Types.

Explicit file extensions are optional. Valid implementations may ignore it and identify a content’s format by the magic field in its header.

6.3.  JSON encoding

3D Tiles has the following restrictions on JSON formatting and encoding.

  1. JSON shall use UTF-8 encoding without BOM.

  2. All strings defined in this spec (properties names, enums) use only ASCII charset and shall be written as plain text, without JSON escaping.

  3. Non-ASCII characters that appear as property values in JSON may be escaped.

  4. Names (keys) within JSON objects shall be unique, i.e., duplicate keys aren’t allowed.

  5. Some properties are defined as integers in the schema. Such values may be stored as decimals with a zero fractional part or by using exponent notation, as defined in RFC 8259, Section 6.

6.4.  URIs

3D Tiles uses URIs to reference tile content. These URIs may point to relative external references (RFC3986) or be data URIs that embed resources in the JSON. Embedded resources use the “data” URL scheme (RFC2397).

When the URI is relative, its base is always relative to the referring tileset JSON file.

Client implementations are required to support relative external references and embedded resources. Optionally, client implementations may support other schemes (such as http://). All URIs shall be valid and resolvable.

6.5.  Units

The unit for all linear distances is meters.

All angles are in radians.

6.6.  Coordinate reference system (CRS)

3D Tiles uses a right-handed Cartesian coordinate system; that is, the cross product of x and y yields z. 3D Tiles defines the z axis as up for local Cartesian coordinate systems. A tileset’s global coordinate system will often be in a WGS 84 Earth-centered, Earth-fixed (ECEF) reference frame (EPSG 4978), but it doesn’t have to be, e.g., a power plant may be defined fully in its local coordinate system for use with a modeling tool without a geospatial context.

The CRS of a tileset may be defined explicitly, as part of the tileset metadata. The metadata for the tileset can contain a property that has the TILESET_CRS_GEOCENTRIC semantic, which is a string that represents the EPSG Geodetic Parameter Dataset identifier.

An additional tile transform may be applied to transform a tile’s local coordinate system to the parent tile’s coordinate system.

The region bounding volume specifies bounds using a geographic coordinate system (latitude, longitude, height), specifically, EPSG 4979. The reference ellipsoid is assumed to be the same as the reference ellipsoid of the tileset.

6.7.  Concepts

6.7.1.  Tiles

Tiles consist of metadata used to determine if a tile is rendered, a reference to the renderable content, and an array of any children tiles.

6.7.1.1.  Tile Content

A tile can be associated with renderable content. A tile can either have a single tile.content object, or multiple content objects, stored in a tile.contents array. The latter allows for flexible tileset structures: for example, a single tile may contain multiple representations of the same geometry data.

The content.uri of each content object refers to the tile’s content in one of the tile formats that are defined in the Tile format specifications), or another tileset JSON to create a tileset of tilesets (see External tilesets).

The content.group property assigns the content to a group. Contents of different tiles or the contents of a single tile can be assigned to groups in order to categorize the content. Additionally, each group can be associated with Metadata.

Each content can be associated with a bounding volume. While tile.boundingVolume is a bounding volume that encloses all contents of the tile, each individual content.boundingVolume is a tightly fit bounding volume enclosing just the respective content. More details about the role of tile- and content bounding volumes are given in the bounding volume section.

6.7.1.2.  Geometric error

Tiles are structured into a tree incorporating Hierarchical Level of Detail (HLOD) so that at runtime a client implementation will need to determine if a tile is sufficiently detailed for rendering and if the content of tiles should be successively refined by children tiles of higher resolution. An implementation will consider a maximum allowed Screen-Space Error (SSE), the error measured in pixels.

A tile’s geometric error defines the selection metric for that tile. Its value is a nonnegative number that specifies the error, in meters, of the tile’s simplified representation of its source geometry. Generally, the root tile will have the largest geometric error, and each successive level of children will have a smaller geometric error than its parent, with leaf tiles having a geometric error of or close to 0.

In a client implementation, geometric error is used with other screen space metrics—​e.g., distance from the tile to the camera, screen size, and resolution— to calculate the SSE introduced if this tile is rendered and its children are not. If the introduced SSE exceeds the maximum allowed, then the tile is refined and its children are considered for rendering.

The geometric error is formulated based on a metric like point density, mesh or texture decimation, or another factor specific to that tileset. In general, a higher geometric error means a tile will be refined more aggressively, and children tiles will be loaded and rendered sooner.

6.7.1.3.  Refinement

Refinement determines the process by which a lower resolution parent tile renders when its higher resolution children are selected to be rendered. Permitted refinement types are replacement (”REPLACE”) and additive (”ADD”). If the tile has replacement refinement, the children tiles are rendered in place of the parent, that is, the parent tile is no longer rendered. If the tile has additive refinement, the children are rendered in addition to the parent tile.

A tileset can use replacement refinement exclusively, additive refinement exclusively, or any combination of additive and replacement refinement.

A refinement type is required for the root tile of a tileset; it is optional for all other tiles. When omitted, a tile inherits the refinement type of its parent.

6.7.1.3.1.  Replacement

If a tile uses replacement refinement, when refined it renders its children in place of itself.

Table 2 — A tile and a refined tile using replacement refinement

Parent TileRefined
6.7.1.3.2.  Additive

If a tile uses additive refinement, when refined it renders itself and its children simultaneously.

Table 3 — A tile and a refined tile using additive refinement

Parent TileRefined

6.7.1.4.  Bounding volumes

A bounding volume defines the spatial extent enclosing a tile or a tile’s content. To support tight fitting volumes for a variety of datasets such as regularly divided terrain, cities not aligned with a line of latitude or longitude, or arbitrary point clouds, the bounding volume types include an oriented bounding box, a bounding sphere, and a geographic region defined by minimum and maximum latitudes, longitudes, and heights.

Table 4 — Different bounding volume types for a tile

Bounding boxBounding sphereBounding region
6.7.1.4.1.  Region

The boundingVolume.region property is an array of six numbers that define the bounding geographic region with latitude, longitude, and height coordinates with the order [west, south, east, north, minimum height, maximum height]. Latitudes and longitudes are in the WGS 84 datum as defined in EPSG 4979 and are in radians. Heights are in meters above (or below) the WGS 84 ellipsoid.

NOTE  The latitude and longitude values are given in radians, deviating from the EPSG 4979 definition, where they are given in degrees. The choice of using radians is due to internal computations usually taking place in radians — for example, when converting cartographic to Cartesian coordinates.

Bounding Region

Figure 2 — A bounding region

"boundingVolume": {
  "region": [
    -1.3197004795898053,
    0.6988582109,
    -1.3196595204101946,
    0.6988897891,
    0,
    20
  ]
}
6.7.1.4.2.  Box

The boundingVolume.box property is an array of 12 numbers that define an oriented bounding box in a right-handed 3-axis (x, y, z) Cartesian coordinate system where the z-axis is up. The first three elements define the x, y, and z values for the center of the box. The next three elements (with indices 3, 4, and 5) define the x-axis direction and half-length. The next three elements (indices 6, 7, and 8) define the y-axis direction and half-length. The last three elements (indices 9, 10, and 11) define the z-axis direction and half-length.

NOTE  The representation that is used for an oriented bounding box in 3D Tiles is versatile and compact: In addition the center position, the array contains the elements of a 3×3 matrix. The columns of this matrix are the images of unit vectors under a transformation, and therefore uniquely and compactly define the scaling and orientation of the bounding box.

Bounding Box

Figure 3 — A bounding box

"boundingVolume": {
  "box": [
    0,   0,   10,
    100, 0,   0,
    0,   100, 0,
    0,   0,   10
  ]
}
6.7.1.4.3.  Sphere

The boundingVolume.sphere property is an array of four numbers that define a bounding sphere. The first three elements define the x, y, and z values for the center of the sphere in a right-handed 3-axis (x, y, z) Cartesian coordinate system where the z-axis is up. The last element (with index 3) defines the radius in meters.

Bounding Sphere

Figure 4 — A bounding sphere

"boundingVolume": {
  "sphere": [
    0,
    0,
    10,
    141.4214
  ]
}
6.7.1.4.4.  Content Bounding Volume

The bounding volume can be given for each tile, via the tile.boundingVolume property. Additionally, it is possible to specify the bounding volume for each tile content individually. The content.boundingVolume may be a more tight-fitting bounding volume. This enables tight view frustum culling, excluding from rendering any content not in the volume of what is potentially in view. When it is not defined, the tile’s bounding volume is still used for culling (see Grids).

The screenshot below shows the bounding volumes for the root tile for Canary Wharf. The tile.boundingVolume, shown in red, encloses the entire area of the tileset; content.boundingVolume shown in blue, encloses just the four features (models) in the root tile.

Figure 5 — Bounding volumes for the root tile of a tileset. Building data from CyberCity3D. Imagery data from Bing Maps

6.7.1.4.5.  Extensions

Other bounding volume types are supported through extensions.

6.7.1.5.  Viewer request volume

A tile’s viewerRequestVolume can be used for combining heterogeneous datasets, and can be combined with external tilesets.

The following example has a point cloud inside a building. The point cloud tile’s boundingVolume is a sphere with a radius of 1.25. It also has a larger sphere with a radius of 15 for the viewerRequestVolume. Since the geometricError is zero, the point cloud tile’s content is always rendered (and initially requested) when the viewer is inside the large sphere defined by viewerRequestVolume.

{
  "children": [{
    "transform": [
      4.843178171884396,   1.2424271388626869, 0,                  0,
      -0.7993325488216595,  3.1159251367235608, 3.8278032889280675, 0,
      0.9511533376784163, -3.7077466670407433, 3.2168186118075526, 0,
      1215001.7612985559, -4736269.697480114,  4081650.708604793,  1
    ],
    "boundingVolume": {
      "box": [
        0,     0,    6.701,
        3.738, 0,    0,
        0,     3.72, 0,
        0,     0,    13.402
      ]
    },
    "geometricError": 32,
    "content": {
      "uri": "building.glb"
    }
  }, {
    "transform": [
      0.968635634376879,    0.24848542777253732, 0,                  0,
      -0.15986650990768783,  0.6231850279035362,  0.7655606573007809, 0,
      0.19023066741520941, -0.7415493329385225,  0.6433637229384295, 0,
      1215002.0371330238,  -4736270.772726648,   4081651.6414821907, 1
    ],
    "viewerRequestVolume": {
      "sphere": [0, 0, 0, 15]
    },
    "boundingVolume": {
      "sphere": [0, 0, 0, 1.25]
    },
    "geometricError": 0,
    "content": {
      "uri": "points.glb"
    }
  }]
}

For more on request volumes, see the sample tileset.

6.7.1.6.  Transforms

6.7.1.6.1.  Tile transforms

To support local coordinate systems—​e.g., so a building tileset inside a city tileset can be defined in its own coordinate system, and a point cloud tileset inside the building could, again, be defined in its own coordinate system—​each tile has an optional transform property.

The transform property is a 4×4 affine transformation matrix, stored in column-major order, that transforms from the tile’s local coordinate system to the parent tile’s coordinate system—​or the tileset’s coordinate system in the case of the root tile.

NOTE  The storage of the transform matrix in column-major order follows the conventions that are common in graphics programming APIs like OpenGL, meaning that the elements in the transform array directly correspond to the entries of a 4×4 matrix in these systems.

The transform property applies to

  • tile.content

    • Each feature’s position.

    • Each feature’s normal should be transformed by the top-left 3×3 matrix of the inverse-transpose of transform to account for correct vector transforms when scale is used.

    • content.boundingVolume, except when content.boundingVolume.region is defined, which is explicitly in EPSG:4979 coordinates.

  • tile.boundingVolume, except when tile.boundingVolume.region is defined, which is explicitly in EPSG:4979 coordinates.

  • tile.viewerRequestVolume, except when tile.viewerRequestVolume.region is defined, which is explicitly in EPSG:4979 coordinates.

The transform property scales the geometricError by the largest scaling factor from the matrix.

When transform is not defined, it defaults to the identity matrix:

[
  1.0, 0.0, 0.0, 0.0,
  0.0, 1.0, 0.0, 0.0,
  0.0, 0.0, 1.0, 0.0,
  0.0, 0.0, 0.0, 1.0
]

The transformation from each tile’s local coordinate system to the tileset’s global coordinate system is computed by a top-down traversal of the tileset and by post-multiplying a child’s transform with its parent’s transform like a traditional scene graph or node hierarchy in computer graphics.

6.7.1.6.2.  glTF transforms

glTF defines its own node hierarchy and uses a y-up coordinate system. Any transforms specific to a tile format and the tile.transform property are applied after these transforms are resolved.

6.7.1.6.2.1.  glTF node hierarchy

First, glTF node hierarchy transforms are applied according to the glTF specification.

6.7.1.6.2.2.  y-up to z-up

Next, for consistency with the z-up coordinate system of 3D Tiles, glTFs shall be transformed from y-up to z-up at runtime. This is done by rotating the model about the x-axis by π/2 radians. Equivalently, apply the following matrix transform (shown here as row-major):

[
  1.0, 0.0,  0.0, 0.0,
  0.0, 0.0, -1.0, 0.0,
  0.0, 1.0,  0.0, 0.0,
  0.0, 0.0,  0.0, 1.0
]

More broadly the order of transformations is:

  1. glTF node hierarchy transformations

  2. glTF y-up to z-up transform

  3. Tile transform

NOTE  When working with source data that is inherently z-up, such as data in WGS 84 coordinates or in a local z-up coordinate system, a common workflow is:

  • Mesh data, including positions and normals, are not modified — they remain z-up.

  • The root node matrix specifies a column-major z-up to y-up transform. This transforms the source data into a y-up coordinate system as required by glTF.

  • At runtime the glTF is transformed back from y-up to z-up with the matrix above. Effectively the transforms cancel out.

Example glTF root node:

"nodes": [
 {
   "matrix": [1,0,0,0,0,0,-1,0,0,1,0,0,0,0,0,1],
   "mesh": 0,
   "name": "rootNode"
 }
]
6.7.1.6.3.  Example

For an example of the computed transforms (transformToRoot in the code above) for a tileset, consider:

Figure 6 — Structure of an example tileset with tiles that contain glTF content

The computed transform for each tile is:

  • TO: [T0]

  • T1: [T0][T1]

  • T2: [T0][T2]

  • T3: [T0][T1][T3]

  • T4: [T0][T1][T4]

The full computed transforms, taking into account the glTF y-up to z-up transform and glTF Transforms are

  • TO: [T0]

  • T1: [T0][T1]

  • T2: [T0][T2][glTF y-up to z-up][glTF transform]

  • T3: [T0][T1][T3][glTF y-up to z-up][glTF transform]

  • T4: [T0][T1][T4][glTF y-up to z-up][glTF transform]

6.7.1.6.4.  Implementation example

This section is informative

The following JavaScript code shows how to compute this using Cesium’s Matrix4 and Matrix3 types.

function computeTransforms(tileset) {
  const root = tileset.root;
  const transformToRoot = defined(root.transform) ? Matrix4.fromArray(root.transform) : Matrix4.IDENTITY;

  computeTransform(root, transformToRoot);
}

function computeTransform(tile, transformToRoot) {
  // Apply 4x4 transformToRoot to this tile's positions and bounding volumes

  let normalTransform = Matrix4.getRotation(transformToRoot, new Matrix4());
  normalTransform = Matrix3.inverseTranspose(normalTransform, normalTransform);
  // Apply 3x3 normalTransform to this tile's normals

  const children = tile.children;
  if (defined(children)) {
    const length = children.length;
    for (let i = 0; i < length; ++i) {
      const child = children[i];
      let childToRoot = defined(child.transform) ? Matrix4.fromArray(child.transform) : Matrix4.clone(Matrix4.IDENTITY);
      childToRoot = Matrix4.multiplyTransformation(transformToRoot, childToRoot, childToRoot);
      computeTransform(child, childToRoot);
    }
  }
}

6.7.1.7.  Tile JSON

A tile JSON object consists of the following properties.

Figure 7 — Elements of a tile JSON object

The following example shows one non-leaf tile.

{
  “boundingVolume”: {
    “region”: [
      -1.2419052957251926,
      0.7395016240301894,
      -1.2415404171917719,
      0.7396563300150859,
      0,
      20.4
    ]
  },
  “geometricError”: 43.88464075650763,
  “refine” : “ADD”,
  “content”: {
    “boundingVolume”: {
      “region”: [
        -1.2418882438584018,
        0.7395016240301894,
        -1.2415422846940714,
        0.7396461198389616,
        0,
        19.4
      ]
    },
    “uri”: “2/0/0.glb”
  },
  “children”: […]
}

The boundingVolume defines a volume enclosing the tile, and is used to determine which tiles to render at runtime. The above example uses a region volume, but other bounding volumes, such as box or sphere, may be used.

The geometricError property is a nonnegative number that defines the error, in meters, introduced if this tile is rendered and its children are not. At runtime, the geometric error is used to compute Screen-Space Error (SSE), the error measured in pixels. The SSE determines if a tile is sufficiently detailed for the current view or if its children should be considered, see Geometric error.

The optional viewerRequestVolume property (not shown above) defines a volume, using the same schema as boundingVolume, that the viewer shall be inside of before the tile’s content will be requested and before the tile will be refined based on geometricError. See the Viewer request volume section.

The refine property is a string that is either “REPLACE” for replacement refinement or “ADD” for additive refinement, see Refinement. It is required for the root tile of a tileset; it is optional for all other tiles. A tileset can use any combination of additive and replacement refinement. When the refine property is omitted, it is inherited from the parent tile.

The content property is an object that describes the tile content. A file extension is not required for content.uri. A content’s tile format can be identified by the magic field in its header, or else as an external tileset if the content is JSON.

The content.boundingVolume property defines an optional bounding volume similar to the top-level tile.boundingVolume property. But unlike the top-level boundingVolume property, content.boundingVolume is a tightly fit bounding volume enclosing just the tile’s content.

It is also possible to define multiple contents for a tile: The contents property (not shown above) is an array containing one or more contents. contents and content are mutually exclusive. When a tile has a single content it should use content for backwards compatibility with engines that only support 3D Tiles 1.0. Multiple contents allow for different representations of the tile content — for example, one as a triangle mesh and one as a point cloud:

Figure 8 — An example of a tile that defines multiple contents

Contents can also be arranged into groups, using the content.group property:

{
  “root”: {
    “refine”: “ADD”,
    “geometricError”: 0.0,
    “boundingVolume”: {
      “region”: [-1.707, 0.543, -1.706, 0.544, 203.895, 253.113]
    },
    “contents”: [
      {
        “uri”: “buildings.glb”,
        “group”: 0
      },
      {
        “uri”: “trees.glb”,
        “group”: 1
      },
      {
        “uri”: “cars.glb”,
        “group”: 2
      }
    ]
  }
}

These groups can be associated with group metadata: The value of the content.group property is an index into the array of groups that are defined in a top-level array of the tileset. Each element of this array is a metadata entity, as defined in the metadata section. This allows applications to perform styling or filtering based on the group that the content belongs to:

Figure 9 — Illustration of rendering options based on content groups

The optional transform property (not shown above) defines a 4×4 affine transformation matrix that transforms the tile’s content, boundingVolume, and viewerRequestVolume as described in the Tile transform section.

The optional implicitTiling property (not shown above) defines how the tile is subdivided and where to locate content resources. See Implicit Tiling.

The children property is an array of objects that define child tiles. Each child tile’s content is fully enclosed by its parent tile’s boundingVolume and, generally, a geometricError less than its parent tile’s geometricError. For leaf tiles, the length of this array is zero, and children may not be defined. See the Tileset JSON section below.

The full JSON schema can be found in tile.schema.json.

6.7.2.  Tileset JSON

3D Tiles uses one main tileset JSON file as the entry point to define a tileset. Both entry and external tileset JSON files are not required to follow a specific naming convention.

Here is a subset of the tileset JSON used for Canary Wharf:

{
  "asset" : {
    "version": "1.1",
    "tilesetVersion": "e575c6f1-a45b-420a-b172-6449fa6e0a59",
  },
  "properties": {
    "Height": {
      "minimum": 1,
      "maximum": 241.6
    }
  },
  "geometricError": 494.50961650991815,
  "root": {
    "boundingVolume": {
      "region": [
        -0.0005682966577418737,
        0.8987233516605286,
        0.00011646582098558159,
        0.8990603398325034,
        0,
        241.6
      ]
    },
    "geometricError": 268.37878244706053,
    "refine": "ADD",
    "content": {
      "uri": "0/0/0.glb",
      "boundingVolume": {
        "region": [
          -0.0004001690908972599,
          0.8988700116775743,
          0.00010096729722787196,
          0.8989625664878067,
          0,
          241.6
        ]
      }
    },
    "children": [...]
  }
}

The tileset JSON has four top-level properties: asset, properties, geometricError, and root.

asset is an object containing metadata about the entire tileset. The asset.version property is a string that defines the 3D Tiles version, which specifies the JSON schema for the tileset and the base set of tile formats. The tilesetVersion property is an optional string that defines an application-specific version of a tileset, e.g., for when an existing tileset is updated.

NOTE  The tilesetVersion can be used as a query parameter when requesting content to avoid using outdated content from a cache.

properties is an object containing objects for each per-feature property in the tileset. This tileset JSON snippet is for 3D buildings, so each tile has building models, and each building model has a Height property (see Batch Table). The name of each object in properties matches the name of a per-feature property, and its value defines its minimum and maximum numeric values, which are useful, for example, for creating color ramps for styling.

geometricError is a nonnegative number that defines the error, in meters, that determines if the tileset is rendered. At runtime, the geometric error is used to compute Screen-Space Error (SSE), the error measured in pixels. If the SSE does not exceed a required minimum, the tileset should not be rendered, and none of its tiles should be considered for rendering, see Geometric error.

root is an object that defines the root tile using the tile JSON described in the above section. root.geometricError is not the same as the tileset’s top-level geometricError. The tileset’s geometricError is used at runtime to determine the SSE at which the tileset’s root tile renders; root.geometricError is used at runtime to determine the SSE at which the root tile’s children are rendered.

6.7.2.1.  External tilesets

To create a tree of trees, a tile’s content.uri can point to an external tileset (the uri of another tileset JSON file). This enables, for example, storing each city in a tileset and then having a global tileset of tilesets.

Figure 10 — A tileset that refers to other tilesets

When a tile points to an external tileset, the tile:

  • Cannot have any children; tile.children shall be omitted

  • Cannot be used to create cycles, for example, by pointing to the same tileset file containing the tile or by pointing to another tileset file that then points back to the initial file containing the tile.

  • Will be transformed by both the tile’s transform and root tile’s transform. For example, in the following tileset referencing an external tileset, the computed transform for T3 is [T0][T1][T2][T3].

Figure 11 — The chain of transforms for a tileset that refers to another tileset

If an external tileset defines asset.tilesetVersion, this overrides the value from the parent tileset. If the external tileset does not define asset.tilesetVersion, the value is inherited from the parent tileset (if defined).

6.7.2.2.  Bounding volume spatial coherence

As described above, the tree has spatial coherence; each tile has a bounding volume completely enclosing its content, and the content for child tiles are completely inside the parent’s bounding volume. This does not imply that a child’s bounding volume is completely inside its parent’s bounding volume. For example:

Figure 12 — Bounding sphere for a terrain tile.

Figure 13 — Bounding spheres for the four child tiles. The children’s content is completely inside the parent’s bounding volume, but the children’s bounding volumes are not since they are not tightly fit.

6.7.2.3.  Spatial data structures

3D Tiles incorporates the concept of Hierarchical Level of Detail (HLOD) for optimal rendering of spatial data. A tileset is composed of a tree, defined by root and, recursively, its children tiles, which can be organized by different types of spatial data structures.

A runtime engine is generic and will render any tree defined by a tileset. Any combination of tile formats and refinement approaches can be used, enabling flexibility in supporting heterogeneous datasets, see Refinement.

A tileset may use a 2D spatial tiling scheme similar to raster and vector tiling schemes (like a Web Map Tile Service (WMTS) or XYZ scheme) that serve predefined tiles at several levels of detail (or zoom levels). However since the content of a tileset is often non-uniform or may not easily be organized in only two dimensions, other spatial data structures may be more optimal.

Included below is a brief description of how 3D Tiles can represent various spatial data structures.

6.7.2.3.1.  Quadtrees

A quadtree is created when each tile has four uniformly subdivided children (e.g., using the center latitude and longitude), similar to typical 2D geospatial tiling schemes. Empty child tiles can be omitted.

Figure 14 — Classic quadtree subdivision

3D Tiles enable quadtree variations such as non-uniform subdivision and tight bounding volumes (as opposed to bounding, for example, the full 25% of the parent tile, which is wasteful for sparse datasets).

Figure 15 — Quadtree with tight bounding volumes around each child

For example, here is the root tile and its children for Canary Wharf. Note the bottom left, where the bounding volume does not include the water on the left where no buildings will appear:

Figure 16 — Building data from CyberCity3D. Imagery data from Bing Maps

3D Tiles also enable other quadtree variations such as loose quadtrees, where child tiles overlap but spatial coherence is still preserved, i.e., a parent tile completely encloses all of its children. This approach can be useful to avoid splitting features, such as 3D models, across tiles.

Figure 17 — Quadtree with non-uniform and overlapping tiles

Below, the green buildings are in the left child and the purple buildings are in the right child. Note that the tiles overlap so the two green and one purple building in the center are not split.

Figure 18 — Building data from CyberCity3D. Imagery data from Bing Maps

6.7.2.3.2.  K-d trees

A k-d tree is created when each tile has two children separated by a splitting plane parallel to the x, y, or z axis (or latitude, longitude, height). The split axis is often round-robin rotated as levels increase down the tree, and the splitting plane may be selected using the median split, surface area heuristics, or other approaches.

Figure 19 — Example k-d tree. Note the non-uniform subdivision

Note that a k-d tree does not have uniform subdivision like typical 2D geospatial tiling schemes and, therefore, can create a more balanced tree for sparse and non-uniformly distributed datasets.

3D Tiles enables variations on k-d trees such as multi-way k-d trees where, at each leaf of the tree, there are multiple splits along an axis. Instead of having two children per tile, there are n children.

6.7.2.3.3.  Octrees

An octree extends a quadtree by using three orthogonal splitting planes to subdivide a tile into eight children. Like quadtrees, 3D Tiles allows variations to octrees such as non-uniform subdivision, tight bounding volumes, and overlapping children.

Figure 20 — Traditional octree subdivision