Point Transformer - V3 Mode1
Original Author: Xiaoyang Wu ( Please cite their work if you use the following code in your research paper.
Module Contents¶
- fastdev.nn.point_transformer_v3.drop_path(x, drop_prob: float = 0.0, training: bool = False, scale_by_keep: bool = True)[source]¶
Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
This is the same as the DropConnect impl I created for EfficientNet, etc networks, however, the original name is misleading as ‘Drop Connect’ is a different form of dropout in a separate paper… See discussion: … I’ve opted for changing the layer and argument names to ‘drop path’ rather than mix DropConnect as a layer name and use ‘survival rate’ as the argument.
- Parameters:
drop_prob (float)
training (bool)
scale_by_keep (bool)
- class fastdev.nn.point_transformer_v3.DropPath(drop_prob: float = 0.0, scale_by_keep: bool = True)[source]¶
Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
- Parameters:
drop_prob (float)
scale_by_keep (bool)
- fastdev.nn.point_transformer_v3.xyz2key(x: torch.Tensor, y: torch.Tensor, z: torch.Tensor, b: torch.Tensor | int | None = None, depth: int = 16)[source]¶
coordinates to the shuffled keys based on pre-computed look up tables. The speed of this function is much faster than the method based on for-loop.- Parameters:
x (torch.Tensor) – The x coordinate.
y (torch.Tensor) – The y coordinate.
z (torch.Tensor) – The z coordinate.
b (torch.Tensor or int) – The batch index of the coordinates, and should be smaller than 32768. If
, the size ofb
must be the same asx
, andz
.depth (int) – The depth of the shuffled key, and must be smaller than 17 (< 17).
- fastdev.nn.point_transformer_v3.right_shift(binary, k=1, axis=-1)[source]¶
Right shift an array of binary values.
binary: An ndarray of binary values.
k: The number of bits to shift. Default 1.
axis: The axis along which to shift. Default -1.
Returns an ndarray with zero prepended and the ends truncated, along whatever axis was specified.
- fastdev.nn.point_transformer_v3.gray2binary(gray, axis=-1)[source]¶
Convert an array of Gray codes back into binary values.
gray: An ndarray of gray codes.
axis: The axis along which to perform Gray decoding. Default=-1.
Returns an ndarray of binary values.
- fastdev.nn.point_transformer_v3.hilbert_encode_(locs, num_dims, num_bits)[source]¶
Decode an array of locations in a hypercube into a Hilbert integer.
This is a vectorized-ish version of the Hilbert curve implementation by John Skilling as described in:
- Skilling, J. (2004, April). Programming the Hilbert curve. In AIP Conference
Proceedings (Vol. 707, No. 1, pp. 381-387). American Institute of Physics.
- locs - An ndarray of locations in a hypercube of num_dims dimensions, in
which each dimension runs from 0 to 2**num_bits-1. The shape can be arbitrary, as long as the last dimension of the same has size num_dims.
num_dims - The dimensionality of the hypercube. Integer.
num_bits - The number of bits for each dimension. Integer.
The output is an ndarray of uint64 integers with the same shape as the input, excluding the last dimension, which needs to be num_dims.
- fastdev.nn.point_transformer_v3.z_order_encode(grid_coord: torch.Tensor, depth: int = 16)[source]¶
- Parameters:
grid_coord (torch.Tensor)
depth (int)
- fastdev.nn.point_transformer_v3.hilbert_encode(grid_coord: torch.Tensor, depth: int = 16)[source]¶
- Parameters:
grid_coord (torch.Tensor)
depth (int)
- class fastdev.nn.point_transformer_v3.Point(*args, **kwargs)[source]¶
Point Structure of Pointcept
A Point (point cloud) in Pointcept is a dictionary that contains various properties of a batched point cloud. The property with the following names have a specific definition as follows:
“coord”: original coordinate of point cloud;
“grid_coord”: grid coordinate for specific grid size (related to GridSampling);
Point also support the following optional attributes: - “offset”: if not exist, initialized as batch size is 1; - “batch”: if not exist, initialized as batch size is 1; - “feat”: feature of point cloud, default input of model; - “grid_size”: Grid size of point cloud (related to GridSampling); (related to Serialization) - “serialized_depth”: depth of serialization, 2 ** depth * grid_size describe the maximum of point cloud range; - “serialized_code”: a list of serialization codes; - “serialized_order”: a list of serialization order determined by code; - “serialized_inverse”: a list of inverse mapping determined by code; (related to Sparsify: SpConv) - “sparse_shape”: Sparse shape for Sparse Conv Tensor; - “sparse_conv_feat”: SparseConvTensor init with information provide by Point;
- serialization(order='z', depth=None, shuffle_orders=False)[source]¶
Point Cloud Serialization
relay on [“grid_coord” or “coord” + “grid_size”, “batch”, “feat”]
- class fastdev.nn.point_transformer_v3.PointModule(*args, **kwargs)[source]¶
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.PointSequential(*args, **kwargs)[source]¶
A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.
- class fastdev.nn.point_transformer_v3.PDNorm(num_features, norm_layer, context_channels=256, conditions=('ScanNet', 'S3DIS', 'Structured3D'), decouple=True, adaptive=False)[source]¶
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.SerializedAttention(channels, num_heads, patch_size, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0, order_index=0, enable_rpe=False, enable_flash=True, upcast_attention=True, upcast_softmax=True)[source]¶
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.MLP(in_channels, hidden_channels=None, out_channels=None, act_layer=nn.GELU, drop=0.0)[source]¶
- class fastdev.nn.point_transformer_v3.Block(channels, num_heads, patch_size=48, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0, drop_path=0.0, norm_layer=nn.LayerNorm, act_layer=nn.GELU, pre_norm=True, order_index=0, cpe_indice_key=None, enable_rpe=False, enable_flash=True, upcast_attention=True, upcast_softmax=True)[source]¶
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.SerializedPooling(in_channels, out_channels, stride=2, norm_layer=None, act_layer=None, reduce='max', shuffle_orders=True, traceable=True)[source]¶
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.SerializedUnpooling(in_channels, skip_channels, out_channels, norm_layer=None, act_layer=None, traceable=False)[source]¶
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.Embedding(in_channels, embed_channels, norm_layer=None, act_layer=None)[source]¶
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.PointTransformerV3(in_channels=6, order=('z', 'z-trans'), stride=(2, 2, 2, 2), enc_depths=(2, 2, 2, 6, 2), enc_channels=(32, 64, 128, 256, 512), enc_num_head=(2, 4, 8, 16, 32), enc_patch_size=(48, 48, 48, 48, 48), dec_depths=(2, 2, 2, 2), dec_channels=(64, 64, 128, 256), dec_num_head=(4, 4, 8, 16), dec_patch_size=(48, 48, 48, 48), mlp_ratio=4, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0, drop_path=0.3, pre_norm=True, shuffle_orders=True, enable_rpe=False, enable_flash=True, upcast_attention=False, upcast_softmax=False, cls_mode=False, pdnorm_bn=False, pdnorm_ln=False, pdnorm_decouple=True, pdnorm_adaptive=False, pdnorm_affine=True, pdnorm_conditions=('ScanNet', 'S3DIS', 'Structured3D'))[source]¶
PointModule placeholder, all module subclass from this will take Point in PointSequential.