fastdev.nn.point_transformer_v3¶
Point Transformer - V3 Mode1
Adapted from: https://github.com/Pointcept/Pointcept
This module requires the installation of the following packages:
addict: pip install addict
spconv: https://github.com/traveller59/spconv?tab=readme-ov-file#spconv-spatially-sparse-convolution-library
torch-scatter: https://github.com/rusty1s/pytorch_scatter?tab=readme-ov-file#installation
flash-attention: https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#installation-and-features
Original Author: Xiaoyang Wu (xiaoyang.wu.cs@gmail.com) Please cite their work if you use the following code in your research paper.
Module Contents¶
- fastdev.nn.point_transformer_v3.drop_path(x, drop_prob: float = 0.0, training: bool = False, scale_by_keep: bool = True)[source]¶
Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
This is the same as the DropConnect impl I created for EfficientNet, etc networks, however, the original name is misleading as ‘Drop Connect’ is a different form of dropout in a separate paper… See discussion: https://github.com/tensorflow/tpu/issues/494#issuecomment-532968956 … I’ve opted for changing the layer and argument names to ‘drop path’ rather than mix DropConnect as a layer name and use ‘survival rate’ as the argument.
- Parameters:
drop_prob (float)
training (bool)
scale_by_keep (bool)
- class fastdev.nn.point_transformer_v3.DropPath(drop_prob: float = 0.0, scale_by_keep: bool = True)[source]¶
Bases:
torch.nn.Module
Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
- Parameters:
drop_prob (float)
scale_by_keep (bool)
- fastdev.nn.point_transformer_v3.xyz2key(x: torch.Tensor, y: torch.Tensor, z: torch.Tensor, b: torch.Tensor | int | None = None, depth: int = 16)[source]¶
Encodes
x
,y
,z
coordinates to the shuffled keys based on pre-computed look up tables. The speed of this function is much faster than the method based on for-loop.- Parameters:
x (torch.Tensor) – The x coordinate.
y (torch.Tensor) – The y coordinate.
z (torch.Tensor) – The z coordinate.
b (torch.Tensor or int) – The batch index of the coordinates, and should be smaller than 32768. If
b
istorch.Tensor
, the size ofb
must be the same asx
,y
, andz
.depth (int) – The depth of the shuffled key, and must be smaller than 17 (< 17).
- fastdev.nn.point_transformer_v3.right_shift(binary, k=1, axis=-1)[source]¶
Right shift an array of binary values.
Parameters:¶
binary: An ndarray of binary values.
k: The number of bits to shift. Default 1.
axis: The axis along which to shift. Default -1.
Returns:¶
Returns an ndarray with zero prepended and the ends truncated, along whatever axis was specified.
- fastdev.nn.point_transformer_v3.gray2binary(gray, axis=-1)[source]¶
Convert an array of Gray codes back into binary values.
Parameters:¶
gray: An ndarray of gray codes.
axis: The axis along which to perform Gray decoding. Default=-1.
Returns:¶
Returns an ndarray of binary values.
- fastdev.nn.point_transformer_v3.hilbert_encode_(locs, num_dims, num_bits)[source]¶
Decode an array of locations in a hypercube into a Hilbert integer.
This is a vectorized-ish version of the Hilbert curve implementation by John Skilling as described in:
- Skilling, J. (2004, April). Programming the Hilbert curve. In AIP Conference
Proceedings (Vol. 707, No. 1, pp. 381-387). American Institute of Physics.
Params:¶
- locs - An ndarray of locations in a hypercube of num_dims dimensions, in
which each dimension runs from 0 to 2**num_bits-1. The shape can be arbitrary, as long as the last dimension of the same has size num_dims.
num_dims - The dimensionality of the hypercube. Integer.
num_bits - The number of bits for each dimension. Integer.
Returns:¶
The output is an ndarray of uint64 integers with the same shape as the input, excluding the last dimension, which needs to be num_dims.
- fastdev.nn.point_transformer_v3.z_order_encode(grid_coord: torch.Tensor, depth: int = 16)[source]¶
- Parameters:
grid_coord (torch.Tensor)
depth (int)
- fastdev.nn.point_transformer_v3.hilbert_encode(grid_coord: torch.Tensor, depth: int = 16)[source]¶
- Parameters:
grid_coord (torch.Tensor)
depth (int)
- class fastdev.nn.point_transformer_v3.Point(*args, **kwargs)[source]¶
Bases:
addict.Dict
Point Structure of Pointcept
A Point (point cloud) in Pointcept is a dictionary that contains various properties of a batched point cloud. The property with the following names have a specific definition as follows:
“coord”: original coordinate of point cloud;
“grid_coord”: grid coordinate for specific grid size (related to GridSampling);
Point also support the following optional attributes: - “offset”: if not exist, initialized as batch size is 1; - “batch”: if not exist, initialized as batch size is 1; - “feat”: feature of point cloud, default input of model; - “grid_size”: Grid size of point cloud (related to GridSampling); (related to Serialization) - “serialized_depth”: depth of serialization, 2 ** depth * grid_size describe the maximum of point cloud range; - “serialized_code”: a list of serialization codes; - “serialized_order”: a list of serialization order determined by code; - “serialized_inverse”: a list of inverse mapping determined by code; (related to Sparsify: SpConv) - “sparse_shape”: Sparse shape for Sparse Conv Tensor; - “sparse_conv_feat”: SparseConvTensor init with information provide by Point;
- serialization(order='z', depth=None, shuffle_orders=False)[source]¶
Point Cloud Serialization
relay on [“grid_coord” or “coord” + “grid_size”, “batch”, “feat”]
- class fastdev.nn.point_transformer_v3.PointModule(*args, **kwargs)[source]¶
Bases:
torch.nn.Module
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.PointSequential(*args, **kwargs)[source]¶
Bases:
PointModule
A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.
- class fastdev.nn.point_transformer_v3.PDNorm(num_features, norm_layer, context_channels=256, conditions=('ScanNet', 'S3DIS', 'Structured3D'), decouple=True, adaptive=False)[source]¶
Bases:
PointModule
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.SerializedAttention(channels, num_heads, patch_size, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0, order_index=0, enable_rpe=False, enable_flash=True, upcast_attention=True, upcast_softmax=True)[source]¶
Bases:
PointModule
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.MLP(in_channels, hidden_channels=None, out_channels=None, act_layer=nn.GELU, drop=0.0)[source]¶
Bases:
torch.nn.Module
- class fastdev.nn.point_transformer_v3.Block(channels, num_heads, patch_size=48, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0, drop_path=0.0, norm_layer=nn.LayerNorm, act_layer=nn.GELU, pre_norm=True, order_index=0, cpe_indice_key=None, enable_rpe=False, enable_flash=True, upcast_attention=True, upcast_softmax=True)[source]¶
Bases:
PointModule
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.SerializedPooling(in_channels, out_channels, stride=2, norm_layer=None, act_layer=None, reduce='max', shuffle_orders=True, traceable=True)[source]¶
Bases:
PointModule
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.SerializedUnpooling(in_channels, skip_channels, out_channels, norm_layer=None, act_layer=None, traceable=False)[source]¶
Bases:
PointModule
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.Embedding(in_channels, embed_channels, norm_layer=None, act_layer=None)[source]¶
Bases:
PointModule
PointModule placeholder, all module subclass from this will take Point in PointSequential.
- class fastdev.nn.point_transformer_v3.PointTransformerV3(in_channels=6, order=('z', 'z-trans'), stride=(2, 2, 2, 2), enc_depths=(2, 2, 2, 6, 2), enc_channels=(32, 64, 128, 256, 512), enc_num_head=(2, 4, 8, 16, 32), enc_patch_size=(48, 48, 48, 48, 48), dec_depths=(2, 2, 2, 2), dec_channels=(64, 64, 128, 256), dec_num_head=(4, 4, 8, 16), dec_patch_size=(48, 48, 48, 48), mlp_ratio=4, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0, drop_path=0.3, pre_norm=True, shuffle_orders=True, enable_rpe=False, enable_flash=True, upcast_attention=False, upcast_softmax=False, cls_mode=False, pdnorm_bn=False, pdnorm_ln=False, pdnorm_decouple=True, pdnorm_adaptive=False, pdnorm_affine=True, pdnorm_conditions=('ScanNet', 'S3DIS', 'Structured3D'))[source]¶
Bases:
PointModule
PointModule placeholder, all module subclass from this will take Point in PointSequential.