Last updated: May 29, 2026
Application No. 18/207,614
POINT CLOUD OPTIMIZATION USING INSTANCE SEGMENTATION

Non-Final OA §103§112
Filed
Jun 08, 2023
Priority
Aug 17, 2022 — provisional 63/398,669
Examiner
STATZ, BENJAMIN TOM
Art Unit
2611
Tech Center
2600 — Communications
Assignee
Tencent America LLC
OA Round
3 (Non-Final)
Interview Optional

— +100.0% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 33% grant rate with +100.0% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 3 resolved cases, 2023–2026
Examiner Intelligence

STATZ, BENJAMIN TOM View full profile →
Grants only 33% of cases
Career Allowance Rate
1 granted / 3 resolved
-28.7% vs TC avg
Strong +100% interview lift
Without
With
+100.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 6m
Avg Prosecution
21 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§103
94.2%
+54.2% vs TC avg
§112
1.5%
-38.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 3 resolved cases
Office Action

§103 §112
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments with respect to independent claim(s) 1, 14, and 22 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 23 and 24 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

	Claims 23 and 24 recite the limitation “voxels stored in a Eulerian arrangement” and “voxels stored in a non-Eulerian arrangement”, respectively.  Paragraph [0118] of the specification gives several examples of a “Eulerian voxel grid” and mentions the alternative of a “non-Eularian arrangement” [sic] of voxels, but does not explicitly define either term.  These terms do not appear to be commonly understood in the art, and there are at least two distinct possibilities of an intended definition: a “Eulerian graph” as pertaining to traversing edges of a graph, or a “Eulerian grid” as pertaining to tracking particle motion in fluid dynamics.
Furthermore, it is unclear whether or not the provided examples can be used as definitions; i.e. “In an example, a Eulerian voxel grid is a cubic based voxel grid that is formed by cubic voxels with different sizes” cannot be assumed to mean that any cubic based voxel grid that is formed by cubic voxels with different sizes will teach the limitation of “voxels stored in a Eulerian arrangement”.
Therefore, the limitations “voxels stored in a Eulerian arrangement” and “voxels stored in a non-Eulerian arrangement” are considered to be indefinite.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 3, 5, 14, 21, 22, and 26 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhao et al. (“Real-Time Scene-Aware LiDAR Point Cloud Compression Using Semantic Prior Representation”. IEEE Transactions on Circuits and Systems for Video Technology, vol. 32 no. 8 (Aug 2022), pp. 5623-5637. https://doi.org/10.1109/TCSVT.2022.3145513; hereinafter “Zhao”) in view of Cheng et al. (CN 113870271 A, hereinafter "Cheng") and Bogoslavskyi et al. ("Fast Range Image-Based Segmentation of Sparse 3D Laser Scans for Online Operation". 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 163-169 (09-14 Oct 2016). https://doi.org/10.1109/IROS.2016.7759050, hereinafter "Bogoslavskyi").

Regarding claim 1, Zhao teaches: A method for point cloud processing, comprising: 
obtaining point cloud data corresponding to a point cloud in a three dimensional (3D) space (fig. 2, point cloud from LiDAR sensor is input into the encoder); 
projecting the point cloud in the 3D space to one or more two dimensional (2D) planes to generate one or more images (pg. 5626 col. 1 section IIIA: “For the encoder as shown in Fig. 2 (top part), it consists of three stages: 1) projecting 3D point clouds to a set of 2D range images for conducting scene-aware object segmentation…”; fig. 2 first step in encoder: “3-D to 2-D projection”; fig. 4 shows example of 2D range image); 
inputting the one or more images into a logic model that is configured to generate pixel wise masks for respective object instances in the point cloud (pg. 5628 col. 1 “To label each pixel for the 2D range images obtained in Section III-B, a 2D semantic segmentation convolutional neural network (CNN) is exploited, which is a modified version of the RangeNet53 [43] by using its 2D processing part only to generate semantic labels for the 2D range images.”; pg. 5628 col. 1 also specifies that both semantic and instance segmentation is performed: “Furthermore, our algorithm is able to make further separation of the pixels within the same class according to the depth information; for example, a car at the foreground and the one on the background are considered as two different cars. Such scene-aware object segmentation is beneficial not only to the coding efficiency as a whole but also to the accuracy on the aspect of machine perception.”);
generating the pixel wise masks for the respective object instances in the point cloud according to the one or more images and through the logic model (fig. 4(d) shows a pixel-wise segmentation performed on a range image: “a colored version of (c), on which each color corresponds to a semantic region (or label) generated by [43].”), a first pixel wise mask in the pixel wise masks comprising first pixels that are associated with a first object instance in the object instances in the point cloud, each pixel in the first pixel wise mask being associated with (i) semantic information and (ii) instance information of the first object instance (pg. 5627 col. 1 “stage 2 finally generates… (pixel-wise) labels (e.g., car A, car B, pedestrian A, pedestrian B, etc.)” – each label contains both semantic and instance information; pg. 5628 col. 1 ““Furthermore, our algorithm is able to make further separation of the pixels within the same class according to the depth information; for example, a car at the foreground and the one on the background are considered as two different cars.”); and 
processing the point cloud based on the first pixel wise mask (pg. 5629 section III.D. “SPR Lossy Encoding” describes how the labels generated via pixel-wise segmentation are encoded into a compressed point cloud; see fig. 5).
Zhao does not explicitly teach: processing the point cloud based on the first pixel wise mask, a portion of the point cloud corresponding to the first pixels in the first pixel wise mask being processed based on one or more processing parameters determined for the first object instance.
Cheng teaches: processing the point cloud based on the first pixel wise mask, a portion of the point cloud corresponding to the first pixels in the first pixel wise mask being processed based on one or more processing parameters determined for the first object instance ([n0043] “In this embodiment, different semantic regions correspond to different compression rates. For semantic regions with a lot of detail (such as face, hands, desktop and window, etc.), a lower compression rate can be used, while for semantic regions with a lot of detail (such as torso, limbs, walls and ground), a higher compression rate can be used.”; [n0050] “Step 130: Compress the point clouds in each 3D semantic region based on the compression ratio to obtain compressed RGBD data.” – the compression ratio corresponds to the claimed “processing parameter”, and the point cloud is compressed based on the compression ratio determined for each segmented object.).
Zhao and Cheng are analogous to the claimed invention because they are in the same field of point cloud segmentation and compression.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao with the teachings of Cheng to adaptively compress a point cloud based on the semantic regions detected during segmentation.  The motivation would have been to leave the higher detail regions less compressed, which would help to reduce the size (and consequently memory/bandwidth capacity) of a point cloud as much as possible while still preserving necessary detail and “ensuring the quality of point cloud reconstruction”, as taught by Cheng ([n0003] to [n0004]).
	The combination of Zhao in view of Cheng does not explicitly teach: generating pixel wise masks through a non neural network based logic model.
	Bogoslavskyi teaches: generating pixel wise masks through a non neural network based logic model (fig. 4 shows the clustering method of segmentation taught by Bogoslavskiy, based on angles relative to different groups of points; fig. 2C shows the generated pixel wise mask: “We build up a range image not considering points lying on the ground plane and perform the segmentation in the range image directly.”).	Bogoslavskyi and the combination of Zhao in view of Cheng are analogous to the claimed invention because they are in the same field of point cloud segmentation.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng with the teachings of Bogoslavskyi to substitute the original method of range image segmentation with the clustering method of Bogoslavskyi.  The motivation would have been to use a method that is fast, has low computational demands, and runs online (Bogoslavskyi abstract).

	Regarding claim 3, the combination of Zhao in view of Cheng and Bogoslavskyi teaches: The method of claim 1, wherein the point cloud comprises points representing a person, and the generating the pixel wise masks further comprises: generating the first pixel wise mask that includes a plurality of sub masks respectively associated with facial elements and body elements of the person (Cheng [n0040] “In this embodiment, after obtaining the target object image and the background image, the target object parts are segmented using artificial intelligence (AI) technology to obtain multiple 3D target object parts. For example, if the target object is the human body, the human body is divided into parts such as the head, hands, torso, and limbs.”), the plurality of sub masks having metadata indicative of the facial elements and the body elements (Zhao pg. 5626 col. 2 “The 2D range images generated in stage 1 are subjected to conduct object segmentation in stage 2, i.e., assigning an object label to each pixel of the range image.”; the object labels can be considered metadata.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi with the additional teachings of Cheng to segment images of humans into higher detail regions (such as the head) and lower detail regions (such as the torso).  The motivation would have been to apply the adaptive compression method taught by Cheng to leave the higher detail regions less compressed, which would help to reduce the size (and consequently memory/bandwidth capacity) of a point cloud as much as possible while still preserving necessary detail and “ensuring the quality of point cloud reconstruction”, as taught by Cheng ([n0003] to [n0004]).
	
	Regarding claim 5, the combination of Zhao in view of Cheng and Bogoslavskyi teaches: The method of claim 1, wherein the processing the point cloud further comprises: determining first parameters for voxelating the first object instance (Cheng [n0044] “In this embodiment, the compression ratio corresponding to each 3D semantic region can be obtained by: establishing a correspondence between semantic information and compression ratio; and determining the compression ratio corresponding to each 3D semantic region based on the correspondence.”); and voxelating the portion of the point cloud corresponding to the first pixels in the first pixel wise mask according to the first parameters (Cheng [n0053] “Specifically, the process of compressing each 3D semantic region based on the compression ratio to obtain compressed RGBD data can be as follows: for each 3D semantic region, the 3D semantic region is divided into voxel grids according to the compression ratio to obtain multiple voxel grids; downsampling is performed on the multiple voxel grids to obtain the compressed semantic region.”; [n0054] “The compression ratio is proportional to the size of the voxel mesh. That is, the higher the compression ratio, the larger the voxel mesh size; the lower the compression ratio, the smaller the voxel mesh size.”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi with the additional teachings of Cheng to apply a method of adaptively compressing the point cloud based on semantic regions.  The motivation would have been to reduce the size (and consequently memory/bandwidth capacity) of a point cloud as much as possible while still preserving necessary detail and “ensuring the quality of point cloud reconstruction”, as taught by Cheng ([n0003] to [n0004]).

Regarding claim 14, it is rejected using the same references, rationale, and motivations to combine as claim 1 because its limitations substantially correspond to the limitations of claim 1.

Regarding claim 21, the combination of Zhao in view of Cheng and Bogoslavskyi teaches: The method of claim 1, further comprising:
determining the first pixel wise mask from multiple virtual camera views (Zhao pg. 5627 section III.B: “3D to 2D Projection: Range Image Generation” discusses the generation of 100 separate range images projected into 2D from the input point cloud, each of which is labeled per pixel as shown in fig. 4 and contributes to the full segmentation).

Regarding claim 22, most of its limitations substantially correspond to the limitations of claim 1, and are therefore rejected with the same reference, rationale, and motivations to combine as claim 1, with the exception of the following limitations:
a non-transitory computer-readable storage medium storing instructions which when executed by a processor cause the processor to perform an encoding method (Cheng [n0087]), and
transmitting a bitstream indicating the processed portion of the point cloud (Zhao pg. 5629 col. 2 “Finally, three categories of encoded SPR data are encapsulated to generate the final bitstream for storage and/or transmission.”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang with the additional teachings of Cheng to use standard methods of storing and executing programs on standard computational hardware, for purposes of convenience and compatibility.

Regarding claim 26, the combination of Zhao in view of Cheng and Bogoslavskyi teaches: The method of claim 1, wherein metadata associated with the first pixel wise mask indicate one or more of a pose of a virtual camera associated with the one or more 2D planes, a field-of-view of the virtual camera, a size of a virtual sensor in the virtual camera, a resolution of the virtual camera, and an aspect ratio of the virtual camera (Zhao pg. 5627 section II.B. “3D to 2D Projection: Range Image Generation” discusses several parameters used when projecting the point cloud to each of the 2D range images which are subjected to pixel-wise segmentation, including the pitch, width, and height; pg. 5630 section II.F. “SPR Decoding” describes the use of these parameters to reconstruct the original point cloud by specifying the positioning of the perspective of each range image).

Claim(s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhao ("Real-Time Scene-Aware LiDAR Point Cloud Compression Using Semantic Prior Representation") in view of Cheng (CN 113870271 A) and Bogoslavskyi ("Fast Range Image-Based Segmentation of Sparse 3D Laser Scans for Online Operation") as applied to claim 1 above, and further in view of Xie et al. (US 20200082207 A1, hereinafter "Xie").

Regarding claim 4, the combination of Zhao in view of Cheng and Bogoslavskyi teaches: The method of claim 1, wherein the projecting the point cloud further comprises:
determining respective parameters of one or more virtual cameras associated with the one or more 2D planes for a projection of the point cloud, the one or more virtual cameras including a first virtual camera, the parameters of the first virtual camera including a pose and an orientation of the first virtual camera (Zhao pg. 5627 section II.B. “3D to 2D Projection: Range Image Generation” discusses several parameters used when projecting the point cloud to each of the 2D range images which are subjected to pixel-wise segmentation, including the pitch, width, and height).
The combination of Zhao in view of Cheng and Bogoslavskyi does not explicitly teach: determining respective parameters of one or more virtual cameras associated with the one or more 2D planes for a projection of the point cloud according to pretrained data.
Xie teaches: determining respective parameters of one or more virtual cameras associated with the one or more 2D planes for a projection of the point cloud ([0045] “At block 330, each point cloud target region is projected onto a scene image according to parameters of the laser radar and the imager, to determine a respective image target region associated with each point cloud target region.”; [0046] “The parameters may include a posture (such as a position and an orientation) and the like.”) according to pretrained data ([0029] “Both the point cloud feature extraction model and the image feature extraction model may be obtained by training a convolutional neural network in advance.”; [0033] explains that the target regions are identified using the aforementioned models).
Xie and the combination of Zhao in view of Cheng and Bogoslavskyi are analogous to the claimed invention because they are in the same field of point cloud processing.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi with the teachings of Xie to use a pretrained machine learning algorithm to select angles that focus on objects in the field of view.  The motivation would have been to improve the results of object recognition (as taught by Xie), or, analogously, segmentation.

Claim(s) 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhao ("Real-Time Scene-Aware LiDAR Point Cloud Compression Using Semantic Prior Representation") in view of Cheng (CN 113870271 A) and Bogoslavskyi ("Fast Range Image-Based Segmentation of Sparse 3D Laser Scans for Online Operation") as applied to claim 1 above, and further in view of Wu et al. (“3D scene graph prediction from point clouds”. Virtual Reality & Intelligent Hardware vol. 4 issue 1 (Feb 2022), pg. 76-88. https://doi.org/10.1016/j.vrih.2022.01.005; hereinafter “Wu”).

Regarding claim 6, the combination of Zhao in view of Cheng and Bogoslavskyi teaches: The method of claim 1, but does not explicitly teach: wherein the processing the point cloud further comprises: generating a scene graph associated with the point cloud based on the first pixel wise mask, the scene graph including at least a first scene element identifying the first object instance.
Wu teaches: generating a scene graph associated with the point cloud based on the pixel wise mask, the scene graph including at least a first scene element identifying the first object instance (pg. 78 “Given the input point cloud, P, and the class-agnostic instance segmentation, M, of a scene s, indicating that the point cloud has instance segmentation labels without specific semantic categories, our aim is to generate the corresponding scene graph, which is a graph topology structure composed of the category label of each instance and the label of the relationship between the categories” – each object instance is identified, which must necessarily include the claimed ‘first object instance’).
Wu and the combination of Zhao in view of Cheng and Bogoslavskyi are analogous to the claimed invention because they are in the same field of point cloud processing.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi with the teachings of Wu to generate a scene graph from the segmented point cloud.  The motivation would have been to better understand the overarching organization of the environment being observed, rather than simply labeling unassociated classes of objects.

Claim(s) 7-11 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhao ("Real-Time Scene-Aware LiDAR Point Cloud Compression Using Semantic Prior Representation") in view of Cheng (CN 113870271 A) and Bogoslavskyi ("Fast Range Image-Based Segmentation of Sparse 3D Laser Scans for Online Operation") as applied to claim 1 above, and further in view of Zhang et al. (WO 2021112953 A1, hereinafter "Zhang").

Regarding claim 7, the combination of Zhao in view of Cheng and Bogoslavskyi teaches: The method of claim 1, but does not explicitly teach: wherein the processing the point cloud further comprises: processing the point cloud with the first pixel wise mask by a video based point cloud compression (V-PCC) system.
Zhang teaches: wherein the processing the point cloud further comprises: processing the point cloud with the first pixel wise mask by a video based point cloud compression (V-PCC) system (para. 0030, para. 0043).
Zhang and the combination of Zhao in view of Cheng and Bogoslavskyi are analogous to the claimed invention because they are in the same field of point cloud compression.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi with the teachings of Zhang to use an established standard for point cloud compression.  The motivation would have been to improve compatibility with other devices and systems.

Regarding claim 8, the combination of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang teaches: The method of claim 7, further comprising: 
dividing the point cloud into a plurality of segments according to the first pixel wise mask having a plurality of sub masks corresponding to the plurality of segments (Zhao fig. 4 shows the point cloud divided into segments according to the categories of the segmentation map);
packing the plurality of segments respectively into geometry maps (Zhang para. 0045 “In some examples, the V-PCC encoder (300) can convert 3D point cloud frames into geometry images, texture images and occupancy maps, and then use video coding techniques to encode the geometry images, texture images and occupancy maps into a bitstream. Generally, a geometry image is a 2D image with pixels filled with geometry values associated with points projected to the pixels, and a pixel filled with a geometry value can be referred to as a geometry sample”, Lee para. 0033-0038 teaches processing a segment of a point cloud separately from the rest); and
encoding the geometry maps into respective sub streams (Zhang para. 0045 “In some examples, the V-PCC encoder (300) can convert 3D point cloud frames into geometry images, texture images and occupancy maps, and then use video coding techniques to encode the geometry images, texture images and occupancy maps into a bitstream”, Cheng [n0043] teaches processing a segment of a point cloud separately from the rest).
Zhang and the combination of Zhao in view of Cheng and Bogoslavskyi are analogous to the claimed invention because they are in the same field of point cloud compression.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang with the additional teachings of Zhang to adapt the system of adaptively compressing certain semantic regions more or less than others to fit the standardized V-PCC system of Zhang.  The motivation would have been to use an established standard for point cloud compression to improve compatibility with other devices and systems.

Regarding claim 9, the combination of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang teaches: The method of claim 8, further comprising: 
generating 2D patches respectively for the plurality of segments based on the first pixel wise mask (Zhang [0046] “The patch generation module (306) segments a point cloud into a set of patches”, Cheng [n0043] teaches processing a segment of a point cloud separately from the rest), a 2D patch for a segment including geometry information (Zhang [0046] “The patch generation module (306) segments a point cloud into a set of patches (e.g., a patch is defined as a contiguous subset of the surface described by the point cloud), which may be overlapping or not, such that each patch may be described by a depth field with respect to a plane in 2D space”) and semantic information (Zhao fig. 5 and pg. 5629 col. 1, semantic labels are included in encoding for point cloud compression, where label categories are represented as unsigned integers) of the 2D patch.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang with the additional teachings of Zhao to include semantic category labels in the patches during the process of compression.  The motivation would have been to store the findings of the segmentation map (taught by Zhao) during the standardized point cloud compression process (taught by Zhang), so as not to lose that information.

Regarding claim 10, the combination of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang teaches: The method of claim 7, further comprising: 
determining a first quantization parameter for encoding the portion of the point cloud based on the first pixel wise mask (Cheng [n0043] “For semantic regions with a lot of detail (such as face, hands, desktop and window, etc.), a lower compression rate can be used”),
determining a second quantization parameter for encoding a second portion of the point cloud based on a second pixel wise mask in the pixel wise masks, the second pixel wise mask including second pixels that are associated with a second object instance in the object instances (Cheng [n0043] “while for semantic regions with a lot of detail (such as torso, limbs, walls and ground), a higher compression rate can be used.” – this translation appears to contain a typo; one of ordinary skill would understand that the intent was “regions without a lot of detail” in comparison to the previous group), and 
when the first object instance includes a human face of a person and the second object instance includes a portion of the person that is not the human face, the first quantization parameter is less than the second quantization parameter (Cheng [n0043]: first group includes “face” and “hands”, second group includes “torso” and “limbs”; compression rate for the first group is stated to be lower than the compression rate for the second group).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang with the additional teachings of Cheng to adaptively apply a lower compression to higher-detailed objects and vice versa.  The motivation would have been to reduce the size (and consequently memory/bandwidth capacity) of a point cloud as much as possible while still preserving necessary detail and “ensuring the quality of point cloud reconstruction”, as taught by Cheng ([n0003] to [n0004]).

Regarding claim 11, the combination of Zhao in view of Cheng and Bogoslavskyi teaches: The method of claim 1, but does not explicitly teach: wherein the processing the point cloud further comprises: processing the point cloud with the first pixel wise mask by a geometry based point cloud compression (G-PCC) system.
Zhang teaches: wherein the processing the point cloud further comprises: processing the point cloud with the first pixel wise mask by a geometry based point cloud compression (G-PCC) system (para. 0030, para. 0100).
Zhang and the combination of Zhao in view of Cheng and Bogoslavskyi are analogous to the claimed invention because they are in the same field of point cloud compression.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi with the teachings of Zhang to use an established standard for point cloud compression.  The motivation would have been to improve compatibility with other devices and systems.

Regarding claim 13, the combination of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang teaches: The method of claim 11, further comprising: 
determining geometry quantization parameters for octree partitioning (Zhang [0141] discusses a parameter for the maximum octree partition depth; [0101] “The octree encoding module (730) is configured to receive filtered positions from the duplicated points removal module (712), and perform an octree-based encoding process to generate a sequence of occupancy codes dial describe a 3D grid of voxels.”) based on the first pixel wise mask (Cheng describes adjusting the size of a voxel grid based on information from a 2D segmentation of a point cloud: [n0044] “In this embodiment, the compression ratio corresponding to each 3D semantic region can be obtained by: establishing a correspondence between semantic information and compression ratio; and determining the compression ratio corresponding to each 3D semantic region based on the correspondence.”; [n0054] “The compression ratio is proportional to the size of the voxel mesh. That is, the higher the compression ratio, the larger the voxel mesh size; the lower the compression ratio, the smaller the voxel mesh size.”); and 
performing the octree partitioning based on the geometry quantization parameters (Zhang fig. 9-10, [0116] to [0118]).
Zhang and the combination of Zhao in view of Cheng and Bogoslavskyi are analogous to the claimed invention because they are in the same field of point cloud compression.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang with the additional teachings of Zhang to adapt the system of adaptively compressing certain semantic regions more or less than others to fit the standardized G-PCC system of Zhang.  The motivation would have been to use an established standard for point cloud compression to improve compatibility with other devices and systems.

Claim(s) 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhao ("Real-Time Scene-Aware LiDAR Point Cloud Compression Using Semantic Prior Representation") in view of Cheng (CN 113870271 A) and Bogoslavskyi ("Fast Range Image-Based Segmentation of Sparse 3D Laser Scans for Online Operation") and further in view of Zhang (WO 2021112953 A1) as applied to claim 11 above, and further in view of Jeppsson et al. (“Efficient Live and on-Demand Tiled HEVC 360 VR Video Streaming.” 2018 IEEE International Symposium on Multimedia (ISM) (Dec 2018), pp. 81-88. https://doi.org/10.1109/ISM.2018.00022; hereinafter “Jeppsson”).

Regarding claim 12, the combination of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang teaches: The method of claim 11, further comprising: 
dividing the point cloud into multiple slices based on the first pixel wise mask (Zhang para. 0068 and 0144 refer to subdivisions of a compressed point cloud bitstream as “slices”; Cheng describes adjusting the size of subdivisions of a point cloud based on information from a 2D segmentation: [n0044] “In this embodiment, the compression ratio corresponding to each 3D semantic region can be obtained by: establishing a correspondence between semantic information and compression ratio; and determining the compression ratio corresponding to each 3D semantic region based on the correspondence.”; [n0054] “The compression ratio is proportional to the size of the voxel mesh. That is, the higher the compression ratio, the larger the voxel mesh size; the lower the compression ratio, the smaller the voxel mesh size.”); 
determining encoder parameters respectively for the multiple slices based on respective characteristics (Cheng [n0043] describes selecting different compression ratios depending on the semantic contents of a point cloud region). 
The combination of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang does not explicitly teach: encoding respectively the multiple slices into respective sub streams based on the encoder parameters.
Jeppsson teaches: encoding respectively the multiple slices into respective sub streams based on the encoder parameters (fig. 2 shows high-quality and low-quality tiles being encoded separately, pg. 83 col. 2 “Each stream has one substream (also called a track) per tile…”).
The invention of Jeppsson is considered to be pertinent to the problem faced by the claimed invention because it deals with the issue of encoding a feed of visual data.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi and further in view of Zhang with the teachings of Jeppsson to add the functionality of encoding each section of the point cloud with unique parameters into its own substream for the purpose of organizing the bitstream data.
	
Claim(s) 25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhao ("Real-Time Scene-Aware LiDAR Point Cloud Compression Using Semantic Prior Representation") in view of Cheng (CN 113870271 A) and Bogoslavskyi ("Fast Range Image-Based Segmentation of Sparse 3D Laser Scans for Online Operation") as applied to claim 1 above, and further in view of Lim et al. (US 20190318488 A1, hereinafter "Lim").

Regarding claim 25, the combination of Zhao in view of Cheng and Bogoslavskyi teaches: The method of claim 1, but does not explicitly teach: wherein the pixel wise masks are used as inputs to a grid of non-cubic voxels.
Lim teaches: wherein the pixel wise masks are used as inputs to a grid of non-cubic voxels ([0075] “For example, the segmenting engine 414 divides a 3D point cloud into multiple mutually exclusive subsets and correlates each point with a subset, such that each point of a 3D point cloud belongs of a particular subset.”; [0076] “In certain embodiments, each subset can correspond to a 3D spatial region. A 3D spatial region is an area in 3D space where points within a particular subset are located. In certain embodiments, the 3D spatial region can be any shape. For example, the shape can be amorphous and form around the points within the 3D spatial region. In another example, the shape can be a circle or sphere. In another example, the shape can be a cuboid. FIG. 7B below illustrates a 3D spatial region resembling a cuboid. In another example, the shape can be a conical or cylindrical. Other shapes can be appreciated by those skilled in the art.”).
Lim and the combination of Zhao in view of Cheng and Bogoslavskyi are analogous to the claimed invention because they are in the same field of point cloud compression.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Zhao in view of Cheng and Bogoslavskyi with the teachings of Lim to be able to vary the shape of voxels when compressing a point cloud.  The motivation would have been to adapt to the location of points within the cloud and “form around the points within the 3D spatial region” ([0076]).

References Cited
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Sandri et al. ("Point Cloud Compression Incorporating Region of Interest Coding". 2019 IEEE International Conference on Image Processing (ICIP) (22-25 Sep 2019). https://doi.org/10.1109/ICIP.2019.8803553) teaches a method of point cloud compression that incorporates the following: performing segmentation on a point cloud by projecting it to multiple 2D images; detecting facial and non-facial regions of a person; dividing the point cloud into voxels; compressing the voxels containing facial regions less than the voxels without facial regions.
Sirohi et al. ("EfficientLPS: Efficient LiDAR Panoptic Segmentation". IEEE Transactions on Robotics, vol. 38, no. 3 (17 Nov 2021), pp. 1894-1914. https://ieeexplore.ieee.org/abstract/document/9617736) teaches a method of panoptic segmentation of point clouds based on 2D projection that generates separate pixel-wise masks for both semantic and instance segmentation.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN STATZ whose telephone number is (571)272-6654. The examiner can normally be reached Mon-Fri 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tammy Goddard can be reached at (571)272-7773. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/BENJAMIN TOM STATZ/Examiner, Art Unit 2611                                                                                                                                                                                                        
/TAMMY PAIGE GODDARD/Supervisory Patent Examiner, Art Unit 2611
Read full office action
Prosecution Timeline

Show 5 earlier events
Nov 05, 2025
Final Rejection mailed — §103, §112
Dec 11, 2025
Interview Requested
Jan 05, 2026
Response after Non-Final Action
Jan 27, 2026
Request for Continued Examination
Feb 02, 2026
Response after Non-Final Action
Feb 25, 2026
Non-Final Rejection mailed — §103, §112
Mar 24, 2026
Applicant Interview (Telephonic)
Mar 24, 2026
Examiner Interview Summary
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
33%
Grant Probability
99%
With Interview (+100.0%)
2y 6m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 3 resolved cases by this examiner. Grant probability derived from career allowance rate.