DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
The claims 1 and 19 recite feature map decoding obtaining a bitstream of a to-be-decoded feature map with a plurality of feature elements, obtaining a first probability estimation result corresponding to each of the plurality of feature elements based on the bitstream to produce a plurality of first probability estimations each result comprises a first peak probability, determining a set of first feature elements and a set of second feature elements from the plurality of feature elements based on a first threshold and the first peak probabilities of the estimation results and obtain a decoded feature map based on first and second feature elements.
The claims recite:
“obtaining a first probability estimation results…to produce a plurality of first probability estimation results … comprises first peak probability”: in the specification of the current application is described that “the first probability estimation result is a Gaussian distribution, and the first peak probability is a mean probability of the Gaussian distribution” this is pertaining to mathematical concepts grouping of abstract ideas.
“determining a set of first feature elements and a set of second feature elements …based on a first threshold and a first peak probabilities” this also pertains to mathematical concepts, and obtaining a decoded feature map based on the first and second set elements also employs mathematical concepts.
This judicial exception is not integrated into a practical application because the rest of the claim elements recite obtaining bitstream of a to-be-decoded feature map” which includes data gathering, mere instructions to apply an exceptions since the claim is claiming applying the mathematical concepts to obtain a desired result without specifying what for. See MPEP 2106.05(f).
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements as disclosed do not integrate the judicial exception into a practical application as they are mere instructions to apply an exception. Claim is not patent eligible.
Claim 2 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 2 recites the same abstract idea of claim 1. The claim recites the additional limitation of “the first probability estimation result is a Gaussian distribution, and the first peak probability is a mean probability of the Gaussian distribution; or the first probability estimation result is a mixed Gaussian distribution comprising a plurality of Gaussian distributions, and the first peak probability is a largest value in mean probabilities of the plurality of Gaussian distributions, or the first peak probability is calculated based on mean probabilities of the plurality of Gaussian distributions and weights of the plurality of Gaussian distributions in the mixed Gaussian distribution”, which is merely elaborating on the abstract idea, by further specifying an additional mathematical calculation, therefore, does not amount to significantly more than the abstract idea.
Claim 3 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 3 recites the same abstract idea of claim 1. The claim recites the additional limitation of “wherein a value of the decoded feature map comprises numerical values of all first feature elements in the set of first feature elements and numerical values of all second feature elements in the set of second feature elements”, which is merely elaborating on the abstract idea, by further specifying an additional mathematical calculation, therefore, does not amount to significantly more than the abstract idea.
Claim 4 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 4 recites the same abstract idea of claim 1. The claim recites the additional limitation of “wherein the set of first feature elements is an empty set, or the set of second feature elements is an empty set”, which is merely elaborating on the abstract idea, by further specifying an additional mathematical calculation, therefore, does not amount to significantly more than the abstract idea.
Claim 5 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 5 recites the same abstract idea of claim 1. The claim recites the additional limitation of “the first probability estimation result further comprises a feature value corresponding to the first peak probability; and the method further comprises: performing entropy decoding on the first feature elements based on first probability estimation results corresponding to the first feature elements, to obtain the numerical values of the first feature elements; and obtaining the numerical values of the second feature elements based on feature values corresponding to first peak probabilities of the second feature elements”, which is merely elaborating on the abstract idea, by further specifying an additional mathematical calculation, therefore, does not amount to significantly more than the abstract idea.
Claim 6 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 6 recites the same abstract idea of claim 1. The claim recites the additional limitation of “further comprising: before determining the set of first feature elements and the set of second feature elements from the plurality of feature elements obtaining the first threshold based on the bitstream of the to-be-decoded feature map”, which is merely elaborating on the abstract idea, by further specifying an additional mathematical calculation, therefore, does not amount to significantly more than the abstract idea.
Claim 7 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 7 recites the same abstract idea of claim 1. The claim recites the additional limitation of “wherein a first peak probability of a first feature element from the set of first feature elements is less than or equal to the first threshold, and a first peak probability of a second feature element from the set of second feature elements is greater than the first threshold”, which is merely elaborating on the abstract idea, by further specifying an additional mathematical calculation, therefore, does not amount to significantly more than the abstract idea.
Claim 8 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 8 recites the same abstract idea of claim 1. The claim recites the additional limitation of “obtaining the first probability estimation result corresponding to each of the plurality of feature elements obtaining side information corresponding to the to-be-decoded feature map based on the bitstream of the to-be-decoded feature map; and obtaining the first probability estimation result corresponding to each feature element based on the side information”, which is merely elaborating on the abstract idea, by further specifying an additional mathematical calculation, therefore, does not amount to significantly more than the abstract idea.
Claim 9 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 9 recites the same abstract idea of claim 1. The claim recites the additional limitation of “obtaining the first probability estimation result corresponding to each of the plurality of feature elements obtaining side information corresponding to the to-be-decoded feature map based on the bitstream of the to-be-decoded feature map; and estimating the first probability estimation result of each feature element for each feature element in the to-be-decoded feature map based on the side information and first context information, wherein the first context information is a feature element that corresponds to the feature element and that is in a preset region range in the to-be-decoded feature map”, which is merely elaborating on the abstract idea, by further specifying an additional mathematical calculation, therefore, does not amount to significantly more than the abstract idea.
Claims 10-18 and 20 the same 101 rejection analysis applied to claims 1-9 and 19 applies to the encoding in claims 10-18 and 20 as applied above to the decoding.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 10, 12-17, 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Tarashima (US 2024/0046645 A1).
With respect to Claim 10, Tarashima’645 shows a method of feature map encoding (Figure 3 frame input to encoder-decoder backbone 1, paragraph [0042]), comprising:
obtaining a first to-be-encoded feature map (Figure 3 frame input to encoder-decoder backbone 1, paragraph [0042]) comprising a plurality of feature elements (Figure 3 encoder-decoder backbone 1 extracts feature map from input frame including branches 2-5 for respective feature elements: point position, size, re-identification, and action, paragraph [0042]; of the plurality of feature maps the feature map of branch 2 point position elements and branch 3 size elements are interpretated as the claimed feature map and plurality of feature elements, figure 4 depicts the video sequence image frame to comprise an exampled three targets each comprising a plurality of point positions with corresponding rectangular size, paragraph [0064]);
determining a first probability estimation (Paragraph [0052] describes to update the parameters of the model to minimize an error function related to the position and size feature map, describing the position and size feature map to be a prediction/estimation/probability to require minimization of said errors) result of each of the plurality of feature elements based on the first to-be-encoded feature map, to produce a plurality of first probability estimation results (Figure 3 encoder-decoder backbone 1 receiving input frame and extracting feature map to branch 2 comprising point position elements and branch 3 size elements illustrating a convolutional neural network CNN, paragraph [0021]; figure 6 depicts the inference device (post learning) implementation of the CNN wherein the position feature elements (heat map) and size elements (which both receive the input of the video sequence/frame) produces an output (as described above to be a probability estimation), paragraph [0075]), wherein each first probability estimation result comprises a first peak probability (Figure 6 the plurality of point positions (figure 4) output by the position feature map heat map (probability estimation) is input to peak detecting unit 18 which outputs a value greater than or equal to a preset threshold as point position data; the point position data is also input to the size feature elements to output a size feature, paragraph [0110]);
determining whether each feature element in the first to-be-encoded feature map is a first feature element based on the first peak probability of the feature element in the first to-be-encoded feature map (Figure 6 point position data (which includes the probability estimation of position branch 2) and size feature output (which utilizes the input of the peak 18 output and includes the probability estimation of the branch 3 size elements, paragraph [0076]); and
performing entropy encoding on the first feature element only when the feature element is the first feature element (Figure 5 learning device 100 model parameter updating unit 17 that may be added to the inference device 200 so that model parameter updating and learning can be performed by the inference device 200, paragraph [0060]; paragraph [0072] wherein unit 17 updates the model by updating the parameters (position, size, re-identification, action) to minimize error when reapplied back to the CNN 11 that encodes, figure 3).
With respect to Claim 12, Tarashima’645 shows the method according to claim 10, wherein determining whether the feature element is the first feature element comprises:
determining whether the feature element is the first feature element based on a first threshold (output from peak detecting unit 18 via thresholds, paragraph [0110]) and the first peak probability of the feature element (Figure 6 point position data (which includes the probability estimation of position branch 2) and size feature output (which utilizes the input of the peak 18 output and includes the probability estimation of the branch 3 size elements, paragraph [0076]).
With respect to Claim 13, Tarashima’645 shows the method according to claim 12, wherein the first threshold is a largest second peak probability in second peak probabilities (Paragraph [0110] the preset threshold is performed after NMS, a non-maximum suppression NMS that can be performed in advance to suppress redundant outputs) corresponding to feature elements in a set of third feature elements (Figure 6 point position data and/or outputted size feature).
With respect to Claim 14, Tarashima’645 shows the method according to claim 13, wherein a first peak probability of the first feature element is less than or equal to the first threshold (paragraph [0110] NMS equating to the lower first threshold and the describes preset threshold equating to a higher than NMS threshold).
With respect to Claim 15, Tarashima’645 shows the method according to claim 13, wherein:
the method further comprises:
determining the set of third feature elements (figure 6 action, re-identification, or size features output by unit 12 utilizing point position data) from the plurality of feature elements (output of CNN 11) based on a second probability estimation result of each feature element (point position data input to unit 12);
the second probability estimation result further-comprises a feature value (point position data) corresponding to a second peak probability from the second peak probabilities (peak values higher than NMS and preset threshold, paragraph [0110]); and
determining the set of third feature elements from the plurality of feature elements comprises:
determining the set of third feature elements from the plurality of feature elements based on a preset error (NMS), a numerical value of each feature element (peak), and the feature value corresponding to the second peak probability of each feature element (point position data, paragraph [0076]).
With respect to Claim 16, Tarashima’645 shows the method according to claim 15, wherein:
the first probability estimation result is the same as the second probability estimation result (input to peak detecting unit is the same as the output to the peak detecting unit in that they both regard position features); and
determining the first probability estimation result of each of the plurality of feature elements comprises:
obtaining side information of the first to-be-encoded feature map based on the first to-be-encoded feature map (Figures 3 and 6 reference additional branches 4-5 of side information corresponding to the to-be-decoded feature map including re-identification and action feature elements); and
performing probability estimation on the side information to obtain the first probability estimation result of each feature element (Figure 6 reference output from feature selecting unit 12 including action feature and re-identification feature).
With respect to Claim 17, Tarashima’645 shows the method according to claim 13, wherein:
the method further comprises:
determining a second probability estimation result of each of the plurality of feature elements based on the first to-be-encoded feature map (Figures 3 and 6 wherein branches 4-5 each provide different probability estimations respectively in regards to re-identification, and action based on the feature map);
the first probability estimation result is different from the second probability estimation result (Figures 3 and 6 wherein branches 2-5 each provide different probability estimations output from unit 12 respectively in regards to position, size, re-identification, and action); and
determining the second probability estimation result of each of the plurality of feature elements comprises:
obtaining side information (figure 3 branch 3 size) of the first to-be-encoded feature map (figure 3 main feature map) and second context information of each feature element based on the first to-be-encoded feature map (Figure 6 point position data output from the peak detecting unit 18), wherein the second context information is a feature element that corresponds to the feature element (point position data corresponds to the position feature elements) and is in a preset region range in the first to-be-encoded feature map (preset pixels regions with values larger than the preset threshold, paragraph [0110]); and
obtaining the second probability estimation result of each feature element based on the side information and the second context information (unit 12 outputs second probability estimates of action and re-identification based on point position data/side information input to unit 12, paragraph [0112]).
With respect to Claim 20, Tarashima’645 shows a feature map encoding apparatus, comprising:
a processor (figure 11 CPU 1004); and
a memory (figure 11 memory 1003) coupled to the processor to store instructions, which when executed by the processor (paragraph [0129]), cause the feature map encoding apparatus to:
obtain a first to-be-encoded feature map (Figure 3 frame input to encoder-decoder backbone 1, paragraph [0042]) comprising a plurality of feature elements (Figure 3 encoder-decoder backbone 1 extracts feature map from input frame including branches 2-5 for respective feature elements: point position, size, re-identification, and action, paragraph [0042]; of the plurality of feature maps the feature map of branch 2 point position elements and branch 3 size elements are interpretated as the claimed feature map and plurality of feature elements, figure 4 depicts the video sequence image frame to comprise an exampled three targets each comprising a plurality of point positions with corresponding rectangular size, paragraph [0064]);
determine a first probability estimation (Paragraph [0052] describes to update the parameters of the model to minimize an error function related to the position and size feature map, describing the position and size feature map to be a prediction/estimation/probability to require minimization of said errors) result of each of the plurality of feature elements based on the first to-be-encoded feature map, to produce a plurality of first probability estimation results (Figure 3 encoder-decoder backbone 1 receiving input frame and extracting feature map to branch 2 comprising point position elements and branch 3 size elements illustrating a convolutional neural network CNN, paragraph [0021]; figure 6 depicts the inference device (post learning) implementation of the CNN wherein the position feature elements (heat map) and size elements (which both receive the input of the video sequence/frame) produces an output (as described above to be a probability estimation), paragraph [0075]), wherein each first probability estimation result comprises a first peak probability (Figure 6 the plurality of point positions (figure 4) output by the position feature map heat map (probability estimation) is input to peak detecting unit 18 which outputs a value greater than or equal to a preset threshold as point position data; the point position data is also input to the size feature elements to output a size feature, paragraph [0110]);
determine whether each feature element in the first to-be-encoded feature map is a first feature element based on the first peak probability of the feature element in the first to-be-encoded feature map (Figure 6 point position data (which includes the probability estimation of position branch 2) and size feature output (which utilizes the input of the peak 18 output and includes the probability estimation of the branch 3 size elements, paragraph [0076]); and
perform entropy encoding on the first feature element only when the feature element is the first feature element (Figure 5 learning device 100 model parameter updating unit 17 that may be added to the inference device 200 so that model parameter updating and learning can be performed by the inference device 200, paragraph [0060]; paragraph [0072] wherein unit 17 updates the model by updating the parameters (position, size, re-identification, action) to minimize error when reapplied back to the CNN 11 that encodes, figure 3).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 6-9 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Tarashima (US 2024/0046645 A1) in view of TEO et al. (US 2024/0037797 A1).
With respect to Claim 1, Tarashima’645 shows a method of feature map decoding (Figure 3 encoder-decoder backbone 1 for feature map decoding, paragraph [0042]), comprising:
obtaining a signal of a to-be-decoded feature map (Figure 3 frame input to encoder-decoder backbone 1, paragraph [0042]), wherein the to-be-decoded feature map comprises a plurality of feature elements (Figure 3 encoder-decoder backbone 1 extracts feature map from input frame including branches 2-5 for respective feature elements: point position, size, re-identification, and action, paragraph [0042]; of the plurality of feature maps the feature map of branch 2 point position elements and branch 3 size elements are interpretated as the claimed feature map and plurality of feature elements, figure 4 depicts the video sequence image frame to comprise an exampled three targets each comprising a plurality of point positions with corresponding rectangular size, paragraph [0064]);
obtaining a first probability estimation (Paragraph [0052] describes to update the parameters of the model to minimize an error function related to the position and size feature map, describing the position and size feature map to be a prediction/estimation/probability to require minimization of said errors) result corresponding to each of the plurality of feature elements based on the signal of the to-be-decoded feature map, to produce a plurality of first probability estimation results (Figure 3 encoder-decoder backbone 1 receiving input frame and extracting feature map to branch 2 comprising point position elements and branch 3 size elements illustrating a convolutional neural network CNN, paragraph [0021]; figure 6 depicts the inference device (post learning) implementation of the CNN wherein the position feature elements (heat map) and size elements (which both receive the input of the video sequence/frame) produces an output (as described above to be a probability estimation), paragraph [0075]),
wherein each first probability estimation result comprises a first peak probability (Figure 6 the plurality of point positions (figure 4) output by the position feature map heat map (probability estimation) is input to peak detecting unit 18 which outputs a value greater than or equal to a preset threshold as point position data; the point position data is also input to the size feature elements to output a size feature, paragraph [0110]);
determining a set of first feature elements (figure 6 point position data) and a set of second feature elements (figure 6 size feature output from size feature map) from the plurality of feature elements based on a first threshold (output from peak detecting unit 18 via thresholds, paragraph [0110]) and the first peak probabilities of the plurality of first probability estimation results (Figure 6 point position data (which includes the probability estimation of position branch 2) and size feature output (which utilizes the input of the peak 18 output and includes the probability estimation of the branch 3 size elements, paragraph [0076]); and
obtaining a decoded feature map based on the set of first feature elements and the set of second feature elements (Figure 6 post-detection processing unit 19 outputs detection result, paragraph [0076]).
Tarashima’645 does not specifically disclose to obtain a bitstream of a to-be-decoded feature map.
Teo’797 discloses to obtain a bitstream of a to-be-decoded feature map (Figure 4 bitstream to decoding device 1202 to output feature map, paragraph [0089]).
At the time of the invention, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claim invention to modify Tarashima’645 to include obtain a bitstream of a to-be-decoded feature map method taught by Teo’797. The suggestion/motivation for doing so would have been to improve the system’s ability to be able to encode an input image with encoded data (paragraph [0089]).
With respect to Claim 6, the combination of Tarashima’645 and Teo’797 shows the method according to claim 1, further comprising: before determining the set of first feature elements and the set of second feature elements from the plurality of feature elements obtaining the first threshold based on the bitstream of the to-be-decoded feature map (in Tarashima’645: Figure 6 before output of the detection result from unit 19 and/or before output of the size feature from unit 12 and point position data from unit 18, peak detecting unit 18 must first process the input from the position feature map (from the bitstream of the to-be-decoded feature map, figure 3) via a preset threshold, paragraph [0110]; the threshold is preset (based) in the system known to receive bitstreams).
With respect to claim 7, the combination of Tarashima’645 and Teo’797 shows the method according to claim 1, wherein a first peak probability of a first feature element from the set of first feature elements is less than or equal to the first threshold (in Tarashima’645: Paragraph [0110] wherein a non-maximum suppression NMS can be performed in advance to suppress redundant outputs describing a first threshold), and a first peak probability of a second feature element from the set of second feature elements is greater than the first threshold ( in Tarashima’645: Paragraph [0110] the preset threshold is performed after NMS).
With respect to claim 8, the combination of Tarashima’645 and Teo’797 shows the method according to claim 1, wherein obtaining the first probability estimation result corresponding to each of the plurality of feature elements comprises: obtaining side information corresponding to the to-be-decoded feature map based on the bitstream of the to-be-decoded feature map (in Tarashima’645: Figures 3 and 6 reference additional branches 4-5 of side information corresponding to the to-be-decoded feature map including re-identification and action feature elements); and
obtaining the first probability estimation result corresponding to each feature element based on the side information (in Tarashima’645: Figure 6 reference output from feature selecting unit 12 including action feature and re-identification feature).
With respect to claim 9, the combination of Tarashima’645 and Teo’797 discloses the method according to claim 1, wherein obtaining the first probability estimation result corresponding to each of the plurality of feature elements comprises: obtaining side information corresponding to the to-be-decoded feature map based on the bitstream of the to-be-decoded feature map (in Tarashima’645: Figures 3 and 6 reference additional branches 4-5 of side information corresponding to the to-be-decoded feature map including re-identification and action feature elements); and estimating the first probability estimation result of each feature element for each feature element in the to-be-decoded feature map based on the side information (in Tarashima’645: Figure 6 reference output from feature selecting unit 12 including action feature and re-identification feature) and first context information, wherein the first context information is a feature element that corresponds to the feature element and is in a preset region range in the to-be-decoded feature map (in Tarashima’645: figure 6 point position data input to feature selecting unit 12 being a position feature element that corresponds to a value above a threshold (preset range) of the to-be-decoded feature map).
With respect to claim 18, Tarashima’645 shows the method according to claim 10.
Tarashima’645 does not specifically disclose writing entropy encoding results of all the first feature elements into an encoded bitstream.
Teo’797 discloses writing entropy encoding results of all the first feature elements into an encoded bitstream (Figure 4 bitstream to decoding device 1202 to output feature map, paragraph [0089]).
At the time of the invention, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claim invention to modify Tarashima’645 to include writing entropy encoding results of all the first feature elements into an encoded bitstream method taught by Teo’797. The suggestion/motivation for doing so would have been to improve the system’s ability to be able to write entropy encoding results of all the first feature elements into an encoded bitstream for encoding an input image with encoded data (paragraph [0089]).
With respect to Claim 19, Tarashima’645 shows a feature map decoding apparatus (Figure 3 encoder-decoder backbone 1 for feature map decoding, paragraph [0042]), comprising:
a processor (figure 11 CPU 1004); and
a memory (figure 11 memory 1003) coupled to the processor to store instructions, which when executed by the processor (paragraph [0129]), cause the feature map decoding apparatus to:
obtain a signal of a to-be-decoded feature map (Figure 3 frame input to encoder-decoder backbone 1, paragraph [0042]) comprising a plurality of feature elements (Figure 3 encoder-decoder backbone 1 extracts feature map from input frame including branches 2-5 for respective feature elements: point position, size, re-identification, and action, paragraph [0042]; of the plurality of feature maps the feature map of branch 2 point position elements and branch 3 size elements are interpretated as the claimed feature map and plurality of feature elements, figure 4 depicts the video sequence image frame to comprise an exampled three targets each comprising a plurality of point positions with corresponding rectangular size, paragraph [0064]);
obtain a first probability estimation (Paragraph [0052] describes to update the parameters of the model to minimize an error function related to the position and size feature map, describing the position and size feature map to be a prediction/estimation/probability to require minimization of said errors) result corresponding to each of the plurality of feature elements based on the signal of the to-be-decoded feature map, to produce a plurality of first probability estimation results (Figure 3 encoder-decoder backbone 1 receiving input frame and extracting feature map to branch 2 comprising point position elements and branch 3 size elements illustrating a convolutional neural network CNN, paragraph [0021]; figure 6 depicts the inference device (post learning) implementation of the CNN wherein the position feature elements (heat map) and size elements (which both receive the input of the video sequence/frame) produces an output (as described above to be a probability estimation), paragraph [0075]),
wherein each first probability estimation result comprises a first peak probability (Figure 6 the plurality of point positions (figure 4) output by the position feature map heat map (probability estimation) is input to peak detecting unit 18 which outputs a value greater than or equal to a preset threshold as point position data; the point position data is also input to the size feature elements to output a size feature, paragraph [0110]);
determine a set of first feature elements (figure 6 point position data) and a set of second feature elements (figure 6 size feature output from size feature map) from the plurality of feature elements based on a first threshold (output from peak detecting unit 18 via thresholds, paragraph [0110]) and the plurality of first probability estimation results (Figure 6 point position data (which includes the probability estimation of position branch 2) and size feature output (which utilizes the input of the peak 18 output and includes the probability estimation of the branch 3 size elements, paragraph [0076]); and
obtain the to-be-decoded feature map based on the set of first feature elements and the set of second feature elements (Figure 6 post-detection processing unit 19 outputs detection result, paragraph [0076]).
Tarashima’645 does not specifically disclose to obtain a bitstream of a to-be-decoded feature map.
Teo’797 discloses to obtain a bitstream of a to-be-decoded feature map (Figure 4 bitstream to decoding device 1202 to output feature map, paragraph [0089]).
At the time of the invention, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claim invention to modify Tarashima’645 to include obtain a bitstream of a to-be-decoded feature map method taught by Teo’797. The suggestion/motivation for doing so would have been to improve the system’s ability to be able to encode an input image with encoded data (paragraph [0089]).
Claims 2-3 is rejected under 35 U.S.C. 103 as being unpatentable over Tarashima (US 2024/0046645 A1) in view of TEO et al. (US 2024/0037797 A1) further in view of Solh et al. (US 10380853 B1).
With respect to Claim 2, the combination of Tarashima’645 and TEO’797 shows the method according to claim 1.
Tarashima’645 and TEO’797 does not specifically disclose the first probability estimation result is a Gaussian distribution, and the first peak probability is a mean probability of the Gaussian distribution; or the first probability estimation result is a mixed Gaussian distribution comprising a plurality of Gaussian distributions, and the first peak probability is a largest value in mean probabilities of the plurality of Gaussian distributions, or the first peak probability is calculated based on mean probabilities of the plurality of Gaussian distributions and weights of the plurality of Gaussian distributions in the mixed Gaussian distribution.
Solh’853 discloses the first probability estimation result is a Gaussian distribution, and the first peak probability is a mean probability of the Gaussian distribution (Column 12 describes using a clustering algorithm such as a mean shift clustering algorithm to identify pixels including a peak value and column 14 lines 15-40 describes a detector which outputs a heatmap may apply a gaussian filter to emphasize relative edges of regions of interest).
It would have been obvious to one skilled in the art before the effective filing date of the current application to enable Tarashima’645 and TEO’797 point position heat map and peak detecting unit with the known technique of the first probability estimation result is a Gaussian distribution, and the first peak probability is a mean probability of the Gaussian distribution yielding the predictable result of emphasizing relative edges of regions of interest as disclosed by Solh’853 (column 14 lines 15-40).
With respect to Claim 3, Tarashima’645 and TEO’797 shows the method according to claim 1.
Tarashima’645 and TEO’797 do not specifically disclose wherein a value of the decoded feature map comprises numerical values of all first feature elements in the set of first feature elements and numerical values of all second feature elements in the set of second feature elements.
Solh’853 shows wherein a value of the decoded feature map comprises numerical values of all first feature elements in the set of first feature elements and numerical values of all second feature elements in the set of second feature elements (Figure 4 depicts the feature map to include numerical values to represent the features, column 8 lines 53-61).
It would have been obvious to one skilled in the art before the effective filing date of the current application to enable Tarashima’645 and TEO’797 feature map with first and second feature elements with the known technique of wherein a value of the decoded feature map comprises numerical values of all first feature elements in the set of first feature elements and numerical values of all second feature elements in the set of second feature elements yielding the predictable result of generating regions of interest as disclosed by Solh’853 (column 14 lines 15-40).
Claims 4-5 are rejected under 35 U.S.C. 103 as being unpatentable over Tarashima (US 2024/0046645 A1) in view of TEO et al. (US 2024/0037797 A1) further in view Gao (US 2021/0364298 A1).
With respect to Claim 4, Tarashima’645 and TEO’797 shows the method according to claim 3.
Tarashima’645 and TEO’797 do not specifically disclose wherein the set of first feature elements is an empty set, or the set of second feature elements is an empty set.
Gao’298 shows wherein the set of first feature elements is an empty set, or the set of second feature elements is an empty set (Figure 7A the feature element is null/empty/unknown because the feature that reduces entropy the most Is not yet known, paragraph [0071]).
At the time of the invention, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claim invention to modify Tarashima’645 and TEO’797 to include wherein the set of first feature elements is an empty set, or the set of second feature elements is an empty set method taught by Gao’298. The suggestion/motivation for doing so would have been to improve the system’s ability to be able to represent the determination that the feature determination is unknown or because the feature that reduces entropy the most is not yet known (paragraph [0071]).
With respect to claim 5, the combination of Tarashima’645, TEO’797 and Gao’298 shows the method according to claim 3, wherein:
the first probability estimation result (in Tarashima’645: Figure 6 output of position feature map to peak detecting unit 18) further comprises a feature value corresponding to the first peak probability (in Tarashima’645: Figure 6 point position data is output from peak detecting unit 18 and regards the feature value which passes or not the threshold, paragraph [0110]); and
the method further comprises:
performing entropy decoding on the first feature elements based on first probability estimation results corresponding to the first feature elements, to obtain the numerical values of the first feature elements (in Tarashima’645: paragraph [0104] describes using cross entropy loss as the error-function; paragraph [0052] describes to use the error function for the position feature map and size feature; in Gao’298: paragraph [0071] wherein the empty set regards the determination that the feature that reduces entropy the most is not yet known); and
obtaining the numerical values of the second feature elements based on feature values corresponding to first peak probabilities of the second feature elements (in Tarashima’645:Figure 6 post-detection processing unit 19 outputs detection result, paragraph [0076]).
Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Tarashima (US 2024/0046645 A1) in view of Solh et al. (US 10380853 B1).
With respect to Claim 11, Tarashima’645 shows the method according to claim 10.
Tarashima’645 does not specifically disclose wherein: the first probability estimation result is a Gaussian distribution, and the first peak probability is a mean probability of the Gaussian distribution; or the first probability estimation result is a mixed Gaussian distribution comprising a plurality of Gaussian distributions, and the first peak probability is a largest value in mean probabilities of the plurality of Gaussian distributions, or the first peak probability is calculated based on mean probabilities of the plurality of Gaussian distributions and weights of the plurality of Gaussian distributions in the mixed Gaussian distribution.
Solh’853 discloses the first probability estimation result is a Gaussian distribution, and the first peak probability is a mean probability of the Gaussian distribution (Column 12 describes using a clustering algorithm such as a mean shift clustering algorithm to identify pixels including a peak value and column 14 lines 15-40 describes a detector which outputs a heatmap may apply a gaussian filter to emphasize relative edges of regions of interest).
It would have been obvious to one skilled in the art before the effective filing date of the current application to enable Tarashima’645 point position heat map and peak detecting unit with the known technique of the first probability estimation result is a Gaussian distribution, and the first peak probability is a mean probability of the Gaussian distribution yielding the predictable result of emphasizing relative edges of regions of interest as disclosed by Solh’853 (column 14 lines 15-40).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Gao et al. (US 2022/0286696 A1): paragraph [0058] deep learning-based image compression method may construct and implement mapping from the original image to a reconstruction image using a deep neural network. Local contextual information of each pixel in a high-resolution feature map may be learned using a convolution kernel, and accordingly, a network may estimate a corresponding pixel value before quantization according to a value of a neighboring pixel, which may lead to a decrease in a quantization error and an increase in a quality of a reconstruction image.
Mao et al. (US 2024/0105193 A1): paragraphs [0006]-[0010] a picture feature map is obtained for a to-be-encoded picture by using an encoder network, and entropy encoding is further performed on the picture feature map. However, an entropy encoding process is excessively complex. This application provides feature data encoding and decoding methods and apparatuses to reduce encoding and decoding complexity without affecting encoding and decoding performance. According to a first aspect, a feature data encoding method is provided, including: obtaining to-be-encoded feature data, where the to-be-encoded feature data includes a plurality of feature elements, and the plurality of feature elements include a first feature element; obtaining a probability estimation result of the first feature element; determining, based on the probability estimation result of the first feature element, whether to perform entropy encoding on the first feature element; and performing entropy encoding on the first feature element only when it is determined that entropy encoding needs to be performed on the first feature element. The feature data includes a picture feature map, an audio feature variable, or a picture feature map and an audio feature variable; and may be one-dimensional, two-dimensional, or multi-dimensional data output by an encoder network, where each piece of data is a feature element. It should be noted that meanings of a feature point and a feature element in this application are the same.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to IRIANA CRUZ whose telephone number is (571)270-3246. The examiner can normally be reached 10-6.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Akwasi M. Sarpong can be reached at (571) 270-3438. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/IRIANA CRUZ/ Primary Examiner, Art Unit 2681