DETAILED ACTION
This action is in response to claims filed 18 September 2025 for application 17942857 filed on 12 September 2022. Currently claims 1-20 are pending.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-8, 10-17, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Fang et al. (FlexDNN: Input-Adaptive On-Device Deep Learning for Efficient Mobile Vision) in view of Kouri et al. (US 20230128637).
Regarding claims 1, 11 and 20, Fang discloses: A system comprising:
one or more processors and; non-transitory memory coupled to the one or more processors and storing computer-readable instructions that, when executed by the one or more processors, cause the one or more processors to:
determine a neural network element of a first compute path within a neural network model, the first compute path comprising a plurality of neural network elements including the neural network element, wherein the neural network element comprises at least one of a neural network layer and a neural network node, and wherein the first compute path is trained to perform an operation based on input data (Fig 6 each box is a layer and each layer implicitly contains nodes, data flow hard is on a first compute path);
determine a second compute path comprising the neural network element and a different plurality of neural network elements, the second compute path having at least one of a smaller size than the first compute path, a lower latency than the first compute path, and a smaller compute cost than the second compute path (Fig 6 data flow easy is on a second compute path that has a smaller size, latency and compute cost than the hard path);
select, based on one or more conditions, whether to process data from the neural network element through the first compute path or the second compute path (Fig 6 early exit decision blocks); and
process the data from the neural network element through the selected one of the first compute path or the second compute path (Fig 6 easy or hard path depending on input).
Fang does not explicitly disclose:
wherein the second compute path is trained to perform the same operation as the first compute path based on the input data;
prior to processing data from the neural network element with the first compute path and the second compute path, select.
Wherein the second compute path is trained to perform the same operation as the first compute path based on the input data (“As mentioned above, the present techniques provide a method for training a ML model which has the form of a multi-exit semantic segmentation network (or progressive segmentation network). The network comprises numerous early-exit points (i.e. segmentation heads) attached to different depths of a backbone convolutional neural network (CNN) architecture. This offers segmentation predictions with varying workload (and accuracy) characteristics, introducing a “train-once-deploy-everywhere” approach for efficient semantic segmentation. Advantageously, this means that the network can be parametrised without the need to retrain in order to be deployed across heterogeneous target devices of varying capabilities (low to high end).” [0074]);
Prior to processing data from the neural network element with the first compute path and the second compute path, select (“[0075] The ML model of the present techniques can accommodate a wide range of diverse deployment scenarios, including, for example: [0076] extracting workload-lighter sub-models for deployment on devices with varying computation capabilities (e.g. mobile phones) to satisfy latency constraints by completely skipping parts of the computation. [0077] selecting computation path at runtime, according to allocation of available resources on the target device, based on compute load, so as to preserve consistent prediction latency. [0078] obtaining a rapid approximation of the prediction in early-stages of the computation and progressively refining it over time. [0079] selecting computation path at run time, according to the difficulty of each input sample/prediction confidence obtained at different computation stages. [0080] partitioning the model for synergistic cloud-device execution (i.e. computation offloading), while still being able to obtain an approximation of the final prediction relying solely to the on-board computational resources (to address network availability/quality issues). [0081] incorporating specialist segmentation exits, focusing on a different set of classes (e.g. humans/pets) or fine-tuned on a user-centric data distribution (e.g. indoor/outdoor)”)
Fang and Kouris are in the same field of endeavor of neural network compute path selection and are analogous. Fang discloses using different compute paths based on input difficulty. Kouris teaches several methods and reasons for selecting a modified compute path including hardware latency and availability. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the compute path selection of Fang to utilize known other reasons for selecting the second compute path as taught by Kouris to yield predictable results.
Regarding claims 2 and 12, Fang discloses: The system of claim 1, wherein the first compute path is larger than the second compute path, and wherein the one or more conditions comprise a difference in an accuracy of a first output from the first compute path and a second output from the second compute path (Fig 6 path for the hard input is larger, “High Early Exit Rate without Accuracy Loss. With the optimized early exit branch architecture inserted at the optimal locations throughout the base model as well as the design of entropy-based confidence score, FlexDNN is able to achieve high early exit rate while blocking hard inputs from exiting prematurely to preserve accuracy.” P7 §III.F ¶4, note: the hard input would have worse accuracy if it exited at the early stage).
Regarding claims 3 and 13, Fang discloses: The system of claim 2, wherein the computer-readable instructions, when executed by the one or more processors, cause the one or more processors to also: determine the difference in the accuracy of the first output from the first compute path and the second output from the second compute path; and based on a determination that the difference in the accuracy is above a threshold, select to process the data from the neural network element via the first compute path (Fig 6 path for the hard input is larger, “High Early Exit Rate without Accuracy Loss. With the optimized early exit branch architecture inserted at the optimal locations throughout the base model as well as the design of entropy-based confidence score, FlexDNN is able to achieve high early exit rate while blocking hard inputs from exiting prematurely to preserve accuracy.” P7 §III.F, “In essence, Conf measures the confidence level of the early prediction result generated by the early exit branch. The higher the Conf is, the higher the confidence level is. The decision module decides to early exit the input if the value of Conf exceeds a pre-determined threshold.” P6 §III.D ¶2).
Regarding claims 4 and 14, Fang discloses: The system of claim 2, wherein the computer-readable instructions, when executed by the one or more processors, cause the one or more processors to also: determine the difference in the accuracy of the first output from the first compute path and the second output from the second compute path; and based on a determination that the difference in the accuracy is below a threshold, select to process the data from the neural network element via the second compute path (Fig 6 path for the hard input is larger, “High Early Exit Rate without Accuracy Loss. With the optimized early exit branch architecture inserted at the optimal locations throughout the base model as well as the design of entropy-based confidence score, FlexDNN is able to achieve high early exit rate while blocking hard inputs from exiting prematurely to preserve accuracy.” P7 §III.F, “In essence, Conf measures the confidence level of the early prediction result generated by the early exit branch. The higher the Conf is, the higher the confidence level is. The decision module decides to early exit the input if the value of Conf exceeds a pre-determined threshold.” P6 §III.D ¶2).
Regarding claims 5 and 15, Fang discloses: The system of claim 1, wherein determining the second compute path comprises: copying the first compute path to yield a duplicated compute path (Fig 6); and
downsizing the duplicated compute path to yield the second compute path (Fig 6 the easy input path is the same path except it is truncated).
Regarding claims 6 and 16, Fang discloses: The system of claim 5, wherein downsizing the duplicated compute path comprises at least one of quantizing the duplicated compute path, pruning one or more portions of the duplicated compute path, or reducing a size of one or more neural network layers of the duplicated compute path (Fig 6 the easy input path is the same path except it is truncated/pruned).
Regarding claim 7, Fang discloses: The system of claim 1, wherein first compute path comprises a first portion of the neural network model and the second compute path comprises a second portion of the neural network model (Fig 6 the hard input path has a second portion of the neural network).
Regarding claims 8 and 17, Fang discloses: The system of claim 1, wherein first compute path comprises at least one of a branch of the neural network model and at least a portion of the neural network model, and wherein the second compute path comprises a different branch of the neural network model or an additional neural network model, and wherein the additional neural network model comprises at least one of a smaller version of the neural network model and a compressed version of the neural network model (Fig 6).
Regarding claims 10 and 19, Fang discloses: The system of claim 1, wherein the one or more conditions comprise a desired balance between at least two of a processing latency, a compute cost, a safety metric, or an output accuracy, and wherein the first compute path and the second compute path are associated with different processing latencies, different compute costs, different safety metrics, or different output accuracies (Fig 6, “High Early Exit Rate without Accuracy Loss. With the optimized early exit branch architecture inserted at the optimal locations throughout the base model as well as the design of entropy-based confidence score, FlexDNN is able to achieve high early exit rate while blocking hard inputs from exiting prematurely to preserve accuracy.” P7 §III.F, “In essence, Conf measures the confidence level of the early prediction result generated by the early exit branch. The higher the Conf is, the higher the confidence level is. The decision module decides to early exit the input if the value of Conf exceeds a pre-determined threshold.” P6 §III.D ¶2).
Claim(s) 9 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Fang in view of Kouris and further in view Suprem et al. (ODIN: Automated Drift Detection and Recovery in Video Analytics).
Regarding claims 9 and 18, Fang does not explicitly disclose, however, Suprem teaches: The system of claim 1, wherein the computer-readable instructions, when executed by the one or more processors, cause the one or more processors to also:
generate clusters of scene features in an output of the neural network element (Fig 3);
train the first compute path using one or more clusters of the clusters of scene features (Fig 3); and
train the second compute path using one or more different clusters of the clusters of scene features (Fig 3, several compute paths are trained from different specialized cluster features).
Fang, Kouris and Suprem are in the same field of endeavor neural networks with different compute paths. Fang discloses a neural network with a short and longer compute path depending on accuracy of an input at different early stopping decision blocks. Kouris teaches several methods and reasons for selecting a modified compute path including hardware latency and availability. Suprem teaches a neural network with multiple paths trained on different clusters. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the known different compute paths of Fang and Kouris to utilize the known method of using different training data for each path as taught by Suprem to yield predictable results.
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Please see the addition of the Kouris reference.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC NILSSON whose telephone number is (571)272-5246. The examiner can normally be reached M-F: 7-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, James Trujillo can be reached at (571)-272-3677. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ERIC NILSSON/ Primary Examiner, Art Unit 2151