DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
The claims at a high level recite a method for optimizing DNN models.
Step 1: Does the Claim Fall within a Statutory Category?
Yes. Claims 1-19 recite a method and therefore is directed to the statutory class.
The USPTO Guidance recites:
(1) any judicial exceptions, including certain groupings of abstract ideas (i.e., mathematical concepts, certain methods of organizing human activity such as a fundamental economic practice, or mental processes) (Step 2A, Prong 1); and
(2) additional elements that integrate the judicial exception into a practical application (Step 2A, Prong 2). MPEP §§ 2106.04(a), (d).
Only if the claim (1) recites a judicial exception and (2) does not integrate that exception into a practical application, do we then look in Step 2B to whether the claim:
(3) adds a specific limitation beyond the judicial exception that is not “well-understood, routine, conventional” in the field; or
(4) simply appends well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception. MPEP § 2106.05(d).
Step 2A, Prong One: Is a Judicial Exception Recited?
First, determine whether the claims recite any judicial exceptions, including certain groupings of abstract ideas (i.e., mathematical concepts, certain methods of organizing human activity, or mental processes). MPEP § 2106.04(a).
Claim 1 recites –
▪ obtaining, in a computing platform, hardware specification of the target device (Abstract Idea of a mental process, see MPEP § 2106.04(a)(2)(III). Under the broadest reasonable interpretation, this limitation is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — a user can obtain data pertaining to specification of a device);
▪ generating, by the computing platform, a quantitative hardware performance model based on the obtained hardware specification (Abstract Idea of a mental process, see MPEP § 2106.04(a)(2)(III). Under the broadest reasonable interpretation, this limitation is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — a user generate a logical model based on the specifications);
▪ obtaining, in the computing platform, a starting DNN model and DNN performance requirements for the optimized DNN model (Abstract Idea of a mental process, see MPEP § 2106.04(a)(2)(III). Under the broadest reasonable interpretation, this limitation is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — a user can obtain data pertaining to performance requirements for the optimization);
▪ generating, by the computing platform, a DNN performance model based on the obtained starting DNN model and the obtained DNN performance requirements Abstract Idea of a mental process, see MPEP § 2106.04(a)(2)(III). Under the broadest reasonable interpretation, this limitation is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — a user can generate starting models); and
▪ generating, by the computing platform, the optimized DNN model and code through applying the quantitative hardware performance model and the DNN performance model to an optimization space of a plurality of DNN model instances and code optimizations (Amount to “Apply it”. Merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, see MPEP § 2106.05(f). Examiner’s note: high level application of using machine learning models (DNN) optimize data, as it links the judicial exception to a field of use).
These limitations, based on their broadest reasonable interpretation, recite a mental process, i.e. a judicial exception. For these reasons, the independent claim 1 recites a judicial exception.
A method, like the claimed method, “a process that employs mathematical algorithms to manipulate existing information to generate additional information is not patent eligible.” See Digitech Image Techs, LLC v. Elecs. for Imaging, Inc., 758 F.3d 1344, 1351 (Fed. Cir. 2014). See Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350 (Fed. Cir. 2016) where collecting information, analyzing it, and displaying results from certain results of the collection and analysis was held to be an abstract idea. See In re Meyer, 688 F.2d 789, 795—96 (CCPA 1982), which held that “a mental process that a neurologist should follow” when testing a patient for nervous system malfunctions was not patentable.
Accordingly, the claims recite an abstract idea.
Step 2A, Prong Two: Is the Abstract Idea Integrated into a Practical Application?
Next determine whether the claims recite additional elements that integrate the judicial exception into a practical application (see MPEP §§ 2106.05(a)-(c), (e)-(h)). To integrate the exception into a practical application, the additional claim elements must, for example, improve the functioning of a computer or any other technology or technical field (see MPEP § 2106.05(a)), apply the judicial exception with a particular machine (see MPEP § 2106.05(b)), or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment (see MPEP § 2106.05(e)).
Additional elements:
▪ Deep Neural Network ("DNN") model, a starting DNN model, optimized DNN model, quantitative hardware performance model, a plurality of DNN model instances (Amount to “Apply it”. Merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, see MPEP § 2106.05(f). Examiner’s note: high level application of using machine learning model to perform optimization);
▪ a target device, a computing platform (Adding insignificant extra-solution activity to the judicial exception - see MPEP § 2106.05(g));
The term “additional elements” for claim features, limitations, or steps that the claim recites beyond the identified judicial exception. Claim 1 recites the additional elements of “DNN” models and using a computing platform. However, the claim do not recite any improvements to these additional elements, nor does the claims recite any particularly programmed or configured computer system, device, or machine learning. Rather, the additional elements in claim 1 serve merely to automate the abstract idea. See Int’l Bus. Machs. Corp. v. Zillow Group, Inc., 50 F. 4" 1371, 1382 (Fed. Cir. 2022) (“[A] patent that ‘automate[s] “pen and paper methodologies” to conserve human resources and minimize errors’ is a ‘quintessential “do it on a computer” patent’ directed to an abstract idea.”) (quoting Univ. of Fla. Rsch. Found., Inc. v. Gen. Elec. Co., 916 F.3d 1363, 1367 (Fed. Cir. 2019)). Therefore, none of these recited additional elements, whether considered individually or in combination, integrates the judicial exception into a practical application.
The additional elements listed above that relate to computing components are recited at a high level of generality (i.e., as generic components performing generic computer functions such as communicating and processing known data) such that they amount to no more than mere instructions to apply the exception using generic computing components. Simply implementing the abstract idea on a generic computer is not a practical application of the abstract idea. Additionally, the claims do not purport to improve the functioning of the computer itself. There is no technological problem that the claimed invention solves. Rather, the computer system is invoked merely as a tool. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Therefore, these claims are directed to an abstract idea.
For these reasons, independent claim 1 is directed to an abstract idea.
Step 2B: Does the Claim Provide an Inventive Concept?
Next, determine whether the claims recite an “inventive concept” that “must be significantly more than the abstract idea itself, and cannot simply be an instruction to implement or apply the abstract idea on a computer.” BASCOM Glob. Internet Servs., Inc. v. AT&T Mobility LLC, 827 F.3d 1341, 1349 (Fed. Cir. 2016); see MPEP § 2106.05(d). There must be more than “computer functions [that] are “well-understood, routine, conventional activit[ies]’ previously known to the industry.” Alice Corp. v. CLS Bank Int'l, 573 U.S. 208, 225 (2014) (second alteration in original) (quoting Mayo Collaborative Servs. v. Prometheus Labs., Inc., 566 U.S. 66, 73 (2012)); see MPEP § 2106.05(d).
Step 2B: The additional elements are not sufficient to amount to significantly more than the judicial exception.
(see MPEP 2106.05(d)(Il). Taking the claim elements separately, the function performed by the computer at each step of the process is purely conventional. Using a computer and associated computer network to obtain data, use data to identify other data, and comparing data, are some of the most basic functions of a computer. All of these computer functions are well-understood, routine, conventional activities previously known to the industry. The method claims do not, for example, purport to improve the functioning of the computer itself. Nor do they effect an improvement in any other technology or technical field. Instead, the claims at issue amount to nothing significantly more than an instruction to apply the abstract idea of displaying, processing and storing data using some unspecified, generic computer).
No “inventive concept” sufficient to transform the abstract method of organizing human activity into a patent-eligible application. See MPEP § 2106.05. Rather, the additional elements identified above are merely well-understood, conventional computer components, as confirmed by the Specification. See MPEP § 2106.05(d)(1). For example, the Specification refers to the additional elements in generic terms.
As discussed above with respect to integration of the abstract idea into a practical application, the additional elements relating to computing components amount to no more than applying the exception using a generic computing components. Mere instructions to apply an exception using a generic computing component cannot provide an inventive concept. Furthermore, the broadest reasonable interpretation of the claimed computer components (i.e., additional elements) includes any generic computing components that are capable of being programmed to communicate and process known data.
Additionally, the computer components are used for performing insignificant extra-solution activity and well understood, routine, and conventional functions. For example, the claimed processor and machine learning merely communicates and processes known data. Activities such as these are insignificant extra-solution activity and, therefore, well understood, routine, and conventional. See MPEP 2106.05(d); see also, e.g., OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d at 1363, 115 USPQ2d at 1092-93 (Presenting offers to potential customers and gathering statistics generated based on the testing about how potential customers responded to the offers; the statistics are then used to calculate an optimized price); CyberSource v. Retail Decisions, Inc., 654 F.3d 1366, 1375, 99 USPQ2d 1690, 1694 (Fed. Cir. 2011) (Obtaining information about transactions using the Internet to verify credit card transactions); Ultramercial, Inc. v. Hulu, LLC, 772 F.3d at 715, 112 USPQ2d at 1754 (Consulting and updating an activity log); Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016) (Selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display); Apple, Inc. v. Ameranth, Inc., 842 F.3d 1229, 1244, 120 USPQ2d 1844, 1856 (Fed. Cir. 2016) (Recording a customer’s order); Return Mail, Inc. v. U.S. Postal Service, -- F.3d --, -- USPQ2d --, slip op. at 32 (Fed. Cir. August 28, 2017) (Identifying undeliverable mail items, decoding data on those mail items, and creating output data); Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1331, 115 USPQ2d 1681, 1699 (Fed. Cir. 2015) (Arranging a hierarchy of groups, sorting information, eliminating less restrictive pricing information and determining the price). Furthermore, limitations such as integrating account details are well-understood, routine, and conventional activity. See Alice Corp., 134 S. Ct. at 2359, 110 USPQ2d at 1984 (creating and maintaining "shadow accounts"); Ultramercial, 772 F.3d at 716, 112 USPQ2d at 1755 (updating an activity log).
Dependent claims 2-19 further describe the abstract idea. The additional elements of the dependent claims fail to integrate the abstract idea into a practical application and do not amount to significantly more than the abstract idea. Thus, as the dependent claims remain directed to a judicial exception, and as the additional elements of the claims do not amount to significantly more, the dependent claims are not patent eligible. Dependent claims 2-18 do not recite additional limitations that demonstrate integration of the abstract idea into a practical application or an inventive concept that amounts to significantly more than the abstract idea.
With respect to claims 2, 4, 5-8:
Step 2A Prong 1: the claims recite a judicial exception (an abstract idea)
▪ (Abstract Idea of a mental process. Under the broadest reasonable interpretation, the obtaining/determining probability distribution and divergence, as drafted, is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — a user can manually architecture, hardware specification language, rules on preferred data storage and access patterns, generating one or more hardware performance models.)
Step 2A Prong 2: the additional elements that are not sufficient to integrate the judicial exception into a practical application. The additional elements which, considered individually and as an ordered combination with the additional elements from the claim upon which it depends, do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea.
With respect to claim 3:
Step 2A Prong 1: the claims recite a judicial exception (an abstract idea)
▪ a plurality of platforms comprising servers, workstations, personal computing devices, mobile phones, embedded devices, specialized accelerators, FPGAs and ASICs (a generic computer functions of receiving and processing that are well-understood, routine, and conventional activities previously known to the industry. Extracting caption data and natural text processing are merely extra-solution activities and does not meaningfully limit the independent claims. Generic computer implementation does not provide significantly more than the abstract idea).
Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2: The judicial exception is not integrated into a practical application.
With respect to claims 9 and 19:
Step 2A Prong 1: the claims recite a judicial exception (an abstract idea)
▪ conducting active measuring to determine hardware metrics of the target device, wherein hardware metrics include at least one item selected from the group consisting of memory hierarchy, processor speed, and register file size (is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion. Here, an additional mathematical /logical reasoning).
▪ running a set of micro- kernels on the computing platform (a generic computer functions of executing a software that are well-understood, routine, and conventional activities previously known to the industry. Extracting caption data and natural text processing are merely extra-solution activities and does not meaningfully limit the independent claims. Generic computer implementation does not provide significantly more than the abstract idea).
Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2: The judicial exception is not integrated into a practical application.
With respect to claims 10-14:
Step 2A Prong 1: the claims recite a judicial exception (an abstract idea)
▪ (Abstract Idea of a mental process. Under the broadest reasonable interpretation, the obtaining/determining probability distribution and divergence, as drafted, is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — a user can perfume a mathematical evaluation, which performs the determination, thereby further defining the abstract idea. A human being may use this mathematical calculation to facilitate the mental evaluation in order to arrive at the necessary determination. This claim limitation appears to recite both a mathematical formula and mental process, i.e. additional generic mathematical calculations and do not represent significantly more than the abstract idea. See at least MPEP § 2106.05(a) ("Improvements to the Functioning of a Computer or to Any Other Technology or Technical Field")).
Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2: The judicial exception is not integrated into a practical application.
With respect to claims 15-17:
Step 2A Prong 1: the claims recite a judicial exception (an abstract idea)
▪ (Amount to “Apply it”. Merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, see MPEP § 2106.05(f). Examiner’s note: high level application of using machine learning model to perform optimization).
Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2: The judicial exception is not integrated into a practical application.
With respect to claim 18:
Step 2A Prong 1: the claims recite a judicial exception (an abstract idea)
▪ generating the corresponding optimized binary code library of the optimized DNN model (a generic computer functions of receiving and processing that are well-understood, routine, and conventional activities previously known to the industry. Extracting caption data and natural text processing are merely extra-solution activities and does not meaningfully limit the independent claims. Generic computer implementation does not provide significantly more than the abstract idea).
Additional elements: the additional element listed above in step 2A Prong 2 is merely instructions to be implemented on a generic computer component. Therefore, the additional element does not amount to an inventive concept, particularly when the activity is well understood or conventional (MPEP 2106.05(d)).
Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2: The judicial exception is not integrated into a practical application.
Dependent claims 2-19 are thus, also patent ineligible for the reasons discussed above.
Claim Construction
Claim 1 recites limitation – “an optimization space.” Given that the application does not specifically defines the optimization space, based on the broadest reasonable interpretations, the limitation has been construed as follows –
The optimization space (or search space) of a Deep Neural Network (DNN) model is the multidimensional space of all possible parameters (weights and biases) and hyperparameters (i.e. layers, neurons, activation functions, etc.) that the training algorithm can explore to find the best-performing model.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3, 5, 8, 10-17, 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (US 20230177307) in view of Das et al. (US 2021/0350203).
Regarding claim 1, Wang teaches a computer-implemented method for obtaining an optimized Deep Neural Network ("DNN") model ([0087]) to run on a target device that maximizes the DNN performance on the target device, the method comprising:
obtaining, in a computing platform, hardware specification of the target device ([0055] “determining … a type of target hardware and its performance requirement”);
generating, by the computing platform, a quantitative hardware performance model based on the obtained hardware specification ([0053], [0059]-[0061], [0066], [0087], F4:411, 440, 450)(see NOTE);
obtaining, in the computing platform, a starting DNN model and DNN performance requirements for the optimized DNN model ([0055] “determining an initial DNN”, [0058], [0060]);
generating, by the computing platform, a DNN performance model based on the obtained starting DNN model and the obtained DNN performance requirements ([0058] “algorithm may be used to find a more lightweight DNN network structure of the initial DNN which still keeps needed performance level”, [0062], [0066], [0083]); and
generating, by the computing platform, the optimized DNN model and code through applying the quantitative hardware performance model ([0059], [0079], [0087] “use of runtime performance metrics to adjust compression algorithms”, “optimize DNN network compression strategy”) and the DNN performance model to an optimization space ([0057], [0063], [0066], [0083], [0087]) of a plurality of DNN model instances ([0060], [0062]-[0063], [0078]-[0079], [0085])
Wang does not explicitly teach, however Das discloses code optimizations ([0066] “optimize the standard DNN model by modifying unsupported operations used for the execution of the task with supported operations to generate the optimized DNN model”, [0113], [0119]).
Das further discloses a complete limitation - generating, by the computing platform, the optimized DNN model and code through applying the quantitative hardware performance model and the DNN performance model to an optimization space ([0069], [0117]) of a plurality of DNN model instances and code optimizations ([0066] “optimize the standard DNN model by modifying unsupported operations used for the execution of the task with supported operations to generate the optimized DNN model”, [0113], [0119]).
NOTE Wang teaches The runtime performance estimator, which uses quantization module 411, and obtains a target hardware constrains. “The runtime performance estimator … corresponds to differentiable algorithms … denoted as Estimator_deferentiable, which takes a vectorized hardware parameters as input and produces runtime performance estimation to optimize DNN model ([0068]-[0070], which makes it a model i.e. set of parameters and weight in a multidimensional space) and is stored in a library [0084] is analogous to the limitation “generating, by the computing platform, a quantitative hardware performance model.”
However, to merely obviate such teachings Das discloses a quantitative hardware performance model based on the obtained hardware specification ([0072]-[0073] see predictor metamodel, [0114])
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang to include code optimizations and a quantitative hardware performance model as disclosed by Das. Doing so would improve the overall DNN performance and efficiency (Das [0052]).
Regarding claim 2, Wang as modified teaches the method of claim 1, wherein the hardware specification of the target device specifies architecture, execution models and performance recipes of the target device (Wang [0050], [0053], Das [0072]-[0073], [0111]-[0112], [0144]).
Regarding claim 3, Wang as modified teaches the method of claim 1, wherein the target device is one of a plurality of platforms comprising servers, workstations, personal computing devices, mobile phones, embedded devices, specialized accelerators, FPGAs and ASICs (Wang [0050], Das [0063], [0125]).
Regarding claim 5, Wang as modified teaches the method of claim 2, wherein the architecture of hardware specification specifies both processing blocks and memory blocks (Das [0051], [0176]) of the target device and interconnections of the processing blocks and the memory blocks (Das [0059], [0065], [0072]-[0073], [0113], [0120], F8, F13A:1301).
Regarding claim 8, Wang as modified teaches the method of claim 1, wherein the generating hardware performance model comprises one or more of the following:
generating architectural specification of the target device (Das [0058], [0120], [0153]);
generating one or more hardware performance models on one or more common DNN operations through active profiling (Wang [0052]-[0053], [0080] see performance profiler, Das [0058], [0120]) and/or linear curve fitting (Das [0139], F6); and
generating one or more performance recipes based on the architectural specification and the one or more hardware performance models (Das [0058], [0153] “distinct and separate development pipelines [-recipes-] for learning the DNN architecture for different device hardware configurations and different quality requirements”, [0126], [0158], [0160]).
Regarding claim 10, Wang as modified teaches the method of claim 8, wherein the one or more common DNN operations comprise tensor multiplications of a plurality of shapes and sizes, tensor normalization, linear and non- linear tensor transformations (Wang [0057], Das [0061], [0153], [0222]).
Regarding claim 11, Wang as modified teaches the method of claim 8, wherein the one or more performance recipes comprises a decision tree, rules, external executable functions (Wang [0092], Das [0073], [0105], [0139]).
Regarding claim 12, Wang as modified teaches the method of claim 1, wherein the generating a DNN performance model comprises one or more of the following:
determining, for each layer in the obtained starting DNN model (Das [0077], [0116], [0128], [0140]), a DNN performance model through active profiling (Wang [0052]-[0053], [0080], Das [0058], [0120]); and generating a statistical description to capture one or more dynamic features (Das [0128]-[0128], [0150]) of a DNN model (Das [0106], [0132], [0148]).
Regarding claim 13, Wang as modified teaches the method of claim 12, wherein the one or more dynamic features of the DNN model the DNN model instances comprise conditional branching characteristics and parameters of the DNN performance models (Das [0125], [0127]-[0128], [0141]-[0142]).
Regarding claim 14, Wang as modified teaches the method of claim 12, wherein the statistical description comprises:
distributions of probabilities of taking each branch (Das [0066], [0093], [0127], [0142]); and/or
one or more machine learning models, wherein the one or more machine learning models are capable of predicting frequencies of branching taken by the DNN model running for certain input data, and running speed or amount of calculations of the DNN model (Das [0127], [0141], [0148]).
Regarding claim 15, Wang as modified teaches the method of claim 1, wherein the generating the optimized DNN model further comprises iteratively performing impressionistic refinement to the optimization space (Wang [0052], [0063], Das [0052], [0117], [0200], [0226]) and applying hybrid model-driven assessment on the plurality of DNN model instances (Das [0064], [0112])
Regarding claim 16, Wang as modified teaches the method of claim 15, wherein the impressionistic refinement comprises:
determining a refined optimization space of DNN model instances (Wang [0052], [0063], Das [0052], [0117], [0200]) based on optimization recipes and the DNN performance model and the hardware performance model (Wang [0025], [0028], [0035], Das [0114]-[0117] [0117], [0072]); and
reducing the refined optimization space to a pre-determined size (Wang [0058], [0085]) by iteratively selecting a subset of the DNN model instances in the refined optimization space and measuring effects on various metrics of DNN model instances under various code optimizations (Wang [0002] “Model compression may reduce the model size and computation of DNN models to generate compressed/ pruned DNNs to be implemented on target hardware”, [0058], [0060], [0078]-[0079], [0083], Das [0117], [0140], [0163]),
wherein the various metrics comprises at least one item selected from the group consisting of speed, accuracy, size, power consumption, energy and memory (Wang [0053], [0063], [0073], Das [0079], [0148]).
Regarding claim 17, Wang as modified teaches the method of claim 16, wherein the hybrid model-driven assessment comprises:
analytically inferring the speed of the DNN model instances through applying the hardware performance model and the DNN performance model to the DNN model instances (Das [0194]-[0195]); and inferring the accuracy of the DNN model instances through sampling (Wang [0071], Das [0151], [0176]) and interpolation (Das [0106]-[0107], [0117], [0180]-[0183], [0207]).
Regarding claim 19, Wang as modified teaches the method of claim 9. wherein the active measuring comprises running a set of micro- kernels on the computing platform (Wang [0073], [0085]).
Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang as modified and in further view of KOU et al. (US 20240185587) or SABOORI et al. (US 20210350233).
Regarding claim 4, Wang as modified does not explicitly teach, however KOU or SABOORI disclose the method of claim 2, wherein the hardware specification is described in heterogeneous hardware specification language (KOU [0039]-[0040], SABOORI [0025], [0031]).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang as modified to include heterogeneous hardware specification language as disclosed by KOU or SABOORI. Doing so would enhance the overall performance, e.g., inference speed, for all the models during operation (KOU [0042]) and maximize energy-efficiency and execution time on the target hardware (SABOORI [0031]).
Claim 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang as modified and in further view of Khailany et al. (US 20220067530).
Regarding claim 6, Wang as modified does not explicitly teach, however Khailany discloses the method of claim 2, wherein the execution models comprising thread models and synchronization schemes and constraints (Khailany [0118]-[0120]).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang as modified to include thread models and synchronization schemes and constraints as disclosed by Khailany. Doing so would enable greater performance, design flexibility, and software reuse in the form of collective group-wide function interfaces (Khailany [0119]).
Claim 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang as modified and in further view of Wang et al. (US 20210256384), hereafter Wang II.
Regarding claim 7, Wang as modified teaches the method of claim 2, wherein the performance recipes comprise one or more hardware constraints (Wang [0052], [0085], [0087], Das [0112], [0156], [0159]), one or more rules on preferred computation patterns (Das [0131], [0136], [0140])
Wang as modified does not explicitly teach, however Wang II discloses one or more rules on preferred data storage ([0045], [0093], [0103]) and access patterns ([0107], [0110]). It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang as modified to include access patterns as disclosed by Wang II. Doing so would lead to fewer computations and fewer memory accesses thus reducing the memory bandwidth pressure (Wang II [0132]).
Claim 9 and alternatively claims 2-7, 10, 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang as modified and in further view of YAO et al. (US 20220129759).
Regarding claim 9, Wang as modified teaches the method of claim 8, wherein the generating architectural specification comprises conducting active measuring to determine hardware metrics of the target device, wherein hardware metrics include at least one item selected from the group consisting of memory hierarchy, processor speed
Wang as modified does not explicitly teach, however YAO discloses register file size ([0116], [0132]). It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang as modified to include register file size as disclosed by YAO. Doing so would increase processing efficiency (YAO [0003]).
Regarding claim 2, if Wang as modified however YAO discloses teaches the method of claim 1, wherein the hardware specification of the target device specifies architecture, execution models and performance recipes of the target device ([0112]-[0114], [0144]).
Wang as modified does not explicitly teach, however YAO discloses register file size ([0132]). It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang as modified to include execution models as disclosed by YAO. Doing so would improve the speed of the overall system (YAO [0074]).
Regarding claim 5, if Wang as modified however YAO discloses the method of claim 2, wherein the architecture of hardware specification specifies both processing blocks and memory blocks of the target device and interconnections of the processing blocks and the memory blocks ([0045], [0052]-[0053], [0066], [0068], [0113]).
Regarding claim 6, Wang as modified teaches the method of claim 2, wherein the execution models comprising thread models and synchronization schemes and constraints (YAO [0063], [0107], [0124], [0158], [0160]).
Regarding claim 10, if Wang as modified however YAO discloses the method of claim 8, wherein the one or more common DNN operations comprise tensor multiplications of a plurality of shapes and sizes, tensor normalization, linear and non- linear tensor transformations ([0070]-[0071]).
Regarding claim 17, if Wang as modified however YAO discloses the method of claim 16, wherein the hybrid model-driven assessment comprises:
analytically inferring the speed of the DNN model instances through applying the hardware performance model and the DNN performance model to the DNN model instances ([0128], [0200]); and inferring the accuracy of the DNN model instances through sampling and interpolation ([0120], [0201], [0223]).
Wang as modified does not explicitly teach, however YAO discloses register file size ([0132]). It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang as modified to include sampling and interpolation as disclosed by YAO. Doing so would improve the speed of the overall system (YAO [0074]).
Claims 8 and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang as modified and in further view of Chandra et al. (US 20190266015).
Regarding claim 8, if Wang as modified as modified does not explicitly teach, however Chandra discloses - generating one or more hardware performance models on one or more common DNN operations through active profiling ([0041], [0042]) and/or linear curve fitting ([0048]).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang as modified to include active profiling as disclosed by Chandra. Doing so would prevent machine downtime (Chandra [0001]).
Regarding claim 12, Wang as modified teaches the method of claim 1, wherein the generating a DNN performance model comprises one or more of the following:
determining, for each layer in the obtained starting DNN model a DNN performance model through active profiling (Chandra [0041], [0042]); and generating a statistical description to capture one or more dynamic features of a DNN model (Chandra [0040], [0038]-[0039]).
Claim 17 is/are additionally rejected under 35 U.S.C. 103 as being unpatentable over Wang as modified and in further view of KOU et al. (US 20240185587) or Pandya et al. (US 11675878).
Regarding claim 17, if Wang as modified teaches does not explicitly teach, however KOU and Pandya disclose the method of claim 16, wherein the hybrid model-driven assessment comprises:
analytically inferring the speed of the DNN model instances through applying the hardware performance model and the DNN performance model to the DNN model instances (KOU [0038], [0042], [0057], Pandya C27L1-10, C28L20-22); and inferring the accuracy of the DNN model instances through sampling and interpolation (Pandya C24L61-63).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang as modified to include analytically inferring the speed of the DNN model and interpolation as disclosed by KOU and Pandya. Doing so would enhance the overall performance, e.g., inference speed, for all the models during operation (KOU [0042]) and allow for real-time inference while preserving model performance (Pandya C27L9-10).
Claim 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang as modified and in further view of Chai et al. (US 20210241108) or Herr et al. (US 20190317740).
Regarding claim 18, Wang as modified does not explicitly teach, however Chai and Herr disclose the method of claim 1, further comprises generating the corresponding optimized binary code library of the optimized DNN model (Chai [0116], [0118]-[0120], Herr [0038], [0055], [0070], [0100]).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang as modified to include optimized binary code library as disclosed by Chai or Herr. Doing so would improve the overall DNN performance and efficiency (Chai [0064]) and maximize energy-efficiency and execution time on the target hardware (SABOORI [0031]) and help developing a more accurate cost model for the particular processing element (Herr [0112]).
Claim 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang as modified and in further view of YOUN et al. (US 20220245083) or Li et al. (US 20240320047).
Regarding claim 19, if Wang as modified does not explicitly teach, however YOUN or Li disclose the method of claim 9. wherein the active measuring comprises running a set of micro- kernels on the computing platform (YOUN [0037], [0085], Lo Abstract, [0074]-[0075]).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Wang as modified to include micro- kernels as disclosed by YOUN or Li. Doing so would provide a highly optimized code (Li [0034]).
Response to Arguments
Applicant's arguments filed 10/26/2025 have been fully considered but they are not persuasive. ◊ With respect to the rejection under 35 USC 101, the applicant argues –
“The human mind cannot internally create a predictive mathematical model of heterogeneous hardware components like CPUs, GPUs, and memory”; “This core step involves an iterative, multi-dimensional search and synthesis process that is not only impractical but impossible for a human to perform without computational aid, as it requires simultaneously evaluating and balancing hardware constraints against network architecture possibilities to produce a technically superior, device-specific software output”; “Therefore, the claims are directed to a specific technological process that yields the concrete technical improvement of a DNN model.”
The arguments are not persuasive. While the claims may represent an improvement to the code optimization, they in no way either claimed or disclosed represent a practical application. Claims recite obtaining data (hardware specifications), generating performance model (mathematical calculations), obtaining starting DNN model and performance requirements (initial parameters and calculations), generating performance model based on the obtained model (updated mathematical / logical calculations), generating optimized model by combining and applying different model outputs (final mathematical / logical calculations). Human analyst can perform such calculations.
Under the 2019 Revised Guidance, the claims are evaluated to determine if additional elements that integrate the judicial exception into a practical application (see Manual of Patent Examining Procedure ("MPEP") §§ 2106.05(a)-(c), (e)-(h)). See 2019 Revised Guidance, 84 Fed. Reg. at 51-52, 55. Acclaim that integrates a judicial exception into a practical application applies, relies on, or uses the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that the claim is more than a drafting effort designed to monopolize the judicial exception. See 2019 Revised Guidance, 84 Fed. Reg. at 54.
For example, limitations that are indicative of "integration into a practical application" include:
- Improvements to the functioning of a computer, or to any other technology or technical field - see MPEP § 2106.05(a);
- Applying the judicial exception with, or by use of, a particular machine - see MPEP § 2106.05(b);
- Effecting a transformation or reduction of a particular article to a different state or thing - see MPEP §2106.05(c); and
- Applying or using the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception - see MPEP § 2106.05(e).
In contrast, limitations that are not indicative of "integration into a practical application" include:
- Adding the words "apply it" (or an equivalent) with the judicial exception, or merely include instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP § 2106.05(+);
- Adding insignificant extra-solution activity to the judicial exception- see MPEP § 2106.05(g); and
- Generally linking the use of the judicial exception to a particular technological environment or field of use - see MPEP 2106.05(h).
See 2019 Revised Guidance, 84 Fed. Reg. at 54-55 ("Prong Two’).
In view of the 2019 Revised Guidance, one must consider whether there are additional elements set forth in the claims that integrate the judicial exception into a practical application. The identified additional non-abstract element recited in the only independent claim is: DNN model (logical code) and a target device (assuming hardware). However, these generic computer hardware merely perform generic computer functions of receiving, processing and transmitting data and represent a purely conventional implementation of applicant's determining of an event timeline and do not represent significantly more than the abstract idea. See at least MPEP § 2106.05(a) ("Improvements to the Functioning of a Computer or to Any Other Technology or Technical Field").
This recited additional element is merely a generic computer component. The claims do present any other issues as set forth in the 2019 Revised Guidance regarding a determination of whether the additional generic elements integrate the judicial exception into a practical application. See Revised Guidance, 84 Fed. Reg. at 55. Rather, the claims on appeal merely use instructions to implement an abstract idea on a computer, or merely use a computer as a tool to perform an abstract idea. The claims do not recite improvements to the functioning of a computer or any other technology field (MPEP 2106.05(a)), the claims do not apply or use the abstract idea to effect a particular treatment or prophylaxis for a disease or medical condition, the claims to do apply the abstract idea with a particular machine (MPEP 2106.05(b)), the claims do not effect a transformation or reduction of a particular article to a different state or thing (e.g. data remains data even after processing; MPEP 2106.05(c)), the claims no not apply or use the abstract idea in some other meaningful way beyond generally linking the user of the abstract idea to a particular technological environment (i.e. a generic computer) such that the claim as a whole is more than a drafting effort designed to monopolize the abstract idea (MPEP 2106.05(e)). The recited generic computing elements are no more than mere instructions to apply the exception using a generic computer component.
Considering the elements of the claim both individually and as “an ordered combination” the functions performed by the computer system at each step of the process are purely conventional. Each step of the claimed method does no more than require a generic computer to perform a generic computer function. Thus, the claimed elements have not been shown to integrate the judicial exception into a practical application as set forth in the Revised Guidance which references MPEP §§ 2106.04(d) and 2106.05(a)-(c) and (e)-(h). Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Thus, under Step 2A, Prong Two (MPEP §§ 2106.05(a)-(c) and (e) (h)), the claims do not integrate the judicial exception into a practical application.
◊ With respect to the rejection under 35 USC 103, the applicant argues –
“Ref 1 fails to disclose a data-driven hardware performance model Feature A or the automated code generation of Feature B.”
◊◊ The arguments are not persuasive. With respect to Feature A - “generating a quantitative hardware performance model based on the obtained hardware specification of the target device, which computationally predicts the performance characteristics of specific hardware components,” Wang (Ref 1) clearly teaches optimizing DNN in three ways by –
“pruning, quantization and neural network search”; “conduct pruning, quantization and neural network search to generate compressed models/networks that will run on various target hardware with different architectures, such as GPU, CPU, FPGA” [0050]. Which is performed by PRUNING & QUANTIZATION module 411, as shown in Figure 4.
“a runtime performance estimator available for the target hardware type” [0059]; “Runtime performance simulator … obtain corresponding runtime performance data … data may comprise latency, throughput, power consumption” [0078], “obtaining a runtime performance estimator of a target hardware”, “runtime performance estimator … applied to the compression algorithm associated with the initial network (e.g., VGG16) by compression engine 411.” [0083]
The runtime performance estimator, which uses quantization module 411, and obtains a target hardware constrains. “The runtime performance estimator … corresponds to differentiable algorithms … denoted as Estimator_deferentiable which takes a vectorized hardware parameters as input and produces runtime performance estimation to optimize DNN model ([0068]-[0069]) and stored in a library [0084] is analogous to the limitation “generating, by the computing platform, a quantitative hardware performance model” and Feature A - “generating a quantitative hardware performance model based on the obtained hardware specification of the target device, which computationally predicts the performance characteristics of specific hardware components.”
◊◊ With respect to Feature B - “generating the optimized DNN model and executable code through the co-application of the quantitative hardware performance model and the DNN performance model to an optimization space of a plurality of DNN model instances and code optimizations,” once again it is noted that the claims does not require “executable code,” thus, it is not clear what functionality the applicant is specifically argues.
Further, the optimization space Deep Neural Network (DNN) model is also known as a search space of all possible parameters (weights and biases) and hyperparameters (i.e. activation functions, etc.) to find the best-performing model or a best solution. Wang (Ref 1) teaches –
“(1) compression/search algorithm(s) used by compression engine 410 and corresponding initial DNN network structure or search space; (2) target hardware type, such as GPU, CPU, FPGA and Power9; and (3) runtime performance requirement” [0052], “algorithm may be used to find a more lightweight DNN network structure of the initial DNN which still keeps needed performance level” [0058] (which analogous to the limitation - “a DNN performance model based on the obtained starting DNN mode”).
“runtime performance estimator is used to characterize the relationship between a DNN model and its runtime performance in target hardware 430. Specifically, the runtime performance estimator defines a function of the structure configuration data” [0066].
As argued above, runtime performance estimator is the “a quantitative hardware performance model.” The a quantitative hardware performance model (aka runtime performance estimator) is applied to the search space based on various hardware requirements to produce various instances of the DNN model -
“output compressed DNN model, i.e., a smaller version of VGG16, can meet the runtime requirement on target hardware” [aka generating optimized DNN model” by applying performance estimator Estimator_deferentiable by engine 411 [aka a quantitative hardware performance model to the compression algorithm associated with the initial network -
“performance estimator Estimator_deferentiable may be applied to the compression algorithm associated with the initial network (e.g., VGG16) by compression engine 411”; “engine 411 will start with the initial DNN (e.g., VGG16) and iteratively update the output compressed DNN model configuration using the compression algorithm” [0083];
“emulating a plurality of different compressed models of the initial DNN on the target hardware to obtain corresponding runtime performance data, wherein the different compressed models are defined with different configuration data” [0060];
“runtime performance estimator … stored along with the initial DNN structure or search space, compression/search algorithm, target hardware type and performance requirement” [0063].
In summary, Wang (Ref 1) teaches applying to the Estimator_deferentiable (aka quantitative hardware performance model) and lightweight DNN network structure of the initial DNN to the search space (optimization space) of a plurality of different compressed models of the initial DNN on the target hardware to obtain corresponding runtime performance data to select the best model (aka generating, the optimized DNN model).
It is also noted that Wang (Ref 1) discloses – “use of runtime performance metrics to adjust compression algorithms according to embodiments of the invention may optimize DNN network compression strategy … fulfilling hardware requirements” [0087]. Adjusting compression algorithms and optimizing strategy obviously generates and optimizes a code of the DNN network.
Thus, Wang (Ref 1) fully discloses Feature B - “generating the optimized DNN model and executable code through the co-application of the quantitative hardware performance model and the DNN performance model to an optimization space of a plurality of DNN model instances and code optimizations.”
◊ With respect to the combination of references of Wang and Das, the applicant argues –
“b. There is No Motivation to Combine the References.” Specifically, “combining a general teaching from Wang or Ref 1 with an unrelated teaching from another reference does not bridge this gap”; “The references were not designed to be combined for the purpose of solving the problem addressed by the present invention. There is no teaching, suggestion, or motivation in the cited art for a person of ordinary skill to arrive at the specific, integrated process defined by Features A and B”; “As such, not only there is no teaching, suggestion or motivation to combine Ref 1 and Ref 2, but also Ref 1 and Ref 2 are incompatible to combine, not to mention the claim-specific deficiency as discussed above, at least as to Feature A and Feature B.”
The arguments are not persuasive. It is fist noted that the secondary reference of Das analogously closely discloses present invention – “generating an optimized DNN model for executing a task in an electronic device” (see Das Abstract). Therefore, the reference solve the same problem, in the same filed of invention and moreover, fully interchangeable.
In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art. See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007). In this case, Wang (Ref 1) already discloses optimizing compression algorithm and strategy, which are obviously relates to at least a mathematical code. Das further obviate such teachings and explicitly shows code generation and optimization (see ([0066] “optimize the standard DNN model by modifying unsupported operations used for the execution of the task with supported operations to generate the optimized DNN model”). The motivation used to combine the references is from the references themselves and is not improper.
Applicant's remaining arguments, are addressed in the updated rejections to the claims above.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to POLINA G PEACH whose telephone number is (571)270-7646. The examiner can normally be reached Monday-Friday, 9:30 - 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksandr Kerzhner can be reached at 571-270-1760. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/POLINA G PEACH/Primary Examiner, Art Unit 2165 November 2, 2025