Last updated: May 29, 2026
Application No. 18/325,553
Application Prototyping Systems And Methods

Non-Final OA §103
Filed
May 30, 2023
Examiner
ANYA, CHARLES E
Art Unit
2194
Tech Center
2100 — Computer Architecture & Software
Assignee
Kinara Inc.
OA Round
1 (Non-Final)
Interview Optional

— +33.3% interview lift. Examiner has a relatively high allowance rate (82%); +33.3% interview lift. A written response may suffice.
Based on 898 resolved cases, 2023–2026
Examiner Intelligence

ANYA, CHARLES E View full profile →
Grants 82% — above average
Career Allowance Rate
734 granted / 898 resolved
+26.7% vs TC avg
Strong +33% interview lift
Without
With
+33.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
25 currently pending
Career history
934
Total Applications
across all art units
Statute-Specific Performance

§101
0.9%
-39.1% vs TC avg
§103
94.1%
+54.1% vs TC avg
§102
1.8%
-38.2% vs TC avg
§112
0.6%
-39.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 898 resolved cases
Office Action

§103
DETAILED ACTION
Claims 1-20 are pending in this application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 1 and 11 are objected to because of the following informalities:
Lines 3 and 7 of claims 1 and 10 includes typographical error. Specifically, the term “the” is missing.
Appropriate correction is required, “…using the identified resource constraints, creating multiple presentation models at least in part based on identified processing metrics…” and “…with the multiple presentation models including multiple processing pipelines configurable for execution on the multiple computing devices…”.



Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.

Claims 1-19 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-19 of copending Application No. 18/325,527 to Ghamore et al. in view of U.S. Pub. No. 2019/0325314 A1 to Bourges-Sevenier et al. and further in view of U.S. Pub. No. 2021/0271517 A1 to Guim et al.
This is a provisional nonstatutory double patenting rejection.

U.S. Application No. 18/325,553
U.S. Application No. 18/325,527
Claim 1:
A processing method for multiple computing devices, comprising:
    identifying resource constraints for the multiple computing devices;
     using identified resource constraints, creating multiple presentation models at least in part based on identified processing metrics, with the multiple presentation models including multiple processing pipelines configurable for execution on multiple computing devices; and
       using an inference engine to provide an execution model for the multiple processing pipelines based at least in part on the multiple presentation models, with the execution model having improved processing metrics as compared to at least one of the multiple presentation models.  

Claim 1:
A processing method for multiple computing devices, comprising:
   identifying resource constraints for the multiple computing devices;
    using identified resource constraints, creating a presentation model having a plurality of modifiable parameters based at least in part based on the resource constraints; and 



   using at least one inference engine supporting neural network processing, with the inference engine executing a particular neural network model based at least in part on the presentation model.
Claim 2:
      The method of claim 1, wherein the processing metrics include at least one of a latency, an execution time, a memory consumed, an input/output data transfer time, and an inference time.  

Claim 2:
   The method of claim 1, wherein the creating is based on one or more processing metrics associated with the computing devices, the processing metrics including at least one of a latency, an execution time, a memory consumed, an input/output data transfer time, a numerical accuracy and an inference time.
Claim 3:
      The method of claim 1, further comprising generating user cases via a drag-and-drop visual editor.  
 Claim 3:
    The method of claim 1, further comprising generating one or more user cases via a drag-and-drop visual editor.
Claim 4:
     The method of claim 1, further comprising generating one or more user cases via text input adherent to a domain-specific language.  
Claim 4:
     The method of claim 1, further comprising generating one or more user cases via text input adherent to a domain-specific language.
Claim 5:
     The method of claim 1, wherein a processing pipeline further includes any combination of one or more computational stages such as an input stage, a preprocessing stage, a postprocessing stage, and an output. 
Claim 5:
   The method of claim 1, wherein the inference engine is associated with a processing pipeline that further includes any combination of one or more computational stages such as an input stage, a preprocessing stage, a postprocessing stage, and an output, wherein the processing pipeline is implemented on the computing devices.
Claim 6:
     The method of claim 1, wherein the inference engine can load and unload one or more neural network models generated from the presentation model by the neural network processing.  
Claim 6:
   The method of claim 1, wherein the inference engine can load and unload one or more neural network models generated from the presentation model by the neural network processing.
Claim 7:
    The method of claim 1, wherein one or more computing graphs associated with the presentation model are combined in any combination of sequential, parallel, and merged combinations.  
Claim 7:
   The method of claim 1, wherein one or more computing graphs associated with the presentation model are combined in any combination of sequential, parallel, and merged combinations.


Claim 8:
     The method of claim 7, wherein the combination is used to create one or more additional presentation models.  

Claim 8:
   The method of claim 7, wherein the combination is used to create one or more additional presentation models.
Claim 9:
     The method of claim 7, wherein the computing graphs are split so as to be executed on multiple compute devices attached to a network.  

Claim 9:
   The method of claim 7, wherein the computing graphs are split so as to be executed on multiple compute devices attached to a network.
Claim 11:
An apparatus comprising:
  a plurality of computing devices; and 
   a pipeline processing architecture generator configured to: 
   identify resource constraints for the computing devices; 
    using identified resource constraints, create multiple presentation models at least in part based on identified processing metrics, with the multiple presentation models including multiple processing pipelines configurable for execution on multiple computing devices; and
     use an inference engine to provide an execution model for the multiple processing pipelines based at least in part on the multiple presentation models, with the execution model having improved processing metrics as compared to at least one of the multiple presentation models.  
Claim 11:
An apparatus comprising:
  a plurality of computing devices; and 
   a pipeline processing architecture generator configured to: 
   identify resource constraints for the computing devices; 
   using identified resource constraints, create a presentation model having a plurality of modifiable parameters based at least in part based on the resource constraints; and 


  use at least one inference engine supporting neural network processing, with the inference engine executing a particular neural network model on the computing devices, wherein the neural network model is based at least in part on the presentation model
Claim 12:
    The apparatus of claim 11, wherein the processing metrics include at least one of a latency, an execution time, a memory consumed, an input/output data transfer time, and an inference time.  
Claim 12:
    The apparatus of claim 11, wherein the presentation model is created is based on one or more processing metrics associated with the computing devices, the processing metrics including at least one of a latency, an execution time, a memory consumed, an input/output data transfer time, a numerical accuracy and an inference time.
Claim 13:
     The apparatus of claim 11, further comprising generating user cases via a drag-and-drop visual editor.  

Claim 13:
   The apparatus of claim 11, wherein the pipeline processing architecture generator generates one or more user cases via a drag-and-drop visual editor.
Claim 14:
    The apparatus of claim 11, further comprising generating one or more user cases via text input adherent to a domain-specific language.  
Claim 14:
   The apparatus of claim 11, wherein the pipeline processing architecture generator generates one or more user cases via text input adherent to a domain-specific language.
Claim 15:
   The apparatus of claim 11, wherein a processing pipeline further includes any combination of one or more computational stages such as an input stage, a preprocessing stage, a postprocessing stage, and an output.  

Claim 15:
   The apparatus of claim 11, wherein the inference engine is associated with a processing pipeline that further includes any combination of one or more computational 51stages such as an input stage, a preprocessing stage, a postprocessing stage, and an output, wherein the processing pipeline is implemented on the computing devices.
Claim 16:
     The apparatus of claim 11, wherein the inference engine can load and unload one or more neural network models generated from the presentation model by the neural network processing. 
Claim 16:
   The apparatus of claim 11, wherein the inference engine can load and unload one or more neural network models generated from the presentation model by the neural network processing.


Claim 17:
   The apparatus of claim 11, wherein one or more computing graphs associated with a presentation model are combined in any combination of sequential, parallel, and merged combinations.  
Claim 17:
  The apparatus of claim 11, wherein one or more computing graphs associated with the presentation model are combined in any combination of sequential, parallel, and merged combinations.
Claim 18:
   The apparatus of claim 17, wherein the combination is used to create one or more additional presentation models.  

Claim 18:
  The apparatus of claim 17, wherein the combination is used to create one or more additional presentation models.
Claim 19:
  The apparatus of claim 17, wherein the computing graphs are split so as to be executed on multiple compute devices attached to a network.  

Claim 19:
  The apparatus of claim 17, wherein the computing graphs are split so as to be executed on multiple compute devices attached to a network.

         Ghamore is silent with reference to the multiple presentation models including multiple processing pipelines configurable for execution on multiple computing devices and 
          the execution model having improved processing metrics as compared to at least one of the multiple presentation models.
         Bourges-Sevenier teaches the multiple presentation models including multiple processing pipelines configurable for execution on multiple computing devices (devices) (Block 418) (“…The example quantizer 235 identifies constraints associated with execution of the model. (Block 418). In examples disclosed herein, the constraints may be user-supplied constraints (e.g., energy consumption preferences, preferences for executing on particular hardware, etc.). In some examples, the constraints represent hardware limitations (e.g., bit limitations, memory limitations, whether fast instructions of 8-bit values is enabled, vectorization, tiling, ALU vs. FPU performance, etc.). The example quantizer 235 identifies (e.g., selects) a layer of the model for processing. (Block 420)…” paragraph 0050).
         It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Ghamore with the teaching of Bourges-Sevenier because the teaching of Bourges-Sevenier would improve the system of Ghamore by providing techniques for optimally executing programming models.
Guim teaches the execution model having improved processing metrics as compared to at least one of the multiple presentation models (“…A work request received at a platform or originated at the platform may identify an AI model type associated to be performed for the request and an SLA. The SLA may specify one or more of: AI model type, time to complete a pre-processing model, time to complete processing and make data available after a particular processing layer or stage, minimum compute resources, minimum network interface bandwidth to be allocated, minimum compute resources, minimum memory read or write rates. Examples of types of target operations include, but are not limited, to: person detection, object detection, music detection, and so forth. In response to receipt of a request to perform AI or ML inference, complexity estimator 1302 can identify a pre-processing model to be executed for the operation targeted by the request. For example, if the request indicates an AI model type includes identification of a person, complexity estimator 1302 could identify an associated pre-processing model of person detection and provides inputs to the person detection model that include a pointer to a payload to process and an SLA parameter for the acceleration function…By performing the pre-processing model, complexity estimator 1302 can identify hardware resources to perform a subsequent layer or stage to satisfy the provided SLA from various of hardware resources (e.g., FPGA accelerators, memory, network, CPU, XPU, GPU, IPU, DPU, and so forth) available to access from the platform. Complexity estimator 1302 can output meta-data that indicates a complexity level for one or more layers performed for the request. A complexity level can include target priority and estimated runtime 1304 and estimated resources 1306. For example, target priority and estimated runtime 1304 can indicate one or multiple devices and a time to completion for a subsequent phase. For example, target priority and estimated runtime 1304 can indicate accelerator device Accel0 can complete processing a subsequent phase in 10 milliseconds (ms), accelerator device Accel1 can complete processing a subsequent phase in 15 ms, and a CPU can complete the subsequent phase in 100 ms. For example, estimated resources 1306 can indicate hardware resources to allocate to complete the subsequent phase in the specified time to complete processing. For example, Accel0 can utilize allocation of N number of accelerators and M memory bandwidth to complete processing a subsequent phase in 10 ms; Accel1 can utilize allocation of P number of accelerators and Q memory bandwidth complete processing a subsequent phase in 15 ms; and use of a CPU can utilize R cores running at S frequency and T memory bandwidth complete processing a subsequent phase in 100 ms…” paragraphs 0075/0076).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Ghamore and Bourges-Sevenier with the teaching of Guim because the teaching of Guim would improve the system of Ghamore and Bourges-Sevenier by providing an inference process of arriving at some conclusion that possesses some degree of probability relative to a premises.
	
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 5, 10-12, 15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 2019/0325314 A1 to Bourges-Sevenier et al. in view of U.S. Pub. No. 2021/0271517 A1 to Guim et al. and further in view of W.O. No. 2020185234 A1 to Murphy et al.
As to claim 1, Bourges-Sevenier teaches a processing method for multiple computing devices, comprising:
identifying resource constraints for the multiple computing devices (devices) (Block 418) (“…The example quantizer 235 identifies constraints associated with execution of the model. (Block 418). In examples disclosed herein, the constraints may be user-supplied constraints (e.g., energy consumption preferences, preferences for executing on particular hardware, etc.). In some examples, the constraints represent hardware limitations (e.g., bit limitations, memory limitations, whether fast instructions of 8-bit values is enabled, vectorization, tiling, ALU vs. FPU performance, etc.). The example quantizer 235 identifies (e.g., selects) a layer of the model for processing. (Block 420)…” paragraph 0050); 
using the identified resource constraints (Block 430), creating multiple presentation models at least in part based on identified processing metrics (Block 420) (“:…The example quantizer 230 of the illustrated example of FIG. 2 is implemented by a logic circuit such as, for example, a hardware processor. However, any other type of circuitry may additionally or alternatively be used such as, for example, one or more analog or digital circuit(s), logic circuits, programmable processor(s), ASIC(s), PLD(s), FPLD(s), programmable controller(s), GPU(s), DSP(s), CGRA(s), ISP(s), etc. The example quantizer 235 identifies constraints associated with execution of the model. In examples disclosed herein, the constraints may be user-supplied constraints (e.g., energy consumption preferences, preferences for executing on particular hardware, etc.). In some examples, the constraints represent hardware constraints (e.g., bit-width capabilities, memory limitations, whether fast instructions of 8-bit values is enabled, vectorization, tiling, ALU vs. FPU performance, etc.). The example quantizer 235 identifies (e.g., selects) a layer of the model for processing, and performs quantization of the layer based on the constraints…The example quantizer 235 identifies constraints associated with execution of the model. (Block 418). In examples disclosed herein, the constraints may be user-supplied constraints (e.g., energy consumption preferences, preferences for executing on particular hardware, etc.). In some examples, the constraints represent hardware limitations (e.g., bit limitations, memory limitations, whether fast instructions of 8-bit values is enabled, vectorization, tiling, ALU vs. FPU performance, etc.). The example quantizer 235 identifies (e.g., selects) a layer of the model for processing. (Block 420)…The example quantizer 235 performs quantization of the layer based on the constraints. (Block 430). Quantization maps a dense matrix M into a sparse matrix M′ where only non-zero elements contribute to y′.sub.i. Hardware accelerators contain vectorization and tiling instructions favoring 1D or 2D spatial memory arrangements respectively. In examples disclosed herein, a symmetric linear quantization function is used. However, any other quantization scheme may additionally or alternatively be used. In some examples, before applying the quantization function, the values distribution of each operand (e.g., within a DNN model layer) are shifted to an appropriate order of magnitude…” paragraph 0028/0050/0051).
Bourges-Sevenier is silent with reference to using an inference engine to provide an execution model for the multiple processing pipelines based at least in part on the multiple presentation models, with the execution model having improved processing metrics as compared to at least one of the multiple presentation models and 
the multiple presentation models including multiple processing pipelines configurable for execution on the multiple computing devices.
Guim teaches using an inference engine (Complexity Estimator 1302) to provide an execution model for the multiple processing pipelines (pre-processing model/subsequent phase) based at least in part on the multiple presentation models (Examples of types of target operations include, but are not limited, to: person detection, object detection, music detection, and so forth), with the execution model having improved processing metrics (The SLA may specify one or more of: AI model type, time to complete a pre-processing model, time to complete processing and make data available after a particular processing layer or stage, minimum compute resources, minimum network interface bandwidth to be allocated, minimum compute resources, minimum memory read or write rates) as compared to at least one of the multiple presentation models (“…A work request received at a platform or originated at the platform may identify an AI model type associated to be performed for the request and an SLA. The SLA may specify one or more of: AI model type, time to complete a pre-processing model, time to complete processing and make data available after a particular processing layer or stage, minimum compute resources, minimum network interface bandwidth to be allocated, minimum compute resources, minimum memory read or write rates. Examples of types of target operations include, but are not limited, to: person detection, object detection, music detection, and so forth. In response to receipt of a request to perform AI or ML inference, complexity estimator 1302 can identify a pre-processing model to be executed for the operation targeted by the request. For example, if the request indicates an AI model type includes identification of a person, complexity estimator 1302 could identify an associated pre-processing model of person detection and provides inputs to the person detection model that include a pointer to a payload to process and an SLA parameter for the acceleration function…By performing the pre-processing model, complexity estimator 1302 can identify hardware resources to perform a subsequent layer or stage to satisfy the provided SLA from various of hardware resources (e.g., FPGA accelerators, memory, network, CPU, XPU, GPU, IPU, DPU, and so forth) available to access from the platform. Complexity estimator 1302 can output meta-data that indicates a complexity level for one or more layers performed for the request. A complexity level can include target priority and estimated runtime 1304 and estimated resources 1306. For example, target priority and estimated runtime 1304 can indicate one or multiple devices and a time to completion for a subsequent phase. For example, target priority and estimated runtime 1304 can indicate accelerator device Accel0 can complete processing a subsequent phase in 10 milliseconds (ms), accelerator device Accel1 can complete processing a subsequent phase in 15 ms, and a CPU can complete the subsequent phase in 100 ms. For example, estimated resources 1306 can indicate hardware resources to allocate to complete the subsequent phase in the specified time to complete processing. For example, Accel0 can utilize allocation of N number of accelerators and M memory bandwidth to complete processing a subsequent phase in 10 ms; Accel1 can utilize allocation of P number of accelerators and Q memory bandwidth complete processing a subsequent phase in 15 ms; and use of a CPU can utilize R cores running at S frequency and T memory bandwidth complete processing a subsequent phase in 100 ms…” paragraphs 0075/0076).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Bourges-Sevenier with the teaching of Guim because the teaching of Guim would improve the system of Bourges-Sevenier by providing an inference process of arriving at some conclusion that possesses some degree of probability relative to a premises.
Murphy teaches the multiple presentation models including multiple processing pipelines (video preprocessing pipeline/Preprocessing Pipeline 500) configurable for execution on the multiple computing devices (clients 102) (“…The deep learning server can connect to several machine learning models running distributed or on the same server, and provides a robust and flexible video preprocessing pipeline, optimizing the resources for several different clients. The clients may involve many different types of devices, including robots, printers, mobile phones, assistants/kiosks, etc…Deep learning server 104 may dynamically reuse preprocessing pipelines for multiple clients as long as the clients ask for the same machine learning model over the same sensor data. For example, if three clients ask for an inference over the same video stream, deep learning server 104 may
automatically make use of the same pipeline and attach new clients to the end of the pipeline to receive the results from the machine learning model. The preprocessing pipelines make preprocessing more efficient as processing components may be reused and, by decreasing resource utilization via resource sharing, more machine learning inferences may be provided for clients with the same available hardware at the edge. The preprocessing pipelines are also flexible enough to accept different types of data sources, such as audio or video…In addition, when there are multiple machine learning models working on the same video stream, but with different preprocessing functions, the deep learning server 104 can handle this situation using the preprocessing pipelines. Figure 5 is a diagram illustrating a preprocessing pipeline 500 to receive a single video stream and preprocess the video stream for two different machine learning models 512 and 518 according to one example. The preprocessing pipeline 500 includes an input feeder 506, an image resize processing unit 508, a first tensor processing unit 510, and a second tensor processing unit 516….” paragraphs 0013/0079/0080).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Bourges-Sevenier and Guim with the teaching of Murphy because the teaching of Murphy would improve the system of Bourges-Sevenier and Guim by providing a deep learning server that allows users with the ability to specify a preprocessing pipeline for any given machine learning model (Murphy paragraph 0032).
As to claim 2, Bourges-Sevenier teaches the method of claim 1, wherein the processing metrics include at least one of a latency, an execution time, a memory consumed, an input/output data transfer time, and an inference time (Block 418) (“…The example quantizer 235 identifies constraints associated with execution of the model. (Block 418). In examples disclosed herein, the constraints may be user-supplied constraints (e.g., energy consumption preferences, preferences for executing on particular hardware, etc.). In some examples, the constraints represent hardware limitations (e.g., bit limitations, memory limitations, whether fast instructions of 8-bit values is enabled, vectorization, tiling, ALU vs. FPU performance, etc.). The example quantizer 235 identifies (e.g., selects) a layer of the model for processing. (Block 420)…” paragraph 0050).  

 As to claim 5, Bourges-Sevenier teaches the method of claim 1, wherein a processing pipeline further includes any combination of one or more computational stages such as an input stage, a preprocessing stage, a postprocessing stage, and an output (Block 420). 

As to claim 10, Gumi teaches the method of claim 1, wherein the inference engine segregates processing pipelines in stages ((pre-processing model/subsequent phase) (“…A work request received at a platform or originated at the platform may identify an AI model type associated to be performed for the request and an SLA. The SLA may specify one or more of: AI model type, time to complete a pre-processing model, time to complete processing and make data available after a particular processing layer or stage, minimum compute resources, minimum network interface bandwidth to be allocated, minimum compute resources, minimum memory read or write rates. Examples of types of target operations include, but are not limited, to: person detection, object detection, music detection, and so forth. In response to receipt of a request to perform AI or ML iinference, complexity estimator 1302 can identify a pre-processing model to be executed for the operation targeted by the request. For example, if the request indicates an AI model type includes identification of a person, complexity estimator 1302 could identify an associated pre-processing model of person detection and provides inputs to the person detection model that include a pointer to a payload to process and an SLA parameter for the acceleration function…By performing the pre-processing model, complexity estimator 1302 can identify hardware resources to perform a subsequent layer or stage to satisfy the provided SLA from various of hardware resources (e.g., FPGA accelerators, memory, network, CPU, XPU, GPU, IPU, DPU, and so forth) available to access from the platform. Complexity estimator 1302 can output meta-data that indicates a complexity level for one or more layers performed for the request. A complexity level can include target priority and estimated runtime 1304 and estimated resources 1306. For example, target priority and estimated runtime 1304 can indicate one or multiple devices and a time to completion for a subsequent phase. For example, target priority and estimated runtime 1304 can indicate accelerator device Accel0 can complete processing a subsequent phase in 10 milliseconds (ms), accelerator device Accel1 can complete processing a subsequent phase in 15 ms, and a CPU can complete the subsequent phase in 100 ms. For example, estimated resources 1306 can indicate hardware resources to allocate to complete the subsequent phase in the specified time to complete processing. For example, Accel0 can utilize allocation of N number of accelerators and M memory bandwidth to complete processing a subsequent phase in 10 ms; Accel1 can utilize allocation of P number of accelerators and Q memory bandwidth complete processing a subsequent phase in 15 ms; and use of a CPU can utilize R cores running at S frequency and T memory bandwidth complete processing a subsequent phase in 100 ms…” paragraphs 0075/0076).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Bourges-Sevenier and Murphy with the teaching of Guim because the teaching of Guim would improve the system of Bourges-Sevenier and Murphy by providing an inference process of arriving at some conclusion that possesses some degree of probability relative to a premises.

As to clam 11, see the rejection of claim 1 above, and further in view of a pipeline processing architecture generator.
Bourges-Sevenier teaches a pipeline processing architecture generator (Block 418) (“…The example quantizer 235 identifies constraints associated with execution of the model. (Block 418). In examples disclosed herein, the constraints may be user-supplied constraints (e.g., energy consumption preferences, preferences for executing on particular hardware, etc.). In some examples, the constraints represent hardware limitations (e.g., bit limitations, memory limitations, whether fast instructions of 8-bit values is enabled, vectorization, tiling, ALU vs. FPU performance, etc.). The example quantizer 235 identifies (e.g., selects) a layer of the model for processing. (Block 420)…” paragraph 0050).

As to claim 12, see the rejection of claim 2 above.

As to claim 15, see the rejection of claim 5 above.

As to claim 20, see the rejection of claim 10 above. 

Claims 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 2019/0325314 A1 to Bourges-Sevenier et al. in view of U.S. Pub. No.  2021/0271517 A1 to Guim et al. and further in view of W.O. No. 2020185234 A1 to Murphy et al.
as applied to claims 1 and 11 above, and further in view of U.S. Pub. No. 2017/0099193 A1 to Mahajan et al.

As to claim 3, Bourges-Sevenier as modified by Guim and Murphy teaches the method of claim 1, however it is silent with reference to generating user cases via a drag-and-drop visual editor.  
Mahajan teaches generating user cases via a drag-and-drop visual editor (Pipeline Designing Module 210/Graphical Pipeline) (“…At block 302, the pipeline designing module 210 is configured to enable a first user for designing a distributed computing pipeline (hereinafter also referred as “pipeline” or “graphical pipeline”) over a Graphical User Interface (GUI) of a first data processing environment of a stream analytics platform. The distributed computing pipeline may comprise a subset of component selected from a universal set of components and a set of links corresponding to the subset of component. In one aspect, the pipeline designing module 210 may enable the first user of the first data processing environment to select the subset of component from the universal set of components of the stream analytics platform. The subset of components may comprise at least one of a channel component, a processor component, an analytical component and an emitter component. In one aspect, the GUI may comprise at least a canvas and a palette, wherein the palette is configured to display the universal set of components of the stream analytics platform, and wherein the pipeline designing module 210 may enable the first user to drag and drop one or more components over the canvas from the universal set of components of the stream analytics platform for generating the distributed computing pipeline. In one embodiment, the pipeline designing module 210 may enable the first user to update one or more components of the distributed computing pipeline in order to generate one or more versions corresponding to the distributed computing pipeline in the stream analytics platform…” paragraph 0037).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Bourges-Sevenier, Guim  and Murphy with the teaching of Mahajan because the teaching of Mahajan would improve the system of Bourges-Sevenier, Guimand Murphy by providing a Graphical User Interface (GUI) for designing processing pipelines.

	As to claim 13, see the rejection of claim 3 above. 

Claims 4 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 2019/0325314 A1 to Bourges-Sevenier et al. in view of U.S. Pub. No. 2021/0271517 A1 to Guim et al. and further in view of W.O. No. 2020185234 A1 to Murphy et al.
as applied to claims 1 and 11 above, and further in view of U.S. Pub. No. 2017/0099193 A1 to Mahajan et al.

As to claim 4, Bourges-Sevenier as modified by Guim and Murphy teaches the method of claim 1, however it is silent with reference to generating one or more user cases via text input adherent to a domain-specific language .
Mahajan teaches generating one or more user cases via text input adherent to a domain-specific language  (Domain-specific application 310). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Bourges-Sevenier, Guim and Murphy with the teaching of Mahajan because the teaching of Mahajan would improve the system of Bourges-Sevenier, Guim and Murphy by providing a Graphical User input box for data input.
As to claim 14, see the rejection of claim 4 above.

Claims 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 2019/0325314 A1 to Bourges-Sevenier et al. in view of U.S. Pub. No.  20210271517 A1 to Guim et al. and further in view of W.O. No. 2020185234 A1 to Murphy et al.
as applied to claims 1 and 11 above, and further in view of U.S. Pat. No. 11,467,835 B1 issued to Sengupta et al.

As to claim 6, Bourges-Sevenier as modified by Guim and Murphy teaches the method of claim 1, however it is silent with reference to wherein the inference engine can load and unload one or more neural network models generated from the presentation model by the neural network processing.  
	Sengupta teaches wherein the inference engine can load (load(s)/loading) (“…The inference application 213 also loads one or more models that it wants to use to the accelerator slot. In some embodiments, this load is to an inference engine 325 which then calls the model validator 327 to validate any uploaded model(s). The success or failure of that validation is provided to the inference engine. In other embodiments, the load from the inference engine 212 is to the model validator 327 which validates the model, chooses an inference engine 325 to utilize, and provides the validated model to the chosen inference engine 325. An indication of successful model loading is provided to the inference application 213…” Col. 11 Ln. 48-52) and unload one or more neural network models generated from the presentation model by the neural network processing (unloaded/unloading) (“…When the model is not longer to be used, it is unloaded via a command from the inference application 213 to the accelerator slot. In particular, the inference engine 325 is no longer provisioned to handle requests from the inference application 213…The EI interface is sized by specifying arithmetic precision (such as FP32, FP16, INT8, etc.) and computational capacity (TOPS). An EI interface API allows for loading models, making inference calls against them (such as tensor in/tensor out), and unloading models. Multiple models can be loaded via an EI interface at any given time. A model consists of (i) a description of the whole computation graph for inference, and (ii) weights obtained from training. An example of an API command for loading is as follows: S eia load-model--model-location “s3 location”--role “eiaRole”--model_name “my_model_1”--max_batch_size 16”…” Col. 11 Ln. 56-60).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Bourges-Sevenier, Guim and Murphy with the teaching of Sengupta because the teaching of Sengupta would improve the system of Bourges-Sevenier, Guim and Murphy by providing a technique for reclaiming computer resources for later use.

As to claim 16, see the rejection of claim 6 above.

Claims 7, 8, 17 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 2019/0325314 A1 to Bourges-Sevenier et al. in view of U.S. Pub. No.  20210271517 A1 to Guim et al. and further in view of W.O. No. 2020185234 A1 to Murphy et al.
as applied to claims 1 and 11 above, and further in view of U.S. Pub. No. 2018/0322365 A1 to Yehezkel et al.

As to claim 7, Bourges-Sevenier as modified by Guim and Murphy teaches the method of claim 1, however it is silent with to reference to wherein one or more computing graphs associated with the presentation model are combined in any combination of sequential, parallel, and merged combinations.  
Yehezkel teaches wherein one or more computing graphs associated with the presentation model are combined (merging of layers, such as converting consecutive layers CNN into DBN into then a single network at block 883) in any combination of sequential, parallel, and merged combinations (Step 885)(“…As illustrated in this embodiment, method 880 starts with the joint learning and/or updating stage at block 881, which then leads to stacking of models (e.g., user-dependent model, user-independent model, etc.) and merging of layers, such as converting consecutive layers CNN into DBN into then a single network at block 883 using user-independent classifier (e.g., CNN) at 885 and user-dependent classifier (e.g., DBN) at block 887… In one embodiment, method 880 continues with updating of parameters of the joint model at block 889 using user-independent training data 891 and outputting final classifier 893 and subsequently, leading to ending of method 880 with now having available the highest accuracy…” paragraphs 0212/0213).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Bourges-Sevenier, Guim and Murphy with the teaching of Yehezkel because the teaching of Yehezkel would improve the system of Bourges-Sevenier, Guim and Murphy by providing learned weights of the Deep belief network that are used to provide pre-train neural networks by determining an optimal initial set of weights for the neural network.

As to claim 8, Bourges-Sevenier as modified by Guim and Murphy teaches the method of claim 7, however it is silent with to reference to wherein the combination is used to create one or more additional presentation models.  
Yehezkel teaches wherein the combination is used to create one or more additional presentation models (Blocks 809/811/813) (“…At block 809, as illustrated, a decision is made as to whether the user-dependent learning has completed. If not, method 800 continue at block 803 with extraction features from live data. If yes, method 800 continues at block 811 with combining of the user-independent and user-dependent models into a joint model. At block 813, the joint model is updated using user-independent data. At block 815, method 800 ends or restarts with full operation…In one embodiment, upon training the user-dependent model at block 851, a determination is made as to whether the learning of the user-dependent model is completed at block 855. If not, method 840 loops back to bock 845 for classification of user-independent model and continues thereon. If not, joint learning/updating stage is triggered at block 857 as further described and illustrated with respect to FIG. 8D…” paragraphs 0198/0208).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Bourges-Sevenier, Guim and Murphy with the teaching of Yehezkel because the teaching of Yehezkel would improve the system of Bourges-Sevenier, Guim and Murphy by providing learned weights of the Deep belief network that are used to provide pre-train neural networks by determining an optimal initial set of weights for the neural network (Yehezkel paragraph 0250).

As to claim 17, see the rejection of claim 7 above.

As to claim 18, see the rejection of claim 8 above.
Claims 9 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 2019/0325314 A1 to Bourges-Sevenier et al. in view of U.S. Pub. No. 2021/0271517 A1 to Guim et al. and further in view of W.O. No. 2020185234 A1 to Murphy et al. and further in view of U.S. Pub. No. 2018/0322365 A1 to Yehezkel et al. as applied to claims 1 and 11 above, and further in view of W.O. No. 2021/177976 A1 to Anil et al 

As to claim 9, Bourges-Sevenier as modified by Guim, Murphy and Yehezkel teaches the method of claim 7, however it is silent with to reference to wherein the computing graphs are split so as to be executed on multiple compute devices attached to a network.  
Anil teaches wherein the computing graphs (computational graph) are split (subgraphs) so as to be executed on multiple compute devices attached to a network (the distributed computing system can assign a computational graph across multiple groups of hardware accelerators) (“…Depending on the number of subgraphs and the number of hardware accelerators available, the distributed computing system can assign a computational graph across multiple groups of hardware accelerators, with each hardware accelerator in a group being assigned a unique subgraph of the computational graph…In implementations in which the distributed computing system partitions the computational graph into a plurality of subgraphs and assigns each subgraph to a respective hardware accelerator, the distributed computing system is configured to identify an input subgraph of the computational graph. In this specification, an input subgraph of a computational graph is a subgraph that includes a node that represents the input operation of the computational graph, i.e., the operation that receives, as input, a data element preprocessed by a computing device according as a result of executing the preprocessing operations…”).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to modify the system of Bourges-Sevenier, Guim, Murphy and Yehezkel with the teaching of Anil because the teaching of Anil would improve the system of Bourges-Sevenier, Guim, Murphy and Yehezkel by providing a distributed computing system for executing operations of a processing pipeline among a plurality of hardware accelerators having one or more specialized processing units of special-purpose logic circuitry and that is configured to perform a specialized processing task, e.g., matrix multiplication (Anil).

As to claim 19, see the rejection of claim 9 above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
U.S. Pub. No. 2021/0034372 A1 to Yates et al. and directed to methods, systems, and devices for data processing are described. In some systems, data pipelines may be implemented to handle data processing jobs.
U.S. Pub. No. 2022/0269548 A1 to Dwivedi et al. and directed to systems, and techniques to collect performance data for one or more computations tasks executed by a plurality of nodes of a computational pipeline and enable optimization of distribution of task execution among the plurality of nodes.
U.S. Pub. No. 2021/0312674 A1 to Abroi et al. and directed to domain adaptation techniques involve training a post-processing model to optimize/correct the image-based inference output of the source domain model when applied to images of a different domain.
U.S. Pub. No. 2023/0105476 A1 to Ani et al. and directed to methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing computational graphs on distributed computing devices.
U.S. Pub. No. 2021/0327018 A1 to Carranza et al. and directed to a compute device receives a video stream to be processed through the CV pipeline and performs a first portion of the CV pipeline.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES E ANYA whose telephone number is (571)272-3757. The examiner can normally be reached Mon-Fir. 9-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KEVIN YOUNG can be reached at 571-270-3180. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHARLES E ANYA/Primary Examiner, Art Unit 2194
Read full office action
Prosecution Timeline

May 30, 2023
Application Filed
Mar 06, 2026
Response after Non-Final Action
Apr 22, 2026
Non-Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/232,559
Patent 12626272
APPLICATION PROGRAM INTERFACE SCRIPT CACHING AND BATCHING
2y 9m to grant Granted May 12, 2026
18/102,502
Patent 12619466
SYSTEM AND METHOD FOR GENERATING CONSOLIDATED RESOURCE ACCESS CONTROL DATA IN AN ELECTRONIC NETWORK
3y 3m to grant Granted May 05, 2026
18/310,851
Patent 12619481
DATA-STREAMING SYSTEM OVERLAY INFRASTRUCTURE FOR DEPLOYMENT PIPELINES
3y 0m to grant Granted May 05, 2026
18/183,635
Patent 12608731
Promoting APIs Based on Usage
3y 1m to grant Granted Apr 21, 2026
18/116,751
Patent 12591471
KNOWLEDGE GRAPH REPRESENTATION OF CHANGES BETWEEN DIFFERENT VERSIONS OF APPLICATION PROGRAMMING INTERFACES
3y 1m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
82%
Grant Probability
99%
With Interview (+33.3%)
3y 1m (~1m remaining)
Median Time to Grant
Low
PTA Risk
Based on 898 resolved cases by this examiner. Grant probability derived from career allowance rate.