Last updated: April 19, 2026

Application No. 18/060,112

SYSTEM AND METHOD FOR MANAGING INFERENCE MODEL PERFORMANCE THROUGH INFERENCE GENERATION PATH RESTRUCTURING

Non-Final OA §103§Other

Filed

Nov 30, 2022

Examiner

KEATON, SHERROD L

Art Unit

2148

Tech Center

2100 — Computer Architecture & Software

Assignee

DELL PRODUCTS, L.P.

OA Round

1 (Non-Final)

This examiner grants 52% of cases after interview

— +36.1% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 563 resolved cases, 2023–2026

Examiner Intelligence

KEATON, SHERROD L View full profile →

Grants 52% of resolved cases

Career Allow Rate

295 granted / 563 resolved

-2.6% vs TC avg

Strong +36% interview lift

Without

With

+36.1%

Interview Lift

resolved cases with interview

Typical timeline

4y 6m

Avg Prosecution

32 currently pending

Career history

595

Total Applications

across all art units

Statute-Specific Performance

§101

14.9%

-25.1% vs TC avg

§103

62.0%

+22.0% vs TC avg

§102

11.1%

-28.9% vs TC avg

§112

8.0%

-32.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 563 resolved cases

Office Action

§103 §Other

DETAILED ACTION
This action is in response to the original filing of 11-30-2022. Claims 1-20 are pending and have been considered below:

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 is/are rejected under 35 U.S.C. 103 as being 
unpatentable over Liu et al. (“Liu” 20200193218 A1) in view of “Model-driven Cluster resource management for AI workload in edge clouds”, Liang et al. (“Liang”), pages 1-25, 7-1-2022 and “A systematic literature review on distributed machine learning in edge computing”, Filho et al. (“Filho”) pages 1-36 3-30-2022. 


Claim 1: Liu discloses a method of managing execution of an inference model hosted by data processing systems, the method comprising: 
identifying that a first data processing system of the data processing systems has a level of risk of failing to execute a portion of the inference model that is above a threshold (Paragraph 28; latency threshold analysis); based on the identification: 
performing an inference generation path analysis for the first data processing system to identify whether the first data processing system is an inference model bottleneck (Paragraph 44; performance monitoring for bottleneck); 
automatically initiating re-deployment of the inference model in response to a failure of the first data processing system based on the execution plan to obtain a re-deployed inference model; and generating, using the re-deployed inference model, an inference (Paragraph 45; model retrained and re-exposed/deployed); 
Liu discloses some features regarding an instance of the inference generation path analysis where the first data processing system is the inference model bottleneck: obtaining a deployment plan that distributes multiple redundant instances of the inference model so that only one of the instances of the inference model is hosted by the first data processing system (Paragraphs 29-30 scheduler can determines which model included in a batch  44; scheduling efficiencies determined to address model utilization);
Additionally, Liang is disclosed because it provides a workload management functionality (abstract) that further determines latency (bottleneck)issues and provides deployments so that the load is only provided to a node (Section 4.2 and 4.4.6).
Therefore it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to apply a known technique to a known device ready for improvement and incorporate the migration functionality with the scheduler of Liu. One would have been motivated to provide the functionality because this method can ensure the most optimal response time for improve system operability. 
Liu discloses some features regarding, obtaining an execution plan for responding to a failure of the first data processing system, the execution plan ensuring that one or more other inference model bottlenecks are not formed when the only one of the instances of the inference model is re-deployed across the data processing systems based on the execution plan (Paragraphs 28-30; latency threshold determined, and distribution provided to different batches if needed); deploying the inference model across the data processing systems based on the deployment plan(Paragraphs 28-30; different batches used according to batch determination);
Also, Filho is disclosed because it provides a workload management functionality (abstract) that further provides obtaining a execution plan for offloading due to high latency (Page 2, Bullet 1, Section 4.2.5 and Page 22, Table 8: model partitioning right sizing).
Therefore it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to apply a known technique to a known device ready for improvement and evaluate execution plans with the scheduler of Liu. One would have been motivated to provide the functionality because this method can ensures better deployment by reducing computation latency for improve system operability. 

Claim 2: Liu, Liang and Filho disclose a method of claim 1, wherein performing the inference generation path analysis comprises: identifying one or more portions of the inference model hosted by the first data processing system; making an identification of an inference generation path associated with each of the one or more portions of the inference model hosted by the first data processing system; and in an instance of the identification where there is more than one inference generation path associated with the first data processing system: identifying the first data processing system as the inference model bottleneck (Liang: Section 4.1, 4.2 and 4.4.6; system evaluates workload hotspot to determine latency effects). 
 
Claim 3: Liu, Liang and Filho disclose method of claim 2, wherein a failure of the inference model bottleneck prevents timely execution of one or more redundant instances of the inference model (Liu: Paragraph 44; bottleneck of model and Liang: Section 4.1, 4.2 and 4.4.6; system evaluates workload hotspot to determine latency effects).  

Claim 4: Liu, Liang and Filho disclose method of claim 3, wherein the failure of the inference model bottleneck prevents timely execution of all of the redundant instances of the inference model deployed across the data processing systems(Liu: Paragraph 44; all bottlenecks can be evaluated and Liang: Section 4.1, 4.2 and 4.4.6; system evaluates workload hotspot to determine latency effects).  

Claim 5: Liu, Liang and Filho disclose method of claim 2, wherein the inference generation path comprises: a listing of instances of each of the portions of the inference model usable to generate an inference model result; and an ordering of the listing of the instances (Liu: Paragraph 44; ranker would provide a list and Filho: Section 4.2.2 model partitioning causes models to be segmented into successive parts (Fig. 3 listing of instance)).  
 
Claim 6: Liu, Liang and Filho disclose method of claim 5, wherein obtaining the deployment plan comprises: identifying the one or more portions of the inference model hosted by the first data processing system; identifying a second data processing system, the second data processing system currently not hosting any portions of the inference model; obtaining an updated inference generation path for one of the portions of the inference model hosted by the first data processing system based on the second data processing system; and obtaining inference generation instructions for the data processing systems that are members of the updated inference generation path(Liu: Paragraph 28-30; determine which batches are not processing and 44; all bottlenecks can be evaluated and Liang: Section 4.1, 4.2 and 4.4.6; system evaluates workload hotspot to determine latency effects).    
Claim 7: Liu, Liang and Filho disclose method of claim 6, wherein the inference generation instructions indicate a processing result transmission destination for each of the data processing system that are members of the updated inference generation path (Liu: Paragraphs 29-30; scheduler can determines which model included in a batch and 44-45; scheduling efficiencies determined to address model utilization and Liang: Section 4.1, 4.2 and 4.4.6; system evaluates workload hotspot to determine latency effects). 

Claim 8: Liu, Liang and Filho disclose method of claim 7, wherein the deployment plan ensures that each data processing system of the data processing systems is part of only one inference generation path for the inference model(Liu: Paragraphs 29-30 scheduler can determines which model included in a batch 44-45; scheduling efficiencies determined to address model utilization). 
 
Claim 9: Liu, Liang and Filho disclose method of claim 8, wherein deploying the inference model across the data processing systems based on the deployment plan comprises: configuring the data processing systems that are members of the updated inference generation path to forward processing results based on the inference generation instructions (Liu: Paragraphs 28-30 (scheduler determines batches for deployment) 44; all bottlenecks can be evaluated and Liang: Section 4.1.1 (group placement for models determined), 4.2 and 4.4.6; system evaluates workload hotspot to determine latency effects). 

Claim 10: Liu, Liang and Filho disclose method of claim 9, wherein the execution plan indicates a failover inference generation path for an instance of the inference model hosted by the second data processing system (Liu: Paragraphs 28 (latency which leads to failure of batches can be determined) and 44; performance monitoring would evaluate failure and Liang: Section 4.1, 4.2 and 4.4.6; system evaluates workload hotspot to determine latency effects).  
 
Claim 11: Liu, Liang and Filho disclose method of claim 10, wherein the failover inference generation path comprises: an updated listing of the instances of each of the portions of the inference model usable to generate the inference model result, the updated listing indicating replacement of the first data processing system with a third data processing system responsive to failure of the first data processing system, and the third data processing system not hosting any portion of the inference model prior to the failure of the first data processing system (Liu: Paragraph 44; all bottlenecks can be evaluated and Liang: Section 4.1-4.1.1 (groupings are evaluated), 4.2 and 4.4.6; system evaluates workload hotspot to determine latency effects).

Claim 12: Liu, Liang and Filho disclose method of claim 11, wherein automatically initiating re-deployment of the inference model comprises: identifying the failure of the first data processing system; identifying the failover inference generation path based on the execution plan; and re-deploying the inference model based on the failover inference generation path (Liu: Paragraph 44-45; all bottlenecks can be evaluated and re-exposed and Liang: Section 4.1, 4.2 and 4.4.6; system evaluates workload hotspot to determine latency effects). 
  
Claim 13: Liu, Liang and Filho disclose method of claim 12, wherein re-deploying the inference model comprises: deploying the portion of the inference model hosted by the first data processing system to the third data processing system; and transmitting updated inference generation instructions to the data processing systems, the updated inference generation instructions being based, at least in part, on the failover inference generation path(Liu: Paragraphs 44-45; performance monitoring would evaluate fail of any batch and Liang: Section 4.1, 4.2 and 4.4.6; system evaluates workload hotspot to determine latency effects).  
Claims 14 and 18 are similar in scope to claim 1 and therefore rejected under the same rationale.   
Additionally, 
regarding the non-transitory machine-readable medium of claim 14 (Liu: Paragraph 59; computer readable storage media)
and a processor; and a memory coupled to the processor to store instructions, which when executed by the processor, cause the processor to perform operations of claim 18 (Liu: Paragraphs 58-59; processor and memory)


Claims 15 and 19 are similar in scope to claim 2 and therefore rejected under the same rationale.  

Claims 16 and 20 are similar in scope to claim 3 and therefore rejected under the same rationale.   

Claim 17 is similar in scope to claim 4 and therefore rejected under the same rationale.   

	Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure:

20200249936 A1 0027
Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
In the interests of compact prosecution, Applicant is invited to contact the examiner via electronic media pursuant to USPTO policy outlined MPEP § 502.03.  All electronic communication must be authorized in writing.  Applicant may wish to file an Internet Communications Authorization Form PTO/SB/439.  Applicant may wish to request an interview using the Interview Practice website: http://www.uspto.gov/patent/laws-and-regulations/interview-practice.
Applicant is reminded Internet e-mail may not be used for communication for matters under 35 U.S.C. § 132 or which otherwise require a signature.  A reply to an Office action may NOT be communicated by Applicant to the USPTO via Internet e-mail. If such a reply is submitted by Applicant via Internet e-mail, a paper copy will be placed in the appropriate patent application file with an indication that the reply is NOT ENTERED. See MPEP § 502.03(II).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHERROD KEATON whose telephone number is 571-270-1697.  The examiner can normally be reached 9:30am to 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool.  To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor MICHELLE BECHTOLD can be reached at 571-431-0762.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center.  Unpublished application information in Patent Center is available to registered users.  To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.  Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SHERROD L KEATON/     Primary Examiner, Art Unit 2148
1-30-2026

Read full office action

Prosecution Timeline

Nov 30, 2022

Application Filed

Feb 06, 2026

Non-Final Rejection — §103, §Other (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/188,232

Patent 12566823

SYSTEMS AND METHODS FOR INTERPOLATIVE CENTROID CONTRASTIVE LEARNING

2y 5m to grant Granted Mar 03, 2026

17/674,355

Patent 12547820

Automated Generation Of Commentator-Specific Scripts

2y 5m to grant Granted Feb 10, 2026

17/375,728

Patent 12530587

SYSTEMS AND METHODS FOR CONTRASTIVE LEARNING WITH SELF-LABELING REFINEMENT

2y 5m to grant Granted Jan 20, 2026

18/517,825

Patent 12524147

Modality Learning on Mobile Devices

2y 5m to grant Granted Jan 13, 2026

18/609,638

Patent 12524603

METHODS FOR RECOGNIZING AND INTERPRETING GRAPHIC ELEMENTS

2y 5m to grant Granted Jan 13, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

52%

Grant Probability

88%

With Interview (+36.1%)

4y 6m

Median Time to Grant

Low

PTA Risk

Based on 563 resolved cases by this examiner. Grant probability derived from career allow rate.