Last updated: May 29, 2026

Application No. 18/090,749

OPTIMIZING CONCURRENT EXECUTION USING NETWORKED PROCESSING UNITS

Final Rejection §103

Filed

Dec 29, 2022

Priority

Nov 16, 2022 — provisional 63/425,857

Examiner

DOMAN, SHAWN

Art Unit

2183

Tech Center

2100 — Computer Architecture & Software

Assignee

Intel Corporation

OA Round

2 (Final)

Interview Optional

— +24.8% interview lift. Examiner has a relatively high allowance rate (66%); +24.8% interview lift. A written response may suffice.

Based on 278 resolved cases, 2023–2026

Examiner Intelligence

DOMAN, SHAWN View full profile →

Grants 66% — above average

Career Allowance Rate

183 granted / 278 resolved

+10.8% vs TC avg

Strong +25% interview lift

Without

With

+24.8%

Interview Lift

resolved cases with interview

Typical timeline

3y 0m

Avg Prosecution

22 currently pending

Career history

325

Total Applications

across all art units

Statute-Specific Performance

§101

2.0%

-38.0% vs TC avg

§103

75.3%

+35.3% vs TC avg

§102

9.2%

-30.8% vs TC avg

§112

12.2%

-27.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 278 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1—6, 8, and 10-25 have been amended.
Claims 1-25 have been examined.
The claim objections in the previous Office Action have been addressed and are withdrawn.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7, 9-18, and 20-25 are rejected under 35 U.S.C. 103 as being unpatentable over Non-patent literature “Reining in the Outliers in Map-Reduce Clusters using Mantri,” by Ananthanarayanan et al. (as cited by Applicant and hereinafter referred to as “Mantri”) in view of US Publication No. 2021/0011765 by Doshi et al. (previously cited and hereinafter referred to as “Doshi”). 
Regarding claims 1 and 12, taking claim 1 as representative, Mantri discloses:
a method for task management of a workload in a distributed computing environment, performed by a …processing unit, the method comprising (Mantri discloses, at § 1, a system that performs parallel execution of job tasks in clusters, which discloses a method for task management of a workload in a distributed computing environment performed by a processing unit.): 
identifying multiple tasks of a computing workload, wherein the computing workload is defined by a task graph with nodes of the task graph representing tasks and edges of the task graph representing control or data dependencies, wherein the computing workload includes processing dependencies among the tasks, and wherein two or more of the tasks are executed concurrently (Mantri discloses, at § 1, parallel execution of tasks with tasks in one phase depending on the output of tasks in a previous phase. Mantri also discloses, at § 2, jobs are represented as workflows, or directed acyclic graphs, “where each node is a phase and each edge joins a phase that produces data to another that uses it.”); 
monitoring the task graph to determine an execution time for each of the tasks (Mantri discloses, at § 1, determining that one task takes longer than others and at § 4, calculating outliers relative to the median execution time, which discloses monitoring an execution time for each of the tasks.); 
based on the monitoring, determining a remediation for a particular task (Mantri discloses, at § 1, restarting outlier tasks on different, e.g., less congested, machines. See also § 3.1.); and 
applying the remediation to increase speed of execution of the computing workload (Mantri discloses, at § 1, restarting outlier tasks on different, e.g., less congested, machines. See also § 3.1.).
Mantri does not explicitly disclose the aforementioned processing unit is a network processing unit, wherein the network processing unit comprises (i) a host interface to connect with a host processor, (ii) an operating system, (iii) a container runtime environment, and (iv) a networking interface.
However, in the same field of endeavor (e.g., execution) Doshi discloses:
 a network processing unit, wherein the network processing unit comprises (i) a host interface to connect with a host processor, (ii) an operating system, (iii) a container runtime environment, and (iv) a networking interface (Doshi discloses, at Figure 7A and related description, a processor, which can be a network processing unit (NPU). The NPU includes a host fabric interface and an operating system, uses containers, and includes a network interface. See also Figure 5 for discussion of containers.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Manti to be implemented using the NPU of Doshi in order to provide the same benefits of outlier mitigation to a broader range of device types. 

Regarding claims 2 and 13, taking claim 2 as representative, Mantri discloses the elements of claim 1, as discussed above. Mantri also discloses:
the particular task provides an input to a dependent task, and wherein the dependent task is a join point of the computing workload that receives a control input or data input from the particular task and at least one previous task of the computing workload (Mantri discloses, at § 3.2, “At barriers in the workflow, where none of the tasks in successive phase(s) can begin until all of the tasks in the preceding phase(s) finish, at barriers none of the tasks.” See also § 2, which discloses a directed acyclic graph of dependent nodes joined to producer nodes.).

Regarding claims 3 and 14, taking claim 3 as representative, Mantri discloses the elements of claim 2, as discussed above. Mantri also discloses:
the remediation is applied in response to determining that the dependent task is a join point of the computing workload (Mantri discloses, at § 3.2, outliers at barriers, i.e., join points, can prevent progress, and are therefore to be culled.).

Regarding claims 4 and 15, taking claim 4 as representative, Mantri discloses the elements of claim 2, as discussed above. Mantri also discloses:
calculating an execution time threshold for the particular task, wherein the execution time threshold is weighted by an amount of waiting time elapsed for at least one completed task to reach the join point and wait for the particular task (Mantri discloses, at § 4, calculating the median execution time for a group of tasks and then how much longer, i.e., wait time weighting, an outlier requires.).

Regarding claims 5 and 16, taking claim 5 as representative, Mantri discloses the elements of claim 1, as discussed above. Mantri also discloses:
identifying the multiple tasks of the computing workload comprises splitting the computing workload into the multiple tasks, and wherein the method further comprises distributing the multiple tasks among multiple compute locations of the distributed computing environment (Mantri discloses, at § 1, parallel execution of job tasks in clusters, which discloses identifying the multiple tasks of the workload comprises splitting the workload into the multiple tasks, and wherein the method further comprises distributing the multiple tasks among multiple compute locations of the distributed computing environment.).

Regarding claims 6 and 17, taking claim 6 as representative, Mantri discloses the elements of claim 1, as discussed above. Mantri also discloses:
the remediation includes using fallback compute infrastructure to perform at least a portion of the computing workload for at least a defined period of time (Mantri discloses, at § 1, restarting outlier tasks on different, e.g., less congested, machines. See also § 3.1 and § 4.3, which discloses idle slots.).

Regarding claims 7 and 18, taking claim 7 as representative, Mantri discloses the elements of claim 6, as discussed above. Mantri also discloses:
the use of the fallback compute infrastructure includes use of hardware-assisted resumption, to migrate the particular task from a first compute location to a second compute location in the distributed computing environment (Mantri discloses, at § 1, restarting outlier tasks on different, e.g., less congested, machines, which discloses use of hardware-assisted resumption, to migrate the particular task from a first compute location to a second compute location in the distributed computing environment. See also § 3.1.).

Regarding claims 9 and 20, taking claim 9 as representative, Mantri discloses the elements of claim 6, as discussed above. Mantri also discloses:
the use of the fallback compute infrastructure is based on a classification of the remediation, the classification provided from among a plurality of …categories according to the particular task (Mantri discloses, at § 5, selecting between different types, i.e., categories, of remedial actions.).
Mantri does not explicitly disclose the aforementioned categories are priority categories.
However, in the same field of endeavor (e.g., execution) Doshi discloses:
priority categories (Doshi discloses, at ¶ [0033] et seq., priority categories.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Mantri to include priority categories, as disclosed by Doshi, in order to prevent improve performance by increasing control over execution.

Regarding claims 10 and 21, taking claim 10 as representative, Mantri discloses the elements of claim 1, as discussed above. Mantri also discloses:
the method is performed by a first network processing unit operating as an orchestrator or scheduler of the computing workload, and wherein the remediation for the particular task is implemented with use of a second network processing unit (Mantri discloses (Mantri discloses, at § 1, restarting outlier tasks on different, e.g., less congested, machines, which discloses a first networked processing unit acting as an orchestrator and using a second networked processing unit, e.g., the nodes in the cluster.).

Regarding claims 11 and 22, taking claim 11 as representative, Mantri discloses the elements of claim 10, as discussed above. Mantri also discloses:
the particular task is executed by a first set of compute resources, and wherein the remediation includes use of a second set of compute resources associated with the second network processing unit (Mantri discloses, at § 1, restarting outlier tasks on different, e.g., less congested, machines, which discloses the particular task is executed by a first set of compute resources, and wherein the remediation includes use of a second set of compute resources associated with the second networked processing unit.).

Regarding claim 23, Mantri discloses the elements of claim 1, as discussed above. Mantri does not explicitly disclose a non-transitory machine-readable storage medium comprising information representative of instructions, wherein the instructions, when executed by processing circuitry, cause the processing circuitry to perform the method.
However, in the same field of endeavor (e.g., execution) Doshi discloses:
instructions storing executable instructions (Doshi discloses, at ¶ [0096], implementing the invention in computer readable media.). 
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Mantri to include computer readable media, as disclosed by Doshi, in order to improve performance by providing flexible implementation mechanisms.

Regarding claim 24, Mantri, as modified, discloses the elements of claim 23, as discussed above. Mantri also discloses the elements of claim 3, which correspond to those of claim 24, as disclosed above.

Regarding claim 25 Mantri, as modified, the elements of claim 23, as discussed above. Mantri also discloses the elements of claim 11, which correspond to those of claim 25, as disclosed above.

Claims 8 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Mantri in view of US Publication No. 2020/0065098 by Parandeh et al. (hereinafter referred to as “Parandeh”). 
Regarding claims 8 and 19, taking claim 8 as representative, Mantri discloses the elements of claim 6, as discussed above. Mantri also discloses:
the use of the fallback compute infrastructure includes …underutilization of the fallback compute infrastructure (Mantri discloses, at § 1, restarting outlier tasks on different, e.g., less congested, machines. See also § 3.1 and § 4.3, which discloses idle slots, i.e., underutilization.).
Mantri does not explicitly disclose use of a deferred execution arrangement for at least one task in the workload that does not have dependencies, and wherein the use of the deferred execution arrangement is coordinated.
However, in the same field of endeavor (e.g., execution) Parandeh discloses:
deferring execution of divergent iterations that exceed a time threshold (Parandeh discloses, at Figures 4A and 4B and related description, deferring execution when a particular divergent iteration exceeds a time threshold.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify Mantri to include deferred execution for divergent instructions, as disclosed by Parandeh, in order to improve performance by providing an additional mechanism to prevent bottlenecks in parallel execution. See Parandeh, ¶ [0005].

Response to Arguments
On pages 10-11 of the response filed April 7, 2026 (“response”), the Applicant argues, “The Examiner maps Mantri's cluster nodes to the claimed "network processing unit." Applicant respectfully submits that this mapping is inapplicable. As discussed in Mantri, the Mantri software runs live on production clusters that include thousands of servers. These server clusters are commodity server-class multi-core machines running a centralized software scheduler - in this case, Cosmos. 1 The servers in these clusters are general-purpose compute nodes - each machine runs a single operating system instance, executes tasks assigned by the centralized job scheduler, and reports progress back to that job scheduler via software-based progress reports.²  By contrast, as described in the present specification and illustrated in FIG. 4, a network processing unit such as an IPU (410) is a structurally distinct device from the host compute platform (420) to which it is attached. The IPU connects to the host compute platform via a host interface (e.g., PCIe or CXL interconnect), and operates as a separate compute device with its own processing cores (417), its own memory (418), and other elements, for example, such as its own operating system and cloud-native platform (414), its own container runtime (416), dedicated network circuitry (413), and acceleration functions provided by an ASIC (411) or FPGA (412) 3 The IPU sits between the network and the compute platform, sees all traffic directed to the compute platform, and can intercept, process, or transform that traffic. The IPU further participates in a distributed mesh network⁴ in which multiple IPUs coordinate with each other through peer discovery and attestation,⁵ aggregate monitoring data from peer IPUs at other nodes, and implement remediation by invoking compute resources at remote locations through IPU-to-IPU communication.  None of these architectural features - a compute device separate from the compute platform with attributes like its own OS and container runtime, dedicated acceleration circuitry, network traffic interception, peer discovery and attestation, or inter-IPU aggregation of monitoring data - are present in Mantri's commodity cluster. Mantri's general-purpose servers, managed by a centralized software scheduler, do not constitute IPUs or network processing units as recited in the claims and described in the specification.  Thus, because Mantri does not disclose all elements of claim 12 in the claimed arrangement, Applicant respectfully submits that a prima facie case of anticipation cannot be established.” 
These remarks have been fully considered and, in light of the claim amendments presented in the response, are deemed persuasive, in part. Please see above for new grounds of rejection of the amended claims. That is, Doshi discloses a NPU having the characteristics claimed. See, e.g., Figures 5 and 7A and related description. 
The Examiner notes that a number of the elements argued are not claimed, e.g., structurally distinct, distributed mesh. However, those elements are likely disclosed by Doshi as well. Finally, the Examiner notes that the OS and container runtime environment added to claims 1 and 23 and argued by the Applicant as potential points of patentable distinction are omitted from claim 12. The Examiner surmises that this may have been inadvertent. 

On page 11 of the response the Applicant argues the remaining claims are not taught by the cited art for similar reasons as above. 
Though fully considered, the Examiner respectfully disagrees. The reasons set forth in the remarks and rejections presented above, including those regarding the independent claims, are applicable to these claims.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAWN DOMAN whose telephone number is (571)270-5677.  The examiner can normally be reached on Monday through Friday 8:30am-6pm Eastern Time.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on 571-270-3995.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/SHAWN DOMAN/
Primary Examiner, Art Unit 2183

Read full office action

Prosecution Timeline

Dec 29, 2022

Application Filed

Feb 08, 2023

Response after Non-Final Action

Jan 08, 2026

Non-Final Rejection mailed — §103

Apr 07, 2026

Response Filed

May 04, 2026

Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/474,728

Patent 12639076

PROCESSOR ARCHITECTURE FOR OPTIMIZED PARALLELIZED SEARCH

2y 8m to grant Granted May 26, 2026

18/740,430

Patent 12639070

Processing of Synchronization Barrier Instructions

1y 11m to grant Granted May 26, 2026

17/359,039

Patent 12619434

Programmable Fabric-Based Instruction Set Architecture for a Processor

4y 10m to grant Granted May 05, 2026

18/123,604

Patent 12619553

METHOD AND APPARATUS TO SORT A VECTOR FOR A BITONIC SORTING ALGORITHM

3y 1m to grant Granted May 05, 2026

18/050,673

Patent 12613732

Initialisation of Worker Threads and Associated Operand Registers

3y 6m to grant Granted Apr 28, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

66%

Grant Probability

91%

With Interview (+24.8%)

3y 0m (~0m remaining)

Median Time to Grant

Moderate

PTA Risk

Based on 278 resolved cases by this examiner. Grant probability derived from career allowance rate.