Last updated: May 29, 2026

Application No. 18/257,284

SPLIT NEURAL NETWORK ACCELERATION ARCHITECTURE SCHEDULING AND DYNAMIC INFERENCE ROUTING

Non-Final OA §102§103

Filed

Jun 13, 2023

Priority

Feb 25, 2021 — IN 202141007909 +2 more

Examiner

WAI, ERIC CHARLES

Art Unit

2195

Tech Center

2100 — Computer Architecture & Software

Assignee

Qualcomm Incorporated

OA Round

1 (Non-Final)

Interview Optional

— +27.1% interview lift. Examiner has a relatively high allowance rate (82%); +27.1% interview lift. A written response may suffice.

Based on 645 resolved cases, 2023–2026

Examiner Intelligence

WAI, ERIC CHARLES View full profile →

Grants 82% — above average

Career Allowance Rate

530 granted / 645 resolved

+27.2% vs TC avg

Strong +27% interview lift

Without

With

+27.1%

Interview Lift

resolved cases with interview

Typical timeline

3y 8m

Avg Prosecution

18 currently pending

Career history

671

Total Applications

across all art units

Statute-Specific Performance

§101

3.1%

-36.9% vs TC avg

§103

85.1%

+45.1% vs TC avg

§102

4.9%

-35.1% vs TC avg

§112

3.1%

-36.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 645 resolved cases

Office Action

§102 §103

DETAILED ACTION
Claims 1-20 are presented for examination.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-2 and 8-9 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Verrilli et al. (US PG Pub No. 2020/0250545 A1).
Verrilli was disclosed in IDS dated 06/13/2023.

Regarding claim 1, Verrilli teaches a processor implemented method for accelerating machine learning on a computing device (Fig 3; [0041], "neural network acceleration architecture 300"; Fig 7; [0060], "method for accelerating machine learning on a computing device”), comprising: accessing a neural network ([0041], "operate a neural network"); 
splitting the neural network into N sub-neural networks ([0042], wherein "the neural network of the host application 320 is split across the first AIIA 330 and the second AIIA 340 … the intermediate inference request results 308 are routed between the first AIIA 330 and the second AIIA 340");
hosting the N sub-neural networks in M inference accelerators ([0042], wherein "the neural network of the host application 320 is hosted in the first AIIA 330 and the second AIIA 340. ");
scheduling the N sub-neural networks in the M inference accelerators ([0043], wherein "the inference request result 306 is provided to the host processor 310 through a designated inference accelerator of the first AIIA 330 and the second AIIA 340." This designation amounts to a scheduling of the two sub-neural networks in the two accelerators.); and
executing the N sub-neural networks in the M inference accelerators ([0043], wherein  "the inference request result 306 may be generated by the first AIIA 330, in which the inference request result 306 is based on the intermediate inference request results 308 received from the second AIIA 340.").

Regarding claim 2, Verrilli teaches setting a data path and a control path of a first sub-neural network of the N sub-neural networks in a first inference accelerator of the M inference accelerators; and setting a data path and a control path of a second sub-neural network of the N sub-neural networks in a second inference accelerator of the M inference accelerators (Fig 4; [0047]).

Regarding claims 8-9, they are the medium claims of claims 1-2 above. Therefore, they are rejected for the same reasons as claims 1-2 above. 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 5 and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verrilli et al. (US PG Pub No. 2020/0250545 A1).

Regarding claim 5, Verrilli does not teach assigning a unique port identification to each input port and output port of the first inference accelerator and the second inference accelerator.
However, it is old and well known to assign identifiers or labels to various components when designing/implementing accelerators as a network of accelerators. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to assign a unique port identification to each input port and output port of the first inference accelerator and the second inference accelerator. One would be motivated by the desire to ensure that data is routed correctly.

Regarding claims 12, it is the medium claim of claim 5 above. Therefore, it is rejected for the same reasons as claim 5 above. 


Claim(s) 6-7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verrilli et al. (US PG Pub No. 2020/0250545 A1) further in view of Zejda et al. (US Pat No. 12,093,806).

 Regarding claims 6-7, Verrilli does not teach in which N=M or N<M and N and M are integers greater than one.
Zejda teaches splitting a neural network into different subgraphs and assigning the subgraphs to processing units (Fig 4; col 6 lines 56-62). Zejda further teaches that there may be a surplus available processing units (col 8 lines 27-39) or more subgraphs than processing units can be created (col 13 lines 9-29). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention that N=M or N<M where N and M are integers greater than one. Where the general conditions of a claim are disclosed in the prior art, it is not inventive to discover the optimum or workable ranges by routine experimentation, In re Aller 220 F.2d 454, 456, 105 USPQ 233, 235 (CCPA 1955), see MPEP 2144.05, II.


Claim(s) 3-4, 10-11, and 13-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verrilli et al. (US PG Pub No. 2020/0250545 A1) in view of Yang (US PG Pub No. 2020/0026997 A1). 
Yang was disclosed in IDS dated 06/13/2023.

Regarding claim 3, Verrilli does not teach monitoring the first inference accelerator and the second inference accelerator; and dynamically adjusting the data path and/or the control path of the first sub-neural network and/or the second sub-neural network according to a predetermined performance criteria.
Yang teaches a method for accelerated machine learning wherein when a first accelerator becomes available and has a higher priority than a second accelerator for performing a task currently being performed on the second AI accelerator (Fig. 10-12;  [0101]-[0115]), for example a higher priority for low- power consumption, the task is assigned to the first accelerator and the information regarding the resources and communication schemes ([0030]-[0033]) required for performing the task are updated. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement the dynamic scheduling described in Yang, by using the switch device of Verrilli to update the communication scheme related to the task. For example, when AIIA 340 is computing intermediate results to be passed to AIIA 330 for computing final results based on the intermediate results and a third AIIA allowing lower power consumption than the AIIA 330 becomes available, the skilled artisan would configure the switch device to appropriately update the output port of AIIA 340 so that the intermediate results are routed to the third AIIA. One would be motivated by the desire to provide better control over power consumption and temperature of the inference accelerators.

Regarding claim 4, Verrilli and Yang do not teach identifying a peer output port in response to dynamically adjusting the data path and/or the control path; and waking the peer output port to dynamically adjust the data path and/or the control path.
However, it is old and well known to switch off unused peripherals in a computing device for reducing power consumption. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to switch on the output port to be used when updating a communication scheme. 

Regarding claims 10-11, they are the medium claims of claims 3-4 above. Therefore, they are rejected for the same reasons as claims 3-4 above. 

Regarding claim 13, Verrilli teaches a method for 
splitting a neural network into a first sub-neural network and a second sub-neural network (Fig. 4; [0060], wherein "a neural network is split between the first AIIA 430 and the second AIIA 440 … intermediate inference request results are routed between the first inference accelerator and the second inference accelerator."); 
hosting the first sub-neural network in a first inference accelerator and the second sub-neural network in a second inference accelerator ([0060], wherein "a neural network is hosted in a first inference accelerator and a second inference accelerator"); 
monitoring the first inference accelerator and the second inference accelerator (Fig. 3; [0045], wherein "the switch device 302 supports peer-to-peer communication between the first AIIA 330 and the second AIIA 340 to enable direct memory access (DMA) transfer without intervention from the host processor 310." - this peer-to-peer communication through the switch device implies a monitoring by the switch device of the two accelerators.); and 

Verrilli does not teach that the inference routing is dynamic and the method includes a step of performing an output redirection decision of the first sub-neural network and/or the second sub-neural network according to a predetermined performance criteria. 
Yang, at Figs 10-12; [0101]-[0115], teaches a method for accelerated machine learning wherein when a first accelerator becomes available and has a higher priority than a second accelerator for performing a task currently being performed on the second AI accelerator, for example a higher priority for low- power consumption, the task is assigned to the first accelerator and the information regarding the resources and communication schemes ([0030]-[0033]) required for performing the task are updated. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement the dynamic scheduling described in Yang, by using the switch device of Verrilli to update the communication scheme related to the task. For example, when AIIA 340 is computing intermediate results to be passed to AIIA 330 for computing final results based on the intermediate results and a third AIIA allowing lower power consumption than the AIIA 330 becomes available, the skilled artisan would configure the switch device to appropriately update the output port of AIIA 340 so that the intermediate results are routed to the third AIIA. One would be motivated by the desire to provide better control over power consumption and temperature of the inference accelerators.

Regarding claim 14, Verrilli and Yang do not teach determining whether a platform software redirection is detected prior to choosing an output port; and waking a peer output port in response to detecting the platform software redirection.
However, it is old and well known to switch off unused peripherals in a computing device for reducing power consumption. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to switch on the output port to be used when updating a communication scheme. 

Regarding claim 15, Yang teaches dynamically scheduling the first sub-neural network in the first inference accelerator and the second sub-neural network in the second inference accelerator (Figs 10-12; [0101]-[0115]).

Regarding claim 16, Yang teaches adjusting a data path and a control path of the first sub-neural network in the first inference accelerator at runtime; and adjusting a data path and a control path of the second sub-neural network in the second inference accelerator at runtime (Figs 10-12; [0101]-[0115])

Regarding claim 17, Yang teaches in which the predetermined performance criteria comprises a power consumption ([0105]; [0112]), a temperature, and/or a load of the first inference accelerator and/or the second inference accelerator.

Regarding claims 18-20, they are the medium claims of claims 13-16 above. Therefore, they are rejected for the same reasons as claims 13-16 above. 


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC C WAI whose telephone number is (571)270-1012. The examiner can normally be reached Monday - Friday 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached at (571) 272-4169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Eric C Wai/Primary Examiner, Art Unit 2195

Read full office action

Prosecution Timeline

Jun 13, 2023

Application Filed

Apr 03, 2026

Non-Final Rejection mailed — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/904,824

Patent 12639093

VIRTUALIZATION METHOD, DEVICE, BOARD CARD AND COMPUTER-READABLE STORAGE MEDIUM

3y 9m to grant Granted May 26, 2026

17/922,277

Patent 12632319

SYSTEMS, APPARATUS, AND METHODS TO CONFIGURE HARDWARE BASED ON APPLICATION RATIOS

3y 6m to grant Granted May 19, 2026

18/607,953

Patent 12632301

DATA PROCESSING ACROSS APPLICATIONS IN CLOUD ENVIRONMENTS

2y 2m to grant Granted May 19, 2026

18/153,460

Patent 12608229

CONTROL SYSTEM AND REQUEST PROCESSING METHOD IN CONTROL SYSTEM

3y 3m to grant Granted Apr 21, 2026

17/821,543

Patent 12602261

CONTAINER SCHEDULING ACCORDING TO PREEMPTING A SET OF PREEMPTABLE CONTAINERS DEPLOYED IN A CLUSTER

3y 7m to grant Granted Apr 14, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

82%

Grant Probability

99%

With Interview (+27.1%)

3y 8m (~9m remaining)

Median Time to Grant

Low

PTA Risk

Based on 645 resolved cases by this examiner. Grant probability derived from career allowance rate.