Last updated: April 19, 2026
Application No. 18/184,742
COLLABORATIVE INFERENCE METHOD AND COMMUNICATION APPARATUS

Non-Final OA §101§102§103
Filed
Mar 16, 2023
Examiner
STANLEY, JEREMY L
Art Unit
2127
Tech Center
2100 — Computer Architecture & Software
Assignee
Huawei Technologies Co., Ltd.
OA Round
1 (Non-Final)
This examiner grants 48% of cases after interview

— +44.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 276 resolved cases, 2023–2026
Examiner Intelligence

STANLEY, JEREMY L View full profile →
Grants 48% of resolved cases
Career Allow Rate
131 granted / 276 resolved
-7.5% vs TC avg
Strong +45% interview lift
Without
With
+44.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
28 currently pending
Career history
304
Total Applications
across all art units
Statute-Specific Performance

§101
10.2%
-29.8% vs TC avg
§103
49.1%
+9.1% vs TC avg
§102
13.5%
-26.5% vs TC avg
§112
17.1%
-22.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 276 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the Application filed on March 16, 2023.  Claims 1-20 are pending in the case.  Claims 1, 9, and 16 are the independent claims.  
This action is non-final.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea (mental steps) without significantly more. This judicial exception is not integrated into a practical application because any additional elements amount to implementing the abstract idea on a generic computer. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Regarding independent claims 1 and 11, and relying on the evaluation flowchart in MPEP 2106: 
Step 1 (Is the claim to a process, machine, manufacture, or composition of matter?): Yes.  Claims 1, 9, and 16 each recite a collaborative inference apparatus (machine).  
Step 2a Prong One (Does the claim recite an abstract idea?): Yes.  
Claim 1 recites determine a first inference result and determined based on the first inference result (a mental determination).
Claim 9 recites determined based on the first inference information and determining a target inference result (a mental determination).
Claim 16 recites determined based on all information about a first inference result and determined based on the third inference information (a mental determination).
Under the broadest reasonable interpretation, these steps may be performed mentally, using mental observation and mental determination, including by a human using a physical aid such as pen and paper, including a human mentally performing observations and mentally performing mathematical calculations, and therefore correspond to the Mental Processes grouping.
Step 2a Prong Two (Does the claim recite additional elements that integrate the judicial exception into a practical application?): No.  
Claim 1 additionally recites:
a transceiver, at least one processor, and one or more memories coupled to the at least one processor and storing programming instructions, when executed by the at least one processor cause the apparatus to perform recited steps (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
based on a first machine learning (ML) submodel, wherein the first ML submodel is part of an ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
send the first inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g));
receive a target inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Claim 9 additionally recites:
a transceiver, at least one processor, and one or more memories coupled to the at least one processor and storing programming instructions, when executed by the at least one processor cause the apparatus to perform recited steps (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
receive first inference information from a terminal device, wherein the first inference information comprises all information or partial information of a first inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g));
the first inference result is an inference result of a first machine learning (ML) submodel, and the first ML submodel is part of an ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
send second inference information to a second network device, wherein the second inference information is for determining a target inference result of the ML model, or the second inference information is the target inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)).
Claim 16 additionally recites:
a transceiver, at least one processor, and one or more memories coupled to the at least one processor and storing programming instructions, when executed by the at least one processor cause the apparatus to perform recited steps (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
obtain third inference information (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
the first inference result is an inference result obtained after an operation is performed based on a first machine learning (ML) submodel, and the first ML submodel is part of an ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
send a target inference result to a terminal device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Therefore, in view of the considerations set forth in MPEP 2106.04(d), 2106.05(a)-(c) and (e)-(h), the additional elements as disclosed above alone or in combination do not integrate the judicial exception into a practical application as they are mere insignificant extra solution activity, combined with implementing the abstract idea using generic computer components.
Step 2b (Does the claim recite additional elements that amount to siqnificantly more than the judicial exception): No.  Relying on the same analysis as Step 2a Prong Two (see MPEP 2106.05.I.A:  Limitations that the courts have found not to be enough to qualify as “significantly more” when recited in a claim with a judicial exception include:…Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, e.g., a limitation indicating that a particular function such as creating and maintaining electronic records is performed by a computer, as discussed in Alice Corp., 573 U.S. at 225-26, 110 USPQ2d at 1984 (see MPEP 2106.05(f));…Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception...; Adding insignificant extra-solution activity to the judicial exception, as discussed in MPEP 2106.05(g);…)), claims 1 and 11 do not recite any additional elements that amount to significantly more than the abstract idea.  
Claim 1 additionally recites:
a transceiver, at least one processor, and one or more memories coupled to the at least one processor and storing programming instructions, when executed by the at least one processor cause the apparatus to perform recited steps (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
based on a first machine learning (ML) submodel, wherein the first ML submodel is part of an ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
send the first inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g));
receive a target inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Claim 9 additionally recites:
a transceiver, at least one processor, and one or more memories coupled to the at least one processor and storing programming instructions, when executed by the at least one processor cause the apparatus to perform recited steps (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
receive first inference information from a terminal device, wherein the first inference information comprises all information or partial information of a first inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g));
the first inference result is an inference result of a first machine learning (ML) submodel, and the first ML submodel is part of an ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
send second inference information to a second network device, wherein the second inference information is for determining a target inference result of the ML model, or the second inference information is the target inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)).
Claim 16 additionally recites:
a transceiver, at least one processor, and one or more memories coupled to the at least one processor and storing programming instructions, when executed by the at least one processor cause the apparatus to perform recited steps (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
obtain third inference information (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
the first inference result is an inference result obtained after an operation is performed based on a first machine learning (ML) submodel, and the first ML submodel is part of an ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f));
send a target inference result to a terminal device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
The additional elements as discussed above, in combination with the abstract idea, are not sufficient to amount to significantly more than the judicial exception as they are well, understood, routine and conventional activity as disclosed in combination with generic computer functions and components used to implement the abstract idea.
Regarding dependent claim 2:
Step 2a Prong One:  incorporates the rejection of claim 1.  The claim further recites determined based on all the information about the first inference result (a mental determination).
Step 2a Prong Two:  the claims additionally recite 
wherein when the apparatus accesses a first network device before determining the first inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)),
the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
send all information about the first inference result to the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
receive the target inference result from the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite 
wherein when the apparatus accesses a first network device before determining the first inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)),
the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
send all information about the first inference result to the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
receive the target inference result from the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Regarding dependent claim 3:
Step 2a Prong One:  incorporates the rejection of claim 2.  
Step 2a Prong Two:  the claim further recites
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
receive information about the first ML submodel from the first network device  (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)).
Step 2b:  the claim further recites
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
receive information about the first ML submodel from the first network device  (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)).
Regarding dependent claim 4:
Step 2a Prong One:  incorporates the rejection of claim 3.  The claim further recites determine the first ML submodel based on the first target indication information and the correspondence between the first candidate indication information and the first segmentation location (mental process of determination)
Step 2a Prong Two:  the claims additionally recite
wherein the information about the first ML submodel comprises first target indication information, and the programming instructions, when executed by the at least one processor, further cause the apparatus to  (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
receive first model information from the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the first model information comprises a correspondence between first candidate indication information and a first segmentation location; at least one piece of first candidate indication information and at least one first segmentation location are provided; and one piece of first candidate indication information indicates to segment the ML model, and a location at which the ML model is segmented is a first segmentation location that has a correspondence with the one piece of first candidate indication information  (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite
wherein the information about the first ML submodel comprises first target indication information, and the programming instructions, when executed by the at least one processor, further cause the apparatus to  (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
receive first model information from the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the first model information comprises a correspondence between first candidate indication information and a first segmentation location; at least one piece of first candidate indication information and at least one first segmentation location are provided; and one piece of first candidate indication information indicates to segment the ML model, and a location at which the ML model is segmented is a first segmentation location that has a correspondence with the one piece of first candidate indication information  (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Regarding dependent claim 5:
Step 2a Prong One:  incorporates the rejection of claim 4.  The claim further recites  determining the information about the first ML submodel (mental determination).
Step 2a Prong Two:  the claims additionally recite 
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to  (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
send inference requirement information to the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the inference requirement information comprises information about a time at which the apparatus obtains the target inference result; and the inference requirement information is (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite that the performance is of a machine learning model with which the gradient vectors are associated (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Regarding dependent claim 6:
Step 2a Prong One:  incorporates the rejection of claim 1; the claims further recite determined based on the first partial information and the second partial information (a mental process of evaluation/determination).
Step 2a Prong Two:  the claims additionally recite
wherein when the apparatus accesses a first network device before sending the first inference result, and accesses a second network device in a process of sending the first inference result by the apparatus (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
and the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
send first partial information about the first inference result to the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)); 
send second partial information about the first inference result to the second network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)); and 
receive the target inference result from the second network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite
wherein when the apparatus accesses a first network device before sending the first inference result, and accesses a second network device in a process of sending the first inference result by the apparatus (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
and the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
send first partial information about the first inference result to the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)); 
send second partial information about the first inference result to the second network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)); and 
receive the target inference result from the second network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Regarding dependent claim 7:
Step 2a Prong One:  incorporates the rejection of claim 1; the claim further recites determined based on all the information about the first inference result (a mental process of evaluation/determination).
Step 2a Prong Two:  the claims additionally recite
wherein when the apparatus accesses a first network device before sending the first inference result, and the apparatus accesses a second network device after sending the first inference result and before receiving the target inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
send all information about the first inference result to the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)); and 
receive the target inference result from the second network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b: the claims additionally recite
wherein when the apparatus accesses a first network device before sending the first inference result, and the apparatus accesses a second network device after sending the first inference result and before receiving the target inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
send all information about the first inference result to the first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)); and 
receive the target inference result from the second network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Regarding dependent claim 8:
Step 2a Prong One:  incorporates the rejection of claim 1; the claim additionally recites determined based on all the information about the first inference result (a mental process of evaluation/determination).
Step 2a Prong Two:  the claims additionally recite 
wherein when the apparatus accesses a second network device before sending the first inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
send all information about the first inference result to the second network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)); and 
receive the target inference result from the second network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite 
wherein when the apparatus accesses a second network device before sending the first inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
send all information about the first inference result to the second network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)); and 
receive the target inference result from the second network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the target inference result is an inference result that is of the ML model (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Regarding dependent claim 10:
Step 2a Prong One:  incorporates the rejection of claim 9; the claim additionally recites determine information about the first ML submodel (a mental process of evaluation/determination).
Step 2a Prong Two:  the claims additionally recite 
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
send the information about the first ML submodel to the terminal device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)).
Step 2b:  the claims additionally recite 
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
send the information about the first ML submodel to the terminal device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)).
Regarding dependent claim 11:
Step 2a Prong One:  incorporates the rejection of claim 10; the claims further recite determine the information about the first ML submodel based on the inference requirement information (a mental evaluation and determination process).
Step 2a Prong Two:  the claims additionally recite:
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
receive inference requirement information from the terminal device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the inference requirement information comprises information about a time at which the terminal device obtains the target inference result (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f) and a field of use and technological environment as discussed in MPEP 2106.05(h)).
Step 2b:  the claims additionally recite:
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
receive inference requirement information from the terminal device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the inference requirement information comprises information about a time at which the terminal device obtains the target inference result (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f) and a field of use and technological environment as discussed in MPEP 2106.05(h)).
Regarding dependent claim 12:
Step 2a Prong One:  incorporates the rejection of claim 10; the claims further recite determine the information about the first ML submodel based on the inference requirement information (a mental evaluation and determination process).
Step 2a Prong Two:  the claims additionally recite
wherein the information about the first ML submodel comprises first target indication information, and the programming instructions, when executed by the at least one processor, further cause the apparatus to  (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
send first model information to the terminal device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the first model information comprises a correspondence between first candidate indication information and a first segmentation location; at least one piece of first candidate indication information and at least one first segmentation location are provided; and one piece of first candidate indication information indicates to segment the ML model, and a location at which the ML model is segmented is a first segmentation location that has a correspondence with the one piece of first candidate indication information  (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite
wherein the information about the first ML submodel comprises first target indication information, and the programming instructions, when executed by the at least one processor, further cause the apparatus to  (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)): 
send first model information to the terminal device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
wherein the first model information comprises a correspondence between first candidate indication information and a first segmentation location; at least one piece of first candidate indication information and at least one first segmentation location are provided; and one piece of first candidate indication information indicates to segment the ML model, and a location at which the ML model is segmented is a first segmentation location that has a correspondence with the one piece of first candidate indication information  (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Regarding dependent claim 13:
Step 2a Prong One:  incorporates the rejection of claim 9; the claims further recite determine the target inference result based on all information about the first inference result and a target ML submodel (a mental evaluation and determination process).
Step 2a Prong Two:  the claims additionally recite:
wherein the first inference information comprises all information about the first inference result; and the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
wherein the second inference information is the target inference result, and input data of the target ML submodel corresponds to output data of the first ML submodel (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite:
wherein the first inference information comprises all information about the first inference result; and the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
wherein the second inference information is the target inference result, and input data of the target ML submodel corresponds to output data of the first ML submodel (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
Regarding dependent claim 14:
Step 2a Prong One:  incorporates the rejection of claim 9; the claims further recite determine a second inference result based on all information about the first inference result and a second ML submodel (a mental evaluation and determination process).
Step 2a Prong Two:  the claims additionally recite:
wherein the first inference information comprises all information about the first inference result, and the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
wherein the second inference information is the second inference result, and input data of the second ML submodel corresponds to output data of the first ML submodel (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite:
wherein the first inference information comprises all information about the first inference result, and the programming instructions, when executed by the at least one processor, further cause the apparatus to (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
wherein the second inference information is the second inference result, and input data of the second ML submodel corresponds to output data of the first ML submodel (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Regarding dependent claim 15:
Step 2a Prong One:  incorporates the rejection of claim 14; the claims further recite determine the target inference result (a mental evaluation and determination process).
Step 2a Prong Two:  the claims additionally recite:
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
send information about a target ML submodel to the second network device, wherein input data of the target ML submodel corresponds to output data of the second ML submodel; and (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
the target ML submodel is used by the second network device (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite:
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
send information about a target ML submodel to the second network device, wherein input data of the target ML submodel corresponds to output data of the second ML submodel; and (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
the target ML submodel is used by the second network device (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Regarding dependent claim 17:
Step 2a Prong One:  incorporates the rejection of claim 16; the claims further recite determine the target inference result based on all information about the first inference result  (a mental evaluation and determination process).
Step 2a Prong Two:  the claims additionally recite:
wherein when the terminal device accesses the apparatus before the apparatus obtains the third inference information (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
the third inference information is all information about the first inference result; and the programming instructions, when executed by the at least one processor, further cause the apparatus to: (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
receive all information about the first inference result from the terminal device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
a target ML submodel, wherein input data of the target ML submodel corresponds to output data of the first ML submodel (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite:
wherein when the terminal device accesses the apparatus before the apparatus obtains the third inference information (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)), 
the third inference information is all information about the first inference result; and the programming instructions, when executed by the at least one processor, further cause the apparatus to: (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
receive all information about the first inference result from the terminal device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
a target ML submodel, wherein input data of the target ML submodel corresponds to output data of the first ML submodel (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Regarding dependent claim 18:
Step 2a Prong One:  incorporates the rejection of claim 17.
Step 2a Prong Two:  the claims additionally recite:
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
send information about the first ML submodel to the terminal device  (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)).
Step 2b:  the claims additionally recite:
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
send information about the first ML submodel to the terminal device  (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)).
Regarding dependent claim 19:
Step 2a Prong One:  incorporates the rejection of claim 18; the claims further recite determine the information about the first ML submodel based on the inference requirement information (a mental evaluation and determination process).
Step 2a Prong Two:  the claims additionally recite:
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
receive inference requirement information from the terminal device, wherein the inference requirement information comprises information about a time at which the terminal device obtains the target inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)).
Step 2b:  the claims additionally recite:
wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
receive inference requirement information from the terminal device, wherein the inference requirement information comprises information about a time at which the terminal device obtains the target inference result (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g)).
Regarding dependent claim 20:
Step 2a Prong One:  incorporates the rejection of claim 10; the claims further recite  determine the target inference result based on the first partial information, the second partial information (a mental evaluation and determination process).
Step 2a Prong Two:  the claims additionally recite:
wherein when the terminal device accesses the apparatus in a process of obtaining the third inference information by the apparatus, the third inference information is all information about the first inference result; and  (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
the programming instructions, when executed by the at least one processor, further cause the apparatus to: (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
receive first partial information about the first inference result from the terminal device; and  (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
receive second partial information about the first inference result from a first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
a target ML submodel, wherein input data of the target ML submodel corresponds to output data of the first ML submodel (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Step 2b:  the claims additionally recite:
wherein when the terminal device accesses the apparatus in a process of obtaining the third inference information by the apparatus, the third inference information is all information about the first inference result; and  (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
the programming instructions, when executed by the at least one processor, further cause the apparatus to: (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f))
receive first partial information about the first inference result from the terminal device; and  (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
receive second partial information about the first inference result from a first network device (insignificant extra-solution activity of transmitting data over a network as discussed in MPEP 2106.05(g))
a target ML submodel, wherein input data of the target ML submodel corresponds to output data of the first ML submodel (mere instructions to apply the exception using generic computer components as discussed in MPEP 2106.05(f)).
Therefore, in view of the considerations set forth in MPEP 2106.04(d), 2106.05(a)-(c) and (e)-(h), the additional elements as recited in the dependent claims discussed above alone or in combination do not integrate the judicial exception into a practical application as they are mere insignificant extra solution activity, combined with implementing the abstract idea using generic computer components, and limitations describing a field of use or technological environment. The additional elements as discussed above, in combination with the abstract idea, are not sufficient to amount to significantly more than the judicial exception as they are well, understood, routine and conventional activity as disclosed in combination with generic computer functions and components used to implement the abstract idea, and limitations describing a field of use or technological environment.

Claim Rejections – 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-3, 7-10, and 13-18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Bloom (US 20180336463 A1).
With respect to claim 1, Bloom teaches a collaborative inference apparatus (e.g. paragraph 0086, Fig. 14, machine 1400 able to perform discussed methodologies), comprising: 
a transceiver (e.g. paragraph 0090 Fig. 14, machine 1400 including communication components 1440 operable to couple machine 1400 to communications network 1432 and devices 1424, and may include wired communication components, wireless communication components, cellular communication components, NFC components, Bluetooth components, Wi-Fi components, etc.; i.e. a device capable of both transmitting and receiving data, including wirelessly, such as via radio wave/signal); 
at least one processor; and one or more memories coupled to the at least one processor and storing programming instructions, when executed by the at least one processor (e.g. paragraph 0087, Fig. 14, machine 1400 including processors 1404, memory/storage 1406, etc.; storage unit 1416 and memory 1414 store instructions 1410 embodying described methodologies/functions, and processors execute the instructions), cause the apparatus to: 
determine a first inference result based on a first machine learning (ML) submodel, wherein the first ML submodel is a part of an ML model (e.g. paragraph 0022, performing remote inference using ML model which is split into components and distributed to multiple computing devices; paragraph 0023, trained ML model split into phase I and phase II components; sending computing device possessing the phase I component; sending computing device encoding data using phase I component; paragraph 0065, Fig. 8, step 804, split ML model into at least first and second ML model components; ML model comprising neural network, where first model component comprises first portion and second ML model component comprises second portion of the neural network; step 806, providing first ML model component to remote computing device; paragraph 0066, step 808 of Fig. 8, processing input data using second ML model component to generate intermediate neural network output data); 
send the first inference result (e.g. paragraph 0023, sending device sending the encoded data to the remote computing device; paragraph 0067, Fig. 8 step 810, intermediate neural network output data (representing data transferred between layers of neural network) is provided to the remote computing device; metadata relating to the intermediate neural network output data, including versioning information regarding architecture, etc., also provided); and 
receive a target inference result, wherein the target inference result is an inference result that is of the ML model and that is determined based on the first inference result (e.g. paragraph 0023, remote computing device further encoding received encoded data using phase II component to produce result data which can represent an inference made by the ML model; paragraph 0068, Fig. 8 step 812, receiving prediction data from the remote computing device, where the prediction data is based on result data generated at the remote computing device by the first ML model component processing the intermediate neural network output data; the result data comprises inference data produced by the first ML model component).
With respect to claim 9, Bloom teaches a collaborative inference apparatus (e.g. paragraph 0086, Fig. 14, machine 1400 able to perform discussed methodologies), comprising: 
a transceiver (e.g. paragraph 0090 Fig. 14, machine 1400 including communication components 1440 operable to couple machine 1400 to communications network 1432 and devices 1424, and may include wired communication components, wireless communication components, cellular communication components, NFC components, Bluetooth components, Wi-Fi components, etc.; i.e. a device capable of both transmitting and receiving data, including wirelessly, such as via radio wave/signal); 
at least one processor; and one or more memories coupled to the at least one processor and storing programming instructions, when executed by the at least one processor (e.g. paragraph 0087, Fig. 14, machine 1400 including processors 1404, memory/storage 1406, etc.; storage unit 1416 and memory 1414 store instructions 1410 embodying described methodologies/functions, and processors execute the instructions), cause the apparatus to: 
receive first inference information from a terminal device, wherein the first inference information comprises all information or partial information of a first inference result, the first inference result is an inference result of a first machine learning (ML) submodel, and the first ML submodel is a part of an ML model (e.g. paragraph 0022, performing remote inference using ML model which is split into components and distributed to multiple computing devices; paragraph 0023, trained ML model split into phase I and phase II components; sending computing device possessing the phase I component; sending computing device encoding data using phase I component; paragraph 0070, Fig. 9, step 908 receiving intermediate neural network output data from remote computing device which is generated by processing input data at the remote computing device using first ML model component; paragraph 0076, Fig. 12, describing system in which model has been split into at least three components; intermediate neural network output data generated 1208 and provided to remote computing device); and 
send second inference information to a second network device, wherein the second inference information is determined based on the first inference information, and the second inference information is for determining a target inference result of the ML model, or the second inference information is the target inference result (e.g. paragraph 0022, indicating that the trained ML model can be split into a plurality of components and individual components can be distributed to individual computing devices including at least a sending device, a remote computing device, and intervening computing devices; paragraph 0027, indicating that multiple remote devices may be present in the system; paragraph 0038, Fig. 2, indicating that multiple remote devices (each with their own ML model component) may be utilized in the system; paragraph 0070, Fig. 9, step 910 processing intermediate neural network output data using second ML model component to generate result data; result data used to provide prediction based on input data received by first ML model component; paragraph 0077, Fig. 12, second intermediate neural network output data generated at remote computing device using second ML model component received from remote computing device; second intermediate neural network output data is processed using third ML component to generate result data comprising inference data; i.e. first intermediate data, generated by a first device using a first ML model component and received at another device having a second ML model component, may be processed by the device having the second ML model component to generate second intermediate data/inference information, and this may then be sent to yet another device having a third ML model component, such as to an intervening device (in the case that the first device also includes the third ML model, where the data then is provided to the first device after it is sent through the intervening device), or directly to a third device, in the instance that there are individual devices each having an individual ML model component (as cited with respect to paragraph 0022), such that the third device has the third ML model component which is to process the second intermediate data in order to produce the resulting final inference of the overall ML model).
With respect to claim 16, Bloom teaches a collaborative inference apparatus (e.g. paragraph 0086, Fig. 14, machine 1400 able to perform discussed methodologies), comprising: 
a transceiver (e.g. paragraph 0090 Fig. 14, machine 1400 including communication components 1440 operable to couple machine 1400 to communications network 1432 and devices 1424, and may include wired communication components, wireless communication components, cellular communication components, NFC components, Bluetooth components, Wi-Fi components, etc.; i.e. a device capable of both transmitting and receiving data, including wirelessly, such as via radio wave/signal); 
at least one processor; and one or more memories coupled to the at least one processor and storing programming instructions, when executed by the at least one processor (e.g. paragraph 0087, Fig. 14, machine 1400 including processors 1404, memory/storage 1406, etc.; storage unit 1416 and memory 1414 store instructions 1410 embodying described methodologies/functions, and processors execute the instructions), cause the apparatus to: 
obtain third inference information, wherein the third inference information is determined based on all information about a first inference result, the first inference result is an inference result obtained after an operation is performed based on a first machine learning (ML) submodel, and the first ML submodel is a part of an ML model (e.g. paragraph 0019, split ML model components placed at different computing devices; sending computing device with encoding component encoding original domain specific data using component and sending the encoded data to remote computing device, which ultimately uses the received data to generate prediction data; paragraph 0022, performing remote inference using ML model which is split into components and distributed to multiple computing devices; paragraph 0023, trained ML model split into phase I and phase II components; sending computing device possessing the phase I component; sending computing device encoding data using phase I component; paragraph 0045, Fig. 3, remote machine sending encoded data to inference machine, which uses decoding ML model component on the encoded data to make prediction/inference; paragraph 0067, Fig. 8, providing intermediate neural network output data to remote computing device; result data generated at remote computing device using ML model component to process the intermediate neural network output data; paragraph 0070, Fig. 9, step 908 receiving intermediate neural network output data from remote computing device which is generated by processing input data at the remote computing device using first ML model component; step 910 processing intermediate neural network output data using second ML model component to generate result data; result data used to provide prediction based on input data received by first ML model component); and 
send a target inference result to a terminal device, wherein the target inference result is an inference result that is of the ML model and that is determined based on the third inference information (e.g. paragraph 0019, computing device returning the generated prediction data to the sending computing device; paragraph 0045, Fig. 3, the inference machine provides/sends the prediction/inference back to the remote machine; paragraph 0068, prediction data is received (i.e. at a device, different from the remote computing device) from the remote computing device, where the prediction data is based on result data generated by first ML model component processing intermediate NN output data; result data may comprise inference data produced by the first ML model component).
With respect to claim 13, Bloom teaches all of the limitations of claim 9 as previously discussed, and further teaches 
wherein the first inference information comprises all information about the first inference result (e.g. paragraph 0022, performing remote inference using ML model which is split into components and distributed to multiple computing devices; paragraph 0023, trained ML model split into phase I and phase II components; sending computing device possessing the phase I component; sending computing device encoding data using phase I component; paragraph 0029, encoded/intermediate data also provided with related metadata such as information regarding the ML model component that generated the data including versioning information, architecture of the model component, etc.; paragraph 0052, metadata including instructions to transform unprocessed raw data, links to code/containers that transform unprocessed data, encoding component of the model, governance/provenance data such as details regarding appropriate domain data, lifetime of validity, origin of the model, relevant URLs/links to send data to, ID of the model, creators of the model, etc.; paragraphs 0055 and 0067, intermediate neural network output data sent with metadata; paragraph 0070, Fig. 9, step 908 receiving intermediate neural network output data from remote computing device which is generated by processing input data at the remote computing device using first ML model component; paragraph 0076, Fig. 12, describing system in which model has been split into at least three components; intermediate neural network output data generated 1208 and provided to remote computing device); and 
the programming instructions, when executed by the at least one processor, further cause the apparatus to:  determine the target inference result based on all information about the first inference result and a target ML submodel, wherein the second inference information is the target inference result, and input data of the target ML submodel corresponds to output data of the first ML submodel (e.g. paragraph 0070, Fig. 9, step 910 processing intermediate neural network output data using second ML model component to generate result data; result data used to provide prediction based on input data received by first ML model component; paragraph 0077, Fig. 12, second intermediate neural network output data generated at remote computing device using second ML model component received from remote computing device; second intermediate neural network output data is processed using third ML component to generate result data comprising inference data).
With respect to claim 14, Bloom teaches all of the limitations of claim 9 as previously discussed, and further teaches 
wherein the first inference information comprises all information about the first inference result (e.g. paragraph 0022, performing remote inference using ML model which is split into components and distributed to multiple computing devices; paragraph 0023, trained ML model split into phase I and phase II components; sending computing device possessing the phase I component; sending computing device encoding data using phase I component; paragraph 0029, encoded/intermediate data also provided with related metadata such as information regarding the ML model component that generated the data including versioning information, architecture of the model component, etc.; paragraph 0052, metadata including instructions to transform unprocessed raw data, links to code/containers that transform unprocessed data, encoding component of the model, governance/provenance data such as details regarding appropriate domain data, lifetime of validity, origin of the model, relevant URLs/links to send data to, ID of the model, creators of the model, etc.; paragraphs 0055 and 0067, intermediate neural network output data sent with metadata; paragraph 0070, Fig. 9, step 908 receiving intermediate neural network output data from remote computing device which is generated by processing input data at the remote computing device using first ML model component; paragraph 0076, Fig. 12, describing system in which model has been split into at least three components; intermediate neural network output data generated 1208 and provided to remote computing device), and 
the programming instructions, when executed by the at least one processor, further cause the apparatus to: determine a second inference result based on all information about the first inference result and a second ML submodel, wherein the second inference information is the second inference result, and input data of the second ML submodel corresponds to output data of the first ML submodel  (e.g. paragraph 0070, Fig. 9, step 910 processing intermediate neural network output data (from first ML model component) using second ML model component to generate result data; result data used to provide prediction based on input data received by first ML model component; paragraph 0077, Fig. 12, second intermediate neural network output data generated at remote computing device using second ML model component received from remote computing device; second intermediate neural network output data is processed using third ML component to generate result data comprising inference data).
With respect to claim 2, Bloom teaches all of the limitations of claim 1 as previously discussed, and further teaches wherein when the apparatus accesses a first network device before determining the first inference result (e.g. as shown in Fig. 8, prior to generating the intermediate NN output data at step 808, the ML model is split into components at step 804 and the first ML model component is provided to the remote computing device at step 806; therefore the device (executing the method of Fig. 8) accesses the remote computing device (in order to provide the first ML model component) before determining the first inference result), the programming instructions, when executed by the at least one processor, further cause the apparatus to: 
send all information about the first inference result to the first network device (e.g. paragraph 0023, sending device sending the encoded data to the remote computing device; paragraph 0067, Fig. 8 step 810, intermediate neural network output data (representing data transferred between layers of neural network) is provided to the remote computing device; metadata relating to the intermediate neural network output data, including versioning information regarding architecture, etc., also provided); and 
receive the target inference result from the first network device, wherein the target inference result is an inference result that is of the ML model and that is determined based on all the information about the first inference result (e.g. paragraph 0023, remote computing device further encoding received encoded data using phase II component to produce result data which can represent an inference made by the ML model; paragraph 0068, Fig. 8 step 812, receiving prediction data from the remote computing device, where the prediction data is based on result data generated at the remote computing device by the first ML model component processing the intermediate neural network output data; the result data comprises inference data produced by the first ML model component).
With respect to claim 17, Bloom teaches all of the limitations of claim 16 as previously discussed, and further teaches wherein when the terminal device accesses the apparatus before the apparatus obtains the third inference information, the third inference information is all information about the first inference result (e.g. as shown in Figs. 8/9, prior to generating the intermediate NN output data at step 808, the ML model is split into components at step 804/904 and the first ML model component is provided to the remote computing device at step 806/906; therefore the devices (executing the method of Fig. 8/9) access one another (in order to provide the first ML model component) before determining the inference information); and the programming instructions, when executed by the at least one processor, further cause the apparatus to: 
receive all information about the first inference result from the terminal device (e.g. paragraph 0023, sending device sending the encoded data to the remote computing device; paragraph 0067, Fig. 8 step 810, intermediate neural network output data (representing data transferred between layers of neural network) is provided to the remote computing device; metadata relating to the intermediate neural network output data, including versioning information regarding architecture, etc., also provided); and 
determine the target inference result based on all information about the first inference result and a target ML submodel, wherein input data of the target ML submodel corresponds to output data of the first ML submodel (e.g. paragraph 0023, remote computing device further encoding received encoded data using phase II component to produce result data which can represent an inference made by the ML model; paragraph 0068, Fig. 8 step 812, receiving prediction data from the remote computing device, where the prediction data is based on result data generated at the remote computing device by the first ML model component processing the intermediate neural network output data; the result data comprises inference data produced by the first ML model component; paragraph 0070, Fig. 9, step 910 processing intermediate neural network output data using second ML model component to generate result data; result data used to provide prediction based on input data received by first ML model component; paragraph 0077, Fig. 12, second intermediate neural network output data generated at remote computing device using second ML model component received from remote computing device; second intermediate neural network output data is processed using third ML component to generate result data comprising inference data).
With respect to claim 3, Bloom teaches all of the limitations of claim 2 as previously discussed, and further teaches wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: receive information about the first ML submodel from the first network device (e.g. paragraph 0023-0024, remote computing device can further encode received encoded data and send result/intervening encoded data to the sending computing device; paragraph 0029, encoded data/intermediate neural network output data provided with related metadata, such as information regarding the remote ML mode component that generated the data, including versioning information regarding the architecture of the ML model component; paragraph 0055, indicating that metadata includes additional information regarding the model, including origin of the model, ID of the model, creators of the model, etc.).
With respect to claim 10, Bloom teaches all of the limitations of claim 9 as previously discussed, and further teaches wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: determine information about the first ML submodel; and send the information about the first ML submodel to the terminal device (e.g. paragraph 0023-0024, remote computing device can further encode received encoded data and send result/intervening encoded data to the sending computing device; paragraph 0029, encoded data/intermediate neural network output data provided with related metadata, such as information regarding the remote ML mode component that generated the data, including versioning information regarding the architecture of the ML model component).
With respect to claim 18, Bloom teaches all of the limitations of claim 17 as previously discussed, and further teaches wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: send information about the first ML submodel to the terminal device (e.g. paragraph 0023-0024, remote computing device can further encode received encoded data and send result/intervening encoded data to the sending computing device; paragraph 0029, encoded data/intermediate neural network output data provided with related metadata, such as information regarding the remote ML mode component that generated the data, including versioning information regarding the architecture of the ML model component; paragraph 0055, indicating that metadata includes additional information regarding the model, including origin of the model, ID of the model, creators of the model, etc.).
With respect to claim 7, Bloom teaches all of the limitations of claim 1 as previously discussed, and further teaches  wherein when the apparatus accesses a first network device before sending the first inference result (e.g. as shown in Fig. 8, prior to generating the intermediate NN output data at step 808, the ML model is split into components at step 804 and the first ML model component is provided to the remote computing device at step 806; therefore the device (executing the method of Fig. 8) accesses the remote computing device (in order to provide the first ML model component) before determining the first inference result), and the apparatus accesses a second network device after sending the first inference result and before receiving the target inference result (e.g. paragraph 0019, computing device returning the generated prediction data to the sending computing device; paragraph 0022, indicating that model components are distributed to computing devices sending devices, remote devices, and intervening devices; i.e. where an intervening device exists between the sending computing device and the computing device returning the generated prediction/inference, the sending device will access this intervening device after sending the intermediate data/first inference result and before receiving the result, such as in order to receive the final result from the other computing device, since the sending device would not attempt to access the intervening device for the purpose of receiving the final result until after the first/intermediate result has been sent, and must first access the intervening device in order to be able to receive the result (generated by the other computing device) via the intervening device; Examiner notes that Fig. 1, for example, shows a remote device 130 with a respective model component and an application server 140 with a respective model component, and additionally shows at least one intervening device, such as API server 120 or Web Server 122, on the communication pathway between the remote device and the application server, such that inference results generated at the application server would be received directly from one of these intervening devices), the programming instructions, when executed by the at least one processor, further cause the apparatus to: 
send all information about the first inference result to the first network device (e.g. paragraph 0023, sending device sending the encoded data to the remote computing device; paragraph 0067, Fig. 8 step 810, intermediate neural network output data (representing data transferred between layers of neural network) is provided to the remote computing device; metadata relating to the intermediate neural network output data, including versioning information regarding architecture, etc., also provided); and 
receive the target inference result from the second network device, wherein the target inference result is an inference result that is of the ML model and that is determined based on all the information about the first inference result (e.g. paragraph 0019, computing device returning the generated prediction data to the sending computing device; paragraph 0022, indicating that model components are distributed to computing devices sending devices, remote devices, and intervening devices; paragraph 0023, remote computing device further encoding received encoded data using phase II component to produce result data which can represent an inference made by the ML model; paragraph 0045, Fig. 3, the inference machine provides/sends the prediction/inference back to the remote machine; paragraph 0068, Fig. 8 step 812, receiving prediction data from the remote computing device, where the prediction data is based on result data generated at the remote computing device by the first ML model component processing the intermediate neural network output data; the result data comprises inference data produced by the first ML model component; i.e. where an intervening device is present between the sending device and the other computing device, the inference result would be sent by the other computing device, via the intervening device, and ultimately received by the sending device (where this receiving by the sending device would be both receiving directly from the intervening device and receiving ultimately from the other computing device (via the intervening device); Examiner notes that Fig. 1, for example, shows a remote device 130 with a respective model component and an application server 140 with a respective model component, and additionally shows at least one intervening device, such as API server 120 or Web Server 122, on the communication pathway between the remote device and the application server, such that inference results generated at the application server would be received directly from one of these intervening devices).
With respect to claim 8, Bloom teaches all of the limitations of claim 1 as previously discussed, and further teaches wherein when the apparatus accesses a second network device before sending the first inference result (e.g. paragraph 0022, indicating that model components are distributed to computing devices sending devices, remote devices, and intervening devices; Fig. 1, for example, shows a remote device 130 with a respective model component and an application server 140 with a respective model component, and additionally shows at least one intervening device, such as API server 120 or Web Server 122, on the communication pathway between the remote device and the application server, such that data sent between these two devices would also include accessing the intervening device on the communication pathway; as shown in Fig. 8, prior to generating the intermediate NN output data at step 808, the ML model is split into components at step 804 and the first ML model component is provided to the remote computing device at step 806; therefore the device (executing the method of Fig. 8) accesses the remote computing device (in order to provide the first ML model component) before determining the first inference result and, in a case where an intervening/second device is present in the communication pathway, also accesses this intervening/second computing device (such as during the process of providing the ML model component) before determining/sending the first inference result), the programming instructions, when executed by the at least one processor, further cause the apparatus to: 
send all information about the first inference result to the second network device (e.g. paragraph 0023, sending device sending the encoded data to the remote computing device; paragraph 0067, Fig. 8 step 810, intermediate neural network output data (representing data transferred between layers of neural network) is provided to the remote computing device; metadata relating to the intermediate neural network output data, including versioning information regarding architecture, etc., also provided; as noted above with respect to Fig. 1 and paragraph 0022, where an intervening/second device is present in the communication pathway between sending and remote devices, the sending of the information about the first inference result would include sending the information to the intervening/second device as well as to its ultimate destination); and 
receive the target inference result from the second network device, wherein the target inference result is an inference result that is of the ML model and that is determined based on all the information about the first inference result (e.g. paragraph 0023, remote computing device further encoding received encoded data using phase II component to produce result data which can represent an inference made by the ML model; paragraph 0068, Fig. 8 step 812, receiving prediction data from the remote computing device, where the prediction data is based on result data generated at the remote computing device by the first ML model component processing the intermediate neural network output data; the result data comprises inference data produced by the first ML model component; as noted above with respect to Fig. 1 and paragraph 0022, where an intervening/second device is present in the communication pathway between sending and remote devices, the receiving of the target inference result would include receiving this information from/via the intervening/second device).
With respect to claim 15, Bloom teaches all of the limitations of claim 14 as previously discussed, and further teaches wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: send information about a target ML submodel to the second network device, wherein input data of the target ML submodel corresponds to output data of the second ML submodel; and the target ML submodel is used by the second network device to determine the target inference result (e.g. paragraph 0022, indicating that the trained ML model can be split into a plurality of components and individual components can be distributed to individual computing devices including at least a sending device, a remote computing device, and intervening computing devices; paragraph 0027, indicating that multiple remote devices may be present in the system; paragraph 0038, Fig. 2, indicating that multiple remote devices (each with their own ML model component) may be utilized in the system; paragraph 0070, Fig. 9, step 910 processing intermediate neural network output data using second ML model component to generate result data; result data used to provide prediction based on input data received by first ML model component; paragraph 0077, Fig. 12, second intermediate neural network output data generated at remote computing device using second ML model component received from remote computing device; second intermediate neural network output data is processed using third ML component to generate result data comprising inference data; i.e. where, as described in paragraph 0022, each individual model component is provided to a corresponding individual device and, as described in paragraph 0077, there are at least three model components, this third/target component/submodel is sent to the corresponding device (including information about the third/target component/submodel), where the second intermediate data, generated by a second device using a second ML model component and received as input at a third/different device having a third ML model component, may be processed by the device using the third ML model component to generate resulting inference information such as a final inference of the overall ML model).
With respect to claim 6, Bloom teaches all of the limitations of claim 1 as previously discussed, and further teaches wherein when the apparatus accesses a first network device before sending the first inference result (e.g. as shown in Fig. 8, prior to generating the intermediate NN output data at step 808, the ML model is split into components at step 804 and the first ML model component is provided to the remote computing device at step 806; therefore the device (executing the method of Fig. 8) accesses the remote computing device (in order to provide the first ML model component) before determining the first inference result), and accesses a second network device in a process of sending the first inference result by the apparatus (e.g. paragraph 0019, computing device returning the generated prediction data to the sending computing device; paragraph 0022, indicating that model components are distributed to computing devices sending devices, remote devices, and intervening devices; i.e. where an intervening device exists between the sending computing device and the computing device returning the generated prediction/inference, the sending device will access this intervening device after sending the intermediate data/first inference result and before receiving the result, such as in order to receive the final result from the other computing device, since the sending device would not attempt to access the intervening device for the purpose of receiving the final result until after the first/intermediate result has been sent, and must first access the intervening device in order to be able to receive the result (generated by the other computing device) via the intervening device; Examiner notes that Fig. 1, for example, shows a remote device 130 with a respective model component and an application server 140 with a respective model component, and additionally shows at least one intervening device, such as API server 120 or Web Server 122, on the communication pathway between the remote device and the application server, such that inference results generated at the application server would be received directly from one of these intervening devices), and the programming instructions, when executed by the at least one processor, further cause the apparatus to: 
send first partial information about the first inference result to the first network device (e.g. paragraph 0022, indicating that the trained ML model can be split into a plurality of components and individual components can be distributed to individual computing devices including at least a sending device, a remote computing device, and intervening computing devices; paragraph 0023, sending device sending the encoded data to the remote computing device; paragraph 0067, Fig. 8 step 810, intermediate neural network output data (representing data transferred between layers of neural network) is provided to the remote computing device; metadata relating to the intermediate neural network output data, including versioning information regarding architecture, etc., also provided; i.e. where the system includes sending device/apparatus, a receiving/first device, and an intervening/second device, and inference information made up of at least two different components (actual result, and metadata, where these may be considered to be first and second partial information about an inference result) is sent to the receiving/first device via/through the intervening/second device, both the first partial information and the second partial information may be sent to the receiving/first device (i.e. after being sent from the sending device to the intervening device, and then from the intervening device to the receiving device)); 
send second partial information about the first inference result to the second network device (e.g. paragraph 0022, indicating that the trained ML model can be split into a plurality of components and individual components can be distributed to individual computing devices including at least a sending device, a remote computing device, and intervening computing devices; paragraph 0023, sending device sending the encoded data to the remote computing device; paragraph 0067, Fig. 8 step 810, intermediate neural network output data (representing data transferred between layers of neural network) is provided to the remote computing device; metadata relating to the intermediate neural network output data, including versioning information regarding architecture, etc., also provided; i.e. where the system includes sending device/apparatus, a receiving/first device, and an intervening/second device, and inference information made up of at least two different components (actual result, and metadata, where these may be considered to be first and second partial information about an inference result) is sent to the receiving/first device via/through the intervening/second device, both the first partial information and the second partial information may be sent to the intervening/second device (i.e. after being sent from the sending device to the intervening device, where this information is subsequently sent from the intervening device to the receiving device)); and 
receive the target inference result from the second network device, wherein the target inference result is an inference result that is of the ML model and that is determined based on the first partial information and the second partial information (e.g. paragraph 0022, indicating that the trained ML model can be split into a plurality of components and individual components can be distributed to individual computing devices including at least a sending device, a remote computing device, and intervening computing devices; paragraph 0023, remote computing device further encoding received encoded data using phase II component to produce result data which can represent an inference made by the ML model; paragraph 0068, Fig. 8 step 812, receiving prediction data from the remote computing device, where the prediction data is based on result data generated at the remote computing device by the first ML model component processing the intermediate neural network output data; the result data comprises inference data produced by the first ML model component; i.e. where the system includes sending device/apparatus, a receiving/first device, and an intervening/second device, and inference information made up of at least two different components is received at the receiving/first device, and the receiving/first device generates the target inference result based on both of these information components, this receiving/first device will then send the result back to the sending device/apparatus via/through the intervening device, such that the result is also received from the intervening device).
With respect to claim 20, Bloom teaches all of the limitations of claim 16 as previously discussed, and further teaches wherein when the terminal device accesses the apparatus in a process of obtaining the third inference information by the apparatus, the third inference information is all information about the first inference result (e.g. as shown in Figs. 8/9, prior to generating the intermediate NN output data at step 808, the ML model is split into components at step 804/904 and the first ML model component is provided to the remote computing device at step 806/906; therefore the devices (executing the method of Fig. 8/9) access one another (in order to provide the first ML model component) before determining the inference information); and the programming instructions, when executed by the at least one processor, further cause the apparatus to: 
receive first partial information about the first inference result from the terminal device (e.g. paragraph 0023, sending device sending the encoded data to the remote computing device; paragraph 0067, Fig. 8 step 810, intermediate neural network output data (representing data transferred between layers of neural network) is provided to the remote computing device; metadata relating to the intermediate neural network output data, including versioning information regarding architecture, etc., also provided; i.e. where the received information about the inference result includes at least two parts, including the intermediate neural network output data and the metadata relating to it, this is analogous to receiving at least first partial information about the inference result (such as the output data itself)); and 
receive second partial information about the first inference result from a first network device (e.g. paragraph 0023, sending device sending the encoded data to the remote computing device; paragraph 0067, Fig. 8 step 810, intermediate neural network output data (representing data transferred between layers of neural network) is provided to the remote computing device; metadata relating to the intermediate neural network output data, including versioning information regarding architecture, etc., also provided; i.e. where the received information about the inference result includes at least two parts, including the intermediate neural network output data and the metadata relating to it, this is analogous to receiving at least second partial information about the inference result (such as the corresponding metadata relating to the output data)); and 
determine the target inference result based on the first partial information, the second partial information, and a target ML submodel, wherein input data of the target ML submodel corresponds to output data of the first ML submodel (e.g. paragraph 0023, remote computing device further encoding received encoded data using phase II component to produce result data which can represent an inference made by the ML model; paragraph 0068, Fig. 8 step 812, receiving prediction data from the remote computing device, where the prediction data is based on result data generated at the remote computing device by the first ML model component processing the intermediate neural network output data; the result data comprises inference data produced by the first ML model component; paragraph 0070, Fig. 9, step 910 processing intermediate neural network output data using second ML model component to generate result data; result data used to provide prediction based on input data received by first ML model component; paragraph 0077, Fig. 12, second intermediate neural network output data generated at remote computing device using second ML model component received from remote computing device; second intermediate neural network output data is processed using third ML component to generate result data comprising inference data; i.e. generating the target/prediction/inference by the corresponding submodel, the received intermediate output data, and the received metadata (such as when the metadata includes information required to transform raw/unprocessed data, details for what domain data is appropriate for the process, links to send encoded data to, etc., as discussed in paragraph 0052)).

Claim Rejections – 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims under pre-AIA  35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was commonly owned at the time any inventions covered therein were made absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and invention dates of each claim that was not commonly owned at the time a later invention was made in order for the examiner to consider the applicability of pre-AIA  35 U.S.C. 103(c) and potential pre-AIA  35 U.S.C. 102€, (f) or (g) prior art under pre-AIA  35 U.S.C. 103(a).
Claims 4 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Bloom in view of Pogorelik et al. (US 20210319098 A1).
With respect to claim 4, Bloom teaches all of the limitations of claim 3 as previously discussed.  Bloom does not explicitly disclose wherein the information about the first ML submodel comprises first target indication information, and the programming instructions, when executed by the at least one processor, further cause the apparatus to: 
receive first model information from the first network device, wherein the first model information comprises a correspondence between first candidate indication information and a first segmentation location; at least one piece of first candidate indication information and at least one first segmentation location are provided; and one piece of first candidate indication information indicates to segment the ML model, and a location at which the ML model is segmented is a first segmentation location that has a correspondence with the one piece of first candidate indication information; and 
determine the first ML submodel based on the first target indication information and the correspondence between the first candidate indication information and the first segmentation location.
However, Pogorelik teaches wherein the information about the first ML submodel comprises first target indication information, and the programming instructions, when executed by the at least one processor, further cause the apparatus to: 
receive first model information from the first network device, wherein the first model information comprises a correspondence between first candidate indication information and a first segmentation location; at least one piece of first candidate indication information and at least one first segmentation location are provided; and one piece of first candidate indication information indicates to segment the ML model, and a location at which the ML model is segmented is a first segmentation location that has a correspondence with the one piece of first candidate indication information (e.g. paragraph 0362, interrogating distributed computing platforms to determine security capabilities; paragraphs 0365-0366, client devices and servers communicating/exchanging data over network; client device memory storing model security requirements and server capabilities; paragraph 0369, model security requirements include indications of security requirements for respective layers, portions, or parts of the inference model; inference model split into slices based on the model security requirements and server capabilities; i.e. receiving model security requirements and network device capabilities, where this information includes information indicating different locations (i.e. layers, parts, or portions) of the model along with indicating corresponding security requirements associated with the model, where this information collectively provides indications of locations at which the model may be split/segmented such that the resulting model portions may be distributed to different devices for distributed execution); and 
determine the first ML submodel based on the first target indication information and the correspondence between the first candidate indication information and the first segmentation location (e.g. paragraph 0366, splitting inference model into model slices based on model security requirements and server capabilities so that model can be executed in distributed manner by servers while providing for security protection of portions of the model based on server capabilities and security requirements of the model; paragraph 0374, splitting model into model slices based on model security requirements and received capabilities).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Bloom and Pogorelik in front of him to have modified the teachings of Bloom (directed to ML model splitting and distributed inferencing), to incorporate the teachings of Pogorelik (directed to securing systems employing AI, including via model splitting and distributed inferencing) to include the capability to receive, from a network device, information indicating different parts, portions, layers, locations, etc. of the model which may be split, including model security requirements corresponding to each of these candidate parts, portions, layers, or locations, along with corresponding security capability information of distributed network computing devices which will potentially receive and execute the corresponding parts, portions, layers, or locations of the model after the splitting is performed, and to determine various different split model portions (i.e. submodels) based on this information (as taught by Pogorelik).  One of ordinary skill would have been motivated to perform such a modification in order to mitigate risk associated with attack vectors/vulnerabilities of AI systems as described in Pogorelik (paragraphs 0087-0088).
With respect to claim 12, Bloom teaches all of the limitations of claim 10 as previously discussed.  Bloom does not explicitly disclose wherein the information about the first ML submodel comprises first target indication information; and the programming instructions, when executed by the at least one processor, further cause the apparatus to: 
send first model information to the terminal device, wherein the first model information comprises a correspondence between first candidate indication information and a first segmentation location; at least one piece of first candidate indication information and at least one first segmentation location are provided; one piece of first candidate indication information indicates to segment the ML model, and a location at which the ML model is segmented is a first segmentation location that has a correspondence with the one piece of first candidate indication information (e.g. paragraph 0362, interrogating distributed computing platforms to determine security capabilities; paragraphs 0365-0366, client devices and servers communicating/exchanging data over network; client device memory storing model security requirements and server capabilities; paragraph 0369, model security requirements include indications of security requirements for respective layers, portions, or parts of the inference model; inference model split into slices based on the model security requirements and server capabilities; i.e. receiving model security requirements and network device capabilities, where this information includes information indicating different locations (i.e. layers, parts, or portions) of the model along with indicating corresponding security requirements associated with the model, where this information collectively provides indications of locations at which the model may be split/segmented such that the resulting model portions may be distributed to different devices for distributed execution); and 
the first model information and the first target indication information are used by the terminal device to determine the first ML submodel (e.g. paragraph 0366, splitting inference model into model slices based on model security requirements and server capabilities so that model can be executed in distributed manner by servers while providing for security protection of portions of the model based on server capabilities and security requirements of the model; paragraph 0374, splitting model into model slices based on model security requirements and received capabilities).
However, Pogorelik teaches wherein the information about the first ML submodel comprises first target indication information; and the programming instructions, when executed by the at least one processor, further cause the apparatus to: 
send first model information to the terminal device, wherein the first model information comprises a correspondence between first candidate indication information and a first segmentation location; at least one piece of first candidate indication information and at least one first segmentation location are provided; one piece of first candidate indication information indicates to segment the ML model, and a location at which the ML model is segmented is a first segmentation location that has a correspondence with the one piece of first candidate indication information; and 
the first model information and the first target indication information are used by the terminal device to determine the first ML submodel.
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Bloom and Pogorelik in front of him to have modified the teachings of Bloom (directed to ML model splitting and distributed inferencing), to incorporate the teachings of Pogorelik (directed to securing systems employing AI, including via model splitting and distributed inferencing) to include the capability to receive, from a network device, information indicating different parts, portions, layers, locations, etc. of the model which may be split, including model security requirements corresponding to each of these candidate parts, portions, layers, or locations, along with corresponding security capability information of distributed network computing devices which will potentially receive and execute the corresponding parts, portions, layers, or locations of the model after the splitting is performed, and to determine various different split model portions (i.e. submodels) based on this information (as taught by Pogorelik).  One of ordinary skill would have been motivated to perform such a modification in order to mitigate risk associated with attack vectors/vulnerabilities of AI systems as described in Pogorelik (paragraphs 0087-0088).
Claims 11 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Bloom in view of Kim et al. (US 20210287085 A1).
With respect to claim 11, Bloom teaches all of the limitations of claim 10 as previously disclosed.  Bloom does not explicitly disclose wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: receive inference requirement information from the terminal device, wherein the inference requirement information comprises information about a time at which the terminal device obtains the target inference result; and determine the information about the first ML submodel based on the inference requirement information.
However, Kim teaches wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: receive inference requirement information from the terminal device, wherein the inference requirement information comprises information about a time at which the terminal device obtains the target inference result; and determine the information about the first ML submodel based on the inference requirement information (e.g. paragraph 0045, parallelization schemes/policies used for parallelization strategy, including intra-layer parallelism, inter-layer parallelism, partition dimensions indicating direction in which model, layer, etc. is divided, a division number indicating number of models or number of layers to be divided, etc.; paragraph 0046, generating parallelization strategy for target model; paragraph 0053, executing target model based on parallelization strategy of each target layer of the target model; outputting execution time of the target model or each target layer; the execution time used to evaluate performance of the parallelization strategy; paragraph 0054, reference layer information associated with reference layers, including metadata and reference parallelization strategy corresponding to layers, along with performance (execution time) of the parallelization strategy; paragraph 0058, comparing metadata of each target layer of target model and reference metadata of each reference layer in reference DM, measuring similarity; paragraph 0060, selecting layer corresponding to target layer based on similarity and generating parallelization strategy for the target layer based on matching; i.e. the system obtains information regarding time (execution time) in which the various results (such as first results, second results, target results, etc., corresponding to different model layers) are obtained by the device executing the corresponding model portions/layers/submodels (analogous to inference requirement information as defined in the claim), and determines corresponding information, such as metadata information, similarity measures, parallelization strategy, etc., for the model portions/layers/submodels based on this information).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Bloom and Kim in front of him to have modified the teachings of Bloom (directed to ML model splitting and distributed inferencing), to incorporate the teachings of Kim (directed to parallel processing methods for neural network models) to include the capability to receive, from a corresponding device, information about a time (i.e. inference requirement information), such as an execution time, at which the device obtains corresponding inference results using a corresponding model layer/portion/submodel, and determine various information about one or more model layers/portions/submodels based on this information, including metadata information, similarity measures (between a target model portion/layer and a reference portion/layer), corresponding parallelization strategies, etc. (as taught by Kim).  One of ordinary skill would have been motivated to perform such a modification in order to allow for more quickly converging to results in neural network model training and inference as described in Kim (paragraph 0003).
With respect to claim 19, Bloom teaches all of the limitations of claim 18 as previously discussed.  Bloom does not explicitly disclose wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: receive inference requirement information from the terminal device, wherein the inference requirement information comprises information about a time at which the terminal device obtains the target inference result; and determine the information about the first ML submodel based on the inference requirement information.
However, Kim teaches wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: receive inference requirement information from the terminal device, wherein the inference requirement information comprises information about a time at which the terminal device obtains the target inference result; and determine the information about the first ML submodel based on the inference requirement information (e.g. paragraph 0045, parallelization schemes/policies used for parallelization strategy, including intra-layer parallelism, inter-layer parallelism, partition dimensions indicating direction in which model, layer, etc. is divided, a division number indicating number of models or number of layers to be divided, etc.; paragraph 0046, generating parallelization strategy for target model; paragraph 0053, executing target model based on parallelization strategy of each target layer of the target model; outputting execution time of the target model or each target layer; the execution time used to evaluate performance of the parallelization strategy; paragraph 0054, reference layer information associated with reference layers, including metadata and reference parallelization strategy corresponding to layers, along with performance (execution time) of the parallelization strategy; paragraph 0058, comparing metadata of each target layer of target model and reference metadata of each reference layer in reference DM, measuring similarity; paragraph 0060, selecting layer corresponding to target layer based on similarity and generating parallelization strategy for the target layer based on matching; i.e. the system obtains information regarding time (execution time) in which the various results (such as first results, second results, target results, etc., corresponding to different model layers) are obtained by the device executing the corresponding model portions/layers/submodels (analogous to inference requirement information as defined in the claim), and determines corresponding information, such as metadata information, similarity measures, parallelization strategy, etc., for the model portions/layers/submodels based on this information).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Bloom and Kim in front of him to have modified the teachings of Bloom (directed to ML model splitting and distributed inferencing), to incorporate the teachings of Kim (directed to parallel processing methods for neural network models) to include the capability to receive, from a corresponding device, information about a time (i.e. inference requirement information), such as an execution time, at which the device obtains corresponding inference results using a corresponding model layer/portion/submodel, and determine various information about one or more model layers/portions/submodels based on this information, including metadata information, similarity measures (between a target model portion/layer and a reference portion/layer), corresponding parallelization strategies, etc. (as taught by Kim).  One of ordinary skill would have been motivated to perform such a modification in order to allow for more quickly converging to results in neural network model training and inference as described in Kim (paragraph 0003).
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Bloom in view of Pogorelik, further in view of Kim.
With respect to claim 5, Bloom in view of Pogorelik teaches all of the limitations of claim 4, as previously discussed. Bloom and Pogorelik do not explicitly disclose wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: send inference requirement information to the first network device, wherein the inference requirement information comprises information about a time at which the apparatus obtains the target inference result; and the inference requirement information is for determining the information about the first ML submodel.
However, Kim teaches wherein the programming instructions, when executed by the at least one processor, further cause the apparatus to: send inference requirement information to the first network device, wherein the inference requirement information comprises information about a time at which the apparatus obtains the target inference result; and the inference requirement information is for determining the information about the first ML submodel (e.g. paragraph 0045, parallelization schemes/policies used for parallelization strategy, including intra-layer parallelism, inter-layer parallelism, partition dimensions indicating direction in which model, layer, etc. is divided, a division number indicating number of models or number of layers to be divided, etc.; paragraph 0046, generating parallelization strategy for target model; paragraph 0053, executing target model based on parallelization strategy of each target layer of the target model; outputting execution time of the target model or each target layer; the execution time used to evaluate performance of the parallelization strategy; paragraph 0054, reference layer information associated with reference layers, including metadata and reference parallelization strategy corresponding to layers, along with performance (execution time) of the parallelization strategy; paragraph 0058, comparing metadata of each target layer of target model and reference metadata of each reference layer in reference DM, measuring similarity; paragraph 0060, selecting layer corresponding to target layer based on similarity and generating parallelization strategy for the target layer based on matching; i.e. the system obtains information regarding time (execution time) in which the various results (such as first results, second results, target results, etc., corresponding to different model layers) are obtained by the device executing the corresponding model portions/layers/submodels (analogous to inference requirement information as defined in the claim), and determines corresponding information, such as metadata information, similarity measures, parallelization strategy, etc., for the model portions/layers/submodels based on this information).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Bloom and Kim in front of him to have modified the teachings of Bloom (directed to ML model splitting and distributed inferencing), to incorporate the teachings of Kim (directed to parallel processing methods for neural network models) to include the capability to receive, from a corresponding device, information about a time (i.e. inference requirement information), such as an execution time, at which the device obtains corresponding inference results using a corresponding model layer/portion/submodel, and determine various information about one or more model layers/portions/submodels based on this information, including metadata information, similarity measures (between a target model portion/layer and a reference portion/layer), corresponding parallelization strategies, etc. (as taught by Kim).  One of ordinary skill would have been motivated to perform such a modification in order to allow for more quickly converging to results in neural network model training and inference as described in Kim (paragraph 0003).
	
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned. They are part of the literature of the art, relevant for all they contain,” In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting in re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (GCPA 1968)). Further, a reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill the art, including nonpreferred embodiments. Merck & Co, v. Biocraft Laboratories, 874 F.2d 804, 10 USPQ2d 1843 (Fed. Cir.), cert, denied, 493 U.S. 975 (1989). See also Upsher-Smith Labs. v. Pamlab, LLC, 412 F,3d 1319, 1323, 75 USPQ2d 1213, 1215 (Fed. Cir, 2005): Celeritas Technologies Ltd. v. Rockwell International Corp., 150 F.3d 1354, 1361, 47 USPQ2d 1516, 1522-23 (Fed. Cir. 1998).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JEREMY L STANLEY whose telephone number is (469)295-9105. The examiner can normally be reached on Monday-Friday from 9:00 AM to 5:00 PM CST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar, can be reached at telephone number (571) 270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from Patent Center and the Private Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from Patent Center or Private PAIR. Status information for unpublished applications is available through Patent Center and Private PAIR for authorized users only. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/patents/uspto-automated- interview-request-air-form.
/JEREMY L STANLEY/
Primary Examiner, Art Unit 2127
Read full office action
Prosecution Timeline

Mar 16, 2023
Application Filed
Jan 10, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/648,629
Patent 12591827
ETHICAL CONFIDENCE FABRICS: MEASURING ETHICAL ALGORITHM DEVELOPMENT
2y 5m to grant Granted Mar 31, 2026
18/621,529
Patent 12580783
CONFIGURING 360-DEGREE VIDEO WITHIN A VIRTUAL CONFERENCING SYSTEM
2y 5m to grant Granted Mar 17, 2026
18/383,433
Patent 12572266
ACCESSING AND DISPLAYING INFORMATION CORRESPONDING TO PAST TIMES AND FUTURE TIMES
2y 5m to grant Granted Mar 10, 2026
18/384,355
Patent 12561041
Systems, Methods, and Graphical User Interfaces for Interacting with Virtual Reality Environments
2y 5m to grant Granted Feb 24, 2026
17/503,714
Patent 12555684
ASSESSING A TREATMENT SERVICE BASED ON A MEASURE OF TRUST DYNAMICS
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
48%
Grant Probability
92%
With Interview (+44.7%)
3y 2m
Median Time to Grant
Low
PTA Risk
Based on 276 resolved cases by this examiner. Grant probability derived from career allow rate.