Last updated: April 19, 2026

Application No. 17/684,680

RUNTIME-SPECIFIC PARTITIONING OF MACHINE LEARNING MODELS

Final Rejection §103§112

Filed

Mar 02, 2022

Examiner

LIN, HSING CHUN

Art Unit

2195

Tech Center

2100 — Computer Architecture & Software

Assignee

Adobe Inc.

OA Round

2 (Final)

Interview Optional

— +79.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 108 resolved cases, 2023–2026

Examiner Intelligence

LIN, HSING CHUN View full profile →

Grants 59% of resolved cases

Career Allow Rate

64 granted / 108 resolved

+4.3% vs TC avg

Strong +80% interview lift

Without

With

+79.8%

Interview Lift

resolved cases with interview

Typical timeline

3y 4m

Avg Prosecution

37 currently pending

Career history

145

Total Applications

across all art units

Statute-Specific Performance

§101

17.1%

-22.9% vs TC avg

§103

35.8%

-4.2% vs TC avg

§102

6.5%

-33.5% vs TC avg

§112

34.0%

-6.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 108 resolved cases

Office Action

§103 §112

DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending in this application.

Response to Arguments
Applicant’s arguments regarding the rejections of claims 1-20 under 35 U.S.C. 112b have been fully considered and are persuasive. The rejections have been withdrawn. However, new 35 U.S.C. 112b rejections are applied to claims 1-20 based on the amendments.

Applicant's arguments regarding the 35 U.S.C. 102/103 rejections of claims 1-20 have been fully considered but they are either unpersuasive or moot in light of the references being applied in the current rejection.

Regarding the 35 U.S.C. 102/103 rejections, the applicant argues the following in the remarks:
All of these claims additionally recite the execution of "each of the plurality of partitions of the machine learning model ... using a runtime environment corresponding to a selected accelerator." Applicant submits that the combination of references referred to by the Office does not render the amended claims obvious.

Examiner has thoroughly considered Applicant’s arguments, but respectfully finds them unpersuasive for at least the following reasons:
As to point (a), the examiner respectfully disagrees. Ki recites in [0028] In some embodiments, a workflow and/or a model such as a graph, a machine learning model, a neural network, and/or the like, may be partitioned between multiple accelerator devices and/or virtual accelerators in accordance with example embodiments of the disclosure. For example, a host may partition a model between virtual accelerators based on the memory requirements and/or compute times of the portions of the model, as well as the memory resources and/or cores of the virtual accelerators”. Ki discloses that a machine learning model is partitioned and the partitions are placed on accelerators according to resource requirements of the partitions. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
As per claims 1, 8, and 15 (line numbers refer to claim 1):
	 Line 9 recites “a selected accelerator” and it is unclear if the selected accelerator is one of the plurality of operation system accelerators recited in line 5.

As per claim 8:
	Line 12 recites “the partition” but it is unclear what this is referring to.
	
As per claim 15:
	Lines 7-9 recite “each partition of the plurality of partitions is configured for execution using a runtime environment corresponding to runtime requirements for the runtime environment” but it is unclear what this means (Claim 1 which is similar in scope to claim 15 recites executing each of the plurality of partitions of the machine learning model using a runtime environment corresponding to a selected accelerator and the runtime requirements of the respective partition).

Claims 2-7, 9-14, and 16-20 are dependent claims of claims 1, 8, and 15, and fail to resolve the deficiencies of claims 1, 8, and 15, so they are rejected for the same reasons. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 6-9, 13-16, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ki et al. (US 20220113915 A1 hereinafter Ki) in view of Ng et al. (US 20200211154 A1 hereinafter Ng).
Ki was cited in a prior office action.

As per claim 1, Ki teaches the invention substantially as claimed including a method comprising: accessing a machine learning model configured for processing a data object ([0003] Some data processing workloads, such as machine learning workloads, may involve the use of models; [0028] In some embodiments, a workflow and/or a model such as a graph, a machine learning model, a neural network; [0034] Workload partitioning may involve splitting a workload (e,g., data and model) across multiple machines (e.g., accelerator devices).); 
partitioning the machine learning model into a plurality of partitions of the machine learning model ([0028] In some embodiments, a workflow and/or a model such as a graph, a machine learning model, a neural network, and/or the like, may be partitioned between multiple accelerator devices; [0052] a model (e.g., a graph, an ML model, and/or the like) may be partitioned into portions that may each be assigned to a virtual accelerator to implement model parallelism); 
characterizing each of the plurality of partitions of the machine learning model with respect to runtime requirements; executing each of the plurality of partitions of the machine learning model using a runtime environment corresponding to a selected accelerator and the runtime requirements of the respective partition, to process the data object ([0028] In some embodiments, a workflow and/or a model such as a graph, a machine learning model, a neural network, and/or the like, may be partitioned between multiple accelerator devices and/or virtual accelerators in accordance with example embodiments of the disclosure. For example, a host may partition a model between virtual accelerators based on the memory requirements and/or compute times of the portions of the model, as well as the memory resources and/or cores of the virtual accelerators. In some embodiments, based on the partitioning, the host may generate a clustered graph with data groups to be executed by the virtual accelerators and scheduled by a memory manager; [0003] Some data processing workloads, such as machine learning workloads, may involve the use of models; [0127] At operation 887 (Part 3.), the device (e.g., the device implementing the virtual NPUs) may extract one or more operational parameters from the graph partitions provided by the host such as memory usage, task dependencies, timing information for one or more graph partitions, and/or the like for use at runtime). 
	 
Ki fails to teach the plurality of partitions including a shared backbone partition optimized for 8-bit processing using any of a plurality of operating system accelerators; rendering output using the machine learning model.

However, Ng teaches the plurality of partitions including a shared backbone partition optimized for 8-bit processing using any of a plurality of operating system accelerators (Fig. 1; [0073] In some embodiments, to allow bottom-up pose-estimation models to run in real-time with optimized performance on embedded systems/devices such as embedded fall-detection system 100, the proposed pose-estimation module 106 implements a bottom-up pose-estimation framework with a number of improvements to the existing framework. Some of these modifications/improvements include: [0074] Replacing the commonly used complex VGG16 network (described in “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Simonyan et al., arXiv:1409.1556) with a faster VGG16×4 network (described in “Channel Pruning for Accelerating Very Deep Neural Networks,” He et al., ICCV 2017 and “AMC: AutoML for Model Compression and Acceleration on Mobile Devices,” He et al., ECCV 2018) as the backbone/feature extractor, which has an inference speed 4× faster than the VGG16 network. Note that the term “backbone” herein refers to the neural network which receives an input image and extracts image features for use in subsequent deep-learning tasks such as classification, regression, and segmentation. This speed-up is largely due to performing channel pruning, i.e., reducing the width of the feature map, which in turn shrinks the network into a thinner one; [0077] Quantizing the network parameters and run the network inference in 8-bit integer precision instead of the typical 32-bit floating-point precision. This modification not only reduces the memory usage and the frequency of memory access; [0091] In some embodiments, scene-segmentation module 112 can be implemented by various fast CNN-based semantic segmentation models. In one embodiment, scene-segmentation module 112 can be implemented based on a DeepLabV3+ model; [0093] Modifying the original DeepLabV3+ model by using a fast MobileNetV2 network (described in “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” Sandler et al., arXiv:1801.04381) as the backbone/feature extractor the modified DeepLabV3+ model to speed up and simplify the original DeepLabV3+ model; [0094] Quantizing the network parameters and running the network inference in 8-bit integer precision instead of the existing 32-bit floating-point precision to reduce the memory usage and the frequency of memory access; [0131] First, the proposed face-recognition model can train a lightweight ResNet-18 network (described in “Deep Residual Learning for Image Recognition,” He et al., CVPR 2016) on the MS1M-refine-v2 dataset. Second, the proposed face-recognition model can be configured to quantize the neural network model and run the inference using 8-bit integer precision; [0160] In some embodiments, neural network accelerators 1012 can include any type of microprocessor designed as hardware acceleration for executing AI-based and deep-learning-based programs and models, and in particular various deep learning neural networks such as various CNN and RNN frameworks mentioned in this disclosure. Neural network accelerators 1012 can perform the intended functions of each of the described deep-learning-based modules within the disclosed embedded fall-detection system 100, i.e., pose-estimation module 106, action-recognition module 108, fall-detection module 110, scene-segmentation module 112, face-detection module 116, face-recognition module 118, and the ADL statistics module.); 
rendering output using the machine learning model ([0054] Embedded fall-detection system 100 can generate fall-detection output 140 including fall alarms/notifications 140-1 and sanitized video clips 140-2 when human falls are detected; [0052] Note that various embodiments of the disclosed embedded fall-detection system are based on implementing various deep-learning-based fast neural networks; [0062] In some embodiments, after receiving the fall-detection output from server 204, mobile app 212 can play the received sanitized video clip on one or more mobile devices 206-210).

	It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined Ki with the teachings of Ng to reduce memory usage (see Ng [0094] Quantizing the network parameters and running the network inference in 8-bit integer precision instead of the existing 32-bit floating-point precision to reduce the memory usage and the frequency of memory access).
	
As per claim 2, Ki and Ng teach the method of claim 1. Ki teaches wherein the runtime requirements comprise at least one of a low-memory complex operator requirement, a high-memory optimized operator requirement, a specialized post-processing requirement, GPU-efficient runtime requirements, or CPU-efficient runtime requirements ([0086] However, in some embodiments, the system illustrated in FIG. 4 may be used to implement a large DL model (e.g., a large DNN model); [0038] In some embodiments, the second type memory in the second memory tier 210 may be implemented with one or more types of memory that may provide relatively high capacity; [0028] a host may partition a model between virtual accelerators based on the memory requirements and/or compute times of the portions of the model, as well as the memory resources and/or cores of the virtual accelerators.).

As per claim 6, Ki and Ng teach the method of claim 1. Ng teaches wherein the data object comprises a document and the processing of the data object comprises detecting portions of the document to apply a plurality of rendering resources to the document ([0159] Bus 1002 is also coupled to camera system 1010. Camera system 1010 is configured to capture a sequence of video images at predetermined resolutions and couple the captured video images to various components within hardware environment 1000 via bus 1002, such as to memory 1006 for buffering and to processors 1004 and neural network accelerators 1012 for various deep-learning and neural network-based operations; [0054] Embedded fall-detection system 100 can generate fall-detection output 140 including fall alarms/notifications 140-1 and sanitized video clips 140-2 when human falls are detected; [0052] Note that various embodiments of the disclosed embedded fall-detection system are based on implementing various deep-learning-based fast neural networks; [0062] In some embodiments, after receiving the fall-detection output from server 204, mobile app 212 can play the received sanitized video clip on one or more mobile devices 206-210; [0021] In some embodiments, if a fall is detected for the person, the process further includes generating a sanitized video clip by: identifying a common background image for the sequence of video images; and superimposing the set of skeleton diagrams of the detected person corresponding to the sequence of video images onto the common background image to obtain a sequence of sanitized video images.).

	
As per claim 7, Ki and Ng teach the method of claim 1. Ng teaches wherein the data object comprises presentation media and the processing of the data object comprises at least one of translation, captioning, or detecting portions of the data object to apply a plurality of rendering resources ([0159] Bus 1002 is also coupled to camera system 1010. Camera system 1010 is configured to capture a sequence of video images at predetermined resolutions and couple the captured video images to various components within hardware environment 1000 via bus 1002, such as to memory 1006 for buffering and to processors 1004 and neural network accelerators 1012 for various deep-learning and neural network-based operations; [0054] Embedded fall-detection system 100 can generate fall-detection output 140 including fall alarms/notifications 140-1 and sanitized video clips 140-2 when human falls are detected; [0052] Note that various embodiments of the disclosed embedded fall-detection system are based on implementing various deep-learning-based fast neural networks; [0062] In some embodiments, after receiving the fall-detection output from server 204, mobile app 212 can play the received sanitized video clip on one or more mobile devices 206-210; [0021] In some embodiments, if a fall is detected for the person, the process further includes generating a sanitized video clip by: identifying a common background image for the sequence of video images; and superimposing the set of skeleton diagrams of the detected person corresponding to the sequence of video images onto the common background image to obtain a sequence of sanitized video images.).

As per claim 8, it is a system claim of claim 1, so it is rejected for similar reasons. Additionally, Ki teaches system comprising: a memory component; and a processing device coupled to the memory component to perform operations comprising ([0069] The host 424 may also include local storage 468 which may be implemented, for example, with any type of storage device(s) based on any type of memory and/or storage media including solid state media, magnetic media, optical media, and/or the like; [0071] The CPU 458, local storage 468, and/or network interface 432 may communicate, for example, through a system bus 470.): 
distributing the plurality of partitions to a computing device ([0028] In some embodiments, a workflow and/or a model such as a graph, a machine learning model, a neural network, and/or the like, may be partitioned between multiple accelerator devices and/or virtual accelerators in accordance with example embodiments of the disclosure. For example, a host may partition a model between virtual accelerators based on the memory requirements and/or compute times of the portions of the model, as well as the memory resources and/or cores of the virtual accelerators. In some embodiments, based on the partitioning, the host may generate a clustered graph with data groups to be executed by the virtual accelerators and scheduled by a memory manager; [0030] analyze graph processing and/or deep learning (DL) applications (e.g., deep neural networks (DNNs)) in which computations and/or portions of a DL model may be distributed across multiple machines such as accelerator devices).


As per claim 9, Ki and Ng teach the system of claim 8. Ki teaches wherein the operations further comprise: executing each of the plurality of partitions of the machine learning model using the runtime environment corresponding to the respective runtime requirements to process the data object ([0028] In some embodiments, a workflow and/or a model such as a graph, a machine learning model, a neural network, and/or the like, may be partitioned between multiple accelerator devices and/or virtual accelerators in accordance with example embodiments of the disclosure. For example, a host may partition a model between virtual accelerators based on the memory requirements and/or compute times of the portions of the model, as well as the memory resources and/or cores of the virtual accelerators. In some embodiments, based on the partitioning, the host may generate a clustered graph with data groups to be executed by the virtual accelerators and scheduled by a memory manager; [0003] Some data processing workloads, such as machine learning workloads, may involve the use of models; [0127] At operation 887 (Part 3.), the device (e.g., the device implementing the virtual NPUs) may extract one or more operational parameters from the graph partitions provided by the host such as memory usage, task dependencies, timing information for one or more graph partitions, and/or the like for use at runtime). 
Additionally, Ng teaches rendering output based on the processing of the data object ([0054] Embedded fall-detection system 100 can generate fall-detection output 140 including fall alarms/notifications 140-1 and sanitized video clips 140-2 when human falls are detected; [0052] Note that various embodiments of the disclosed embedded fall-detection system are based on implementing various deep-learning-based fast neural networks; [0054] Fall-detection engine 101 receives video images 104 as input; [0062] In some embodiments, after receiving the fall-detection output from server 204, mobile app 212 can play the received sanitized video clip on one or more mobile devices 206-210).

As per claim 13, it is a system claim of claim 6, so it is rejected for similar reasons.

As per claim 14, it is a system claim of claim 7, so it is rejected for similar reasons.


As per claim 15, it is a non-transitory computer-readable medium claim of claim 1, so it is rejected for similar reasons.  Additionally, Ki teaches a non-transitory computer-readable medium storing executable instructions, which when executed by a processing device, cause the processing device to perform operations ([0069] The host 424 may also include local storage 468 which may be implemented, for example, with any type of storage device(s) based on any type of memory and/or storage media including solid state media, magnetic media, optical media, and/or the like; [0071] The CPU 458, local storage 468, and/or network interface 432 may communicate, for example, through a system bus 470).

As per claim 16, it is a non-transitory computer-readable medium claim of claim 2, so it is rejected for similar reasons.

As per claim 19, it is a non-transitory computer-readable medium claim of claim 6, so it is rejected for similar reasons.

As per claim 20, it is a non-transitory computer-readable medium claim of claim 7, so it is rejected for similar reasons.


Claims 3, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Ki and Ng, as applied to claims 1, 8 and 15 above, in view of Fu et al. (US 20190325276 A1 hereinafter Fu).
Fu was cited in a prior office action.
As per claim 3, Ki and Ng teach the method of claim 1.

Ki and Ng fail to teach wherein at least one of the plurality of partitions is reused among a plurality of machine learning models.
	However, Fu teaches wherein at least one of the plurality of partitions is reused among a plurality of machine learning models ([0030] respective labeled neural network layers can be reused to train part of a neural network for new visual data; [0015] Aspects of the present disclosure are directed toward training neural networks).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined Ki and Ng with the teachings of Fu to implement machine learning quickly (see Fu [0023] applying deep learning to the IoT devices quickly and easily. For example, aspects of the present disclosure can benefit picture regeneration by back propagation and image compression.).

As per claims 10 and 17, they are system and non-transitory computer-readable medium claims of claim 3, so they are rejected for similar reasons.

	
Claims 4, 5, 11, 12, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Ki and Ng, as applied to claims 1 and 15 above, in view of Mody et al. (US 20190005375 A1 hereinafter Mody).
Mody was cited in a prior office action.
As per claim 4, Ki and Ng teach the method of claim 1.


Ki and Ng fail to teach  further comprising applying a partition-specific security profile to at least one of the plurality of partitions.
	However, Mody teaches further comprising applying a partition-specific security profile to at least one of the plurality of partitions ([0044] As described herein, the key management 408 may receive encrypted keys from the external memories such as the external memory 210. At the secure IP block 202, and during the signal processing, different keys may be supplied for each layer of the multi-layer CNN data.).

It would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to have combined Ki and Ng with the teachings of Mody to provide security (see Mody [0027] Referencing the image frame 102 of FIG. 1 above, the image classification may be performed through secure decryptions and/or encryptions by the secure IP block 202 with the use of corresponding keys as further discussed below.)
	

As per claim 5, Ki, Ng, and Mody teach the method of claim 4. Mody teaches wherein the partition-specific security profile comprises applying encryption to at least some layers of the machine learning model ([0044] As described herein, the key management 408 may receive encrypted keys from the external memories such as the external memory 210. At the secure IP block 202, and during the signal processing, different keys may be supplied for each layer of the multi-layer CNN data).

As per claims 11 and 18, they are system and non-transitory computer-readable medium claims of claim 4, so they are rejected for similar reasons.

As per claim 12, it is a system claim of claim 5, so it is rejected for similar reasons. 


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HSING CHUN LIN whose telephone number is (571)272-8522. The examiner can normally be reached Mon - Fri 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached at (571) 272-4169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/H.L./Examiner, Art Unit 2195                                                                                                                                                                                                        
/Aimee Li/Supervisory Patent Examiner, Art Unit 2195

Read full office action

Prosecution Timeline

Mar 02, 2022

Application Filed

Sep 30, 2025

Non-Final Rejection — §103, §112

Nov 03, 2025

Interview Requested

Nov 13, 2025

Applicant Interview (Telephonic)

Nov 14, 2025

Examiner Interview Summary

Dec 16, 2025

Response Filed

Jan 10, 2026

Final Rejection — §103, §112

Feb 17, 2026

Interview Requested

Mar 12, 2026

Applicant Interview (Telephonic)

Mar 15, 2026

Examiner Interview Summary

Apr 14, 2026

Request for Continued Examination

Apr 15, 2026

Response after Non-Final Action

Precedent Cases

Applications granted by this same examiner with similar technology

17/837,306

Patent 12554523

REDUCING DEPLOYMENT TIME FOR CONTAINER CLONES IN COMPUTING ENVIRONMENTS

2y 5m to grant Granted Feb 17, 2026

17/355,265

Patent 12547458

PLATFORM FRAMEWORK ORCHESTRATION AND DISCOVERY

2y 5m to grant Granted Feb 10, 2026

18/074,254

Patent 12468573

ADAPTIVE RESOURCE PROVISIONING FOR A MULTI-TENANT DISTRIBUTED EVENT DATA STORE

2y 5m to grant Granted Nov 11, 2025

17/806,614

Patent 12461785

GRAPHIC-BLOCKCHAIN-ORIENTATED SHARDING STORAGE APPARATUS AND METHOD THEREOF

2y 5m to grant Granted Nov 04, 2025

17/535,922

Patent 12443425

ISOLATED ACCELERATOR MANAGEMENT INTERMEDIARIES FOR VIRTUALIZATION HOSTS

2y 5m to grant Granted Oct 14, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

59%

Grant Probability

99%

With Interview (+79.8%)

3y 4m

Median Time to Grant

Moderate

PTA Risk

Based on 108 resolved cases by this examiner. Grant probability derived from career allow rate.