Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
CLAIM INTERPRETATION
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
Such claim limitations are: “means for tracing, means for building, means for communicating, means for requesting, means for updating, means for dispatching” in claim 25.
The specification discloses corresponding structure, including that “The command dispatcher circuitry 222 sends model compute commands to assigned device executors once the XPU selection client 221 receives the XPU device assignment.” (paragraph [0028])“The XPU selection service circuitry 234 receives the XPU device assignment request from the graph scheduler circuitry 220” (paragraph [0029]), “the means for tracing a graph may be implemented by graph tracer circuitry 210” (paragraph [0031]), “the example graph tracer circuitry 210 then builds the compute graph” (paragraph [0050]), “the processor circuitry is further to detect a change in the compute graph, request a second processing unit device assignment, update the compute graph based on a second processing unit device assignment” (paragraph [0085]).
Because these claim limitations are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless -
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-4, 8, 9, 11, 12-15, 19, 20, 22, 24, and 25 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Poornachandran (US 2022/0326991, hereinafter Poornachandran).
Regarding claim 1, Poornachandran discloses
An apparatus to process a cloud client application pipeline across devices, the apparatus comprising:
at least one memory; machine readable instructions (paragraph [0018]: the storage circuitry (for storing information, such as machine-readable instructions) 16); and
processor circuitry to at least one of instantiate or execute the machine readable instructions to (paragraph [0018]: the functionality of the processing circuitry 14 or means for processing 14 may be implemented by the processing circuitry 14 or means for processing 14 executing machine-readable instructions):
trace an execution of an input model (Fig. 1c; paragraph [0021]: The method comprises obtaining 130 the computer program; paragraph [0027]: the computer program may be divisible into different tasks; paragraph [0044]: Machine-learning models are trained using training input data; paragraph [0056]: The proposed concept may introduce a power, thermal & energy-aware cost function in DPC++ oneAPI that supports task graph re-generation (dynamically without compromising functional accuracy—e.g., AI (Artificial Intelligence) or quality in terms of Media/Graphics) and associated compute Kernels based on static telemetry (graph parsing), run-time telemetry (data dependent) and histogram of past usage telemetry);
build a compute graph (Fig. 1c; paragraph [0027]: the method may comprises determining 140 a task graph of the computer program … the one or more compute kernels may be part of the task graph. In general, this task graph may be generated based on a static analysis of the computer program) based on the trace of the input model (Fig. 1c; paragraph [0021]: The method comprises obtaining 130 the computer program; paragraph [0027]: the computer program may be divisible into different tasks; paragraph [0056]: The proposed concept may introduce a power, thermal & energy-aware cost function in DPC++ oneAPI that supports task graph re-generation (dynamically without compromising functional accuracy—e.g., AI (Artificial Intelligence) or quality in terms of Media/Graphics) and associated compute Kernels based on static telemetry (graph parsing), run-time telemetry (data dependent) and histogram of past usage telemetry);
communicate an operational parameter of the input model (paragraph [0056]: The proposed concept may introduce a power, thermal & energy-aware cost function in DPC++ oneAPI that supports task graph re-generation (dynamically without compromising functional accuracy—e.g., AI (Artificial Intelligence) or quality in terms of Media/Graphics) and associated compute Kernels based on static telemetry (graph parsing), run-time telemetry (data dependent) and histogram of past usage telemetry) from a graph scheduler to a processing unit selection service (paragraph [0054]: applications may specify the power/thermal SLA/requirement to the PTCCK to influence the task graph scheduling, compute kernel generation and scheduling on the target XPU HW);
request a first processing unit device assignment from a system wide processing unit selection policy provider to assign a processing unit device based on at least one provisioned policy (paragraph [0054]: The Tensorflow framework may also initialize 411 with oneDNN and specify or negotiate an SLA (Service Level Agreement) from the application to specify 422 the application SLA (e.g., to not use a certain ISA, such as AVX-512). In other words, applications may specify the power/thermal SLA/requirement to the PTCCK to influence the task graph scheduling, compute kernel generation and scheduling on the target XPU HW … The processing circuitry may be configured to negotiate the service-level agreement based on the bi-directional specification of the service-level agreement and based on the capabilities of the two or more different XPUs);
update the compute graph based on the first processing unit device assignment (paragraph [0027]: the method may comprise generating 140 or re-generating 162 the task graph based on a dynamic analysis of the computer program based on the real-world current data flow and/or the real-world past data flow. This dynamic analysis may be performed by executing the computer program (or portions thereof) in a sandboxed environment or using the two or more XPUs with appropriate telemetry); and
dispatch the first processing unit device assignment to the devices by sending a dispatch command (paragraph [0035]: the task graph may be generated or re-generated such, that it includes (e.g., reflects) the assignment of the execution of the one or more compute kernels; paragraph [0038]: the processing circuitry may be configured to transfer the compute kernel(s) and necessary data to and from the respective XPU(s) during execution of the computer program (e.g., via the interface circuitry 12); paragraph [0059]: The processing circuitry is configured to assign the execution of the one or more compute kernels to the two or more different XPUs based on the respective energy-related metric).
Regarding claim 13 referring to claim 1, Poornachandran discloses A non-transitory machine readable storage medium comprising instructions that, when executed, cause processor circuitry to at least: … (paragraph [0018]: the functionality of the processing circuitry 14 or means for processing 14 may be implemented by the processing circuitry 14 or means for processing 14 executing machine-readable instructions; See the rejection for claim 1).
Regarding claim 25 referring to claim 1, Poornachandran discloses An apparatus for processing a cloud client application pipeline across devices, the apparatus comprising: means for tracing, means for building, means for communicating, means for requesting, means for updating, means for dispatching … (paragraph [0027]; processing circuitry; See the rejection claim 1).
Regarding claims 2 and 14, Poornachandran discloses
wherein the processor circuitry is further to:
detect a change in the compute graph (paragraph [0055]: The evaluator 520 may perform real-time evaluation of the kernel power/thermal QoS for future improvement … task graph re-generation may be performed along with associated compute kernels statically (i.e., by graph parsing) or dynamically (e.g., based on a real-world data flow) with or without a past histogram; paragraph [0056]: The proposed concept may introduce a power, thermal & energy-aware cost function in DPC++ oneAPI that supports task graph re-generation (dynamically without compromising functional accuracy—e.g., AI (Artificial Intelligence) or quality in terms of Media/Graphics) and associated compute Kernels based on static telemetry (graph parsing), run-time telemetry (data dependent) and histogram of past usage telemetry. It may provide graph and kernel partitioning based on (CXL) memory, input/output and discrete memory control hub capabilities. It may provide hardware and application awareness, e.g., to support virtual machine migration and to dynamically adapt a kernel to available newer hardware);
request a second processing unit device assignment (paragraph [0054]: The Tensorflow framework may also initialize 411 with oneDNN and specify or negotiate an SLA (Service Level Agreement) from the application to specify 422 the application SLA (e.g., to not use a certain ISA, such as AVX-512). In other words, applications may specify the power/thermal SLA/requirement to the PTCCK to influence the task graph scheduling, compute kernel generation and scheduling on the target XPU HW … The processing circuitry may be configured to negotiate the service-level agreement based on the bi-directional specification of the service-level agreement and based on the capabilities of the two or more different XPUs);
update the compute graph based on a second processing unit device assignment (paragraph [0027]: the method may comprise generating 140 or re-generating 162 the task graph based on a dynamic analysis of the computer program based on the real-world current data flow and/or the real-world past data flow. This dynamic analysis may be performed by executing the computer program (or portions thereof) in a sandboxed environment or using the two or more XPUs with appropriate telemetry); and
dispatch the second processing unit device assignment by sending a second dispatch command (paragraph [0035]: the task graph may be generated or re-generated such, that it includes (e.g., reflects) the assignment of the execution of the one or more compute kernels; paragraph [0038]: the processing circuitry may be configured to transfer the compute kernel(s) and necessary data to and from the respective XPU(s) during execution of the computer program (e.g., via the interface circuitry 12; paragraph [0059]: The processing circuitry is configured to assign the execution of the one or more compute kernels to the two or more different XPUs based on the respective energy-related metric).
Regarding claim 3, Poornachandran discloses
wherein a security model of a multi-process web browser is preserved (paragraph [0182] : The computer-executable instructions can be part of, for example, an operating system of the computing system, an application stored locally to the computing system, or a remote application accessible to the computing system (e.g., via a web browser)).
Regarding claims 4 and 15, Poornachandran discloses
wherein the input model is a machine learning model (Fig. 1c; paragraph [0021]: The method comprises obtaining 130 the computer program; paragraph [0027]: the computer program may be divisible into different tasks; paragraph [0044]: Machine-learning models are trained using training input data; paragraph [0056]: The proposed concept may introduce a power, thermal & energy-aware cost function in DPC++ oneAPI that supports task graph re-generation (dynamically without compromising functional accuracy—e.g., AI (Artificial Intelligence) or quality in terms of Media/Graphics) and associated compute Kernels based on static telemetry (graph parsing), run-time telemetry (data dependent) and histogram of past usage telemetry).
Regarding claims 8 and 19, Poornachandran discloses
wherein processor circuitry is further to at least one of instantiate or execute the machine readable instructions to communicate to the devices via discovery and telemetry (paragraph [0056]: The proposed concept may introduce a power, thermal & energy-aware cost function in DPC++ oneAPI that supports task graph re-generation (dynamically without compromising functional accuracy—e.g., AI (Artificial Intelligence) or quality in terms of Media/Graphics) and associated compute Kernels based on static telemetry (graph parsing), run-time telemetry (data dependent) and histogram of past usage telemetry).
Regarding claims 9 and 20, Poornachandran discloses
wherein the devices are implemented in at least one of a Central Processing Unit, a Graphics Processing Unit, and a Vision Processing Unit (paragraph [0025]: he two or more XPUs may comprise two or more of the group of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU).
Regarding claims 11 and 22, Poornachandran discloses
wherein processor circuitry is further to at least one of instantiate or execute the machine readable instructions to accelerate the first processing unit device assignment using a processing unit prediction machine learning mode (paragraph [0055]: The evaluator module 520 may police the run time kernel with active telemetry from the XPU to provide feedback and insights to the kernel generator in the controller module for the future. This can be implemented e.g., using the RL (Reinforcement Learning)-based AutoML framework for scale out and real-time adaptation. In other words, the processing circuitry may be configured to process data related to on a monitoring of the execution of the computer program (i.e., active telemetry by the XPU) by the two or more XPUs using a machine-learning model being trained to output a monitored energy-related metric based on the data related to the monitoring, and to assign the execution of the one or more compute kernels and/or to generate or re-generate the one or more compute kernels based on the output of the machine-learning model. For example, as outlined above, the RL-based AutoML framework may be used to create and/or train the machine-learning model. In effect, the evaluator 520 & controller 510 modules may police the run time kernel with active telemetry from the XPUs to provide feedback and insights into the kernel generator module for future improvements with or without Machine Learning (ML) support).
Regarding claims 12 and 24, Poornachandran discloses
further including the processor circuitry to: train the processing unit prediction machine learning model based on the first processing unit device assignment; and predict, using the processing unit prediction machine learning model (paragraph [0055]: The evaluator module 520 may police the run time kernel with active telemetry from the XPU to provide feedback and insights to the kernel generator in the controller module for the future. This can be implemented e.g., using the RL (Reinforcement Learning)-based AutoML framework for scale out and real-time adaptation. In other words, the processing circuitry may be configured to process data related to on a monitoring of the execution of the computer program (i.e., active telemetry by the XPU) by the two or more XPUs using a machine-learning model being trained to output a monitored energy-related metric based on the data related to the monitoring, and to assign the execution of the one or more compute kernels and/or to generate or re-generate the one or more compute kernels based on the output of the machine-learning model. For example, as outlined above, the RL-based AutoML framework may be used to create and/or train the machine-learning model. In effect, the evaluator 520 & controller 510 modules may police the run time kernel with active telemetry from the XPUs to provide feedback and insights into the kernel generator module for future improvements with or without Machine Learning (ML) support), a second processing unit device assignment (paragraph [0056]: The proposed concept may introduce a power, thermal & energy-aware cost function in DPC++ oneAPI that supports task graph re-generation (dynamically without compromising functional accuracy—e.g., AI (Artificial Intelligence) or quality in terms of Media/Graphics) and associated compute Kernels based on static telemetry (graph parsing), run-time telemetry (data dependent) and histogram of past usage telemetry. It may provide graph and kernel partitioning based on (CXL) memory, input/output and discrete memory control hub capabilities. It may provide hardware and application awareness, e.g., to support virtual machine migration and to dynamically adapt a kernel to available newer hardware).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made
Claims 5-7, 16-18, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Poornachandran (US 2022/0326991, hereinafter Poornachandran) in view of Nurvitadhi et al. (US 2022/0114495, hereinafter Nurvitadhi).
Regarding claims 5 and 16, Poornachandran does not disclose wherein the input model is a web-based model. Nurvitadhi discloses wherein the input model is a web-based model (paragraph [0025]: an XPU may be implemented by a heterogeneous computing system including multiple types of processor circuitry (e.g., one or more FPGAs, one or more CPUs, one or more GPUs, one or more DSPs, etc.; paragraph [0039]: The AutoML architecture 100 of the illustrated example includes example optimized applications 104, example optimized middleware and frameworks 106, and example application programming interfaces (APIs) 108. In some examples, the optimized applications 104 can be implemented by applications (e.g., software applications, web- or browser-based applications, etc.) that are customized, tailored, and/or otherwise optimized to effectuate the identification and/or generation of a composable ML compute node; paragraph [0040]: The APIs 108 of the illustrated example can be invoked to program, develop, and/or otherwise generate an AI/ML application by at least one of direct programming or API-based programming; paragraph [0043]: the analysis tools 116 can instantiate emulator(s) to emulate all of the hardware and/or software features of the composable ML compute node to generate and/or otherwise output one or more evaluation parameters).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Poornachandran’s XPU-aware runtime/tracing/graph generation techniques to workloads coming from a web/browser context because Nurvitadhi expressly teaches web/browser-based applications as a source of ML workloads and exposes APIs and middleware to connect those applications to AutoML/runtime tooling, thereby resulting in a system in which the “input model” (i.e., the traced model for graph generation and placement) originates from a web-based application. The motivation would have been to provide techniques to improve access and availability of Machine Learning (ML) to various applications and use cases (Nurvitadhi paragraph [0026]).
Regarding claims 6 and 17, Poornachandran does not disclose wherein processor circuitry is further to at least one of instantiate or execute the machine readable instructions to construct the web-based model from a plurality of browser application programming interfaces. Nurvitadhi discloses wherein processor circuitry is further to at least one of instantiate or execute the machine readable instructions to construct the web-based model from a plurality of browser application programming interfaces (paragraph [0025]: an XPU may be implemented by a heterogeneous computing system including multiple types of processor circuitry (e.g., one or more FPGAs, one or more CPUs, one or more GPUs, one or more DSPs, etc.; paragraph [0039]: The AutoML architecture 100 of the illustrated example includes example optimized applications 104, example optimized middleware and frameworks 106, and example application programming interfaces (APIs) 108. In some examples, the optimized applications 104 can be implemented by applications (e.g., software applications, web- or browser-based applications, etc.) that are customized, tailored, and/or otherwise optimized to effectuate the identification and/or generation of a composable ML compute node; paragraph [0040]: The APIs 108 of the illustrated example can be invoked to program, develop, and/or otherwise generate an AI/ML application by at least one of direct programming or API-based programming; paragraph [0043]: the analysis tools 116 can instantiate emulator(s) to emulate all of the hardware and/or software features of the composable ML compute node to generate and/or otherwise output one or more evaluation parameters).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Poornachandran’s XPU-aware runtime/tracing/graph generation techniques to workloads coming from a web/browser context because Nurvitadhi expressly teaches web/browser-based applications as a source of ML workloads and exposes APIs and middleware to connect those applications to AutoML/runtime tooling, thereby resulting in a system in which the “input model” (i.e., the traced model for graph generation and placement) originates from a web-based application. The motivation would have been to provide techniques to improve access and availability of Machine Learning (ML) to various applications and use cases (Nurvitadhi paragraph [0026]).
Regarding claims 7 and 18, Poornachandran does not disclose wherein processor circuitry is further to at least one of instantiate or execute the machine readable instructions to trace the execution of the input model inside a browser renderer process. Nurvitadhi discloses wherein processor circuitry is further to at least one of instantiate or execute the machine readable instructions to trace the execution of the input model inside a browser renderer process (paragraph [0025]: an XPU may be implemented by a heterogeneous computing system including multiple types of processor circuitry (e.g., one or more FPGAs, one or more CPUs, one or more GPUs, one or more DSPs, etc.; paragraph [0039]: The AutoML architecture 100 of the illustrated example includes example optimized applications 104, example optimized middleware and frameworks 106, and example application programming interfaces (APIs) 108. In some examples, the optimized applications 104 can be implemented by applications (e.g., software applications, web- or browser-based applications, etc.) that are customized, tailored, and/or otherwise optimized to effectuate the identification and/or generation of a composable ML compute node; paragraph [0040]: The APIs 108 of the illustrated example can be invoked to program, develop, and/or otherwise generate an AI/ML application by at least one of direct programming or API-based programming; paragraph [0043]: the analysis tools 116 can instantiate emulator(s) to emulate all of the hardware and/or software features of the composable ML compute node to generate and/or otherwise output one or more evaluation parameters). While Nurvitadhi does not literally phrase “browser renderer process,” it expressly treats browser/web applications as sources of ML workloads and includes interfaces and middleware to integrate them into AutoML configuration/analysis flows.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Poornachandran’s XPU-aware runtime/tracing/graph generation techniques to workloads originating from a web/browser context as taught by Nurvitadhi, thereby placing the tracing instrumentation at the runtime where the browser model executes (i.e., the browser renderer process) so the runtime can collect accurate dynamic traces and telemetry for task-graph generation or re-generation and XPU placement. The motivation would have been to provide techniques to improve access and availability of Machine Learning (ML) to various applications and use cases (Nurvitadhi paragraph [0026]).
Regarding claim 23, Poornachandran does not disclose wherein the processing unit selection service is a proxy inside a browser process to communicate between the graph scheduler and the system-wide processing unit selection policy provider. Nurvitadhi discloses wherein the processing unit selection service is a proxy inside a browser process to communicate between the graph scheduler and the system-wide processing unit selection policy provider (paragraph [0025]: an XPU may be implemented by a heterogeneous computing system including multiple types of processor circuitry (e.g., one or more FPGAs, one or more CPUs, one or more GPUs, one or more DSPs, etc.; paragraph [0039]: The AutoML architecture 100 of the illustrated example includes example optimized applications 104, example optimized middleware and frameworks 106, and example application programming interfaces (APIs) 108. In some examples, the optimized applications 104 can be implemented by applications (e.g., software applications, web- or browser-based applications, etc.) that are customized, tailored, and/or otherwise optimized to effectuate the identification and/or generation of a composable ML compute node; paragraph [0040]: The APIs 108 of the illustrated example can be invoked to program, develop, and/or otherwise generate an AI/ML application by at least one of direct programming or API-based programming; paragraph [0043]: the analysis tools 116 can instantiate emulator(s) to emulate all of the hardware and/or software features of the composable ML compute node to generate and/or otherwise output one or more evaluation parameters). While Nurvitadhi does not literally use the phrase “browser process,” it expressly treats browser/web applications as sources of ML workloads and provides the API/middleware patterns and tooling that would naturally be implemented as in-browser proxies or browser-side components to present model/operational parameters and to communicate with backend AutoML/selection service.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Poornachandran’s XPU-aware runtime/tracing/graph-generation and assignment architecture with Nurvitadhi’s browser/API integration to implement the claimed proxy-in-browser processing-unit selection service. The motivation would have been to provide techniques to improve access and availability of Machine Learning (ML) to various applications and use cases (Nurvitadhi paragraph [0026]).
Claims 10 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Poornachandran (US 2022/0326991, hereinafter Poornachandran) in view of Intel “Unlocking the Power of Intel® Deep Link Part One: Client Artificial Intelligence (AI) Using Intel® GPUs,” 2021, hereinafter Intel.
Regarding claims 10 and 21, Poornachandran does not disclose wherein the first processing unit device assignment is based on utilization of a deep-link technology connection. Intel discloses wherein the first processing unit device assignment is based on utilization of a deep-link technology connection (page 4: Using OpenVINO™, it is possible to select which of the available compute devices is best used to run inference with each CNN. One device - CPU, iGPU or dGPU - can be used to run both CNN models, or the more efficient path may be to run the Style Transfer model on one device and channel the Upscale model through another. A test implementation of the pipeline (with the pre- and post-processing and CODEC stages being perform ed by the CPU, upscaling performed by the iGPU, and style transfer performed by the dGPU) resulted in the following performance and device utilization figures:
PNG
media_image1.png
178
656
media_image1.png
Greyscale
). Intel therefore expressly teaches basing device assignment on Deep Link-enabled multi-GPU utilization and partitioning (i.e., choosing device targets per CNN based on Deep Link coordination).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Poornachandran’s XPU-aware runtime/tracing/graph-generation system so that the “first processing unit device assignment” is determined using Deep Link-based device utilization/partitioning information as taught by Intel, thereby considering Deep Link device targets and utilization (e.g., assign Style Transfer to dGPU and Upscale to iGPU) when producing the first device assignment. The motivation would have been to provide techniques to offer significant gains in both performance and efficiency, boosting the functional capabilities of a given application by offering expanded computing capabilities and processing options (Intel page 2).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SISLEY N. KIM whose telephone number is (571)270-7832. The examiner can normally be reached M-F 11:30AM -7:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, April Y. Blair can be reached on (571)270-1014. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SISLEY N KIM/Primary Examiner, Art Unit 2196 01/17/2026