DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to argument
Applicant's arguments filed 09/03/2025 ("Arguments/Remarks") have been fully considered but they are not persuasive.
Argument – 1: (page: 9) Applicant contends: “Applicant submits that Dirac does not describe or suggest deleting result data, let alone deleting result data stored in association with a record with respect to all iterations subsequent a last successful iteration. Rather, Dirac describes a system that modifies datasets by replacing features (PV1, PV2, PV3) with, for example, a mean value, a random value, and a median value, which are then run through a model to produce prediction results. Applicant submits that the system of Dirac is performing controlled variable substitution for re-evaluation, not resuming failed training jobs or deleting data as provided in amended Claim 1.”
Regarding the above argument, the amended claim limitation is taught by the newly introduced reference Beyer et al, see the 35 USC § 103 rejection section.
Argument – 2: (page: 9) Applicant contends: “As such, Dirac, in combination with the cited art, fails to describe or suggest at least: “determine the accessed model training operation has previously been executed by a different processing worker pod”, “determine a last successful iteration of the model training operation performed by the different processing worker pod”, or “delete, from the corresponding record in the shared repository, any evaluation result data stored in association with the corresponding record in respect of all iterations subsequent to a last successful iteration of the model training operation.””
Regarding the above argument, the Examiner notes that Mundra et al., ¶[0018] describes how the client device managed multiple workflow tasks across different worker threads, including monitoring and renewing to ensure to country of execution. This indicates awareness of which worker threat (or pods) have previously executed particular workflow task. In addition, Mundra ¶[0060] explains that when execution of workflow task is suspended, the system retrieves corresponding state information and resumes execution from the point where it was last suspended, effectively identifying the last successful iteration of the task previously performed by another worker pod. Applicant’s argument regarding limitation: “delete, from the corresponding record in the shared repository, any evaluation result data stored in association with the corresponding record in respect of all iterations subsequent to a last successful iteration of the model training operation” is addressed by the newly introduced reference Beyer et al, see the 35 USC § 103 rejection section.
As to the remaining dependent claims, applicant argue that they are allowable due to their respective direct and indirect dependencies upon one of the aforementioned Independent claims. The examiner respectfully disagrees, Independent claims were not allowable as stated in the paragraph above in this “Response to Arguments” section in this office action.
Claim Rejections - 35 USC § 112: New Matter
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and
of the manner and process of making and using it, in such full, clear, concise, and exact terms as to
enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to
make and use the same, and shall set forth the best mode contemplated by the inventor or joint
inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and
process of making and using it, in such full, clear, concise, and exact terms as to enable any person
skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the
same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claim(s) 7 – 9 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AlA), first paragraph, as
failing to comply with the written description requirement. The claim(s) contains subject matter
which was not described in the specification in such a way as to reasonably convey to one skilled in
the relevant art that the inventor or a joint inventor, or for applications subject to pre-AlA 35 U.S.C.
112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Specifically, claim 1 recites “wherein the set of hyperparameter configurations for each model training operation comprises .” that is considered new matter because the original disclosure does not appear to support all the set of hyperparameter configurations for each model training operation, see Applicant’s specification ¶[0013].
Regarding the dependent claim(s) 8 – 9, do not resolve the noted deficiencies and thus are appropriately rejected.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are:
the processing master pod being configured to …, processing worker pods … being configured to and each worker pod … being configured to in claim 1.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1 – 4, 7, 9 – 10, 13 – 16 and 19 – 20 are rejected under 35 U.S.C. 103 as being unpatentable over Narayan et al., "Ultron-AutoML: an open-source, distributed, scalable framework for efficient hyper-parameter optimization.", (hereafter Narayan) in view of Dirac et al., Pub. No.: US10963810B2, (hereafter Dirac), Mundra et al., Pub. No.: US20220147388A1, (hereafter Mundra) and Beyer et al., Pub. No.: US11182209B2, (hereafter Beyer).
Regarding claim 1, Narayan teaches: A computing system for optimising a machine learning process, the computing system being implemented using a cluster computing infrastructure comprising a plurality of computing nodes, the computing system comprising at an application level:
(Narayan, page: 6, “For scalability, we rely on the Kubernetes cluster auto-scaler. The cluster [implemented using a cluster computing infrastructure] auto-scaler minimizes node underutilization and therefore compute cost by dynamically commissioning and decommissioning nodes [comprising a plurality of computing nodes,]. Kubernetes can also shift in-progress jobs out of unavailable nodes to available ones. This allows us to effectively leverage intrinsically unreliable computes like pre-emptible GPU nodes which are up to 70% more cost effective than regular nodes for training models”)
a processing master pod arranged to manage the optimisation,
(Narayan, page: 6, “E. Master and Workers The master and worker pods [a processing master pod] execute the HPO [arranged to manage the optimisation] (i.e.: hyper-parameter optimization (HPO)) Job as follows: 1) The master pod samples HP configurations and fetches their scores by calling a stub of the model’s score function as in step 5, Fig. 1. When the stub is called, the framework pushes the HP configurations into the Work queue and the master pod waits for their scores on the Results queue.”)
the processing master pod being configured to maintain a shared work queue comprising a plurality of machine learning model training operations,
PNG
media_image1.png
536
426
media_image1.png
Greyscale
[AltContent: textbox ([the processing master pod being configured to maintain])][AltContent: textbox ([a shared work queue])][AltContent: textbox ([comprising a plurality of machine learning model training operations])](Narayan, fig. 8)
each model training operation comprising an associated set of hyperparameter configurations to be evaluated during the training operation,
(Narayan, page: 1, “Additionally, the models need to be extensively fine-tuned for hyper-parameters such as the learning rate, regularization terms such as drop-out and architecture related choices related to the depth and width. This entails running many Deep Learning training jobs [each model training operation], each with a different hyper-parameter configuration, in parallel, or, if used in conjunction with a hyper-parameter optimization algorithm [comprising an associated set of hyperparameter configurations to be evaluated during the training operation,], sequentially with a lower degree of parallelism.”)
wherein each training operation is configured to be executed for a pre-defined number of iterations;
(Narayan, page: 3, “A. Specifying and Running an HPO Job The user submits a HPO Job via HTTP POST method. The HTTP POST payload, shown in Fig. 2, is a JSON object that fully defines the HPO Job. The JSON object values include the packaged code-base for executing a model training job, the hyper-parameter search space, the choice of hyper-parameter tuning strategy from supported strategies – Random Search, TPE [20], and REINFORCE based methods [32]. The overall computational budget, characterized in terms of number of training jobs [wherein each training operation], is set via the num iterations key [is configured to be executed for a pre-defined number of iterations].”)
a shared repository configured to store a plurality of records, each record corresponding to one of the model training operations in the shared work queue; and
(Narayan, page: 7, “1) During training, inside the Data Manager, the data from the object store [a shared repository configured to store a plurality of records] is incrementally streamed, passed through the shuffling-augmenting-batching operations and the output batches are pushed into the tf.data Queue. When the Data Manager API Controller receives a request for a set of training batches [each record corresponding to one of the model training operations] (i.e.: The requests from the Data Manager API Controller for training batches represent operations where these batches are dequeued and fed into the model), it dequeues them from the tf.data Queue [in the shared work queue] (i.e.: tf.data Queue (or the underlying pipeline mechanism ensuring that batches are ready for consumption) acts as the shared queue storing the records before they are retrieved for training) and serves them. The pipeline works to ensure that the tf.data Queue is always filled to capacity. The Data Manager is multi-threaded, with a pipeline running in each, all injecting data into the tf.data Queue.”)
a plurality of processing worker pods, each worker pod being in operative communication with the shared work queue and the shared repository, and being configured to:
(Narayan, page: 4, “4) The Job Agent acquires resources from the Resource Limiter, launches the master and creates two queues: Work queue, Results queue. 5) The master, on initialization, starts a pool of parallel worker pods [a plurality of processing worker pods]. The master and workers co-ordinate via the Work queue [each worker pod being in operative communication with the shared work queue and the shared repository] (i.e.: the Work queue is where the master enqueues hyperparameter configurations, and the worker pods dequeue these tasks, meaning they are in operative communication with the Work queue), Results queue to run the HPO algorithm to completion 6) Finally, the Completion Manager deletes the master, associated workers, Results queue and Work queue, marking the end of the HPO Job.”)
access, from the shared work queue, a model training operation;
(Narayan, page: 4, “4) The Job Agent acquires resources from the Resource Limiter, launches the master and creates two queues: Work queue, Results queue. 5) The master, on initialization, starts a pool of parallel worker pods. The master and workers co-ordinate via the Work queue [access, from the shared work queue], Results queue to run the HPO algorithm to completion [a model training operation] (i.e.: each worker pod dequeues a task (i.e., a model training operation) from the Work queue, executes the assigned training job) 6) Finally, the Completion Manager deletes the master, associated workers, Results queue and Work queue, marking the end of the HPO Job.”)
retrieve, from the shared repository, the corresponding record for the accessed model training operation;
(Narayan, page: 7, “This container can also be extended. Models can save checkpoint files within separate folders named as per the respective epochs. As soon as all checkpoint files for an epoch have been written to its corresponding folder, the model can create an indicator file inside it which commands the checkpoint syncer to upload the epoch’s checkpoint folder to the object store. During training [the corresponding record for the accessed model training operation], the checkpoint recovery container retrieves all folders from the object store [retrieve, from the shared repository], sorts them in descending order to get the checkpoint files for the latest epoch.”)
after the pre-defined number of iterations for the accessed model training operation have been executed; and
PNG
media_image3.png
315
353
media_image3.png
Greyscale
[AltContent: textbox ([after the pre-defined number of iterations for the accessed model training operation have been executed;])][AltContent: rect](Narayan, fig. 2)
Narayan does not teach:
for each executed iteration, output evaluation result data associated with a corresponding iteration to the shared repository for storage in the corresponding record
determine the accessed model training operation has previously been executed by a different processing worker pod: determine a last successful iteration of the model training operation performed by the different processing worker pod
outputting steps with respect to each of the iterations subsequent to the last successful iteration
delete, from the corresponding record in the shared repository, any evaluation result data stored in association with the corresponding record in respect of all iterations subsequent to a last successful iteration of the model training operation implement each executing
Dirac teaches:
for each executed iteration, output evaluation result data associated a corresponding iteration to the shared repository for storage in the corresponding record
(Dirac, (col. 48, 53:ff – col. 49. 1:ff), “If the accuracy/quality measures 2630 are satisfactory, the candidate model 2620 may be designated as an approved model 2640 in the depicted embodiment. Otherwise, any of several techniques may be employed in an attempt to improve the quality or accuracy of the model's predictions. Model tuning 2672 may comprise modifying the set of independent or input variables being used for the predictions, changing model execution parameters (such as a minimum bucket size or a maximum tree depth for tree-based classification models), and so on, and executing additional training runs 2618. Model tuning may be performed iteratively using the same training and test sets, varying some combination of input variables and parameters in each iteration [for each executed iteration] in an attempt to enhance the accuracy or quality of the results. In another approach to model improvement, changes 2674 may be made to the training and test data sets for successive training-and-evaluation iterations [output evaluation result data associated a corresponding iteration]. For example, the input data set may be shuffled (e.g., at the chunk level and/or at the observation record level), and a new pair of training/test sets may be obtained for the next round of training. In another approach, the quality of the data may be improved by, for example, identifying observation records whose variable values appear to be invalid or outliers, and deleting such observation records from the data set.”)
to the shared repository for storage in the corresponding record
(Dirac, (col. 25 26:ff), “In accordance with the selected distribution strategy and processing plan, a set of resources may be identified for Jk (element 957). The resources (which may include compute servers or clusters, storage devices, and the like) may be selected from the MLS-managed shared pools [to the shared repository for storage in the corresponding record], for example, and/or from customer-assigned or customer-owned pools. JK's operations may then be performed on the identified resources (element 960), and the client on whose behalf Jk was created may optionally be notified when the operations complete (or in the event of a failure that prevents completion of the operations).”)
Dirac and Narayan are related to the same field of endeavor (i.e.: distributed computing architecture). It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teaching of Dirac with teachings of Narayan to efficiently manage and respond to potential data redundancy across datasets, supporting accurate and timely actions such as notifications to clients (Dirac, Abstract).
Narayan in view of Dirac do not teach:
determine the accessed model training operation has previously been executed by a different processing worker pod: determine a last successful iteration of the model training operation performed by the different processing worker pod
outputting steps with respect to each of the iterations subsequent to the last successful iteration
delete, from the corresponding record in the shared repository, any evaluation result data stored in association with the corresponding record in respect of all iterations subsequent to a last successful iteration of the model training operation implement each executing
Mundra teaches:
determine the accessed model training operation has previously been executed by a different processing worker pod:
(Mundra, “[0018] In other instances, the client device may separate the transmission of requests for additional workflow tasks and the lease time interval. This enables the client device to request additional workflow tasks while other workflow tasks are still being processed by the client device to maintain a high processing load. For example, the client device may execute one or more poller threads that can request additional workflow tasks at a time just before a set of workflow tasks are expected to be completed by a corresponding set of worker threads (e.g., when the lease time interval has not yet expired). The client device may receive the new workflow tasks as the previous workflow tasks are returned to the server and assign the new workflow tasks to the now idle worker threads [determine the accessed model training operation has previously been executed by a different processing worker pod].”)
determine a last successful iteration of the model training operation performed by the different processing worker pod
(Mundra, “[0060] When the delay terminates, work-item executor may retrieve the suspended workflow tasks from temporary workflow task queue 308 and the corresponding state information. Work-item executor 140 may then resume execution [determine a last successful iteration of the model training operation] of each workflow task using the worker threads (e.g., workers 212-1-212-n) [performed by the different processing worker pod] at the point of execution when execution was suspended [if so, determine a last successful iteration of the model training operation]. Work-item executor 140 may output updated state information to heartbeaters 116 (through box 120) to enable heartbeaters 116 may to continue to monitor the execution of the worker threads.”)
Mundra, Narayan and Dirac are related to the same field of endeavor (i.e.: distributed computing architecture). It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teaching of Mundra with teachings of Narayan and Dirac to add dynamic resource allocation and improves fault tolerance by enabling worker threads to be repurposed during processing delays. (Mundra, Abstract).
Narayan in view of Dirac and Mundra do not teach:
delete, from the corresponding record in the shared repository, any evaluation result data stored in association with the corresponding record in respect of all iterations subsequent to a last successful iteration of the model training operation implement each executing
Beyer teaches:
delete, from the corresponding record in the shared repository, any evaluation result data stored in association with the corresponding record in respect of all iterations subsequent to a last successful iteration of the model training operation implement each executing and
(Beyer, “[0040] The leader job scheduler 201 may receive a request to terminate a job 205 that is to be performed in the future from the client computing device 401. The leader job scheduler 201 may determine which of the one or more job schedulers 201 in the distributed job scheduling system 200 is managing the job 205 to be terminated by sending a query to the data store 202 regarding a job scheduler 201 that is currently managing the job 205 to be terminated. In particular embodiments, the leader job scheduler 201 may determine which of the one or more job schedulers is managing the job 205 to be terminated based on local records. The leader job scheduler 201 may forward the request to terminate the job 205 to the determined job scheduler. The job scheduler 201 managing the job 205 to be terminated may send a request to delete the job 205 to the data store 202 [delete, from the corresponding record in the shared repository]. On receiving the request to delete the job 205, the data store 202 may delete the job description for the job and stored status information. The job scheduler 201 may eliminate all the local data associated with the job 205 [any evaluation result data stored in association with the corresponding record in respect of all iterations subsequent to a last successful iteration of the model training operation implement each executing] (i.e.: the job scheduler 201 delete all data related to the terminated job/process (job 205) in the data store 202). The job scheduler 201 may terminate the handler for the job 205. Although this disclosure describes terminating a job that is to be performed in the future in a particular manner, this disclosure contemplates terminating a job that is to be performed in the future in any suitable manner.”)
Beyer, Narayan, Dirac and Mundra are related to the same field of endeavor (i.e.: distributed computing architecture). It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teaching of Beyer with teachings of Narayan, Dirac and Mundra to coordinate job execution and tracking by enabling centralized storage, dispatch and status updating of multi-step jobs performed across multiple worker systems. (Beyer, Abstract).
Regarding claim 2, Narayan in view of Dirac, Mundra and Beyer teach the method of claim 1.
Dirac further teaches: wherein each model training operation has an associated completion time period within which execution of each of the iterations is to be completed
(Dirac, (col. 18, 52:ff – col. 19. 1:ff), “At t5, the portion of J3 on which J4 depends may be complete, and the client may be notified accordingly. However, J4 also depends on the completion of J2, so J4 cannot be started until J2 completes at t6. J3 continues execution until t8. J4 completes at t7, earlier than t8. The client is notified regarding the completion of each of the jobs corresponding to the respective API invocations API1-API4 in the depicted example scenario. In some embodiments, partial dependencies between jobs may not be supported—instead, as mentioned earlier, in some cases such dependencies may be converted into full dependencies by splitting multi-phase jobs into smaller jobs. In at least one implementation, instead of or in addition to being notified when the jobs corresponding to the API invocations are complete (or when phases of the jobs are complete), clients may be able to submit queries to the MLS to determine the status (or the extent of completion) of the operations corresponding to various API calls. For example, an MLS job [wherein each model training operation] monitoring web page may be implemented, enabling clients to view the progress of their requests (e.g., via a “percent complete” indicator for each job), expected completion times [has an association completion time period within which execution of each of the iterations is to be completed], and so on. In some embodiments, a polling mechanism may be used by clients to determine the progress or completion of the jobs.”)
It would have been obvious to one of ordinary skill in the art before the effective filling date of the present application to combine the teachings of Dirac with teachings of Narayan, Mundra and Beyer for the same reasons disclosed for claim 1.
Claim 10 recites analogous limitation as claim 1, so is rejected under similar rationale.
Regarding claim 3, Narayan in view of Dirac, Mundra and Beyer teach the method of claim 2.
Mundra further teaches: wherein upon expiration of the completion time period, if the execution of the corresponding iteration is incomplete: the iteration is deemed to not have been successful; and
(Mundra, “[0066] For example, if a worker thread fails to execute a workflow task [the iteration is deemed to not have been successful] within the time interval of the TTL value [wherein upon expiration of the completion time period, if the execution of the corresponding iteration is incomplete:], the heartbeater may first determine if the worker thread is executing. If the worker thread is executing, then heartbeater may request to renew the token (e.g., reset the time interval of the TTL value). This gives the worker thread more time to execute the workflow task and prevents the client device from having to request the workflow task, receive the workflow task, and re-execute the workflow task from the beginning. If the worker thread has been executing beyond the time interval of the TTL value, the heartbeater thread may terminate the worker thread. The workflow task may be immediately reassigned to another worker thread so that the workflow task may execute within the time interval of the TTL.”)
the model training operation is configured to be returned to the shared work queue for access
(Mundra, “[0041]Work-item executor 140 is responsible for executing workers 112. Work-item executor 140 may also manage interrupts and delay requests received from the local execution environment of the client device (e.g., such as processor interrupt), from server host 128, user input, from other devices, and/or the like. For example, when the client device executes a client-side delay, work-item executor 140 pauses execution of workers 112 and causes the workflow tasks [the model training operation] to be transferred to the temporary workflow task queue [is configured to be returned to the shared work queue for access]. Work-item executor 140 may then execute worker threads of workers 112 that are configured for other tasks until the client-side delay terminates. Work-item executor 140 may also receive external delays from server host 128 and/or other devices. Work-item executor may follow the same process for facilitating external delays and resuming execution upon the termination of external delays. For external delays, work-item executor 140 may transmit the state information to the device requesting the delay to provide an indication of the state of the workflow execution.”)
and execution by a different one of the plurality of processing worker pods.
(Mundra, “[0066] For example, if a worker thread fails to execute a workflow task [wherein each worker pod is configured to, after executing each iteration of the model training operation] within the time interval of the TTL value, the heartbeater may first determine if the worker thread is executing. If the worker thread is executing, then heartbeater may request to renew the token (e.g., reset the time interval of the TTL value) [reset the completion time period in relation to a subsequent iteration of the model training operation]. This gives the worker thread more time to execute the workflow task and prevents the client device from having to request the workflow task, receive the workflow task, and re-execute the workflow task from the beginning. If the worker thread has been executing beyond the time interval of the TTL value, the heartbeater thread may terminate the worker thread. The workflow task may be immediately reassigned to another worker [and execution by a different one of the plurality of processing worker pods] thread so that the workflow task may execute within the time interval of the TTL.”)
It would have been obvious to one of ordinary skill in the art before the effective filling date of the present application to combine the teachings of Mundra with teachings of Narayan, Dirac and Beyer for the same reasons disclosed for claim 1.
Claim(s) 13 and 19 recite analogous limitation as claim 3, so are rejected under similar rationale.
Regarding claim 4, Narayan in view of Dirac, Mundra and Beyer teach the method of claim 2.
Mundra further teaches: wherein each worker pod is configured to, after executing each iteration of the model training operation, reset the completion time period in relation to a subsequent iteration of the model training operation
(Mundra, “[0066] For example, if a worker thread fails to execute a workflow task [wherein each worker pod is configured to, after executing each iteration of the model training operation] within the time interval of the TTL value, the heartbeater may first determine if the worker thread is executing. If the worker thread is executing, then heartbeater may request to renew the token (e.g., reset the time interval of the TTL value) [reset the completion time period in relation to a subsequent iteration of the model training operation]. This gives the worker thread more time to execute the workflow task and prevents the client device from having to request the workflow task, receive the workflow task, and re-execute the workflow task from the beginning. If the worker thread has been executing beyond the time interval of the TTL value, the heartbeater thread may terminate the worker thread. The workflow task may be immediately reassigned to another worker thread so that the workflow task may execute within the time interval of the TTL.”)
It would have been obvious to one of ordinary skill in the art before the effective filling date of the present application to combine the teachings of Mundra with teachings of Narayan, Dirac and Beyer for the same reasons disclosed for claim 1.
Claim(s) 14 and 20 recite analogous limitation as claim 4, so are rejected under similar rationale.
Regarding claim 7, Narayan in view of Dirac, Mundra and Beyer teach the method of claim 1.
Narayan further teaches: wherein the set of hyperparameter configurations for each model training operation comprises .
(Narayan, page: 3 – 4, “The user has complete control and flexibility over composing the model training [wherein the set of hyperparameter configurations for each model training operation comprises one or more of the following] code-base, including choice of dependencies. The user can specify these dependencies, including any version of popular ML frameworks such as Pytorch [13], Tensorflow or scikit-learn along with other libraries, in the setup.py file. The Ultron-AutoML framework manages the installation of these dependencies and execution of the training job as a containerized application (refer section V-B). Within the user supplied code-base, the user only needs to write a class implementation for the interface in Fig. 4. The abstract method score takes as argument, hyperparameters [(a) a combination of hyperparameter input values;] which is an HP configuration object and returns a score. B. Bring Your Own Hyperparameter Optimization Algorithm [(d) a search algorithm to be used] The user can specify a custom HPO algorithm by writing a class for the interface based on the abstraction in Fig. 5. The class needs to implement the following methods: • sample hyperparameter candidates takes as argument an integer N and returns a set of HP configurations of size N. • update state takes as argument a set of tuples, whose components comprise a HP configuration and associated ML model validation score, and updates some internal state.”)
(b) a hyperparameter search space; (c) an objective metric to be achieved as a result of the model training operation;
(Narayan, page: 2, “Hyperparameter Optimization (HPO), also referred to as AutoML in the literature, can be cast as the optimization of an unknown, possibly stochastic, objective function mapping the hyper-parameter search space [(b) a hyperparameter search space;] to a real valued scalar, the ML model’s accuracy or any other performance metric on the validation dataset. The search-space can extend beyond algorithm or architecture [(c) an objective metric to be achieved as a result of the model training operation;] specific elements to encompass the space of data pre-processing and data-augmentation techniques, feature selections, as well as choice of algorithms.”)
Regarding claim 9, Narayan in view of Dirac, Mundra and Beyer teach the method of claim 7.
Narayan further teaches: wherein where the set of hyperparameter configurations comprises a search algorithm to be used, this search algorithm corresponds to a random search function or a grid search function.
(Narayan, page: 2, “These techniques are flexible in that they can search over variable size architectures and have shown very promising results for NAS. Gradient based methods specify the objective function as a parametric model and proceed to optimize it with respect to the hyper-parameters via gradient-descent [35], [38], [39]. Abstractly, a generic HPO algorithm [wherein where the set of hyperparameter configurations comprises a search algorithm to be used] distinct from Random Search or Grid Search [this search algorithm corresponds to a random search function or a grid search function] maintains state and repeats steps based on a state-update procedure and a sampling procedure shown in Fig. 1. This abstraction can inform the design of a distributed framework that can support any HPO algorithm.”)
Regarding claim 15, Narayan in view of Dirac, Mundra and Beyer teach the method of claim 10.
Narayan further teaches: wherein the machine learning model is a neural network.
(Narayan, page: 2, “Neural Architecture Search (NAS) is a special type of HPO where the focus is on algorithm driven design of neural network architecture components or cells [26]. Models [wherein the machine learning model] trained with architectures composed of these algorithmically designed neural network cells [is a neural network] have been shown to outperform their hand-crafted counterparts in image recognition, object detection [57], and semantic segmentation [21], underscoring the practical importance of this field.”)
Regarding claim 16, Narayan teaches: A computer storage medium having computer-executable instructions that, upon execution by a processor, cause the processor to at least:
(Dirac, (col. 118 20:ff), “In some embodiments, system memory 9020 may be one embodiment of a computer-accessible medium configured to store program instructions [A computer storage medium having computer-executable instructions] and data as described above for FIG. 1 through FIG. 75 for implementing embodiments of the corresponding methods and apparatus. However, in other embodiments, program instructions and/or data may be received, sent or stored upon different types of computer-accessible media.”)
The rest of the limitations are analogous to claim 1, so are rejected under similar rationale.
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Narayan in view of Dirac and in further view of SREENIVASAN et al., Pub. No.: US20210358638A1, (hereafter SREENIVASAN).
Regarding claim 8, Narayan in view of Dirac, Mundra and Beyer teach the method of claim 1.
Narayan in view of Dirac, Mundra and Beyer do not teach:
wherein where the set of hyperparameter configurations comprises a combination of hyperparameter input values, these hyperparameter values are randomly generated.
SREENIVASAN teaches:
wherein where the set of hyperparameter configurations comprises a combination of hyperparameter input values, these hyperparameter values are randomly generated.
(SREENIVASAN, “[0007] Various embodiments are described, wherein training the linear regression model uses a genetic method wherein the set of hyperparameter pairs [comprises a combination of hyperparameter input values] are randomly generated [these hyperparameter values are randomly generated].”)
SREENIVASAN, Narayan and Dirac are related to the same field of endeavor (i.e.: distributed computing architecture). It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teaching of SREENIVASAN with teachings of Narayan, Dirac, Mundra and Beyer to adds an optimized model selection process to the system by systematically evaluating multiple hyperparameter pairs to identify the best-performing adherence model. (SREENIVASAN, Abstract).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Allen et al., Pub. No.: US8375389B2.
A multiple resumption events linked to several suspended processes. Each event represents a suspended process and includes an execution time and a resumption time window..
Miller et al., US7653833B1.
A check-pointing for non-clustered workload to make room for a clustered workload that was running on a computer system that has suffered a hardware failure.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner
should be directed to MATIYAS T MARU whose telephone number is (571)270-0902. The examiner
can normally be reached Monday - Friday (8:00am - 4:00pm) EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a
USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to
use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor,
Michelle Bechtold can be reached on (571)431-0762. The fax phone number for the organization were this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from
Patent Center. Unpublished application information in Patent Center is available to registered users.
To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit
https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and
https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional
questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like
assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA)
or 571-272-1000.
/M.T.M./ Examiner, Art Unit 2148
/Ryan Barrett/Primary Examiner, Art Unit 2148