DETAILED ACTION
This action is responsive to Applicant’s reply filed 27 February 2026. This action is made non-final.
Status of the Claims
Claims 19-26 and 30-37 are currently amended.
Claim status is currently pending and under examination for claims 19-40 of which independent claims are 19 and 30.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on September 29, 2025 has been entered.
Response to Amendment
Applicant’s amendments to the Claims have overcome each and every claim objection and 35 U.S.C § 112(b) rejections previously set forth in the Final Office Action mailed December 29th 2025.
Applicant’s arguments regarding the art rejections are moot in view of the new grounds of rejection necessitated by Applicant’s amendment.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 19-20, 22-25, 30-31 and 33-36 are rejected under 35 U.S.C. 103 as being unpatentable over Nguyen et al. (“Spatial-Temporal Multi-Task Learning for Within-Field Cotton Yield Prediction”), hereinafter Nguyen, in view of Ahn et al. (“Deep Elastic Networks with Model Selection for Multi-Task Learning”), hereinafter Ahn, further in view of Liu et al. (“Video-based Prediction for Header-height Control of a Combine Harvester”), hereinafter Liu, and Zoph et al. (US 20190370648 A1), hereinafter Zoph, and further in view of Lin et al. (“MCUNet: Tiny Deep Learning on IoT Devices”), hereinafter Lin.
With respect to claim 19, Nguyen teaches:
A method implemented using one or more processors and comprising (“Figure 4 shows the performance of our proposed model compared with other baselines in term of RMSE metric. The Multi-Task learning and our proposed Spatial-Temporal Multi-Task learning model have the least error in all three years … our Spatial-Temporal Multi-Task learning methods show significant superiority than all the other approaches” (P. 351, Sec. 4.3, First Paragraph). Obtaining performance results of a machine learning model implies the use of a computer which further implies a processor.):
obtaining a plurality of images of crops growing in an agricultural plot (Nguyen discloses “with the introduction of the global positioning system (GPS), geographic information systems (GIS), and yield monitors along with other new technologies, we can quantify spatial variability in soil properties and crop yield in small areas of a field. As satellite and drone technologies develop, we are able to collect remote sensing images at fine resolutions to support within-field yield forecast. Within-field scale crop yield prediction provides valuable information for producers to site-specifically manages their crop, which can optimize crop production for maximum profitability. In the within-field prediction procedure, we use a 30-m grid to represent a continuous surface” (P. 344, Sec. 1).
Nguyen further discloses “our dataset includes weather data, soil properties, spectral data, and NDVI. Spectral data and NDVI are extracted from Lantsat 5 and Landsat 7 remote sensing images. The multi-spectral images were collected from 2001 to 2003 of a cotton field in west Texas. The total area is approximately 48 ha. The sensed images spatial resolution is 30 m. Hence, there are 475 grid cells under investigation” (P. 348, Sec. 4.1). See Figure 2 on P. 347 depicting a plurality of remote sensed images used to predict cotton yield.);
training a … candidate multi-task dense prediction (MT-DP) machine learning [model] … (Nguyen discloses “Figure 2 presents the framework of our prediction model. The cotton field is split into 475 grids for fine-grain prediction. We utilized the Dense and Dropout layers in the network. A shared Dense layer is used to extract latent features from all data dimensions, which are aggregated and fed into multiple sub-networks. Each sub-network represents the architecture of forecasting task in one year for all the grids. In other words, cotton yield prediction for all grids of each year are achieved in parallel via the separated sub-networks. This aggregation is shared among all task-specific sub-networks. Therefore, it helps the task-specific sub-network to learn features from other tasks and to enhance its own prediction performance” (P. 346, Sec. 3.1, ¶1-2). See (P. 348, Sec. 3.3, ¶1) describing minimizing a loss function for the prediction model.),
processing the plurality of images using [a] … candidate MT-DP machine learning [model] to perform a plurality of agricultural prediction tasks (Nguyen discloses “we propose a Multi-Task learning model to predict within-field cotton yield. As shown in Fig. 1, this model ingests many sources of data which contain features for different learning tasks, including soil topographic attributes (elevation, slope, curvature, etc.), spectral data (Blue, Green, Red, and NIR bands denoted as BAND1, BAND2, BAND3, and BAND4, respectively), normalized difference vegetation index (NDVI) during the crop seasons; and weather (temperature, rainfall, etc.) data. These multiple data sources are aggregated in the shared layer before transferring to task-specific layers. This type of design in a Multi-Task learning model makes it capable of enhancing specific learning task by utilizing all sources of information of other related tasks” (P. 345, Sec. 1, First Paragraph).
Nguyen further discloses “our dataset includes weather data, soil properties, spectral data, and NDVI. Spectral data and NDVI are extracted from Lantsat 5 and Landsat 7 remote sensing images. The multi-spectral images were collected from 2001 to 2003 of a cotton field in west Texas” (P. 348, Sec. 4.1). See Figure 2 on P. 347 depicting a plurality of remote sensed images used to predict cotton yield.)
including one or more agricultural prediction tasks that generate pixel-level predictions for the plurality of images (Nguyen discloses Figure 6 on P. 351 (reproduced below) depicting cotton yield prediction (‘pixel-level prediction’) for a region of a cotton field generated by a Spatial-Temporal Multi-Task Learning model. The predictions were generated by processing remote sensing images (‘plurality of images’) of a cotton field and other features disclosed above.
PNG
media_image1.png
513
834
media_image1.png
Greyscale
).
However, Nguyen does not teach training a multi-task dense prediction candidate machine learning model using network architecture search, which is taught by Ahn:
training a … candidate multi-task dense prediction (MT-DP) machine learning [model] using neural network layers sampled from a search space of neural network layers having different parameters using a network architecture search (NAS) (Ahn discloses NAS as a combination of an estimator and selector model, “In this work, we aim to develop an instance-aware dynamic model selection approach for a single network to learn multiple tasks … the estimator is based on a backbone (baseline) network, such as VGG or ResNet. It is structured hierarchically based on modularized blocks which consist of several convolution layers in the backbone network. It can produce multiple network models of different configurations and scales in a hierarchy. The selector is a relatively small network compared to the estimator and outputs a probability distribution over candidate network models for a given instance. The model with the highest probability is chosen by the selector from a pool of candidate models to perform the task. Note that the approach is learned to choose a model corresponding to each instance throughout all tasks. This makes it possible to share the common models or features across all tasks” (P. 6530, Sec. 1, First Paragraph).
Ahn further discloses “note that there are a vast number of candidate models produced by the estimator, and this makes it difficult for the selector to explore the extensive search space. As a simplification strategy of the daunting task, we use a block notation to shrink the search space over the candidate models. A block is defined as a disjoint collection of multiple convolution (or fully connected) layers. The block is constructed as a hierarchical structure such that a lower level of hierarchy only refers fewer channels of hidden layers in the block and a higher level refers more channels, maintaining input and output dimensions of the block” (P. 6531, Sec. 3.1, First Paragraph).
Ahn discloses “the estimator can produce different network models by selecting convolution groups from zero to all groups in every block. The selector outputs a probability distribution over the convolution groups in every block, and a network model is determined from the distribution. The overall loss function consists of a prediction loss term (e.g., cross-entropy) from the determined network model and a sparse regularization term” (P. 6531, Sec. 3.1, Figure 2).
To calculate loss, a chosen candidate model must be trained which implies “training a candidate MT-DP machine learning model”.
Ahn discloses using networks with different parameters as a backbone network, “we used ResNet-l and WRN-l-r as backbone networks in the MTL scenarios, where l is the number of layers and r is the scale factor on the number of convolutional channels … We also used SimpleConvNet introduced in [27, 29] as a backbone network for Mini-ImageNet. SimpleConvNet consists of four 3x3 convolutional layers (32 filters) and three fully connected layers (128 dimensions for hidden units). In the network compression scenario, we used ResNeXt-l (c × sd) [37] and VGG-l [30] to apply our methods in various backbones, where c and sd are the number of individual convolution blocks and unit depth of the convolution blocks in each layer, respectively [37]. The backbone networks are used as baseline methods performing an individual task in each scenario” (P. 6533, Sec. 4.1).),
Ahn teaches training a candidate machine learning model using convolution blocks from a search space is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of Nguyen with the technique disclosed by Ahn to design an effective and efficient neural network model. By choosing architectures from a search space, the most optimal layers can be configured to build an optimized neural network architecture. Therefore, less computational resources are used since an optimal neural network model can perform more effectively and efficiently.
Furthermore, the combination of Nguyen in view of Ahn does not teach operating an agricultural vehicle based on pixel-level predictions, which is taught by Liu:
and operating one or more agricultural vehicles in the agricultural plot based on the pixel-level predictions for the plurality of images (Liu discloses “the goal of this framework is to analyze the field region in front of the combine harvester and predict when the header should be raised. As mentioned in Section I, we assume that the combine harvester is in its normal harvesting state, which means the front reel is rotating and harvesting crops. The output of the system is the predicted future time that indicates when the reel should be lifted. Furthermore, we assume the operator is making correct adjustments while operating the combine; therefore, the video contains the ground truth for the correct time to raise the header. Then the predicted time can be validated” (P. 311, Sec. III.A, First Paragraph).
Liu discloses “figure 5 shows some probability maps at frame number 90, 120, 150 and 180 in one testing clip. The colored regions (both orange and blue) represent the coarsely-segmented field region, and the color shows the probability: blue indicates more likely to be crop area; orange indicates no crop. Notice the combine harvester is driving to the right, and the crop region (blue) is gradually shrinking from right to left over time. In Figure 5b and 5c, we can clearly observe the borderline between the crops and the empty field. But in Figure 5d, there are some uncertain regions in the lower half of the field. One possible reason could be the pretrained classifier has a bias that makes it more effective at classifying crops than classifying empty field” (P. 314, Sec. IV.C, Third Paragraph). See Figures 5A-D depicting an analyzed field that can be used to determine the presence of crops.).
Liu teaches operating a combine harvester based on image predictions used to analyze a field region is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the technique disclosed by Liu to automate harvesting activities. Image predictions can be used to identify the presence of crops as disclosed by Liu. By identifying the presence of crops, an agricultural vehicle can use the information to autonomously navigate, locate crops, collect crop data, and guide harvesting equipment like a harvester’s reel. Therefore, image predictions can be used to automate harvesting activities, which can lead to faster harvests thereby increasing efficiency, productivity, and reducing labor costs.
Furthermore, the combination of Nguyen in view of Ahn and in further view of Liu does not teach training multiple candidate machine learning models based on a satisfaction of a performance metric for the plurality of candidate machine learning models, which is taught by Zoph:
training a plurality of candidate … dense prediction … machine learning models using neural network layers sampled from a search space of neural network layers having different parameters using a network architecture search (NAS) (Zoph discloses “the system 100 determines the architecture for the neural network by searching a space of candidate architectures to identify one or more best performing architectures. Each candidate architecture in the space of candidate architectures includes (i) the same first neural network backbone that is configured to receive an input image and to process the input image to generate a plurality of feature maps and (ii) a different dense prediction cell configured to process the plurality of feature maps and to generate an output for the dense image prediction task. Thus, each candidate architecture includes the same neural network backbone as each other candidate architecture but has a different dense prediction cell from each other candidate architecture” [0030-0032].
Zoph discloses convolutions and pooling operations (‘neural network layers’) can be selected from an operator space (‘search space of neural network layers’), “The operator space, i.e., the space of possible operations from which the operation performed by each of the B blocks is selected, can include one or more of the following: (1) a convolution with a 1×1 kernel, (2) one or more atrous separable convolutions, each having a different sampling rate, and (3) one or more spatial pyramid pooling operations, each having a respective grid size” [0041]. Zoph further discloses “When the operator space includes multiple spatial pyramid pooling operations, each of the spatial pyramid pooling operations will have a different grid size” [0043].
Zoph discloses “the system can use a random search strategy. In a random search strategy, at each iteration of the process 400, the system selects one or more architectures from the space of candidate architectures uniformly at random while also selecting one or more architectures that are close to, i.e., are similar to, the currently best observed architectures, i.e., the architectures already evaluated as part of the search that have been found to perform best on the dense prediction task” [0073].
Zoph discloses “the system trains the selected one or more candidate architectures on at least a portion of the training data (step 408). That is, for each selected candidate, the system trains a neural network having the architecture until criteria for stopping the training are satisfied” [0075].),
the training of the plurality of candidate … machine learning models performed based on a satisfaction of a performance metric for the plurality of candidate MT-DP machine learning models (Zoph discloses “The system then repeatedly performs steps 406-410 until termination criteria for the search are satisfied, e.g., until a threshold number of candidate architectures have been evaluated, until the highest performing candidate architecture reaches a threshold accuracy, or until a threshold amount of time has elapsed” [0071].);
processing the plurality of images using at least one of the plurality of candidate … machine learning models to perform a plurality of … prediction tasks (Zoph discloses “the system determines the architecture for the neural network based on the one or more best performing candidate architectures (step 306). For example, the system can generate, from each of the identified best performing candidates, a final architecture, and then train the final architectures to convergence on the dense prediction task. The system can then select the best performing trained architecture, e.g., as determined based on a quality measure on the validation set, as the architecture of the neural network” [0064].
Zoph discloses “the resulting architectures can achieve state-of-the-art performance on several dense prediction tasks, including achieving 82.7% mIOU accuracy on the Cityscapes data set (street scene parsing), 71.3% mIOU accuracy on the PASCAL-Person-Part data set (person-part segmentation), and 87.9% mIOU accuracy on the PASCAL VOC 2012 data set (semantic image segmentation)” [0011].)
including one or more … prediction tasks that generate pixel-level predictions for the plurality of images (Zoph discloses “the architecture 200 receives a training input 220, i.e., one of the three images shown in FIG. 2, and generates a neural network output 250 that assigns labels to the training input. As shown in FIG. 2, the neural network output 250 is represented as an overlay over the corresponding input image, with pixels assigned the same label being in the same shade in the overlay” [0033].
Zoph discloses “In a semantic image segmentation task, the input is an image and the output is a respective label for every pixel in the image that identifies which object class the pixel belongs to, e.g., from a set of multiple foreground object classes and one or more background object classes” [0023].).
Zoph teaches training multiple candidate dense-prediction machine learning models until an accuracy threshold is reached is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the technique disclosed by Zoph to train candidate dense-prediction machine learning models until an accuracy threshold is reached. By training candidate dense-prediction machine learning models until an accuracy threshold is reached, it can be ensured that the best performing candidate model reaches a desired accuracy, thereby resulting in a reliable model that can be used for dense prediction tasks.
Furthermore, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph does not teach neural architecture search corresponding to hardware-based constraints of a target edge computing system to be operated in association with an agricultural plot, which is taught by Lin:
the NAS to correspond to hardware-based constraints of a target edge computing system to be operated in association with the agricultural plot (Lin discloses “The number of IoT devices based on always-on microcontrollers is increasing rapidly at a historical rate, reaching 250B [2], enabling numerous applications including … precision agriculture, automated retail, etc. These low-cost, low-energy microcontrollers give rise to a brand new opportunity of tiny machine learning (TinyML). By running deep learning models on these tiny devices, we can directly perform data analytics near the sensor, thus dramatically expand the scope of AI applications” (P. 1, Sec. 1, ¶1).
Lin discloses “TinyNAS is a two-stage neural architecture search method that first optimizes the search space to fit the tiny and diverse resource constraints, and then performs neural architecture search within the optimized space. With an optimized space, it significantly improves the accuracy of the final model. … To fit the tiny and diverse resource constraints of different microcontrollers, we scale the input resolution and the width multiplier of the mobile search space [44]. … This leads to 12×9 = 108 possible search space configurations S =W×R. … Our goal is to find the best search space configuration S∗ that contains the model with the highest accuracy while satisfying the resource constraints” (P. 3, Sec. 3.1, ¶1-2).
Neural architecture search is modified to only include architectures that meet resource constraints of a microcontroller that is used in precision agriculture (therefore a microcontroller is a target edge computing system to be operated in association with an agricultural plot).).
Lin teaches modifying neural architecture search to only include architectures that support resource constraints of a microcontroller used in precision agriculture is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the neural architecture search disclosed by Lin to modify neural architecture search to only include architectures that meet resource constraints of a target hardware. By modifying neural architecture search to only include architectures that meet resource constraints of a target hardware, machine learning models that satisfy resource constraints can be configured and embedded in microcontrollers with limited resources, thereby allowing machine learning models to perform accurate sensor-based data analytics for precision agriculture.
With respect to claim 20, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin teaches:
the method of claim 19, further comprising jointly training the NAS and one or more of the plurality of candidate MT-DP machine learning models (Ahn discloses “both estimator and selector are jointly trained in a unified learning framework in conjunction with a sampling-based learning strategy, without additional computation steps” (P. 6529, Abstract).
Ahn discloses “the estimator can produce different network models by selecting convolution groups from zero to all groups in every block. The selector outputs a probability distribution over the convolution groups in every block, and a network model is determined from the distribution. The overall loss function consists of a prediction loss term (e.g., cross-entropy) from the determined network model and a sparse regularization term” (P. 6531, Sec. 3.1, Figure 2). To calculate loss, a candidate model must be trained which implies “training a candidate MT-DP machine learning model”.).
Ahn teaches jointly training a candidate model and the estimator and selector models (‘NAS’) is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of Nguyen with the technique disclosed by Ahn to design an accurate and efficient neural network model. By training the estimator and selector models that make up the NAS, the NAS learns to produce and configure candidate models that are likely to be efficient and accurate. Rather than make random configurations, the trained NAS can learn to make appropriate architecture selections that would result in only producing and training candidate models that are likely to be optimal, thereby saving computational resources and time.
With respect to claim 22, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin teaches:
the method of claim 19, wherein the satisfaction of the performance metric is determined based on an accuracy of a corresponding candidate MT-DP machine learning model (Zoph discloses “The system then repeatedly performs steps 406-410 until termination criteria for the search are satisfied, e.g., until a threshold number of candidate architectures have been evaluated, until the highest performing candidate architecture reaches a threshold accuracy, or until a threshold amount of time has elapsed” [0071].
Zoph discloses generating multiple candidate dense-prediction machine learning models by generating candidate architectures, see [0030-0032].
Ahn discloses generating a candidate machine learning model to perform multiple tasks (‘candidate MT-DP machine learning model’), see (P. 6530, Sec. 1, First Paragraph).).
Zoph teaches training multiple candidate dense-prediction machine learning models until an accuracy threshold is reached is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the technique disclosed by Zoph to train candidate dense-prediction machine learning models until an accuracy threshold is reached. By training candidate dense-prediction machine learning models until an accuracy threshold is reached, it can be ensured that the best performing candidate model reaches a desired accuracy, thereby resulting in a reliable model that can be used for dense prediction tasks.
Ahn teaches generating a candidate machine learning model to perform multiple tasks is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of Nguyen and the training technique of Zoph with the candidate model disclosed by Ahn to generate candidate models that perform multiple tasks. By generating candidate models that perform multiple tasks, a single model can be developed to perform multiple tasks, thus saving computational resources by not having to run and store multiple models to perform individual tasks.
With respect to claim 23, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin teaches:
the method of claim 19, wherein the at least one of the plurality of candidate MT-DP machine learning models is selected based on at least one of an accuracy metric or latency metric (Zoph discloses “the system determines the architecture for the neural network based on the one or more best performing candidate architectures (step 306). For example, the system can generate, from each of the identified best performing candidates, a final architecture, and then train the final architectures to convergence on the dense prediction task. The system can then select the best performing trained architecture, e.g., as determined based on a quality measure on the validation set, as the architecture of the neural network” [0064].
Zoph discloses “The system then repeatedly performs steps 406-410 until termination criteria for the search are satisfied, e.g., until a threshold number of candidate architectures have been evaluated, until the highest performing candidate architecture reaches a threshold accuracy, or until a threshold amount of time has elapsed” [0071]. Zoph further discloses “The neural architecture search system 100 is a system that obtains training data 102 for training a neural network to perform a dense image prediction task and a validation set 104 for evaluating the performance of the neural network on the dense image prediction task” [0027].
Ahn discloses generating a candidate machine learning model to perform multiple tasks (‘candidate MT-DP machine learning model’), see (P. 6530, Sec. 1, First Paragraph).).
Zoph teaches training multiple machine learning candidate models and selecting a candidate model based on accuracy is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the selection technique disclosed by Zoph to select a candidate model based on accuracy. By selecting a candidate model based on accuracy, it can be ensured that the selected model can be used to perform a designated task with the highest accuracy as possible, leading to increased model reliability.
Ahn teaches generating a candidate machine learning model to perform multiple tasks is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of Nguyen and the training technique of Zoph with the candidate model of Ahn to generate candidate models that perform multiple tasks. By generating candidate models that perform multiple tasks, a single model can be developed to perform multiple tasks, thus saving computational resources by not having to run and store multiple models to perform individual tasks.
With respect to claim 24, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin teaches:
the method of claim 23, further including training the selected candidate MT-DP machine learning model to convergence (Zoph discloses “the system determines the architecture for the neural network based on the one or more best performing candidate architectures (step 306). For example, the system can generate, from each of the identified best performing candidates, a final architecture, and then train the final architectures to convergence on the dense prediction task. The system can then select the best performing trained architecture, e.g., as determined based on a quality measure on the validation set, as the architecture of the neural network” [0064].
Ahn discloses generating a candidate machine learning model to perform multiple tasks (P. 6530, Sec. 1, First Paragraph).).
Zoph teaches training a best performing candidate model to convergence is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the training technique of Zoph to only train the best performing candidate model to convergence. By only training the best performing candidate model to convergence, computational resources can be saved by not having to fully train worse-performing candidate models to convergence.
With respect to claim 25, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin teaches:
The method of claim 24, further including deploying the selected candidate MT-DP machine learning model (Zoph discloses “the system 100 uses the trained neural network having the final architecture to process requests received by users, e.g., through the API provided by the system. That is, the system 100 can receive inputs to be processed, use the trained neural network to process the inputs, and provide the outputs generated by the trained neural network or data derived from the generated outputs in response to the received inputs” [0060].
Zoph discloses “the system can generate, from each of the identified best performing candidates, a final architecture, and then train the final architectures to convergence on the dense prediction task” [0064]. Zoph further discloses “Machine learning models can be implemented and deployed using a machine learning framework, .e.g., a TensorFlow framework, a Microsoft Cognitive Toolkit framework, an Apache Singa framework, or an Apache MXNet framework” [0091].
Ahn discloses generating a candidate machine learning model to perform multiple tasks (P. 6530, Sec. 1, First Paragraph).).
Zoph teaches deploying a best performing model that has been trained to convergence is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the technique disclosed by Zoph to deploy a best performing candidate model. By deploying a best performing candidate model, users can access an accurate model, allowing users to make accurate and reliable predictions.
With respect to claim 30, the rejection of claim 19 is incorporated. The difference in scope being:
At least one non-transitory computer readable medium comprising instructions that cause at least one processor to at least (“Figure 4 shows the performance of our proposed model compared with other baselines in term of RMSE metric. The Multi-Task learning and our proposed Spatial-Temporal Multi-Task learning model have the least error in all three years … our Spatial-Temporal Multi-Task learning methods show significant superiority than all the other approaches” (P. 351, Sec. 4.3, First Paragraph). Obtaining performance results of a machine learning model implies the use of a computer which further implies “at least one non-transitory computer readable medium comprising instructions” and a processor that executes stored instructions.).
With respect to claim 31, the claim recites similar limitations corresponding to claim 20, therefore the same rationale of rejection is applicable.
With respect to claim 33, the claim recites similar limitations corresponding to claim 22, therefore the same rationale of rejection is applicable.
With respect to claim 34, the claim recites similar limitations corresponding to claim 23, therefore the same rationale of rejection is applicable.
With respect to claim 35, the claim recites similar limitations corresponding to claim 24, therefore the same rationale of rejection is applicable.
With respect to claim 36, the claim recites similar limitations corresponding to claim 25, therefore the same rationale of rejection is applicable.
Claims 21 and 32 are rejected under 35 U.S.C. 103 as being unpatentable over Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin and Li et al. (US 20220230048 A1), hereinafter Li.
With respect to claim 21, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin teaches:
the method of claim 19, wherein the satisfaction of the performance metric is determined … of a corresponding candidate MT-DP machine learning model (Zoph discloses “The system then repeatedly performs steps 406-410 until termination criteria for the search are satisfied, e.g., until a threshold number of candidate architectures have been evaluated, until the highest performing candidate architecture reaches a threshold accuracy, or until a threshold amount of time has elapsed” [0071].
Zoph discloses generating multiple candidate dense-prediction machine learning models by generating candidate architectures, see [0030-0032].
Ahn discloses generating a candidate machine learning model to perform multiple tasks (‘candidate MT-DP machine learning model’), see (P. 6530, Sec. 1, First Paragraph).).
Zoph teaches training multiple candidate dense-prediction machine learning models until an accuracy threshold is reached is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the technique disclosed by Zoph to train candidate dense-prediction machine learning models until an accuracy threshold is reached. By training candidate dense-prediction machine learning models until an accuracy threshold is reached, it can be ensured that the best performing candidate model reaches a desired accuracy, thereby resulting in a reliable model that can be used for dense prediction tasks.
Ahn teaches generating a candidate machine learning model to perform multiple tasks is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of Nguyen and the training technique of Zoph with the candidate model of Ahn to generate candidate models that perform multiple tasks. By generating candidate models that perform multiple tasks, a single model can be developed to perform multiple tasks, thus saving computational resources by not having to run and store multiple models to perform individual tasks.
However, the combination does not teach determining satisfaction of a performance metric based on a latency of a corresponding candidate machine learning model, which is taught by Li:
wherein the satisfaction of the performance metric is determined based on a latency of a corresponding candidate … machine learning model (Li discloses “the system can be configured to train the candidate neural network until stopping criteria are met, such as a number of iterations for training, a maximum period of time, convergence, or when a minimum accuracy threshold is met. The system can generate performance metrics for the accuracy and latency of the candidate neural network architecture on the target computing resources, in addition to other performance metrics” [0060-0061].
Li discloses “the stopping criteria can specify threshold ranges predetermined to be “optimal.” For example, a threshold range for optimal latency can be a threshold range from a theoretical or measured minimum latency achieved by the target computing resources. The theoretical or measured minimum latency can be based on physical characteristics of the computing resources, such as the minimum amount of time necessary for components of the computing resources to be able to physically read and process incoming data” [0057].
Li further discloses “The stopping criteria can be a minimum predetermined threshold of performance met by a current candidate network. The stopping criteria in addition or alternatively can be a maximum number of search iterations, or a maximum amount of time allocated for performing the search” [0056].).
Li teaches training candidate neural networks until an optimal latency is reached is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the stopping criteria disclosed by Li to stop candidate model training based on latency. By stopping candidate model training based on latency, computer resources and time can be saved by not letting models train beyond reaching a desired latency.
With respect to claim 32, the claim recites similar limitations corresponding to claim 21, therefore the same rationale of rejection is applicable.
Claims 26-27 and 37-38 are rejected under 35 U.S.C. 103 as being unpatentable over Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin and Gupta et al. (“Accelerator-aware neural network design using automl”), hereinafter, Gupta.
With respect to claim 26, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin teaches the method of claim 19, however the combination does not teach a layer type selected from inverted bottleneck and fused inverted bottleneck, which is taught by Gupta:
wherein the different parameters include a layer type selected from inverted bottleneck (IBN) and fused-IBN (Gupta discloses “As a result, crafting the search space to include building blocks that are known to improve hardware utilization as well as excluding incompatible operations becomes a critical component in arriving at accelerator-optimized models. … our search space includes the inverted bottleneck convolution block with a depthwise convolution layer that is used in MobileNetV2 (Sandler et al., 2018). In addition to this baseline block, we introduce a fused inverted bottleneck convolution block that fuses the initial expansion convolution with the depthwise convolution into a single full convolution (Figure 3). Originally this block expands the depth of the input tensor and performs a “cheaper” depthwise convolution with a larger depth dimension. Although, the fused alternative performs a more “expensive” full convolution at a larger depth dimension, it can utilize the hardware resources better and provide more trainable parameters which can be a good latency-accuracy trade-off. In Figure 3, on the top, we observe that the fused inverted bottleneck block has a better runtime as well as more trainable parameters compared to the baseline inverted bottleneck. However, on the bottom, fused version has more than 2x worse runtime compared to the baseline version” (P. 3, Sec. 2.2, ¶1-2).).
Gupta teaches a neural architecture search space comprised of inverted bottleneck (IBN) convolution blocks and fused inverted bottleneck (fused IBN) convolution blocks is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the search space disclosed by Gupta to create a search space comprised of IBN and fused IBN convolution blocks. By creating a search space comprised of IBN and fused IBN convolution blocks, machine learning engineers can design a model based on hardware constraints, thereby allowing engineers to balance the trade-off between hardware resource utilization and runtime.
With respect to claim 27, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin teaches the method of claim 19, however the combination does not teach wherein the different parameters include a kernel size, which is taught by Gupta:
wherein the different parameters include a kernel size (Gupta discloses “As a result, crafting the search space to include building blocks that are known to improve hardware utilization as well as excluding incompatible operations becomes a critical component in arriving at accelerator-optimized models. Although our search space includes several potentially useful blocks with varying kernel and tensor sizes, it is not trivial to determine when an option becomes favorable. For example, our search space includes the inverted bottleneck convolution block with a depthwise convolution layer that is used in MobileNetV2” (P. 3, Sec. 2.2, ¶1-2).
Gupta further discloses “Figure 4 demonstrates another case where the same choice from the search space is not always favorable. In Figure 4, on the top, 5x5 kernel size choice leads to 2.78x increase in the number of MACs and parameters compared to 3x3 kernel size which leads to 2.71x increase in the runtime (1122us vs. 414us). However, on the bottom we observe that the same increase in the kernel size, number of MACs and parameters lead to only a 35% increase in the runtime (27us vs 20us). For this case, it turns out that the combination of a shallow input tensor depth with a larger output tensor depth has a lower utilization where the increase in the kernel size has minor impact on the runtime due to improved utilization. This can be good trade-off to gain more trainable parameters to improve model quality at a marginal latency cost” (P. 3, Sec. 2.2, ¶3).).
Gupta teaches a neural architecture search space comprised of blocks with varying kernel sizes is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the search space disclosed by Gupta to create a search space comprised of convolution blocks with varying kernel sizes. By creating a search space comprised of convolution blocks with varying kernel sizes, machine learning engineers can design a model based on hardware constraints, thereby allowing engineers to balance the trade-off between hardware resource utilization and runtime.
With respect to claim 37, the claim recites similar limitations corresponding to claim 26, therefore the same rationale of rejection is applicable.
With respect to claim 38, the claim recites similar limitations corresponding to claim 27, therefore the same rationale of rejection is applicable.
Claims 28-29 and 39-40 are rejected under 35 U.S.C. 103 as being unpatentable over Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin and Das et al. (US 20210350203 A1), hereinafter Das.
With respect to claim 28, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin teaches the method of claim 19, however the combination does not teach wherein the different parameters include an output channel multiplier or stride, which is taught by Das:
wherein the different parameters include an output channel multiplier or stride (Das discloses “FIGS. 1A-1B illustrate a conceptual idea of searching for neural components at every layer (11A, 111B, 11C) of the DNN model (12). The neural components vary with “hyperparameters”, such as a number of filter (12A, 12F), a filter size (12B, 12C, 12G), a stride (12D, 12E), an expansion ratio and so on. Determining a correct balance of the hyperparameters is equal to determining a correct choice of a neural block in any layer (11A, 111B, 11C) of the DNN model (12). Manually determining for a correct choice is not feasible. … The search space 10A includes of all possible choices of the neural blocks” [0052]. See Figure 1B depicting stride height and stride width are possible neural block choices.).
Das teaches a search space comprised of neural blocks with varying stride hyperparameters is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the search space disclosed by Das to create a search space comprised of neural blocks with varying stride heights and widths. By creating a search space comprised of neural blocks with varying stride heights and widths, the most optimal configuration of stride height and width can be chosen, thereby increasing overall model performance.
With respect to claim 29, the combination of Nguyen in view of Ahn, further in view of Liu and Zoph, and further in view of Lin teaches the method of claim 19, however the combination does not teach wherein the different parameters include an expansion ratio, which is taught by Das:
wherein the different parameters include an expansion ratio (Das discloses “FIGS. 1A-1B illustrate a conceptual idea of searching for neural components at every layer (11A, 111B, 11C) of the DNN model (12). The neural components vary with “hyperparameters”, such as a number of filter (12A, 12F), a filter size (12B, 12C, 12G), a stride (12D, 12E), an expansion ratio and so on. Determining a correct balance of the hyperparameters is equal to determining a correct choice of a neural block in any layer (11A, 111B, 11C) of the DNN model (12). Manually determining for a correct choice is not feasible. … The search space 10A includes of all possible choices of the neural blocks” [0052].).
Das teaches a search space comprised of neural blocks with varying expansion ratios is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to combine the method of Nguyen with the search space disclosed by Das to create a search space comprised of neural blocks with varying expansion ratios. By creating a search space comprised of neural blocks with varying expansion ratios, the optimal expansion rate can be chosen based on hardware or performance constraints, thus allowing an optimal model to be developed.
With respect to claim 39, the claim recites similar limitations corresponding to claim 28, therefore the same rationale of rejection is applicable.
With respect to claim 40, the claim recites similar limitations corresponding to claim 29, therefore the same rationale of rejection is applicable.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PEDRO J MORALES whose telephone number is (571)272-6106. The examiner can normally be reached 8:30 AM - 6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MIRANDA M HUANG can be reached at (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PEDRO J MORALES/Examiner, Art Unit 2124
/VINCENT GONZALES/Primary Examiner, Art Unit 2124