Last updated: April 19, 2026
Application No. 18/170,632
METHOD AND A SYSTEM FOR GENERATING SECONDARY TASKS FOR NEURAL NETWORKS

Non-Final OA §101§102§112
Filed
Feb 17, 2023
Examiner
BALAKRISHNAN, VIJAY MURALI
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
Hitachi, Ltd.
OA Round
1 (Non-Final)
This examiner grants 43% of cases after interview

— +85.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 14 resolved cases, 2023–2026
Examiner Intelligence

BALAKRISHNAN, VIJAY MURALI View full profile →
Grants 43% of resolved cases
Career Allow Rate
6 granted / 14 resolved
-12.1% vs TC avg
Strong +86% interview lift
Without
With
+85.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 12m
Avg Prosecution
26 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
26.4%
-13.6% vs TC avg
§103
31.5%
-8.5% vs TC avg
§102
13.2%
-26.8% vs TC avg
§112
24.3%
-15.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 14 resolved cases
Office Action

§101 §102 §112
DETAILED ACTION
This nonfinal action is in response to application 18/170,632 filed 02/17/2023, with priority to foreign application IN202241013379 filed 03/11/2022.
Claims 1-14 remain pending in the application. Claims 1 and 8 are independent claims.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
	The information disclosure statements (IDS) filed 02/17/2023 and 05/09/2025 have been fully considered by the examiner.
Claim Objections
	Claims 1, 4, and 11 are objected to because of the following informalities:
In claim 1, a reference character [“(104)”] is recited for the claim element “one or more secondary tasks”. However, no other claim elements appear to utilize reference characters. Either removal of the reference character at issue, or incorporation of reference characters for other essential claim elements, is thereby requested in order to maintain internal consistency within the claims.
In claims 4 and 11, “a second set of task groups comprising corresponding plurality of ideal features” should read “a second set of task groups comprising a corresponding plurality of ideal features” to improve grammatical clarity.
	Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 1, it recites the limitation “identifying, by the task generation system, one or more secondary features by mapping the first set of features with the feature set”. It is unclear what function is being recited by the term “mapping” in this context, as the concept of “mapping” one set of features “with” another set of features does not appear to be recognized in the art, or even understandable to a level that would allow one of ordinary skill in the art to recognize the scope of what is being claimed. While the specification recites a concept of performing “mapping” via application of Reverse Singular Value Decomposition (R-SVD) onto the first set of features [¶ 0026, 0033], these limitations have not been incorporated into the claims. Consequently, one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
For purposes of examination and as best understood in light of the specification, the limitation “identifying, by the task generation system, one or more secondary features by mapping the first set of features with the feature set” is interpreted as identifying secondary features based on the first set of features.
Regarding claim 4, it recites the limitation “selecting a set of ideal features from the second set of features, based on a predefined selection technique”.
In the context of the claimed elements, “ideal” is a relative term that renders the claim indefinite; it is unclear what characteristic would designate a feature as “ideal”, and the specification does not provide a standard for ascertaining the requisite degree. 
For purposes of examination and as best understood in light of the specification, “a set of ideal features” is interpreted as encompassing any set of features from the second set of features.
Claim 4 further recites “selecting a second set of task groups comprising corresponding plurality of ideal features, based on a similarity between the plurality of ideal features of each of the first set of task groups”. There is insufficient antecedent basis for the terms “a second set of task groups” and “the plurality of ideal features of each of the first set of task groups”. The claims do not previously recite either a “first set of task groups”, or recite “ideal features” of such a set of task groups. 
Additionally, it is unclear what the scope is of a “corresponding plurality of ideal features” to a second set of task groups, because the claims do not previously recite a correspondence of ideal features to the newly recited second set of task groups, or provide a basis for understanding what would be representative of a correspondence between these elements. It is unclear if the “corresponding plurality” of ideal features is equivalent to, or separate from, the previously recited “plurality” of ideal features. Because the composition of the recited “second set of task groups” is unclear, it additionally becomes unclear how the selection of such a set of task groups is “based on a similarity between the plurality of ideal features of each of the first set of task groups”, as it is unclear whether or not this similarity is being determined in relation to, or separately from, the “corresponding plurality”. Ultimately, it is entirely unclear how the recited elements at issue are interrelated with the elements previously recited in the claims, and one of ordinary skill in the art consequently would not be reasonably apprised of the scope of the invention.
For purposes of examination and as best understood in light of the specification, the limitation is interpreted as reciting a selection of task groups corresponding to the set of features at issue, wherein the selection is based on a similarity measure.
Regarding claims 8 and 11, the have similar deficiencies to those found in claims 1 and 4 above. Consequently, they are rejected for the same reasons and are likewise interpreted as detailed above.
Claim 11 further recites “identifying a first set of task groups of the set of ideal features, based on received inputs”. There is insufficient antecedent basis for “received inputs” in the claim, as the claims do not previously recite any receiving of “input” elements.
For purposes of examination and as best understood in light of the specification, “received inputs” are interpreted as referring to received features of data items.
	Regarding claims 2-3, 5-7, 9-10, and 12-14, they inherit the deficiencies of their parent claims. Consequently, they are also rejected under 35 U.S.C. 112(b) as being indefinite for depending on an indefinite parent claim. 
Applicant is advised to consider other parts of the disclosure, including the  specification and drawings, for clarity and indefiniteness issues similar to those detailed  above. The examiner notes that any amendments made to the specification, claims, or  drawings must only contain subject matter that is supported by the originally filed disclosure in order to avoid being rejected under U.S.C. 112(a) as new matter (see  MPEP § 608.04). Additionally, any interpretations of indefinite claim language detailed above are made solely for the purpose of examining the instant application on the merits, and are not attested as being adequately supported by the disclosure or  adequately resolving all indefiniteness issues raised.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-14 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).
Independent Claims (Claim 1, Claim 8):
Step 1: Claim 1 is drawn to a method and claim 8 is drawn to an apparatus. Therefore, each of these claims falls under one of the four categories of statutory subject matter (process/method, machine/apparatus, manufacture/product, or composition of matter).
Step 2A Prong 1: Claims 1 and 8 each recite a judicially recognized exception of an abstract idea.
	Claim 1 recites, inter alia:
determining an association score between each of the one or more features of a data item from the plurality of data items with each of the one or more features of other data items from the plurality of data items; – This limitation amounts to a mere procedure of observing and comparing data elements to determine “scores”, and therefore recites a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
identifying a first set of features from the feature set, based on a comparison of the association score related to the one or more features of the plurality of data items with a threshold value, wherein the association score is greater than the threshold value for the first set of features; – This limitation further recites a procedure of data observation and analysis based on mere comparison of numerical values, and therefore recites a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
identifying one or more secondary features by mapping the first set of features with the feature set – This limitation further recites a procedure of data observation and analysis (e.g. generically “mapping” features) that a human could reasonably perform in the mind or using pen and paper.
generating one or more secondary tasks (104) based on the one or more secondary features – This limitation further recites a procedure of data observation and analysis (e.g., generically determining related “tasks”) that a human could reasonably perform in the mind or using pen and paper.
Claim 8 recites abstract idea limitations substantially similar to those recited in claim 1, and therefore recites the same judicial exception.
Step 2A Prong 2: The following additional elements recited in claims 1 and 8 do not integrate the recited judicial exceptions into a practical application.
	Claim 1 additionally recites:
A method of generating secondary tasks for neural networks – The elements recited in the preamble do no more than generically link the recited abstract procedure to the technological environment of neural networks, without providing any significant details of technical implementation that would reflect an integration into practical application.
receiving a feature set comprising one or more features of each of a plurality of data items, wherein the one or more features are generated for a primary task – This limitation amounts to no more preliminary steps of data gathering, and therefore recites insignificant extra-solution activity.
[receiving/determining/identifying/identifying/generating] by [a/the] task generation system – The recitation of a generic “task generation system” in the claims amounts to nothing more than mere instructions to implement an abstract idea on a computer or computer components.
[generating one or more secondary tasks] for a neural network – The recitation of tasks being generated “for a neural network” does no more than generically link the recited abstract procedure to the technological environment of neural networks, without providing any significant details of technical implementation that would reflect an integration into practical application.
Claim 8 recites substantially similar additional elements to those recited in claim 1, and therefore also does not integrate the recited judicial exception into a practical application.
Step 2B: The additional elements recited in claims 1 and 8, viewed individually or as an ordered combination, do not provide an inventive concept or otherwise amount to significantly more than the recited abstract ideas themselves.
	Claim 1 additionally recites:
A method of generating secondary tasks for neural networks – Generically linking the recited abstract procedure to the technological environment of neural networks without providing any significant details of technical implementation does not provide an inventive concept or significantly more to the recited abstract idea.
receiving a feature set comprising one or more features of each of a plurality of data items, wherein the one or more features are generated for a primary task – Receiving data is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Receiving or transmitting data over a network”) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
[receiving/determining/identifying/identifying/generating] by [a/the] task generation system – Mere instructions to implement an abstract idea on a computer or computer components do not provide an inventive concept or significantly more to the recited abstract idea.
[generating one or more secondary tasks] for a neural network – Generically linking the recited abstract procedure to the technological environment of neural networks without providing any significant details of technical implementation does not provide an inventive concept or significantly more to the recited abstract idea.
Claim 8 recites substantially similar additional elements to those recited in claim 1, and therefore also does not provide an inventive concept or significantly more to the recited abstract idea
As such, claims 1 and 8 are not patent eligible.
Dependent Claims (Claims 2-7, Claims 9-14):
	Dependent claims 2-7 and 9-14 narrow the scope of independent claims 1 and 8, and likewise narrow the recited judicial exceptions. They recite abstract idea limitations that are similar to those recited within the independent claims (i.e., mental processes and/or mathematical concepts), and thereby merely expand on the already recited exceptions. The dependent claims also do not recite any further additional elements that successfully integrate the recited judicial exceptions into a practical application or provide significantly more than the recited abstract ideas themselves. Consequently, claims 2-7 and 9-14 are also rejected under 35 U.S.C. 101.
Step 1: Claim 2-7 are drawn to a method and claims 9-14 are drawn to an apparatus. Therefore, each of these claims falls under one of the four categories of statutory subject matter (process/method, machine/apparatus, manufacture/product, or composition of matter).
Step 2A Prong 1: Claims 2-7 and 9-14 each recite a judicially recognized exception of an abstract idea.
Claim 2 recites, inter alia:
generating one or more secondary task groups of the one or more secondary features; labelling the one or more secondary task groups based on the feature set; and generating the one or more secondary tasks corresponding to the one or more secondary task groups – Similarly to limitations previously recited in the parent claim, these limitations amount to further reciting a procedure of data observation and analysis based on generically identifying and organizing groups of “tasks” related to observed features, and therefore recite a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claim 3 recites, inter alia:
identifying a second set of features having the association score below the threshold value –  Similarly to limitations previously recited in the parent claim, this limitation further recites a procedure of data observation and analysis based on mere comparison of numerical values, and therefore recites a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claim 4 recites, inter alia:
selecting a set of ideal features from the second set of features, based on a predefined selection technique; selecting a second set of task groups comprising corresponding plurality of ideal features, based on a similarity between the plurality of ideal features of each of the first set of task groups; and generating the one or more secondary tasks corresponding to the second set of task groups with the corresponding plurality of ideal features – Similarly to limitations previously recited in the parent claim, these limitations amount to further reciting a procedure of data observation and analysis based on organizing sets of features, generically identifying similarities between data points, and organizing groups of “tasks” related to observed features, and therefore recite a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claims 5-7 recite the same judicial exception as claim 1.
Claims 9-10 recite substantially similar abstract idea limitations to those recited in claim 1, and therefore recite the same judicial exception.
Claim 11 recites, inter alia:
selecting a set of ideal features from the second set of features, based on a predefined selection technique; identifying a first set of task groups of the set of ideal features, based on received inputs, wherein each task group from the first set of task groups comprises a plurality of ideal features from the set of ideal features; selecting a second set of task groups comprising corresponding plurality of ideal features, based on a similarity between the plurality of ideal features of each of the first set of task groups; and generating the one or more secondary tasks corresponding to the second set of task groups with the corresponding plurality of ideal features – Similarly to limitations previously recited in the parent claim, these limitations amount to further reciting a procedure of data observation and analysis based on organizing sets of features, generically identifying similarities between data points, and organizing groups of “tasks” related to observed features, and therefore recite a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claims 12-14 recite the same judicial exception as claim 8.
Step 2A Prong 2: Claims 2-4 and 9-11 do not recite any further additional elements besides those already recited in the independent claims, and the following additional elements recited in claims 5-7 and 12-14 also do not integrate the recited judicial exceptions into a practical application.
	Claim 5 additionally recites:
wherein the plurality of data items comprises one of, images, videos, audio inputs, text inputs, and speech inputs – This limitation does no more than specify a type of data to be manipulated, and therefore recites insignificant extra-solution activity.
	Claim 6 additionally recites:
wherein the feature set is received from one of, a trained neural network and an untrained neural network – This limitation does no more than recite an insignificant detail of computer implementation with respect to a neural network, wherein the neural network itself is claimed at a high level of generality and is merely being invoked as a tool to perform an existing abstract procedure of data observation of analysis.
	Claim 7 additionally recites:
wherein the neural network is one of, a trained neural network and an untrained neural network – This limitation does no more than recite an insignificant detail of computer implementation with respect to a neural network, wherein the neural network itself is claimed at a high level of generality and is merely being invoked as a tool to perform an existing abstract procedure of data observation of analysis.
Claims 12-14 recite the same additional elements as those recited in claims 5-7, and therefore also do not integrate the recited judicial exceptions into a practical application.
Step 2B: The additional elements recited in claims 5-7 and 12-14, viewed individually or as an ordered combination, do not provide an inventive concept or otherwise amount to significantly more than the recited abstract ideas themselves.
	Claim 5 additionally recites:
wherein the plurality of data items comprises one of, images, videos, audio inputs, text inputs, and speech inputs – Using neural networks to process image/video/audio/text/speech data (e.g., in video analytics systems, vehicle identification, crowd monitoring, etc. – see [¶ 002] of specification) is well-understood, routine, and conventional activity, and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
	Claim 6 additionally recites:
wherein the feature set is received from one of, a trained neural network and an untrained neural network – It is well-known in the art that neural networks, by nature, are “untrained” upon initialization of parameters and are then iteratively “trained” to learn a given task – as much, merely invoking a generic “trained” or “untrained” network as a tool for performing an existing abstract procedure of data observation of analysis does not provide an inventive concept or significantly more to the recited abstract idea.
	Claim 7 additionally recites:
wherein the neural network is one of, a trained neural network and an untrained neural network – It is well-known in the art that neural networks, by nature, are “untrained” upon initialization of parameters and are then iteratively “trained” to learn a given task – as much, merely invoking a generic “trained” or “untrained” network as a tool for performing an existing abstract procedure of data observation of analysis does not provide an inventive concept or significantly more to the recited abstract idea.
Claims 12-14 recite the same additional elements as those recited in claims 5-7, and therefore also do not provide an inventive concept or significantly more to the recited abstract idea.
As such, claims 2-7 and 9-14 also are not patent eligible.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


	Claims 1-14 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Li  al., (“Multi-task learning with Attention: Constructing auxiliary tasks for learning to learn” , available conference 03 Nov 2021), hereinafter Li.
	Regarding claim 1, Li discloses A method for generating secondary tasks for neural networks (“With the development of deep learning in various fields, a deep learning method that can optimize multiple target tasks at the same time, that is, deep multi-task learning (MTL), has attracted more and more attention. We aim to find a multitask learning method to optimize a single goal. For image classification tasks, it is difficult to find multiple tasks with task relevance. We use an unsupervised clustering algorithm to construct multiple related auxiliary tasks in the dataset to solve this problem in order to achieve a kind of data enhancement. The purpose is to improve the accuracy of the main task. While these newly constructed auxiliary tasks may exhibit semantic features that are not relevant to the main task, in order to reduce the impact of such non-ideal auxiliary tasks on the main task, we use a multi-task learning based on learning to learn (MTL-LTL) approach with a spatially dependent attention function embedded in an underlying joint model with hard parameter sharing that allows our model to have pixelated modeling capabilities. In addition, we also use the method of learning to learn to randomly sample multiple auxiliary tasks. It can train these tasks on the shared hidden layer, and at the same time minimize the loss of the main task, and ensure that the optimization direction leads to the improvement of the main task.” [Li Abstract]), the method comprising:
	receiving, by a task generation system, a feature set comprising one or more features of each of a plurality of data items, (see Fig. 1 including Input in (b) being images (i.e., data items) – “Fig. 1 Illustration of the training process of our proposed method,…(b) In the learning stage using MTL-LTL, our model samples a batch of images in each episode to update the task-specific decoder, and uses the shared model containing attention to learn the potential information of the task-specific. Our ultimate goal is to use the shared layer to minimize the loss of the main task” [Li page 148];) wherein the one or more features are generated for a primary task; (“1) MTL framework based on hard parameter sharing: Our work is aimed at the supervised classification task, which is a traditional STL-based task. We use the k-means unsupervised clustering method to construct multiple auxiliary tasks {Tt}T t=1 related to the main task on the unlabeled dataset Daux obtained from the original data, and use these auxiliary tasks to improve the generalization performance of MTL” [Li page 147 MTL with Attention]; see Fig. 1 including Intput [sic] in (a) being data (i.e., data items) {Xi} each having features NxN; “In order to learn the semantic meaning of the original features gathered in the space, we use an unsupervised learning method to generate a useful embedding space. Specifically, first run an unsupervised embedding learning algorithm E on Daux. E is a process that takes an unlabeled dataset Daux = {xi} as input, and then maps {Xi} to a low-latitude embedding space Z to generate {Zi}” [Li page 148 Construction of auxiliary tasks]; The MTL-LTL framework takes a set of images (i.e., data items) as input data, wherein the images are sampled for the purpose of supervised classification (i.e., generated for the primary task), and then obtains unlabeled dataset Daux = {Xi} (i.e. feature set) from the original data)
	determining, by the task generation system, an association score between each of the one or more features of a data item from the plurality of data items with each of the one or more features of other data items from the plurality of data items; (“Similarly, for the tasks we need to construct, we can first construct different partitions Pn on Daux using the similar method described above…. we use the k-means clustering algorithm to treat the learned partition P = {Cl}L l=1 as a simplified Gaussian mixture model p(x | c)p(c)…E is a process that takes an unlabeled dataset Daux = {xi} as input, and then maps {Xi} to a low-latitude embedding space Z to generate {Zi}. In order to generate a different taskset, we generate T partitions {Pt}T t=1 by running the standard clustering algorithm kmeans, and apply random scaling to the dimensionality of the embedding space to induce different metrics. The clustering operation is summarized as: 
    PNG
    media_image1.png
    106
    528
    media_image1.png
    Greyscale
. Equ. 6 learns a d × k central diagonal matrix U and the cluster assignment y∗ i of each vector Zi together, where y∗ l 1k = 1 and U represent the centroid of the learned cluster.” [Li page 148 Construction of auxiliary tasks]; The MTL-LTL framework utilizes a k-means clustering algorithm to partition feature set Daux into clusters – by definition, the k-means algorithm utilizes distance measure 
    PNG
    media_image2.png
    44
    109
    media_image2.png
    Greyscale
 to compare features of data item Zi to each cluster centroid, wherein features of each centroid are representative of the features of data items belonging to that cluster (i.e., other data items). The inverse of the distance measure can thereby be implicitly understood as an association score)
	identifying, by the task generation system, a first set of features from the feature set, based on a comparison of the association score related to the one or more features of the plurality of data items with a threshold value, wherein the association score is greater than the threshold value for the first set of features; (“Equ. 6 learns a d × k central diagonal matrix U and the cluster assignment y∗ i of each vector Zi together, where y∗ l 1k = 1 and U represent the centroid of the learned cluster. We iteratively cluster the depth features to obtain a set of best distributions {y∗--- l}. Then, we use these y∗ i as pseudo-labels to allocate T partitions to construct T auxiliary tasks” [Li pages 148-149 Construction of auxiliary tasks]; As explained above, the inverse of distance measure 
    PNG
    media_image2.png
    44
    109
    media_image2.png
    Greyscale
for the closest cluster can be implicitly understood as an association score, and the “threshold” value being implicitly understood as the inverse of distance to the second-nearest cluster. By definition, the k-means algorithm assigns a set of data items (and their associated features) from dataset {Zi} to a particular cluster (i.e., first set of features) based on identifying the centroid to which they have minimum distance (
    PNG
    media_image3.png
    48
    180
    media_image3.png
    Greyscale
))
	identifying, by the task generation system, one or more secondary features by mapping the first set of features with the feature set; (“In our method, a similar classic MTL method is added. Specifically, the network is shared by using the bottom part of the joint model as the backbone, and then task-specific decoders are used for learning according to different tasks…The obtained useful features h = F(x) can improve the performance of its main task, and then assign a task-specific decoder to each task according to its uniqueness, and obtain the output ˆyt = Dt( h) of the task-specific.” [Li page 147 MTL with Attention]; “Based on the above unsupervised embedding algorithm E, we have obtained a taskset including the main task and multiple auxiliary tasks. For this taskset, we use the MTL-LTL framework with hard parameter sharing embedded with attention to learn the multi-task neural classification function ˆyt i = Dt (F (xi; θF) , θDt ), and use the learned prior knowledge and shared information of multiple tasks to learn the main task... For related auxiliary tasks, we further use an MTL method based on learning to learn, which can randomly sample a batch of tasks T from multiple auxiliary tasks and main tasks {Tt}T t=0 in the taskset. Then combine the data of multiple auxiliary tasks to train Dt {F (xi; θF) , θDt } to ensure that the main task is optimized in the correct direction. MTL-LTL with attention makes our model highly generalized and robust. Specifically, we use a hard parameter sharing MTL framework that combines spatially-dependent Attention. It will use the bottom layer of the combined model to let all tasks share the same hidden space, while retaining a specific decoder Dt for a task-specific. At the same time, the shared layer F on the auxiliary task is trained to generalize to the main task…After these two stages of learning, we can use the prior knowledge and initialization representation that are beneficial to the main task learned from each task-specific decoder Dt, thereby helping to improve the robustness and accuracy of the main task.” [Li page 149 MTL based on learning to learn]; see Fig. 1. including Shared Layers –> Auxiliary Task Decoder 1..3 in (b) [Li page 148]; Upon identifying T potential tasks through partitioned sets of features (including, e.g., a first set of features), the MTL-LTL framework can further learn, through the auxiliary task-specific decoders that are based on the constructed tasks (and partitioned sets of features therein), secondary features that contribute towards optimization of the main (i.e., primary) task) and
	generating, by the task generation system, one or more secondary tasks (104) based on the one or more secondary features, for a neural network ([Li page 147 MTL with Attention] and [Li page 149 MTL based on learning to learn] and Fig. 1. including Shared Layers –> Auxiliary Task Decoder 1..3 in (b) [Li page 148]; MTL-LTL utilizes the constructed auxiliary task-specific decoders to thereby generate secondary training tasks for the neural framework based on them each learning secondary features that contribute towards optimization of the primary task)
	Regarding claim 2, Li discloses the limitations of parent claim 1, and further discloses wherein generating the one or more secondary tasks comprises:
	generating one or more secondary task groups of the one or more secondary features; (“Based on the above unsupervised embedding algorithm E, we have obtained a taskset including the main task and multiple auxiliary tasks. For this taskset, we use the MTL-LTL framework with hard parameter sharing embedded with attention to learn the multi-task neural classification function ˆyt i = Dt (F (xi; θF) , θDt ), and use the learned prior knowledge and shared information of multiple tasks to learn the main task... For related auxiliary tasks, we further use an MTL method based on learning to learn, which can randomly sample a batch of tasks T from multiple auxiliary tasks and main tasks {Tt}T t=0 in the taskset” [Li page 149 MTL based on learning to learn]; As explained above, a number of potential auxiliary tasks T are first identified through partitioned sets of features, from which a batch of tasks (i.e., secondary task group) is sampled for learning task-specific decoders (and associated secondary features))
	labelling the one or more secondary task groups based on the feature set; (“We iteratively cluster the depth features to obtain a set of best distributions {y∗ l }. Then, we use these y∗ i as pseudo-labels to allocate T partitions to construct T auxiliary tasks” [Li page 149 Construction of auxiliary tasks]) and
	generating the one or more secondary tasks corresponding to the one or more secondary task groups (“The multi-layer feedforward neural network proposed by [8] uses the hidden layer as the underlying model F shared among all tasks, and selects a specific task decoder Dt according to different tasks, and uses each decoder as a different output layer. The output of the output layer is regarded as the prediction result of the corresponding data sample. The model of task T is defined as follows: 
    PNG
    media_image4.png
    48
    308
    media_image4.png
    Greyscale
 Where θF and θDt are the parameters of the underlying shared layer F and the fixed task decoder Dt, respectively. Our model can find the optimal parameter 
    PNG
    media_image5.png
    47
    256
    media_image5.png
    Greyscale
 for the learning-based MTL based on these two parameters, such that 
    PNG
    media_image6.png
    85
    435
    media_image6.png
    Greyscale
” [Li page 146 Multi-task learning]; Each task-specific decoder uses its respective pseudo-label y∗ l (of the overall set of pseudo-labels of the sampled task group) for optimizing model parameters θDt (i.e., obtaining secondary features and generating secondary training tasks))
	Regarding claim 3, Li discloses the limitations of parent claim 1, and further discloses identifying a second set of features having the association score below the threshold value ([Li pages 148-149 Construction of auxiliary tasks] as detailed in claim 1 above; As explained above, the inverse of distance measure 
    PNG
    media_image2.png
    44
    109
    media_image2.png
    Greyscale
for the closest cluster can be implicitly understood as an association score, and the “threshold” value being implicitly understood as the inverse of distance to the second-nearest cluster. By definition, the k-means algorithm assigns a set of data items (and their associated features) from dataset {Zi} to a particular cluster (i.e., first set of features) based on identifying the centroid to which they have minimum distance (
    PNG
    media_image3.png
    48
    180
    media_image3.png
    Greyscale
) – any other clusters thereby can be implicitly understood as separate from the first set of features (i.e., second set of features))
	Regarding claim 4, Li discloses the limitations of parent claim 3, and further discloses selecting a set of ideal features from the second set of features, based on a predefined selection technique; ([Li pages 148-149 Construction of auxiliary tasks] as detailed in claim 1 above; Features of each centroid are representative of the features of data items belonging to that cluster (e.g., a second set of features) – features of centroids can thereby be implicitly understood as “ideal” features representative of that cluster)
	selecting a second set of task groups comprising corresponding plurality of ideal features, based on a similarity between the plurality of ideal features of each of the first set of task groups; ([Li pages 148-149 Construction of auxiliary tasks] as detailed in claim 1 above; Through the k-means algorithm, determining partitions of features from which tasks are constructed (i.e. task groups) involves comparing features of each data item Zi to features of each cluster centroid (i.e., comparing based on distance to ideal features for each task group (incl. e.g., first set of task groups)) to identify minimum distance and thereby the task groups (incl. e.g., second set of task groups) with closest similarity), and
	generating the one or more secondary tasks corresponding to the second set of task groups with the corresponding plurality of ideal features ([Li page 147 MTL with Attention] and [Li page 149 MTL based on learning to learn] and Fig. 1. including Shared Layers –> Auxiliary Task Decoder 1..3 in (b) [Li page 148] as detailed in claim 1 above; Upon identifying T potential tasks through partitioned sets of features (including, e.g., a second set of task groups), the MTL-LTL framework can further learn, through the auxiliary task-specific decoders that are based on the constructed tasks (and partitioned sets of features therein), secondary features that contribute towards optimization of the main (i.e., primary) task)
	Regarding claim 5, Li discloses the limitations of parent claim 1, and further discloses wherein the plurality of data items comprises one of, images, videos, audio inputs, text inputs, and speech inputs (see Fig. 1 including Input in (b) being images (i.e., data items) [Li page 148])
	Regarding claim 6, Li discloses the limitations of parent claim 1, and further discloses wherein the feature set is received from one of, a trained neural network and an untrained neural network (see CIFAR-10 Dataset in Experiments – “CIFAR-10 [31] is a lightweight natural image dataset. The dataset includes 60,000 32×32 RGB color pictures, divided into 10 categories, each containing 5000 training images and 1000 test images. For this dataset, we use the two layers of the CNN architecture as the underlying shared layer of the joint model. The first convolutional layer has 64 filters with a size of 3×3, followed by a 2×2 maximum pooling layer. The second convolutional layer has 128 filters of size 3×3 and a maximum pooling layer of 2×2, as shown in [29]. At the same time, an activation function layer with a spatially dependent attention mechanism is embedded between each convolutional layer and the maximum pooling layer. Each task-specific decoder has two fully connected layers.” [Li pages 149-150 Experiments]; The images (from which the embedding algorithm obtains features) are received prior to training of the CNN architecture (i.e., untrained neural network))
	Regarding claim 7, Li discloses the limitations of parent claim 1, and further discloses wherein the neural network is one of, a trained neural network and an untrained neural network ([Li pages 149-150 Experiments] as detailed in claim 6 above; As is typical of neural networks, the neural architecture is initially untrained, and then is trained over iterations).
Regarding claims 8-10 and 12-14, they are apparatus claims that largely correspond to the method of claims 1-3 and 5-7, which are already disclosed by Li as detailed above. Li further discloses A task generation system comprising: one or more processors; and a memory storing processor-executable instructions, which, on execution, cause the one or more processors to: perform the claimed functions ([Li pages 149-150 Experiments] as detailed in claim 6 above; Execution of the MTL-LTL framework on the disclosed datasets implicitly requires a computer with adequate processing capabilities). Consequently, claims 8-10 and 12-14 are rejected for the same reasons as claims 1-3 and 5-7.
Regarding claims 11, it is an apparatus claim that largely corresponds to the method of claim 4, which is already disclosed by Li as detailed above. Li further discloses identifying a first set of task groups of the set of ideal features, based on received inputs, wherein each task group from the first set of task groups comprises a plurality of ideal features from the set of ideal features (Li pages 148-149 Construction of auxiliary tasks] as detailed in claim 1 above; Through the k-means algorithm, determining partitions of features from which tasks are constructed (i.e. task groups) involves comparing features of each data item Zi to features of each cluster centroid (i.e., comparing based on distance to ideal features for each task group (incl. e.g., first set of task groups and their associated ideal features)). Consequently, claim 11 is rejected for the same reasons as claim 4.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Liu et al. (“Self-Supervised Generalisation with Meta Auxiliary Learning”, available conference 2019) discloses a new method which automatically learns appropriate labels for an auxiliary task, such that any supervised learning task can be improved without requiring access to any further data. The approach is to train two neural networks: a label-generation network to predict the auxiliary labels, and a multi-task network to train the primary task alongside the auxiliary task. The loss for the label-generation network incorporates the loss of the multi-task network, and so this interaction between the two networks can be seen as a form of meta learning with a double gradient.
Kung et al. (“Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity”, available conference Nov 2021) discloses a time-efficient sampling method to select the data that is most relevant to the primary task by training on the most beneficial sub-datasets from the auxiliary tasks, achieving efficient multi-task auxiliary learning.
Caron et al. (“Deep Clustering for Unsupervised Learning of Visual Features”, available arXiv 18 Mar 2019) discloses further implementation details of the clustering method (DeepCluster) described in Li.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VIJAY M BALAKRISHNAN whose telephone number is (571) 272-0455. The examiner can normally be reached 10am-5pm EST Mon-Thurs.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JENNIFER WELCH can be reached on (571) 272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/V.M.B./
Examiner, Art Unit 2143 


/JENNIFER N WELCH/Supervisory Patent Examiner, Art Unit 2143
Read full office action
Prosecution Timeline

Feb 17, 2023
Application Filed
Mar 07, 2026
Non-Final Rejection — §101, §102, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/766,854
Patent 12585912
GATED LINEAR CONTEXTUAL BANDITS
2y 5m to grant Granted Mar 24, 2026
17/517,698
Patent 12468967
METHOD AND SYSTEM FOR GENERATING A SOCIO-TECHNICAL DECISION IN RESPONSE TO AN EVENT
2y 5m to grant Granted Nov 11, 2025
Study what changed to get past this examiner. Based on 2 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
43%
Grant Probability
99%
With Interview (+85.7%)
3y 12m
Median Time to Grant
Low
PTA Risk
Based on 14 resolved cases by this examiner. Grant probability derived from career allow rate.