DETAILED ACTION
This office action is responsive to the response filed 8/28/2025. The application contains claims 1-10, 12-13, 15-22, all examined and rejected.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-22 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. Claim 1 is rejected under 35 USC 101 because the claimed inventions are directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
While independent claims 1, and 22 are each directed to a statutory category, it recites a series of steps pertaining to extracting features and computing weights to produce an inference output, which appears to be directed to an abstract idea (mental process, mathematical concept).
Claims 1-22 are rejected under 35 U.S.C. § 101 because the instant application is directed to non-patentable subject matter. Specifically, the claims are directed toward at least one judicial exception without reciting additional elements that amount to significantly more than the judicial exception. The rationale for this determination is in accordance with the guidelines of USPTO, applies to all statutory categories, and is explained in detail below.
When considering subject matter eligibility under 35 U.S.C. 101, (1) it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter. If the claim does fall within one of the statutory categories, (2a) it must then be determined whether the claim is directed to a judicial exception (i.e., law of nature, natural phenomenon, and abstract idea), and if so (2b), it must additionally be determined whether the claim is a patent-eligible application of the exception. If an abstract idea is present in the claim, any element or combination of elements in the claim must be sufficient to ensure that the claim amounts to significantly more than the abstract idea itself. Examples of abstract ideas include certain methods of organizing human activities; a mental processes; and mathematical concepts, (2019 PEG)
STEP 1.
Per Step 1, the claims are determined to include machine, and a process as in independent Claim 1 and 22 and in the therefrom dependent claims. Therefore, the claims are directed to a statutory eligibility category.
At step 2A, prong 1, The invention is directed to calculate data related to estimating data related classes, and if the classes are shared between different domains or private (Mental process, observation, evaluation and judgment, Mathematical concept) which is akin to Mental Process and Mathematical concept (see Alice), As such, the claims include an abstract idea. When considering the limitations individually and as a whole the limitations directed to the abstract idea are:
generate, for series of target data items xt input to the first computational model, a target weight ϐT indicative of a confidence value that said target data item belongs to a class which is shared with known classes of the first computational model (Mathematical Concept, Mental process, evaluation and judgment), generate, for a series of source data items xs input to the first computational model, a source weight ϐS indicative of a confidence value that said source data item belongs to a known class of the first computational model, shared with the target domain (Mathematical concept, Mental Process).
The claim recites additional elements as
at least one processor; and at least one memory including computer program code; the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to perform (“Using a computer as a tool to perform a mental process”, MPEP 2106.04(a)(2)(III)(C));
provide a source dataset comprising source data items associated with a source domain, provide a target dataset comprising target data items associated with a target domain, provide a first computational model associated with the source domain dataset, the first computational model being associated with source domain classes (insignificant extra-solution activity, MPEP 2106.05(g)), train a discriminator to seek to decrease a discriminator loss function by the source and target data items xs xt, respectively weighted by the source and target weights ϐS, ϐT (training a system which is a high-generic computer software process of training data. This limitation does not amount to significantly more than the judicial exception, see MPEP 2106.05 (f) and merely confines the use of the abstract idea to a particular technological environment (Generative Adversarial Network (GAN) framework) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h)), adapt at least part of the first computational model to generate a second computational model by the discriminator loss function (training a system which is a high-generic computer software process of training data. This limitation does not amount to significantly more than the judicial exception, see MPEP 2106.05 (f) and merely confines the use of the abstract idea to a particular technological environment (Generative Adversarial Network (GAN) framework) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h)), deploy the second computational model for use to receive one or more input data items associated with the target domain and to produce an inference output (insignificant extra-solution activity, that recites only the idea of a solution or outcome (MPEP 2106.05(g)), “wherein the target weight is generated by a first classifier trained using a filtered subset of the target data items based on a probability distribution produced by the first computational model” (This limitation is directed to the description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)), “wherein the source weight is generated by a second classifier trained using a filtered subset of the source data items” (This limitation is directed to the description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)).
This judicial exception is not integrated into a practical application. The elements are recited at a high level of generality, i.e. a generic computing system performing generic functions including generic processing of data. Accordingly the additional elements do not integrate the abstract into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Therefore the claims are directed to an abstract idea. (2019 Revised Patent Subject Matter Eligibility Guidance ("2019 PEG"). Thus, under Step 2A of the Mayo framework, the Examiner holds that the claims are directed to concepts identified as abstract.
STEP 2B.
Because the claims include one or more abstract ideas, the examiner now proceeds to Step 2B of the analysis, in which the examiner considers if the claims include individually or as an ordered combination limitations that are "significantly more" than the abstract idea itself. This includes analysis as to whether there is an improvement to either the "computer itself," "another technology," the "technical field," or significantly more than what is "well-understood, routine, or conventional" (WURC) in the related arts.
The instant application includes in Claim 1 additional steps to those deemed to be abstract idea(s).
When taken the steps individually, these steps are:
provide a source dataset comprising source data items associated with a source domain, provide a target dataset comprising target data items associated with a target domain, provide a first computational model associated with the source domain dataset, the first computational model being associated with source domain classes (sending, receiving, displaying and processing data are common and basic functions in computer technology, MPEP 2106.05(d)(I)(2)), train a discriminator to seek to decrease a discriminator loss function by the source and target data items xs xt, respectively weighted by the source and target weights ϐS, ϐT (training a system which is a high-generic computer software process of training data. This limitation does not amount to significantly more than the judicial exception, see MPEP 2106.05 (f) and at best mere instructions to “apply” the abstract ideas, which cannot provide an inventive concept. See MPEP 2106.05(f), adapt at least part of the first computational model to generate a second computational model by the discriminator loss function (mere instructions to “apply” the abstract ideas, which cannot provide an inventive concept. See MPEP 2106.05(f)), deploy the second computational model for use to receive one or more input data items associated with the target domain and to produce an inference output (sending, receiving, displaying and processing data are common and basic functions in computer technology, MPEP 2106.05(d)(I)(2)), “wherein the target weight is generated by a first classifier trained using a filtered subset of the target data items based on a probability distribution produced by the first computational model” (This limitation is directed to the description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)), “wherein the source weight is generated by a second classifier trained using a filtered subset of the source data items” (This limitation is directed to the description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)).
In the instant case, Claim 1 is directed to above mentioned abstract idea. Technical functions such as receiving, and extracting are common and basic functions in computer technology. The individual limitations are recited at a high level and do not provide any specific technology or techniques to perform the functions claimed.
In addition, when the claims are taken as a whole, as an ordered combination, the combination of steps does not add "significantly more" by virtue of considering the steps as a whole, as an ordered combination. The instant application, therefore, still appears only to implement the abstract idea to the particular technological environments using what is well-understood, routine, and conventional in the related arts. The steps are still a combination made to the abstract idea. The additional steps only add to those abstract ideas using well understood and conventional functions, and the claims do not show improved ways of, for example, an unconventional non-routine functions for analyzing model operations or updating the model that could then be pointed to as being "significantly more" than the abstract ideas themselves.
Moreover, Examiner was not able to identify any "unconventional" steps, which, when considered in the ordered combination with the other steps, could have transformed the nature of the abstract idea previously identified. The instant application, therefore, still appears to only implement the abstract ideas to the particular technological environments using what is well-understood, routine, and conventional (WURC) in the related arts.
Further, note that the limitations, in the instant claims, are done by the generically
recited computing devices. The limitations are merely instructions to implement the abstract idea on a computing device that is recited in an abstract level and require no more than a generic computing devices to perform generic functions.
Independent claim 22 is the same analogy and rejected using similar analysis as claim 1.
CONCLUSION
It is therefore determined that the instant application not only represents an abstract idea identified as such based on criteria defined by the Courts and on USPTO examination guidelines, but also lacks the capability to bring about "Improvements to another technology or technical field" (Alice), bring about "Improvements to the functioning of the computer itself" (Alice), "Apply the judicial exception with, or by use of, a particular machine" (Bilski), "Effect a transformation or reduction of a particular article to a different state or thing" (Diehr), "Add a specific limitation other than what is well-understood, routine and conventional in the field" (Mayo), "Add unconventional steps that confine the claim to a particular useful application" (Mayo), or contain "Other meaningful limitations beyond generally linking the use of the judicial exception to a particular technological environment" (Alice), transformed a traditionally subjective process performed by humans into a mathematically automated process executed on computers (McRO), or limitations directed to improvements in computer related technology, including claims directed to software (Enfish).
The dependent claims, when considered individually and as a whole, likewise do not provide "significantly more" than the abstract idea for similar reasons as the independent claim.
claims 2 disclose “the source and target datasets comprise respective first and second sets of audio data items, and wherein the second computational model is an adapted audio classifier comprising at least one class shared with known classes of the first computational model” description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)). It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; claim 3 disclose “the first set of audio data items represent audio data received under one or more first conditions and wherein the second set of audio data items represent audio data received under one or more second conditions, wherein the first and second conditions comprise differences in terms of their respective ambient noise and/or microphone characteristics” description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)). It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; claim 4 disclose “the first and second sets of audio data items represent speech, e.g. one or more keywords” directed to the description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use(MPEP 2106.05(h)). It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; claim 5 “the second computational model is configured for use with a digital assistant apparatus for performing one or more processing actions based on received speech associated with the target domain” is directed to usage of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)). It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; claim 10 disclose “wherein the probability distribution is produced by input of one or more of the target data items to the first computational model” is directed to usage of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)). It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; claim 12 disclose “provide the filtered subset of target data items by: generate, using the first computational model, a probability distribution over the known source domain classes for a particular target data item (mental process, mathematical concept); determine a confidence level for the particular target data item belonging to a source domain class using the generated probability distribution (mental process); and select the particular target data item for the subset if the confidence level is above an upper confidence level threshold or below a lower confidence level threshold” (mental process, mathematical concept), It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; Claim 13 disclose “first classifier is further configured as a binary classifier to compute a target weight of '1' for indicating that a particular target data item belongs to a shared target domain class and '0' for indicating that a target data item belongs to a private target domain class” (mental process, mathematical concept), It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; Claim 15 disclose “input a batch of target data items to the first trained model to generate respective probability distributions (sending, receiving, displaying and processing data are common and basic functions in computer technology, MPEP 2106.05(d)(I)(2)); aggregate the probability distributions (mental process, mathematical concept); identify a subset of the source domain classes based on the aggregated probability distributions (mental process), including a predetermined number of largest value and lowest value classes description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use(MPEP 2106.05(h)); and select source data items associated with the identified subset of source domain classes (mental process)”, It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; Claim 16 disclose “second classifier is configured as a binary classifier for computing a source weight of '1' for indicating that a particular source data item belongs to a known class of the first computational model shared with the target domain and '0' for indicating that a particular source data item belongs to a private source domain class” directed to description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use. It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; claim 17 discloses “first computational model comprises a feature extractor associated with the source domain dataset (description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use MPEP 2106.05(h)), and wherein the adapting of the first computational model further comprises: update weights of the feature extractor based on the computed discriminator loss function (mental process, mathematical concept)” It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; claim 18 discloses “first computational model further comprises a classifier for receiving feature representations from the feature extractor, and wherein the adapting of the first computational model (description of data, which is directed to generally linking the use of a judicial exception to a particular technological environment or field of use MPEP 2106.05(h)), further comprises: determine a classification loss resulting from updating weights of the feature extractor and updating the weights of the feature extractor based on the classification loss (mental process, mathematical concept)”, It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; claim 19 discloses “adapt the first computational model responsive to an identification that one or more conditions under which the set of target data items were produced are different from one or more conditions under which the set of source data items were produced” discloses analyzing data that fall under Mental process and Mathematical concept, It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; claim 20 discloses “identify different characteristics of one or more sensors used for generating the respective sets of target data items and source data items” discloses analyzing data that fall under Mental process, It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea; claim 21 discloses “access metadata respectively associated with the source and target data items indicative of the one or more conditions under which the sets of source and target data items were produced” (sending, receiving, displaying and processing data are common and basic functions in computer technology, MPEP 2106.05(d)(I)(2)) It does not integrate the abstract idea into a practical application and did not add significantly more to the abstract idea.
The dependent claims which impose additional limitations also fail to claim patent eligible subject matter because the limitations cannot be considered statutory. The dependent claim(s) have been examined individually and in combination with the preceding claims, however they do not cure the deficiencies of claim 1 ; where all claims are directed to the same abstract idea, "addressing each claim of the asserted patents [is] unnecessary." Content Extraction &. Transmission LLC v, Wells Fargo Bank, Natl Ass'n, 776 F.3d 1343, 1348 (Fed. Cir. 2014). If applicant believes the dependent claims are directed towards patent eligible subject matter, they are invited to point out the specific limitations in the claim that are directed
towards patent eligible subject matter. Claims for the other statutory classes are similarly analyzed.
For at least these reasons, the claimed inventions of each of dependent claims 2-21,are directed or indirect to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more and are rejected under 35 USC 101.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 10, 12-13, 16, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over “Universal Domain Adaptation” [You et al., hereinafter D1] Disclosed in IDS filed 12/23/2021 and published on 2019 in view of https://openaccess.thecvf.com/content_CVPR_2019/papers/Liu_Separate_to_Adapt_Open_Set_Domain_Adaptation_via_Progressive_Separation_CVPR_2019_paper.pdf “Separate to Adapt: Open Set Domain Adaptation via Progressive Separation” [Liu et al., hereinafter D3] published on 2019 further in view of Xu et al. [US 2021/0295208 A1, hereinafter Xu].
With regard to Claim 1,
D1 disclose apparatus, comprising: at least one processor; and at least one memory including computer program code; the at least one memory and the computer program code configured to, with the at least one processor (P. 2724, 4. Experiments, “Code and data will be available at github.com/thuml.”, 4.1.2, ¶1, P. 2726, “Implementation Details. Implementation is in PyTorch and ResNet-50 [13] is used as the backbone network”), cause the apparatus at least to perform:
provide a source dataset comprising source data items associated with a source domain (Fig. 1, P.2721, ¶3, “given a labeled source domain, for any related target domain, regardless of how its label set differs from that of the source domain”, P.2722, 3.1, ¶1, “a source domain Ds = {(
x
i
s
,
y
i
s
,)} consisting of
n
s
labeled samples”);
provide a target dataset comprising target data items associated with a target domain (Fig. 1, P.2722, 3.1, ¶1, “a target domain Dt = {(
x
i
t
,)} of
n
t
unlabeled samples are provided at training”);
provide a first computational model associated with the source domain dataset, the first computational model being associated with source domain classes (P.2723, 3.3. “UAN consists of a feature extractor F, an adversarial domain discriminator D, a non-adversarial domain discriminator D′ and a label classifier G. Input x from either domain is fed into the feature extractor F. The extracted feature z = F(x) is forwarded into the label classifier G to obtain the probability ˆy = G(z) of x over the source classes Cs”);
generate, for a series of target data items xt input to the first computational model, a target weight ϐT indicative of a confidence value that said target data item belongs to a class which is shared with known classes of the first computational model (P.2723, 3.3, “
w
t
(x) indicates the probability of a target sample x belonging to the common label set C”, P.2724, 3.4, “ws = ws(x) and wt = wt(x) by sample-level transferability criterion. With a proper sample-level transferability criterion, each point in both source and target domains can be weighted such that the distributions of source and target data in the common label set C can be maximally aligned”, “
PNG
media_image1.png
105
369
media_image1.png
Greyscale
”);
wherein the target weight is generated (P.2723, 3.3. Universal Adaptation Network, ¶1, “extracted feature z = F(x) is forwarded into the label classifier G to obtain the probability ˆy = G(z) of x over the source classes Cs”, entropy of ˆy = G(F(x)) for target samples used in calculating Wt(x)
PNG
media_image2.png
200
400
media_image2.png
Greyscale
, P.2723, 3.3. Universal Adaptation Network, ¶1, “extracted feature z = F(x) is forwarded into the label classifier G to obtain the probability ˆy = G(z) of x over the source classes Cs”, G(z) label classifier over source label set Cs trained using source data only).
generate, for a series of source data items xs input to the first computational model, a source weight ϐS indicative of a confidence value that said source data item belongs to a known class of the first computational model, shared with the target domain (“
w
s
(x) indicates the probability of a target sample x belonging to the common label set C”);
wherein the source weight is generated by a second classifier (D1, equation (7), transferability function ws(x) = H(yˆ)/log |Cs|− dˆ′(x), uses entropy and domain similarity to predict if the source sample is part of shared class).
train a discriminator to seek to decrease a discriminator loss function by the source and target data items xs xt, respectively weighted by the source and target weights ϐS, ϐT (Fig. 2, P. 2723, 3.3, “non-adversarial domain discriminator D′ and adversarial domain discriminator D, which are formally defined as
PNG
media_image3.png
89
693
media_image3.png
Greyscale
where L is the standard cross-entropy loss,
w
s
(x) indicates the probability of a source sample x belonging to the common label set C, and similarly,
w
t
(x) indicates the probability of a target sample x belonging to the common label set C”);
adapt at least part of the first computational model to generate a second computational model by the discriminator loss function (P. 2720, Introduction, ¶2, “Domain adaptation [33] aims to minimize the domain gap and successfully transfer the model trained on the source domain to the target domain”, P.2726, Implementation Details, “Models are fine-tuned from ResNet-50 pre-trained on ImageNet”, Table 1, Table 2, P. 2723, 3.3, “adversarial domain discriminator D is confined to distinguish the source and target data in the common label set C. Adversarially, the feature extractor F strives to confuse D, yielding domain-invariant features in the common label set C”); and
deploy the second computational model for use to receive one or more input data items associated with the target domain and to produce an inference output (P. 2720, Introduction, ¶2, “Domain adaptation [33] aims to minimize the domain gap and successfully transfer the model trained on the source domain to the target domain”, Table 1, Table 2, Fig. 2, P. 2723, 3.3, “The label classifier G trained on such features can be applied safely to the target domain. The training of UAN can be written as a minimax game:
PNG
media_image4.png
122
530
media_image4.png
Greyscale
”, P. 2727, 5, “if one wants to generalize a model to a new scenario, the proposed UAN can be a good candidate model. If UAN classifies most examples as “unknown”, then domain adaptation in such a new scenario may well fail, and collecting labels will be indispensable. On the other hand, if UAN can generate labels for most examples, collecting labels for such a scenario are not necessary and domain adaptation will perform the work”).
D1 does not explicitly teach trained using a filtered subset of the target data items.
D3 teach trained using a filtered subset of the target data items (P. 2930, Fig. 3, Gc-> Gb, Col. 2, ¶2, 3.3 Progressive Separation, “we rank the similarity for all the target samples, and choose samples with highest/lowest similarity to train the binary classifier Gb. This filtering is relatively coarse but has high confidence since we only use samples with extreme similarity”, 3.5. Training Procedure, P.2931, “Step 1. We first train the feature extractor Gf and the classifier Gy to classify source samples. Meanwhile, the multi-binary classifier Gc, c = 1, 2, ..., |Cs| is trained in a one-vs-rest way for each source class. We further select target samples with high/low similarities to the source domain to train the fine-grained binary classifier Gb”, Gb is a classifier trained using only filtered target data subset).
D1 and D3 are analogous art to the claimed invention because they are from a similar field of endeavor of domain adaption. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1 resulting in resolutions as disclosed by D3 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1 as described above to prevent the rise to negative transfer due to mismatch between unknown and known classes and to allow openness robust open set domain adaptation, which can be adaptive to a variety of openness in the target domain (D3, 3.3. Co.2, ¶2, “This filtering is relatively coarse but has high confidence since we only use samples with extreme similarity. It is also robust to different levels of openness since we no longer need to choose hyperparameters manually or using optimization tools”).
D1-D3 does not teach second classifier trained using a filtered subset of the source domain data items.
Xu teach source weight is generated by a second classifier trained using a filtered subset of the source data items (Abstract, Fig. 5, ¶21, “MDDA device 110 selects the source training samples that are close to the target to finetune the source classifiers to obtain refined source classifiers 230-236 (denoted as C′i) in respective source domains 210-216”, ¶41, “In step S512, source classifier refining unit 346 finetunes source classifiers Ci (412) using the selected source training data”) .
D1-D3 and Xu are analogous art to the claimed invention because they are from a similar field of endeavor of domain adaption. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D3 resulting in resolutions as disclosed by Xu with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1-D3 as described above for handling new scenarios with few labeled examples (Xu, ¶2).
With regard to Claim 10,
D1 disclose the apparatus of claim 1, wherein the probability distribution produced by input of one or more target data items to the first computational model (D1, P.2724, Col. 1, 3.4. “Transferability Criterion, ¶¶1-3, “Domain Similarity. In Eq. (2), the objective of D′ is to predict samples from source domain as 1 and samples from target domain as 0. Thus, ˆ d′ can be seen as the quantification for the domain similarity of each sample. For a source sample, smaller ˆ d′ means that it is more similar to the target domain; for a target sample, larger ˆ d′ means that it is more similar to the source domain”, Col. 2, “
PNG
media_image5.png
200
400
media_image5.png
Greyscale
”).
With regard to Claim 12,
D1-D3-Xu disclose the apparatus of claim 1, wherein the at least one memory and the computer program code are further configured to provide the filtered subset of target data items by:
generate, using the first computational model, a probability distribution over the known source domain classes for a particular target data item (D1, P.2930, 3.3. Progressive Separation, “We utilize a multi-binary classifier, which is composed of |Cs| binary classifiers denoted by Gc||Cs| c=1, to measure the similarity between each target sample and each source class”, “Each binary classifier Gc outputs a probability pc for each target sample to measure how possible the sample belongs to known class c. Thus, the probability pc can be explained as the similarity between the target sample and known class c. Data of known classes in the target domain tend to have higher probability in one of the shared classes than data of unknown classes”, P.2931, 3.5. Training Procedure, “Step 1. We first train the feature extractor Gf and the classifier Gy to classify source samples. Meanwhile, the multi-binary classifier Gc, c = 1, 2, ..., |Cs| is trained in a one-vs-rest way for each source class”);
determine a confidence level for the particular target data item belonging to a source domain class using the generated probability distribution (D1, P.2930, 3.3. Progressive Separation, “We utilize a multi-binary classifier, which is composed of |Cs| binary classifiers denoted by Gc||Cs| c=1, to measure the similarity between each target sample and each source class”, “Each binary classifier Gc outputs a probability pc for each target sample to measure how possible the sample belongs to known class c. Thus, the probability pc can be explained as the similarity between the target sample and known class c. Data of known classes in the target domain tend to have higher probability in one of the shared classes than data of unknown classes”, “With such similarity definition, target data of known classes will have high similarity to their source domain counterparts. Correspondingly, target data of unknown classes will have low similarity to all classes in the source domain. Thus, we rank the similarity for all the target samples, and choose samples with highest/lowest similarity to train the binary classifier Gb”); and
select the particular target data item for the subset if the confidence level is above an upper confidence level threshold or below a lower confidence level threshold (D1, P.2931, 3.5. Training Procedure, ¶2, “We further select target samples with high/low similarities to the source domain to train the fine-grained binary classifier Gb”).
The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 13,
D1-D3-Xu disclose the apparatus of claim 1, wherein the first classifier is further configured as a binary classifier to compute a target weight of ‘1’ for indicating that a particular target data item belongs to a shared target domain class and ‘0’ for indicating that a target data item belongs to a private target domain class (D1, P.2724, ¶2, “we compute wt(x) using Eq. (8) (details in the next subsection). With a validated threshold w0, the class y(x) can be predicted by thresholding ˆy(x) w.r.t. w0:
PNG
media_image6.png
200
400
media_image6.png
Greyscale
which either rejects the target sample x as “unknown” class or classifies it to one of the source classes” DX, P.2930, Col. 2, ¶2, “we rank the similarity for all the target samples, and choose samples with highest/lowest similarity to train the binary classifier Gb. This filtering is relatively coarse but has high confidence”, P.2930, 3.3. Progressive Separation, Col. 2, ¶“With samples selected into known and unknown classes by the multi-binary classifier, we further train a binary classifier Gb to finely separate known and unknown classes. Using X′ to denote the set of filtered samples by the multi-binary classifier, and dj to indicate whether a target sample xj ∈ X′ is labeled as known (dj = 0) or unknown (dj = 1)”, P.2931, 3.5. Training Procedure, ¶2, “We further select target samples with high/low similarities to the source domain to train the fine-grained binary classifier Gb”).
The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 16,
D1-D3-Xu teach the apparatus of claim 14 , wherein the second classifier is configured as a binary classifier for computing a source weight of ‘1’ for indicating that a particular source data item belongs to a known class of the first computational model shared with the target domain and ‘0’ for indicating that a particular source data item belongs to a private source domain class (D1, equation (7), ws(x) = H(yˆ)/log |Cs|− dˆ′(x), “3.4. Transferability Criterion, “Note that the entropy is normalized by its maximum value (log |Cs|) so that it is restricted into [0, 1] and comparable to the domain similarity measure ˆ d′. Also, the weights are normalized into interval [0, 1] during training”, transferability function uses entropy and domain similarity to predict if the source sample is part of shared class where Ws(x) = 0 (low similarity or high entropy) means private class and Ws(x) = 1 (high similarity or low entropy) means shared class).
The same motivation to combine for claim 1 equally applies for current claim
Regarding claim 22,
Claim 22 is similar to scope to claim 1; therefore it is rejected under similar rationale.
Claims 2-5 are rejected under 35 U.S.C. 103 as being unpatentable over “Universal Domain Adaptation” [You et al., hereinafter D1] Disclosed in IDS filed 12/23/2021 and published on 2019 in view of https://openaccess.thecvf.com/content_CVPR_2019/papers/Liu_Separate_to_Adapt_Open_Set_Domain_Adaptation_via_Progressive_Separation_CVPR_2019_paper.pdf “Separate to Adapt: Open Set Domain Adaptation via Progressive Separation” [Liu et al., hereinafter D3] published on 2019 further in view of Xu et al. [US 2021/0295208 A1, hereinafter Xu] in view of “Mic2Mic: Using Cycle-Consistent Generative Adversarial Networks to Overcome Microphone Variability in Speech Systems” [Mathur et al., hereinafter D2] Disclosed in IDS filed 12/23/2021 and published on 2019.
With regard to Claim 2,
D1-D3-Xu disclose the apparatus of claim 1, wherein the source and target datasets comprise respective first and second sets of audio data items (D1, P.2722, 3.1, ¶1, “a source domain Ds = {(
x
i
s
,
y
i
s
,)} consisting of
n
s
labeled samples”, 2721, ¶3, “given a labeled source domain, for any related target domain, regardless of how its label set differs from that of the source domain”, Fig. 1, P.2722, 3.1, ¶1, “a target domain Dt = {(
x
i
t
,)} of
n
t
unlabeled samples are provided at training”), and
wherein the second computational model is an adapted audio classifier comprising at least one class shared with known classes of the first computational model (D1, 2721, ¶3, “we need to classify its samples correctly if it belongs to any class in the source label set, or mark it as “unknown” otherwise. The word “universal” indicates that UDA imposes no prior knowledge on the label sets”, P.2722, 3.1, ¶2, “The task for UDA is to design a model that does not know ξ but works well across a wide spectrum of ξ. It must be able to distinguish between target data coming from C and target data coming from Ct, as well as to learn a classification model f to minimize the target risk in the common label set”, P. 2720, Introduction, ¶2, “Domain adaptation [33] aims to minimize the domain gap and successfully transfer the model trained on the source domain to the target domain”, P.2726, Implementation Details, “Models are fine-tuned from ResNet-50 pre-trained on ImageNet”, Table 1, Table 2).
D1-D3-Xu does not disclose first and second sets of audio data items, and wherein the second computational model is an adapted audio classifier; adapted audio classifier.
D2 disclose source and target datasets comprise respective first and second sets of audio data items (Fig. 4, data from target microphone and data from source microphone), and wherein the second computational model is an adapted audio classifier comprising at least one class shared with known classes of the first computational model (Fig. 4, P.175, 6.1 Audio Tasks and Datasets, “Keyword Spotting. In this task, the goal is to identify the presence of a certain keyword class (e.g., Hey Alexa) in a given speech segment. To train a model for this task”, “model outputs a probability of a given audio recording belonging to a certain keyword class (e.g., Yes, No) or to an Unknown class”) .
D1-D3-Xu and D2 are analogous art to the claimed invention because they are from a similar field of endeavor of applying the principles of generative adversarial network (GAN) to learn using unlabeled data. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D3-Xu resulting in resolutions as disclosed by D2 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1-D3-Xu as described above to allow the usage of D1-D3-Xu in audio detection for expanding D1 fields of application and to decouple the problem of microphone variability from the audio task, and b) put minimal burden on end-users to provide training data (D2, Abstract).
With regard to Claim 3,
D1-D3-Xu-D2 disclose the apparatus of claim 2, wherein the first set of audio data items represent audio data received under one or more first conditions and wherein the second set of audio data items represent audio data received under one or more second conditions, wherein the first and second conditions comprise differences in terms of their respective ambient noise and/or microphone characteristics (D2, P.169-170, “In this paper, we present a practical solution to recover the accuracy of audio models otherwise lost due to microphone variability. We propose to frame the problem of microphone variability as an audio translation problem, that is, given a set of audio data from a source microphone, can we translate it such that it will resemble data collected from a target microphone? More formally, if and 0 are the recordings of the same audio signal from two microphones A and B respectively, we would like to learn a microphone translation function f , such that ý = f (y)”, P.175, 6.1, Data Collection, “Microphone Hardware. We collect audio data from six different microphones representing three class of devices”, P.176, ¶1, “we use a JBL LSR305 reference monitor speaker to replay the audio datasets in a quiet room. This speaker has a relatively $at frequency response in the human speech range which allows for a faithful replay of the datasets. The replayed audios are recorded simultaneously on 6 Raspberry Pi devices, each connected with a microphone and placed equidistant (12cm) from the audio source. Effectively, we created six versions of the audio datasets, one for each microphone used in our experiment”).
The same motivation to combine for claim 2 equally applies for current claim
With regard to Claim 4,
D1-D3-Xu-D2 disclose the apparatus of claim 2, wherein the first and second sets of audio data items represent speech, e.g. one or more keywords (D2, P.175, 6.1 Audio Tasks and Datasets, “Keyword Spotting. In this task, the goal is to identify the presence of a certain keyword class (e.g., Hey Alexa) in a given speech segment. To train a model for this task”, “model outputs a probability of a given audio recording belonging to a certain keyword class (e.g., Yes, No) or to an Unknown class”).
The same motivation to combine for claim 2 equally applies for current claim
With regard to Claim 5,
D1-D3-Xu-D2 disclose the apparatus of claim 4, wherein the second computational model is configured for use with a digital assistant apparatus for performing one or more processing actions based on received speech associated with the target domain (D2, P.175, 6.1 Audio Tasks and Datasets, “Keyword Spotting. In this task, the goal is to identify the presence of a certain keyword class (e.g., Hey Alexa) in a given speech segment. To train a model for this task, we use the Speech Commands dataset containing 65,000 one-second long utterances of 30 short keywords [3]. Instead of using all 30 classes, we used a subset of 12 classes for our experiments (yes, no, up, down, left, right, on, off stop, go, zero, one). ”).
The same motivation to combine for claim 4 equally applies for current claim
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over “Universal Domain Adaptation” [You et al., hereinafter D1] Disclosed in IDS filed 12/23/2021 and published on 2019 in view of https://openaccess.thecvf.com/content_CVPR_2019/papers/Liu_Separate_to_Adapt_Open_Set_Domain_Adaptation_via_Progressive_Separation_CVPR_2019_paper.pdf “Separate to Adapt: Open Set Domain Adaptation via Progressive Separation” [Liu et al., hereinafter D3] published on 2019 further in view of Xu et al. [US 2021/0295208 A1, hereinafter Xu] further in view of https://arxiv.org/pdf/2008.03111 “Associative Partial Domain Adaptation” Published on August 7, 2020 [hereinafter D4].
With regard to Claim 15,
D1-D3-Xu teach the apparatus of claim 14, wherein the at least one memory and the computer program code are further configured to filter the source data items by:
input a batch of target data items to the first trained model to generate respective probability distributions (Xu, Fig. 5, ¶43, “In step S514, prediction unit 348 may apply the trained target encoders and the refined classifiers to unlabeled data of the target domain, e.g., unlabeled data 151 stored in database/repository 150. For example, the refined source classifiers C′i (430) and target encoders Fi T (420) are provided for the learning task (e.g., to classify a target data xT), as illustrated by the fourth block of FIG. 4. For each source domain, prediction unit 348 extracts the features of the unlabeled target domain data based on the learned target encoder Fi T (xT), and obtains source-specific prediction using the distilled source classifier C′i (Fi T (xT))”;
aggrege the probability distributions (Xu, ¶44, “In step S516, aggregation unit 348 may determine weights for the different source domains in order to aggregate the source-specific prediction results”, ¶45, “In step S518, aggregation unit 348 may then aggregate the source-specific predication results with the calculated weights”, Claim 10, “aggregating the predictions weighted by respective domain weights each corresponding to a source domain”);
identify a subset of the source domain classes including a predetermined number of [source data] (Xu, ¶39, “training data selection unit 344 may select a subset of training data from the received labeled data of each source domain. This step is also referred to as source distilling, which selects more relevant training data to improve the performance of the source classifiers. In some embodiments, in each source domain, training data selection unit 344 selects the source training samples that are closer to the target”, ¶40, “calculated distance reflects its similarity to the target domain. The smaller the calculated distance is, the closer the source sample is to the target domain. In some embodiments, in each source domain Si, a predetermined percentage of source training samples may be selected”); and
select source data items associated with the identified subset of source domain classes (D1, Abstract, “selecting a subset of the labeled data received from each source domain based on a similarity between the selected labeled data and the unlabeled data of the target domain”, Claim 1, ¶41, “source classifier refining unit 346 finetunes source classifiers Ci (412) using the selected source training data”, ¶42).
The same motivation to combine for claim 1 equally applies for current claim.
D1-D3-Xu does not explicitly teach based on the aggregated probability distributions, largest value and lowest value classes.
D4 teach based on the aggregated probability distributions including a predetermined number of largest value and lowest value classes (P.5, Col. 1, ¶2, Confidence-guided Loss: “To reduce errors caused by corner cases, we apply confidence guided loss only to top-K classes and bottom-K classes according to the value of class-level commonness:
PNG
media_image7.png
200
400
media_image7.png
Greyscale
Here, TK and BK are top-K class set and bottom-K class set respectively”).
D1-D3-Xu and D4 are analogous art to the claimed invention because they are from a similar field of endeavor of domain adaption. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D3-Xu resulting in resolutions as disclosed by D4 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to substitute D1-D3-Xu method for identifying the similarity (e.g., an empirical Wasserstein distance) between the source and target to be based on top-K class and bottom-k class to encourage positive transfer by mapping between nearby target samples and source samples with high label-commonness and to reduce errors caused by error cases (Abstract, P.5, ¶2, “To reduce errors caused by corner cases, we apply loss only to top-K classes and bottom-K classes according to the value of class-level commonness”).
Claims 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over “Universal Domain Adaptation” [You et al., hereinafter D1] Disclosed in IDS filed 12/23/2021 and published on 2019 in view of https://openaccess.thecvf.com/content_CVPR_2019/papers/Liu_Separate_to_Adapt_Open_Set_Domain_Adaptation_via_Progressive_Separation_CVPR_2019_paper.pdf “Separate to Adapt: Open Set Domain Adaptation via Progressive Separation” [Liu et al., hereinafter D3] published on 2019 further in view of Xu et al. [US 2021/0295208 A1, hereinafter Xu] in view of “Adversarial Training Based Multi-Source Unsupervised Domain Adaptation for Sentiment Analysis” [Dai et al., hereinafter D5] published on 4/2020.
With regard to Claim 17,
D1-D3-Xu teach the apparatus of claim 1.
D1-D3-Xu does not explicitly teach first computational model comprises a feature extractor associated with the source domain dataset, and wherein the adapting of the first computational model further comprises: update weights of the feature extractor based on the computed discriminator loss function.
D5 teach first computational model comprises a feature extractor associated with the source domain dataset, and wherein the adapting of the first computational model further comprises: update weights of the feature extractor based on the computed discriminator loss function (P.7621, ¶1 The Discriminator D, “update weights of the feature extractor based on the computed discriminator loss function”, P. 7621, Col. 2, ¶3, “shared feature extractor Es … The second term is the adversarial loss, which will encourage Es to strengthen its feature extraction ability to confuse D”, Equation (2), Equation (5)).
D1-D3-Xu and D5 are analogous art to the claimed invention because they are from a similar field of endeavor of domain adaption. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D3-Xu resulting in resolutions as disclosed by D5 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1-D3-Xu as described above encourage feature extractor to strengthen its feature extraction ability to confuse discriminator (D5, P. 7621, Col. 2, ¶3).
With regard to Claim 18,
D1-D3-Xu-D5 teach the apparatus of claim 17, wherein the first computational model further comprises a classifier for receiving feature representations from the feature extractor, and wherein the adapting of the first computational model further comprises: determine a classification loss resulting from updating weights of the feature extractor and updating the weights of the feature extractor based on the classification loss (D5, P. 7621, ¶4, “The classifier C C is an usual classifier and used to classify sentiment polarities. … This objective means the concatenated features will be sent to C to calculate the final sentiment label”, P. 7621, Col. 2, ¶3, “shared feature extractor Es … The first term denotes the contribution to the sentiment classification when concatenated with the private features Equation (3), Equation (5)).
The same motivation to combine for claim 17 equally applies for current claim
Claims 19-21 are rejected under 35 U.S.C. 103 as being unpatentable over “Universal Domain Adaptation” [You et al., hereinafter D1] Disclosed in IDS filed 12/23/2021 and published on 2019 in view of https://openaccess.thecvf.com/content_CVPR_2019/papers/Liu_Separate_to_Adapt_Open_Set_Domain_Adaptation_via_Progressive_Separation_CVPR_2019_paper.pdf “Separate to Adapt: Open Set Domain Adaptation via Progressive Separation” [Liu et al., hereinafter D3] published on 2019 further in view of Xu et al. [US 2021/0295208 A1, hereinafter Xu] in view of https://arxiv.org/pdf/1901.05335 “A review of domain adaptation without target labels” [Kouw et al., hereinafter D6] published on 7/2019.
With regard to Claim 19,
D1-D3-Xu teach the apparatus of claim 1, wherein the at least one memory and the computer program code.
D1-D3-Xu does not explicitly teach adapt the first computational model responsive to an identification that one or more conditions under which the set of target data items were produced are different from one or more conditions under which the set of source data items were produced.
D6 adapt the first computational model responsive to an identification that one or more conditions under which the set of target data items were produced are different from one or more conditions under which the set of source data items were produced (P.1, Col. 2, ¶1, “Nevertheless, unlabeled data gives an indication in what way a source domain and a target domain differ from each other. This information can be exploited to make a classifier adapt, i.e. change its decisions such that it generalizes better towards the target domain”, P.5, col. 1, ¶2, “. Concept shift, also known as concept drift or conditional shift, will require observations of labeled data in both domains and is therefore out of our scope [86]. Covariate shift corresponds to decomposing the joint distributions into p(y | x)p(x) and assuming that the posteriors remain equal in both domains, pS (y|x) = pT (y|x) [30]. Conversely, prior shift, also referred to as label or target shift, corresponds to decomposing the joints into p(x | y)p(y) and assuming the conditional distributions remain equal pS (x | y) = pT (x | y) [30].).
D1-D3-Xu and D6 are analogous art to the claimed invention because they are from a similar field of endeavor of domain adaption. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D3-Xu resulting in resolutions as disclosed by D5 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1-D3-Xu as described above to make a classifier adapt, i.e. change its decisions such that it generalizes better towards the target domain (D6, P.1, Col. 2, ¶1” to make a classifier adapt, i.e. change its decisions such that it generalizes better towards the target domain”).
With regard to Claim 20,
D1-D3-Xu-D6 teach the apparatus of claim 19, wherein the at least one memory and the computer program code are further configured to identify different characteristics of one or more sensors used for generating the respective sets of target data items and source data items (D6, P.2, 1.3 Motivating examples, Col.2, ¶2, “In medical imaging, radiologists manually annotate tissues, abnormalities, and pathologies to obtain training data for computer-aided-diagnosis systems. But due to the mechanical configuration, calibration, vendor or acquisition protocol of MRI, CT or PET scanners, there are large variations between data sets from different medical centers [38], [39]. Consequently, diagnosis systems often fail to perform well across centers”, domain differences are due to domain differences to variables related to sensors related variables mechanical configuration, calibration, vendor or acquisition protocol, (D6, 6.4 Multi-site studies, P.14-15, “another group uses different experimental protocols or measuring devices, or is located in a different environment, and that their data is therefore not compatible [208]. For example, in biostatistics, gene expression micro-array data can exhibit batch effects [209]. These can be caused by the amplification reagent, time of day, or even atmospheric ozone level [210]. In some data sets, batch effects are actually the most dominant source of variation and are easily identified by clustering algorithms [209]. Domain adaptation methods are useful tools for this type of problem. Additional information such as local weather, laboratory conditions, or experimental protocol, can be exploited to correct for the data shift”).
The same motivation to combine for claim 19 equally applies for current claim.
With regard to Claim 21,
D1-D3-Xu-D6 teach the apparatus of claim 19 , wherein the at least one memory and the computer program code are further configured to access metadata respectively associated with the source and target data items indicative of the one or more conditions under which the sets of source and target data items were produced (D6, 6.4 Multi-site studies, P.14-15, “another group uses different experimental protocols or measuring devices, or is located in a different environment, and that their data is therefore not compatible [208]. For example, in biostatistics, gene expression micro-array data can exhibit batch effects [209]. These can be caused by the amplification reagent, time of day, or even atmospheric ozone level [210]. In some data sets, batch effects are actually the most dominant source of variation and are easily identified by clustering algorithms [209]. Domain adaptation methods are useful tools for this type of problem. Additional information such as local weather, laboratory conditions, or experimental protocol, can be exploited to correct for the data shift”).
The same motivation to combine for claim 19 equally applies for current claim.
Response to Arguments
Applicant argue that a human cannot perform large scale data processing, model execution, and statistical filtering to train a classifier.
Examiner respectfully disagrees, the claims does not require a large scale data and human can process and filter a small number of data easily. In addition training and model execution is not considered as a mental process training a system which is a high-generic computer software process of training data. Training limitation does not amount to significantly more than the judicial exception, see MPEP 2106.05 (f) and merely confines the use of the abstract idea to a particular technological environment and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h).
Applicant argue that training a second classifier and the process of involving first and second classifier is technological process with no practical mental equivalent.
Examiner respectfully disagrees, the training process as disclosed in the rejection is not a mental process and it does not amount to significantly more than the judicial exception, see MPEP 2106.05 (f) and merely confines the use of the abstract idea to a particular technological environment and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h).
Applicant argue that adversarial training with generated weights that involve iterative calculation is a computationally intensive process that cannot be considered a mental process or mathematical concept.
Examiner respectfully disagrees, the training process as disclosed in the rejection is not a mental process and it does not amount to significantly more than the judicial exception, see MPEP 2106.05 (f) and merely confines the use of the abstract idea to a particular technological environment and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h). In addition, the human mind is able to conduct calculations using mathematical formulas and calculations fall under mathematical concepts as the MPEP does not exclude or include mathematical calculation from mathematical concepts based on a degree of complexity.
Applicant argue that the claims recite a practical application amounting to “significantly more” as they are directed to a specific improvement in the functioning of a computer.
Examiner respectfully disagrees, the cited paragraphs disclose the problems in the field and how the invention could solve the introduced problems. However, the specifications does not identify the unconventional technical solution expressed in the claim. The applicant argue that the usage of two classifiers enables the improvement described in the specification ¶62, and ¶74. The examiner notes that the paragraphs describe an improvement, but they do not explain why or how the usage of two classifiers would solve this problems (MPEP 2106.05(a)).
Applicant’s arguments, see P.7, filed 8/28/2025, with respect to the rejection(s) of claim 1 under D1 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of D3 and Xu.
In response to applicant's argument that the examiner's conclusion of obviousness is based upon improper hindsight reasoning, it must be recognized that any judgment on obviousness is in a sense necessarily a reconstruction based upon hindsight reasoning. But so long as it takes into account only knowledge which was within the level of ordinary skill at the time the claimed invention was made, and does not include knowledge gleaned only from the applicant's disclosure, such a reconstruction is proper See In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971).
Applicant argue that there is no teaching or suggestion in D1, which uses its own integrated weighting scheme based on entropy and domain similarity, to discard its method and instead adopt the specific, multi-step filtering and training process from D3 for the sole purpose of generating target weights. A person of ordinary skill in the art would not have been motivated to combine these disparate methodologies to arrive at the claimed invention.
Examiner respectfully disagrees,
First, the motive for modification should be derived from the teaching that will provide the improvement (D3) not from the teaching that require improvement (D1).
Second, D1 and D3 are analogous art to the claimed invention because they are from a similar field of endeavor of domain adaption.
Third, as D1 rely on entropy based target weighting, which may be unstable across different levels of domain openness and may require sensitive tuning. In contrast D3 introduce mechanism for explicitly selecting high confidence target samples using extreme similarity score. D3 explicitly disclose this filtering provide high confidence because it relies only on samples with extreme similarity score See at least D3, 3.3. Co.2, ¶2, “This filtering is relatively coarse but has high confidence since we only use samples with extreme similarity. It is also robust to different levels of openness since we no longer need to choose hyperparameters manually or using optimization tools”.
Applicant argue that Xu does not teach or suggest training a separate, second classifier for the distinct purpose of generating source weights to be used in an adversarial loss function as claimed. Xu’s filtering is for refining the main task classifier, not for training an auxiliary weight-generation classifier. Combining D1, D3, and Xu would require a person of ordinary skill to selectively pick and choose isolated elements from three different complex systems without any suggestion or motivation to do so. Therefore, the specific and integrated process of training two separate classifiers on filtered subsets of data to generate source and target weights for an adversarial domain adaptation framework is not taught or suggested by the cited art.
Examiner respectfully disagrees,
First, D1 teaches generating a source weight as a function of similarity between domains and Xu teach a different way to generate source weight using filtered source data. A person of ordinary skill would understand that this is a substitutable way to generate weight which is an obvious variant of the same design principle.
Second, D1 teach weighting target samples, D3 teach filtering and selecting target samples, Xu teach filtering and selecting source samples and weighting them. Therefore it would have been obvious for a person of ordinary skill in the art to combine them to improve the weighting accuracy.
Third, D3 classifier and Xu classifier serve different rules and require different filtered datasets. Therefore it would have been obvious for a person of ordinary skill in the art to treat them as distinct classifiers that map precisely to the first and second claimed classifiers. In other words, the combination of cited references teach the claimed first and second classifiers. D3 explicitly teach training a classifier trained using filtered subset of target samples to differentiate between known and unknown target data (classifier Gb), Xu teach training or fine tuning classifier on filtered subset of source samples that are most similar to the target domain (classifier Ci). These are clearly two different classifiers, trained on different filtered datasets for different functional purposes, as required by the claim. So D3 and Xu when combined with D1 that generate weights from classifiers outputs they modify it to provide both the claimed target weight and the claimed source weight. In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
Fourth, D1-D3 and Xu are analogous art to the claimed invention because they are from a similar field of endeavor of domain adaption. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D3 resulting in resolutions as disclosed by Xu with a reasonable expectation of success. One of ordinary skill in the art would be motivated to modify D1-D3 as described above for handling new scenarios with few labeled examples (Xu, ¶2).
As to the remaining dependent claims, applicant argue that they are allowable due to their respective direct and indirect dependencies upon one of the aforementioned Independent claims. The examiner respectfully disagrees, Independent claims were not allowable as stated in the paragraph above in this “Response to Arguments” section in this office action.
Conclusion
The prior art made of record and not relied upon is considered pertinent to the applicant’s disclosure.
US Patent Application Publication No. 20200134442 issued to Sim et al. that disclose the usage of domain adaption to transfer learning between source domain and target domain for task detection models.
Examiner has pointed out particular references contained in the prior arts of record in the body of this action for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and Figures may apply as well. It is respectfully requested from the applicant, in preparing the response, to consider fully the entire references as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior arts or disclosed by the examiner. It is noted that any citation to specific pages, columns, figures, or lines in the prior art references any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331-33, 216 USPQ 1038-39 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)).
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMED ABOU EL SEOUD whose telephone number is (303)297-4285. The examiner can normally be reached Monday-Thursday 9:00am-6:00pm MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michelle Bechtold can be reached at (571) 431-0762. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MOHAMED ABOU EL SEOUD/Primary Examiner, Art Unit 2148