Last updated: May 29, 2026
Application No. 18/159,759
DATA PROCESSING DEVICE, DATA PROCESSING SYSTEM, AND DATA PROCESSING METHOD

Non-Final OA §103§112
Filed
Jan 26, 2023
Priority
Apr 19, 2022 — JP 2022-068632
Examiner
CHEN, ALAN S
Art Unit
2125
Tech Center
2100 — Computer Architecture & Software
Assignee
Toshiba Electronic Devices & Storage Corporation
OA Round
1 (Non-Final)
Interview Optional

— +6.3% interview lift. Interview lift (+6.3%) is below the 15.0% threshold. A written response is recommended.
Based on 1134 resolved cases, 2023–2026
Examiner Intelligence

CHEN, ALAN S View full profile →
Grants 91% — above average
Career Allowance Rate
1033 granted / 1134 resolved
+36.1% vs TC avg
Moderate +6% lift
Without
With
+6.3%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
22 currently pending
Career history
1152
Total Applications
across all art units
Statute-Specific Performance

§101
8.1%
-31.9% vs TC avg
§103
30.8%
-9.2% vs TC avg
§102
42.6%
+2.6% vs TC avg
§112
12.5%
-27.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 1134 resolved cases
Office Action

§103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed.
The following title is suggested: "DATA PROCESSING DEVICE, SYSTEM, AND METHOD FOR GENERATING MACHINE LEARNING MODEL USING AUGMENTED MATRIX DATA"

The disclosure is objected to because of the following informalities: 
On pg. 3, line 35, "processor71" is missing a space and should read "processor 71” 
On pg. 8, line 26, "processor71" is missing a space and should read "processor 71” 
On pg. 4, line 12, “ego” should read "e.g."
On pg. 7, line 28, "xp_2,D" should read "xp_2,D1"
On pg. 12, line 3, "yq_Np" should read "yq_Nq"
On pg. 12, line 30, "xp_2,1; xp_2,2; . . . xp_2,D2" should read "xq_2,1; xq_2,2; ... xq_2,D2" 
On pg. 12, lines 32-33, "xq_Np,1; xq_Np,2" should read "xq_Nq,1; xq_Nq,2"
On pg. 23, lines 35-36, "machine leaning" should read "machine learning". Appropriate correction is required.
Claim Objections
Claims 4, 6, and 17 are objected to because of the following informalities: 
Claim 4 recites "components of the first regression matrix data includes" which should read "components of the first regression matrix data include"
Claim 6 recites "second machine leaning model" which should read "second machine learning model" 
Claim 6 recites "components of the fourth matrix data includes," "components of the fifth matrix data includes," "components of the sixth matrix data includes," "components of the second generated label includes," "components of the fourth regression matrix data includes," and "components of the fifth regression matrix data includes," each of which should use "include" rather than "includes";
Claim 17 recites "a first acquired label with N1 columns" which should read "a first acquired label with N1 rows" 
Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word "means," but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are:
"an acquisitor ... the acquisitor being configured to acquire first acquired data in a first operation" in Claim 1.
"the acquisitor is configured to acquire a second acquired data in the second operation" in Claim 6.
"one or a plurality of acquisitors ... the one or plurality of acquisitors being configured to acquire a first acquired data in a first operation" in Claim 17.
The term "acquisitor" appears to be a non-standard term that is not recognized by a person having ordinary skill in the art as denoting a particular definite structure. It serves as a generic placeholder coupled with functional language ("configured to acquire") without reciting sufficient structure to perform the data acquisition function.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure, material, or acts described in the specification as performing the claimed function, and equivalents thereof. The specification on pg. 4, first paragraph, identifies the acquisitor as an interface for input and output.  Accordingly, the term "acquisitor" is construed to cover an interface for input and output as described in the specification, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites "the processor being configured to generate a first machine learning model based on a first generated data based on the first acquired data and a first other acquired data in the first operation" and then subsequently recites "the first other data including a first other feature value matrix with Np rows and D1 columns and a first other label with Np rows".  The term "the first other data" lacks antecedent basis because the claim previously introduces "a first other acquired data," not "a first other data".  The terms "first other acquired data" and "first other data" are not identical, and it is unclear whether these refer to the same or different data elements.  For purposes of examination, "the first other data" is interpreted under BRI to refer to the same data element as the "first other acquired data" previously introduced in claim 1.
Claim 19 recites "the processor being configured to generate a first machine learning model based on a first generated data based on the first acquired data and a first other acquired data in the first operation" and then subsequently recites "the first other data including a first other feature value matrix with Np rows and D1 columns and a first other label with Np rows." The term "the first other data" lacks antecedent basis because the claim previously introduces "a first other acquired data," not "a first other data," for the same reasons as discussed with respect to claim 1 above.  For purposes of examination, "the first other data" in claim 19 is interpreted under BRI as stated in the analysis of claim 1 above.
Claims 2-16 depend directly or indirectly from claim 1 and therefore inherit the lack of antecedent basis for "the first other data" identified in claim 1. Claims 2-16 do not cure this deficiency and are therefore rejected for the same reasons.
Claim 6 recites "the second other data includes a second feature value matrix with Nq rows and D2 columns and a second other label with Np rows." The specification's Detailed Description consistently describes the second other label 52b as having Nq rows (not Np rows), stating (see pg. 10, lines 18+): "The second other data 52 includes a second other feature value matrix 52a with Nq rows and D2 columns and a second other label 52b with Nq rows," where "Nq" corresponds to the number of samples in the second other data 52. The use of "Np" (which corresponds to the number of samples in the first other data from the first operation) for the row count of the second other label creates an irreconcilable inconsistency with the Detailed Description. A person of ordinary skill in the art cannot determine with reasonable certainty whether the second other label is intended to have Np rows (as literally recited) or Nq rows (as described in the Detailed Description).  For purposes of examination, "a second other label with Np rows" is interpreted under BRI as referring to a second other label with Nq rows, consistent with the specification's Detailed Description, which describes the second other label 52b as having Nq rows corresponding to the number of samples in the second other data 52.
Claim 6 further recites "the second acquired data includes a second feature value matrix with N2 rows and D2 columns and a second acquired label with Np rows." The specification's Detailed Description consistently describes the second acquired label 12b as having N2 rows (not Np rows), stating: "The second acquired data 12 includes a second feature value matrix 12a with N2 rows and D2 columns and a second acquired label 12b with N2 rows," where "N2" corresponds to the number of samples in the second acquired data 12. The use of "Np" for the row count of the second acquired label creates an irreconcilable inconsistency with the Detailed Description. A person having ordinary skill in the art cannot determine with reasonable certainty whether the second acquired label is intended to have Np rows (as literally recited) or N2 rows (as described in the Detailed Description).  For purposes of examination, "a second acquired label with Np rows" is interpreted under BRI as referring to a second acquired label with N2 rows, consistent with the specification's Detailed Description, which describes the second acquired label 12b as having N2 rows corresponding to the number of samples in the second acquired data 12.
Claims 7-9 depend directly or indirectly from claim 6 and therefore inherit the inconsistencies between claim 6 and the specification identified above regarding "a second other label with Np rows" and "a second acquired label with Np rows." Claims 7-9 do not cure these deficiencies and are therefore rejected for the same reasons.
Claim 14 recites "the first acquired data, the first other data, and the second acquired data include characteristics of a magnetic recording/reproducing device." There is insufficient antecedent basis for the limitation "the second acquired data" in the claim. Claim 14 depends on claim 1, and neither claim 1 nor claim 14 previously introduces "a second acquired data." The term "a second acquired data" is introduced only in claim 6, from which claim 14 does not depend.  For purposes of examination, "the second acquired data" in claim 14 is interpreted under BRI to refer to any second set of acquired data that is distinct from the first acquired data.
Claim 17 recites "the first acquired data including a first feature value matrix with N1 rows and D1 columns and a first acquired label with N1 columns." The specification consistently describes the first acquired label as having N1 rows, not N1 columns. The specification states (see paragraph 4+): "The first acquired data 11 includes a first feature value matrix 11a with N1 rows and D1 columns and a first acquired label 11b with N1 rows" (Detailed Description). Furthermore, independent claim 1, which recites the same data processing structure, recites "a first acquired label with N1 rows." The use of "N1 columns" in claim 17, where both the specification and the parallel claim 1 recite "N1 rows," creates ambiguity as to whether the first acquired label is intended to be a column-oriented data structure with N1 columns or a row-oriented data element with N1 rows. A person having ordinary skill in the art cannot determine with reasonable certainty the intended dimensionality of the first acquired label as recited in claim 17.  For purposes of examination, "a first acquired label with N1 columns" in claim 17 is interpreted under BRI as referring to a first acquired label with N1 rows.
Claim 18 depends from claim 17 and inherits the inconsistency between claim 17 and the specification regarding "a first acquired label with N1 columns" identified above. Claim 18 does not cure this deficiency and is therefore rejected for the same reasons.
Claim 19 is directed to "A data processing method" but does not recite any active method steps. The claim recites "a processor being caused to perform a first operation" and "the processor being configured to generate a first machine learning model," using participial phrases and apparatus-style "being configured to" language throughout. All remaining limitations describe data structures using "including" clauses (e.g., "the first other data including a first other feature value matrix," "the first generated matrix including first matrix data"). A method claim should recite active method steps defining the process performed. As drafted, claim 19 reads as a description of an apparatus configuration rather than a method comprising active steps, and a person having ordinary skill in the art cannot determine whether the claim is directed to the method of performing data processing or to the apparatus configured to perform it. The claim fails to set forth the subject matter which the inventor regards as the method invention.  For purposes of examination, claim 19 is interpreted under BRI as a method comprising the steps of: causing a processor to perform a first operation, wherein the processor generates a first machine learning model based on a first generated data that is based on the first acquired data and a first other acquired data, with the data structures as recited in the claim. 
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-19 are rejected under 35 USC 103 as being unpatentable over Frustratingly Easy Domain Adaptation to Daumé in view of U.S. Patent Application Publication No. US 2012/0303562 A1 to Paguio.

Per claim 1, Daumé discloses a data processing method (Abstract…"We describe an approach to domain adaptation that is appropriate exactly in the case when one has enough 'target' data to do slightly better than just using only 'source' data"), comprising:
…
the processor being configured to generate a first machine learning model based on a first generated data based on the first acquired data and a first other acquired data in the first operation (Section 2…the approach generates augmented training data from both source domain data and target domain data, and a standard learning algorithm is trained on the augmented data; Abstract and Section 1…"Our approach is incredibly simple, easy to implement as a preprocessing step", where the source domain data corresponds to the first other data and the target domain data corresponds to the first acquired data), 
the first other data including a first other feature value matrix with Np rows and D1 columns and a first other label with Np rows, the Np being an integer of 2 or more, the D1 being an integer of 1 or more (Section 2…the source domain dataset Ds consists of Np labeled samples, each having a feature vector X ∈ ℝF with F features and corresponding labels),
the first acquired data including a first feature value matrix with N1 rows and D1 columns and a first acquired label with N1 rows, the N1 being an integer of 2 or more, the N1 being smaller than the Np (Section 1…the approach addresses the scenario where a limited amount of target data is available relative to source data, i.e., the target domain dataset Dt has N1 samples where N1 < Np),
the first generated data including a first generated matrix with (Np+N1) rows and (3×D1) columns and a first generated label with (Np+N1) rows (Section 3…the combined augmented dataset has (Np+N1) samples, each with augmented features in ℝ3F, effective creating a matrix with (Np+N1) rows and 3F=3×D1 columns; the labels from both domains are combined into a single label vector of (Np+N1) rows), 
the first generated matrix including first matrix data, second matrix data, and third matrix data (Section 3…the augmented feature vector consists of three F-dimensional blocks: "Φs(x) = ⟨x, x, 0⟩" for source and "Φt(x) = ⟨x, 0, x⟩" for target, creating three distinct column blocks in the combined matrix),
components of the first matrix data including combinations in a row direction of the first other feature value matrix and the first feature value matrix (Section 3…the first block of the augmented feature vector contains the original features x for both source and target samples...for source rows this block contains xp and for target rows it contains x1, which when stacked vertically is the row-direction combination of the source feature matrix and the target feature matrix),
components of the second matrix data including combinations in the row direction of a matrix of 0 components with Np rows and D1 columns and the first feature value matrix (Section 3…"Φs(x) = ⟨x, x, 0⟩", the third block for source data is the zero vector 0 ∈ ℝF; "Φt(x) = ⟨x, 0, x⟩", the third block for target data is x; when the third block column, e.g., target-specific block, is stacked vertically, the source rows contain zeros (Np×D1 zero matrix) and the target rows contain x1, which is the combination in the row direction of a zero matrix and the first feature value matrix),
components of the third matrix data including combinations in the row direction of the first other feature value matrix and a matrix of 0 components with N1 rows and D1 columns (Section 3…"Φs(x) = ⟨x, x, 0⟩", the second block for source data is x; "Φt(x) = ⟨x, 0, x⟩",  the second block for target data is the zero vector 0; when the second block column, e.g., source-specific block, is stacked vertically, the source rows contain xp and the target rows contain zeros (N1×D1 zero matrix), which is the combination in the row direction of the source feature matrix and a zero matrix),
components of the first generated label including combinations in the row direction of the first other label and the first acquired label (Section 3…the combined training set uses labels from both domains stacked together, source labels followed by target labels, forms a combined label vector).
Daumé intrinsically discloses the three-block column structure of the first generated matrix (M1, M2, M3) as described in claim 1, because the augmented representations Φs(x) = ⟨x, x, 0⟩ for source data and Φt(x) = ⟨x, 0, x⟩ for target data, when assembled into a combined training matrix, necessarily produce three column blocks: (1) a shared block containing original features for all samples, (2) a block that is zero for source rows and contains target features for target rows, and (3) a block that contains source features for source rows and is zero for target rows. The column ordering in the patent (M1=shared, M2=target-specific, M3=source-specific) is merely a permutation of Daumé's column ordering (shared, source-specific, target-specific), which is mathematically equivalent for any machine learning algorithm since column ordering does not affect model training or prediction.
Daumé does not expressly disclose, but with Paguio does teach: a data processing device, comprising: an acquisitor; and a processor (Paguio: [0043]…the disk drive includes a control unit 129 containing "logic control circuits, storage means and a microprocessor",  the microprocessor is a processor, and the data acquisition from manufacturing equipment constitutes an acquisitor/interface for receiving data; [0056]-[0057]…data is acquired from wafer fabrication, slider processing, and magnetic testing equipment), the acquisitor being configured to acquire first acquired data in a first operation (Paguio: [0125]…the system acquires manufacturing parameter data including Overlay-1, Overlay-2, Reader/Writer Offset, and Final Stripe Height from manufacturing and testing equipment during the manufacturing process).
Daumé does not expressly disclose the limitation (Np+N1)/D1 being 250 or more.  However, Daumé teaches the general condition of combining source domain data (Np samples) with target domain data (N1 samples), where each sample has D1 features, to create a combined dataset. The ratio (Np+N1)/D1, representing the total number of samples relative to the feature dimensionality, is a result-effective variable recognized in by a PHOSITA the machine learning art as directly affecting model accuracy and generalization. It is well-established in the machine learning literature that the ratio of training samples to feature dimensions is a critical factor governing model performance: insufficient samples relative to features leads to overfitting and poor generalization (the ‘curse of dimensionality’), while increasing the sample-to-feature ratio monotonically improves model stability and accuracy, e.g., the bias-variance tradeoff. Moreover, Daumé demonstrates experiments with datasets having sample-to-feature ratios well exceeding 250 (Table 1: e.g., PubMed-POS source domain has 950,028 training examples with 571k features, and the ACE-NER domains have tens of thousands of examples with 54k-113k features). Accordingly, it would have been a matter of routine experimentation for one of ordinary skill in the art to select a ratio of 250 or more to achieve stable and accurate model performance.  See MPEP § 2144.05(II)(A) ("where the general conditions of a claim are disclosed in the prior art, it is not inventive to discover the optimum or workable ranges by routine experimentation").
Daumé and Paguio are analogous art because they are both within the same field of endeavor, specifically machine learning applied to data processing and prediction. They address the same problem-solving area of improving prediction accuracy using training data from related sources.  Daumé's feature augmentation technique provides a general-purpose preprocessing method for combining data from different domains to train a more accurate model (Daumé: Abstract…"easy to implement as a preprocessing step…and outperforms state-of-the-art approaches"), while Paguio's system acquires data from manufacturing processes and uses machine learning to generate predictive models (Paguio: ¶[0177]…"ANN gave better accuracy in predicting the final MCW than MLR by 30%").
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to implement Daumé's feature augmentation technique for domain adaptation on a computing device such as that taught by Paguio, because Daumé's method is a preprocessing step applicable to any standard learning algorithm and Paguio's system already acquires data and trains machine learning models on a processor with memory. A person having ordinary skill would have recognized that applying Daumé's feature augmentation to Paguio's data processing system would yield the predictable result of improved prediction accuracy when combining data from different manufacturing conditions or different device populations, which is the precise benefit Daumé's technique provides. 
The suggestion/motivation for doing so would have been that Daumé explicitly teaches that the feature augmentation technique improves accuracy when target data is limited relative to source data (Daumé: Abstract…"appropriate exactly in the case when one has enough “target” data to do slightly better than just using only “source data”), and Paguio's manufacturing environment inherently involves scenarios where data from a specific production run (target data) is limited while historical production data (source data) is more abundant (Paguio: [159]…"Wafer Data has few data points for Overlay-1 & Overlay-2").  A person having ordinary skill in the art would have been motivated to apply Daumé’s technique to augment the limited target manufacturing data with more abundant historical data to improve prediction accuracy.

Per claim 2, Daumé combined with Paguio discloses claim 1. Daumé combined with Paguio further teaches the device further comprising: a memory, the memory including a first memory area, the first acquired data and the first other data being stored in the first memory area, and the processor being configured to acquire the first acquired data and the first other data from the first memory area and perform the first operation (Paguio: [0043]…the control unit includes "storage means"; [0057]…the acquired manufacturing data is stored for processing by the neural network, and the processor retrieves the stored data to perform the ML training operation). The rationale to combine Paguio with Daumé is the same as the parent claim.

Per claim 3, Daumé combined with Paguio discloses claim 2. Daumé combined with Paguio further teaches the memory further includes a second memory area, and the processor is configured to store the first generated data in the second memory area (Paguio: [0029]…the control unit includes storage means, it is well-known and conventional in the computing arts to partition memory into separate regions or areas for storing different categories of data, such as input data and processed/generated output data, to prevent data corruption and facilitate data management during processing). The rationale to combine Paguio with Daumé is the same as the parent claim.

Per claim 4, Daumé combined with Paguio discloses claim 1. Daumé further teaches the processor is configured to further derive a first regression label by inputting a first regression matrix to the first machine learning model in the first operation, the first regression matrix has N1 rows and (3×D1) columns, the first regression matrix includes first regression matrix data, second regression matrix data, and third regression matrix data, components of the first regression matrix data includes the first feature value matrix, components of the second regression matrix data include the first feature value matrix, and components of the third regression matrix data include a matrix of 0 components with N1 rows and D1 columns (Daumé: Section 3…to make predictions for target domain data, the target features are augmented using Φt(x) = ⟨x, 0, x⟩ and input to the trained model, where M1=shared, M2=target-specific, M3=source-specific, the target augmented features become [x1, x1, 0], meaning the first regression matrix data contains x1, the second regression matrix data contains x1, and the third regression matrix data is the zero matrix; the model outputs predicted labels (regression labels) for the N1 target samples). 

Per claim 5, Daumé combined with Paguio discloses claim 4. Daumé further teaches the first regression label has the N1 rows (Daumé: Section 3…when N1 augmented target samples are input to the trained model, the model produces N1 corresponding predictions, one per target sample).

Per claim 6, Daumé combined with Paguio discloses claim 4. Daumé combined with Paguio further teaches the acquisitor and the processor are configured to further perform a second operation (the second operation being analogous to the first operation but with second acquired data and second other data having Nq rows with D2 features, and generating a second machine learning model based on second generated data with the same three-block matrix structure; Daumé: Abstract…the technique "trivially extends to a multi-domain adaptation problem" where one can have data from a variety of different domains, performing the domain adaptation technique on a second dataset with different characteristics is a straightforward repeated application of the same method; Paguio: [0162]…"Network training and validation was performed for each wafer", the system performs repeated ML operations on different datasets).  All limitations of claim 6 regarding the second generated matrix structure (fourth, fifth, and sixth matrix data with the same zero-padding pattern as the first operation, and (Nq+N2)/D2 being 250 or more) are taught by Daumé's feature augmentation technique applied to the second dataset, with the ratio constraint being a result-effective variable as discussed for claim 1. The rationale to combine Paguio with Daumé is the same as the parent claim.

Per claim 7, Daumé combined with Paguio discloses claim 6. Daumé combined with Paguio further teaches the processor is configured to further perform a third operation, and the processor is configured to output information relating to comparison between the first regression label and the second regression label in the third operation (Paguio: [0055]-[0056]…the system evaluates and compares prediction accuracy across different models and configurations; comparing the outputs of two separately trained machine learning models is a fundamental practice in the ML art for model selection, validation, and decision-making. A person having ordinary skill in the art, having trained two domain-adapted models on different datasets as taught by Daumé, would naturally compare their respective prediction outputs to evaluate relative performance and support data-driven decisions, as this is a routine step in any ML pipeline that generates multiple models). The rationale to combine Paguio with Daumé is the same as the parent claim.

Per claim 8, Daumé combined with Paguio discloses claim 7. Daumé combined with Paguio further teaches the information includes at least one of a first result or a second result, the first result includes a comparison result between a maximum value of components of the first regression label and a maximum value of components of the second regression label, and the second result includes a comparison result between a minimum value of the components of the first regression label and a minimum value of the components of the second regression label (comparing maximum and minimum values of regression outputs from two different models is a standard statistical analysis technique used in the ML art for evaluating model behavior across prediction ranges.  Comparing extrema (maximum and minimum values) of model prediction outputs is a well-known and conventional analytical technique for assessing model agreement, detecting outliers, and evaluating prediction bounds. A person having ordinary skill, having obtained regression labels from two domain-adapted models, would have found it obvious to compare the maximum and minimum predicted values as a basic diagnostic measure). The rationale to combine Paguio with Daumé is the same as the parent claim.

Per claim 9, Daumé combined with Paguio discloses claim 6. Daumé further teaches the second regression label has N2 rows (Daumé: Section 3…when N2 augmented target samples from the second dataset are input to the second trained model, the model produces N2 corresponding predictions).

Per claim 10, Daumé combined with Paguio discloses claim 1. Daumé further teaches (Np+N1)/D1 is 500 or more (as discussed for claim 1, the ratio (Np+N1)/D1 is a result-effective variable directly affecting model accuracy; selecting a ratio of 500 or more is a matter of routine optimization. Daumé's own experimental data demonstrates the use of training datasets with sample-to-feature ratios exceeding 500 (Table 1: e.g., CNN-Recap source domain has 2,000,000 examples with 368k features; multiple ACE-NER domains have 35,000-53,000 examples with 54k-113k features). A person having ordinary skill would have recognized that higher ratios generally yield more stable and accurate models and would have selected a ratio of 500 or more through routine experimentation). See MPEP § 2144.05(II)(A).

Per claim 11, Daumé combined with Paguio discloses claim 1. Daumé further teaches Np/N1 is 1.5 or more (Daumé: Section 1…the technique is designed for scenarios where source data (Np samples) is more abundant than target data (N1 samples), Daumé's experimental data confirms this, with source-to-target ratios well exceeding 1.5 in the reported experiments (e.g., Table 1: PubMed-POS has 950,028 source examples vs. 11,264 target examples, a ratio of approximately 84:1; CNN-Recap has 2,000,000 source examples vs. 39,684 target examples, approximately 50:1). Selecting a source-to-target ratio of 1.5 or more is well within the range contemplated by Daumé and represents an obvious parameter choice for a person having ordinary skill implementing domain adaptation).

Per claim 12, Daumé combined with Paguio discloses claim 1. Daumé combined with Paguio further teaches the processor is configured to generate the first machine learning model by at least one selected from the group consisting of kernel regression, linear regression, Ridge regression, Lasso regression, Elastic Net, gradient boosting regression, random forest regression, k-nearest neighbor regression, and logistic regression (Daumé: Section 1…"use the result as input to a standard learning algorithm", the technique works with any standard algorithm; Section 3.1…the kernelized interpretation shows the technique is compatible with kernel methods including kernel regression; Paguio: [0055]…multiple linear regression (MLR) is discussed as a baseline — all listed regression algorithms are well-known standard machine learning algorithms that a person having ordinary skill would consider applying with the augmented features). The rationale to combine Paguio with Daumé is the same as the parent claim.

Per claim 13, Daumé combined with Paguio discloses claim 12. Daumé further teaches the kernel regression includes at least one of Gaussian process regression or SVR (Support Vector Regression) (Daumé: Section 3.1…the kernel interpretation of the feature augmentation is discussed with kernel K: X x X → ℝ; Daumé explicitly identifies SVMs as a compatible learning algorithm (Section 1: "standard supervised learning problem to which any standard algorithm may be applied (eg., maxent, SVMs, etc.)"), where SVR is the regression counterpart of SVMs, and Gaussian process regression is another well-known kernel-based regression method. Both GPR and SVR operate using kernels in the manner described in Daumé's Section 3.1, and a person having ordinary skill would have found it obvious to apply these standard kernel regression methods with Daumé's augmented kernel K̆).

Per claim 14, Daumé combined with Paguio discloses claim 1. Daumé combined with Paguio further teaches the first acquired data, the first other data, and the second acquired data include characteristics of a magnetic recording/reproducing device (Paguio: Abstract…"a method for predicting and optimizing magnetic core width of a write head", the data includes characteristics of magnetic recording heads used in magnetic disk drives; [0049]…the magnetic core width determines "the spacing of tracks on the disk and, therefore, determines the amount of data (areal density) of data that can be recorded to the media"). The rationale to combine Paguio with Daumé is the same as the parent claim.

Per claim 15, Daumé combined with Paguio discloses claim 14. Daumé combined with Paguio further teaches the characteristics include at least one selected from the group consisting of SNR, BER, Fringe BER, EWAC, MWW, OW, SOVA-BER, VMM, PRO, and NRRO (Paguio: [0053] teaches characteristics of a magnetic recording device including magnetic core width (MCW) which corresponds to MWW (Magnetic Write track Width) under broadest reasonable interpretation, where MCW is defined as "the actual dimension of a magnetic bit recorded by the magnetic write head on magnetic media", which is the physical measurement of the write track width produced by the magnetic head. Under BRI, MCW and MWW both measure the effective write width of the magnetic recording head, MCW from the perspective of the head's core dimension and MWW from the perspective of the resulting track on the medium. Since the magnetic core width directly determines and is functionally equivalent to the magnetic write track width, Paguio's MCW teaches the claimed MWW characteristic). The rationale to combine Paguio with Daumé is the same as the parent claim.

Per claim 16, Daumé combined with Paguio discloses claim 1. Daumé combined with Paguio further teaches the D1 is 1 (Daumé: Section 3…the feature augmentation works for any feature dimensionality F = D1 including D1 = 1, which is the simplest case with a single feature; Paguio's system predicts MCW (a single output parameter) based on input features, and a person having ordinary skill implementing Daumé's technique for a single-feature prediction scenario (D1 = 1) would have found this to be an obvious selection from the general disclosure of F ≥ 1). The rationale to combine Paguio with Daumé is the same as the parent claim.

Claims 17 and 19 are substantially similar in scope and spirit as claim 1. Claim 17 recites a data processing system with "one or a plurality of acquisitors" and "one or a plurality of processors" performing the same operations as claim 1, and claim 19 recites a data processing method with a processor performing the same first operation as claim 1. Therefore the rejection of claim 1 is applied accordingly.

Per claim 18, Daumé combined with Paguio discloses claim 17. Daumé combined with Paguio further teaches a part of the first operation is performed by a part of the one or plurality of processors, and another part of the first operation is performed by another part of the one or plurality of processors (distributing machine learning computations across multiple processors or processing units is well-known and conventional in the computing arts, such that parallelizing ML training and prediction operations across multiple processors (e.g., multi-core CPUs, GPU clusters, distributed computing nodes) is a well-established practice for improving computational efficiency. Paguio: [0043]…the system includes a microprocessor and logic control circuits providing processing capability. A person of ordinary skill in the art would have found it obvious to distribute Daumé's feature augmentation and model training operations across multiple processing units for improved computational performance, as the matrix operations involved in feature augmentation and ML training are inherently parallelizable). The rationale to combine Paguio with Daumé is the same as the parent claim.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Patents and/or related publications are cited in the Notice of References Cited (Form PTO-892) attached to this action to further show the state of the art with respect to transfer learning using small-scale target data plus large-scale source data.
	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALAN CHEN whose telephone number is (571)272-4143. The examiner can normally be reached M-F 10-7.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached at (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ALAN CHEN/Primary Examiner, Art Unit 2125
Read full office action
Prosecution Timeline

Jan 26, 2023
Application Filed
Apr 27, 2026
Non-Final Rejection mailed — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/559,163
Patent 12632725
NEURAL NETWORK PROCESSING
4y 4m to grant Granted May 19, 2026
18/075,521
Patent 12632796
TRAINING MACHINE LEARNING MODELS TO PREDICT CHARACTERISTICS OF ADVERSE EVENTS USING INTERMITTENT DATA
3y 5m to grant Granted May 19, 2026
18/110,830
Patent 12632757
FIRST-QUANTIZATION BLOCK ENCODING FOR QUANTUM EMULATION
3y 3m to grant Granted May 19, 2026
17/798,038
Patent 12626090
HIERARCHICAL NEUROMORPHIC SENSOR ARRAY WITH INTEGRATED LEARNING FOR PHYSICOCHEMICAL PROPERTY PREDICATION
3y 9m to grant Granted May 12, 2026
17/843,801
Patent 12626175
SPECTRAL CLUSTERING OF GRAPHS ON FAULT TOLERANT AND NOISY QUANTUM DEVICES
3y 11m to grant Granted May 12, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
91%
Grant Probability
97%
With Interview (+6.3%)
2y 9m (~0m remaining)
Median Time to Grant
Low
PTA Risk
Based on 1134 resolved cases by this examiner. Grant probability derived from career allowance rate.