Office Action Analysis: 17554895 — METHODS AND SYSTEMS FOR PREDICTING COGNITIVE LOAD

Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This action is in response to an amendment filed on October 20th, 2025. Claims 1-20 are pending in the current application.

Specification
The disclosure is objected to because of the following informalities: 
Raw Sensor Data (308), FIG. 3, is never cited in the specification.
Normalized Self-Attention Metrics (432), FIG. 4, is never cited in the specification.
Predicted Cognitive Load (452), FIG. 4, is never cited in the specification.
Paragraph 0024 labels 104 as a computing device but FIG. 1, as well as previous paragraphs, label 104 as a user. (It should be 108.)
Paragraph 0025 labels 108 as internal sensors but FIG. 1, as well as previous paragraphs, label 108 as a computing device. (It should be 112.)
Paragraph 0026 labels 104 as a computing device but FIG. 1, as well as previous paragraphs, label 104 as a user. (It should be 108.)
Paragraph 0068 labels 424 as a “single self-attention vector” but according to FIG. 4, they are “Self-Attention Metrics.”
Appropriate correction is required.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Regarding claim 1, Under Step 1 of the Subject Matter Eligibility Test of Products and Processes, claim 1 is directed towards a process, which falls within one of the four statutory categories. 
Next, under a Step 2A Prong 1 Analysis, claim 1 recites
deriving, from the sensor measurements of the sensor, a set of features predictive of a cognitive load of the user;
generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor;
and defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor
These limitations, as drafted, are processes that, under the broadest reasonable interpretation, fall within the “mathematical concepts” grouping of abstract ideas.	Thus, we have to examine the claim under Step 2A prong 2, which considers the addition elements within the claim. The claim’s additional elements are, 
receiving, by a computing device, sensor measurements from each of two or more sensors, wherein the sensor measurements correspond to characteristics of a user during performance of a task
Each sensor of the two or more sensors
generating, from feature vectors of at least two sensors of the two or more sensors, an input feature vector;
generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task
and outputting, by the computing device, the indication of the cognitive load of the user.
The limitations, “each sensor of the two or more sensors”, and “generating, from feature vectors of at least two sensors of the two or more sensors, an input feature vector;” merely point out the technological environment and field of use, and “generally link” each sensor of the two or more sensors and the feature vectors of the sensors to the abstract idea. (See MPEP 2106.05(h)) The limitation, “generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task” is mere instructions to apply an exception as it instructs to use each sensor of the two or more sensors, feature vectors of at least two sensors of the two or more sensors, and a trained machine-learning model using the input feature vector as tools to perform an existing process; (See MPEP 2106.05(f)) Lastly, “receiving, by a computing device, sensor measurements from each of two or more sensors, wherein the sensor measurements correspond to characteristics of a user during performance of a task”, as well as “outputting, by the computing device, the indication of the cognitive load of the user” are additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea and merely imposes insignificant extra-solution activity. (See MPEP 2106.05(g))
Under a Step 2B analysis, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Under a Berkheimer analysis, “receiving, by a computing device, sensor measurements from each of one or more sensors…” and “outputting, by the computing device, the indication of the cognitive load of the user” is well-understood, routine and conventional. (“Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362”) (See MPEP 2106.05(d)(II)) Therefore, the claim is not eligible.

Regarding claim 8, under Step 1 of the Subject Matter Eligibility Test of Products and Processes, claim 8 is directed towards a system, which falls within one of the four statutory categories. 
Next, under a Step 2A Prong 1 Analysis, claim 8 recites
deriving, from the sensor measurements of the sensor, a set of features predictive of a cognitive load of the user;
generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor;
and defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor
These limitations, as drafted, are processes that, under the broadest reasonable interpretation, fall within the “mathematical concepts” grouping of abstract ideas.	Thus, we have to examine the claim under Step 2A prong 2, which considers the addition elements within the claim. The claim’s additional elements are, 
one or more processors
a non-transitory computer-readable medium
receiving, by a computing device, sensor measurements from each of two or more sensors, wherein the sensor measurements correspond to characteristics of a user during performance of a task
each sensor of the two or more sensors
generating, from feature vectors of at least two sensors of the two or more sensors, an input feature vector;
generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task
and outputting, by the computing device, the indication of the cognitive load of the user.
The limitations, “each sensor of the two or more sensors”, and “generating, from feature vectors of at least two sensors of the two or more sensors, an input feature vector;” merely point out the technological environment and field of use, and “generally link” each sensor of the two or more sensors and the feature vectors of the sensors to the abstract idea. (See MPEP 2106.05(h)) The limitations, “one or more processors”, “a non-transitory computer-readable medium”, and “generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task” are mere instructions to apply an exception as it instructs to use each sensor of the two or more sensors, feature vectors of at least two sensors of the two or more sensors, a trained machine-learning model using the input feature vector, one or more processors, and a non-transitory computer-readable medium, as tools to perform an existing process; (See MPEP 2106.05(f)) Lastly, “receiving, by a computing device, sensor measurements from each of two or more sensors, wherein the sensor measurements correspond to characteristics of a user during performance of a task”, as well as “outputting, by the computing device, the indication of the cognitive load of the user” are additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea and merely imposes insignificant extra-solution activity. (See MPEP 2106.05(g))
Under a Step 2B analysis, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Under a Berkheimer analysis, “receiving, by a computing device, sensor measurements from each of two or more sensors, wherein the sensor measurements correspond to characteristics of a user during performance of a task”, and “outputting, by the computing device, the indication of the cognitive load of the user” is well-understood, routine and conventional. (“Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362”) (See MPEP 2106.05(d)(II)) Therefore, the claim is not eligible.

Regarding claim 15, Under Step 1 of the Subject Matter Eligibility Test of Products and Processes, claim 15 is directed towards manufacture, which falls within one of the four statutory categories. 
Next, under a Step 2A Prong 1 Analysis, claim 15 recites
deriving, from the sensor measurements of the sensor, a set of features predictive of a cognitive load of the user;
generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor;
and defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor
These limitations, as drafted, are processes that, under the broadest reasonable interpretation, fall within the “mathematical concepts” grouping of abstract ideas.	Thus, we have to examine the claim under Step 2A prong 2, which considers the addition elements within the claim. The claim’s additional elements are, 
a non-transitory computer-readable medium
receiving, by a computing device, sensor measurements from each of two or more sensors, wherein the sensor measurements correspond to characteristics of a user during performance of a task
each sensor of the two or more sensors
generating, from feature vectors of at least two sensors of the two or more sensors, an input feature vector;
generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task
and outputting, by the computing device, the indication of the cognitive load of the user.
The limitations, “each sensor of the two or more sensors”, and “generating, from feature vectors of at least two sensors of the two or more sensors, an input feature vector;” merely point out the technological environment and field of use, and “generally link” each sensor of the two or more sensors and the feature vectors of the sensors to the abstract idea. (See MPEP 2106.05(h)) The limitations, “a non-transitory computer-readable medium”, and “generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task” are considered to be mere instructions to apply an exception as it instructs to use each sensor of the two or more sensors, feature vectors of at least two sensors of the two or more sensors, a trained machine-learning model using the input feature vector, and a non-transitory computer-readable medium as tools to perform an existing process; (See MPEP 2106.05(f)) Lastly, “receiving, by a computing device, sensor measurements from each of two or more sensors, wherein the sensor measurements correspond to characteristics of a user during performance of a task”, as well as “outputting, by the computing device, the indication of the cognitive load of the user” are additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract Under a Step 2B analysis, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Under a Berkheimer analysis, “receiving, by a computing device, sensor measurements from each of two or more sensors, wherein the sensor measurements correspond to characteristics of a user during performance of a task”, and “outputting, by the computing device, the indication of the cognitive load of the user” is well-understood, routine and conventional. (“Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362”) (See MPEP 2106.05(d)(II)) Therefore, the claim is not eligible.

Regarding claims 2, 9, and 16, they mention “The method/system/non-transitory computer-readable medium of claim 1/8/15, wherein generating the input feature vector includes deriving a tensor product of the feature vector of the self-attention vector and the set of features.” Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea and merely imposes insignificant extra-solution activity. (See MPEP 2106.05(g)) Therefore, for the same reasons that claims 1, 8, and 15 are rejected under 35 U.S.C. 101, claims 2, 9, and 16 are not eligible.

Regarding claims 3, 10, and 17, they mention “The method/system/non-transitory computer-readable medium of claim 1/8/15, wherein generating the input feature vector includes: aggregating the feature vector of each of the one or more sensors.” Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea and merely imposes insignificant extra-solution activity. (See MPEP 2106.05(g)) Therefore, for the same reasons that claims 1, 8, and 15 are rejected under 35 U.S.C. 101, claims 3, 10, and 17 are not eligible.

Regarding claims 4, 11, and 18, they mention “The method/system/non-transitory computer-readable medium of claim 1/8/15, further comprising: executing a feature projection on the set of features, wherein the feature projection is executed before the self-attention vector is generated.” Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea and merely imposes insignificant extra-solution activity. (See MPEP 2106.05(g)) Therefore, for the same reasons that claims 1, 8, and 15 are rejected under 35 U.S.C. 101, claims 4, 11, and 18 are not eligible.

Regarding claims 5, 12, and 19, they mention “The method/system/non-transitory computer-readable medium of claim 1/8/15, wherein deriving the set of features of a first sensor of the one or more sensors includes: filtering the sensor measurements based on a predetermined frequency relative to a type of the first sensor; executing an artifact removal process to remove artifacts in the sensor measurements; and extracting, from the sensor measurements, a plurality of features using a spectral density analysis.” Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea and merely imposes insignificant extra-solution activity. (See MPEP 2106.05(g)) Therefore, for the same reasons that claims 1, 8, and 15 are rejected under 35 U.S.C. 101, claims 5, 12, and 19 are not eligible.

Regarding claims 6, 13 and 20, they mention “The method/system/non-transitory computer-readable medium of claim 1/8/15, wherein the computing device is a mobile device and a first sensor of the one or more sensors is positioned within a wearable device.” Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea and merely imposes insignificant extra-solution activity. (See MPEP 2106.05(g)) Therefore, for the same reasons that claims 1, 8, and 15 are rejected under 35 U.S.C. 101, claims 6, 13, and 20 are not eligible.

Regarding claim 7 and 14, they mention “The method/system of claim 1/8, further comprising: normalizing the self-attention vector according to a softmax function before defining the feature vector.” Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea and merely imposes insignificant extra-solution activity. (See MPEP 2106.05(g)) Therefore, for the same reasons that claims 1 and 8 are rejected under 35 U.S.C. 101, claims 7 and 14 are not eligible.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4, 7-10, 11, 14-17, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Teplitzky et al. (Herein referred to as Teplitzky) (U.S. Patent Application No. US 20220013240 A1) in view of Xiong et al. (Herein referred to as Xiong) (Pattern Recognition of Cognitive Load Using EEG and ECG Signals) and in further view of Yamamoto et al. (Herein referred to as Yamamoto.) (U.S. Patent Application No. US 20220415506 A1)

Regarding claim 1, Teplitzky teaches receiving, by a computing device, sensor measurements from each of two or more sensors, (“multiple channels of ECG data. For example, each sensor pair on the device may detect a different channel of ECG data. That is, a device might include 1 sensor pair for detecting 1 channel of data, 3 sensor pairs for detecting 3 channels of data, 12 sensor pairs for detecting 12 channels of data, or even 64 sensor pairs for detecting 64 channels of data.”, Paragraph 21) (Each channel represents a different sensor which teaches the limitation.) from the sensor measurements of the sensor, a set of features (“The deep learning architecture 800, illustrated in FIG. 8, leverages an existing single channel architecture (e.g., the architecture 700 illustrated in FIG. 7) to classify beats and rhythms based on multiple channels of ECG data. In an embodiment, this is done by removing the final classification layers from the existing, fully-trained architecture (e.g., the fully connected layer 732 and the softmax 734). This produces a feature extraction network, which generates a feature vector. Paragraph 66) (A set of features is extracted from the ECG data) and generating, from feature vectors of at least two sensors of the two or more sensors, an input feature vector; (“Each channel of data is analyzed to produce its own feature vector, and the feature vectors are passed to a new set of classification layers”, Paragraph 66) (Each sensor has a channel of data which produces a feature vector correspond to an input feature vector)
However, Teplitzky does not explicitly teach sensor measurements corresponding to characteristics of a user during performance of a task, nor a set of features predictive of a cognitive load of the user; nor generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor; nor defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor; nor generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task; nor outputting, by the computing device, the indication of the cognitive load of the user.
Xiong teaches sensor measurements corresponding to characteristics of a user during performance of a task, (“There are two ways to improve the accuracy of CL vs. BL and CLMM vs. CLM classification. The first one is to enlarge the amount of data samples, and the second one is to use other math tasks which elicit CLMM and CLM states of the college students (i.e., the subjects).“, pgs. 10-11) (The sensor measure cognitive load to then classify the load in cognitive load mismatching.) a set of features predictive of a cognitive load of the user; (“For the CLMM [cognitive load mismatching] vs. CLM [cognitive load matching] problem, we use two HRV features and two EEG features to get better classification results than those in literature [19]. By using the real e-learning ECG data, we got a validation accuracy of 65.5% F1 score, much higher than that of a random guess.”, pg. 10, under “4. Discussion”) (The EEG features help with cognitive load classification.) generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task; (“Each sample was described as 3-dimension vector of the critical ECG features of Model A, i.e., Area, LF and ApEn. Finally, we got 65.5% F1 score of the validation accuracy of Model A in the real e-learning status, and the confusion matrix is shown in Table 8.”, pg. 10, under “3.3. Validation with E-Learning Data”) and outputting, by the computing device, the indication of the cognitive load of the user. (“By using the combination of one EEG feature (BP_F4) and three HRV features (Mean, LF and ApEn), the DT classifier has classified the CL and the BL states with the accuracy of 96.3%, showing that the CL and BL states are distinguishable in the level of active state of mind.”, pg. 11, under “5. Conclusions”) (The model outputs a classification of cognitive load, teaching the limitation.)
Therefore, it would have been considered obvious to someone with ordinary skill in the art, prior to the effective filing date of this application, to combine the sensors and computing device of Teplitzky with the method and model to predict cognitive load, as disclosed in Xiong. One would be motivated to combine the two teachings, as Xiong’s method allows for sensor measurement to be used to measure cognitive load. (“The matching of cognitive load and working memory is the key for effective learning, and cognitive effort in the learning process has nervous responses which can be quantified in various physiological parameters. Therefore, it is meaningful to explore automatic cognitive load pattern recognition by using physiological measures.”, pg. 1, Abstract)
However, the combination does not teach generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor; nor defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor;
Yamamoto teaches generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor; (“The self-attention mechanism ATT1 calculates a weighted average to obtain a feature vector”, Paragraph 81; See also Figure 14.) (The vector obtained by the self-attention mechanism corresponds to a self-attention vector. Combined with the set of features from the sensors of the combination of Teplitzky and Xiong teaches this limitation.) and defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor; (“The LSTM further abstracts an abstracted feature vector as series data. Specifically, for example, feature vectors are sequentially received as series data, and non-linear transformation is performed repeatedly in consideration of past abstracted information. The self-attention mechanism ATT2 obtains a feature vector in consideration of degree of importance of each date and time for a series feature vector {h.sub.t}.sup.T.sub.t=1 (T is reference date and time) abstracted by the LSTM.”, Paragraph 84 and 85; See Figure 14.) (The inputs represent behavior data as feature vector data which corresponds to a set of features. They are input into a self-attention mechanism to generate a self-attention vector. That vector is then sent to a LSTM network which generates series data which sequentially are feature vectors from a combination of an original set of features and a self-attention vector, which teaches the limitation.)
Therefore, it would have been obvious to someone with ordinary skill in the art, prior to the effective filing date of this application, to use the sensors and indication of cognitive load of Teplitzky and Xiong, and combine it with the prediction system of. One with ordinary skill in the art could combine the two teachings together and arrive at the claimed invention with it yielding predictable results, or be motivated to combine the teachings, as using Yamamoto’s method and behavior data “enables estimation of a [psychological] state felt by the user with high accuracy.” (Paragraph 9 of Yamamoto)

Regarding claim 8, Teplitzky teaches a system comprising one or more processors, (“The system includes a processor and a memory storing a program, which, when executed on the processor, performs an operation.”, Paragraph 17) a non-transitory computer-readable medium storing instructions (“Any combination of one or more computer readable medium(s) may be utilized.”, Paragraph 80) receiving, by a computing device, sensor measurements from each of two or more sensors, (“multiple channels of ECG data. For example, each sensor pair on the device may detect a different channel of ECG data. That is, a device might include 1 sensor pair for detecting 1 channel of data, 3 sensor pairs for detecting 3 channels of data, 12 sensor pairs for detecting 12 channels of data, or even 64 sensor pairs for detecting 64 channels of data.”, Paragraph 21) (Each channel represents a different sensor which teaches the limitation.) from the sensor measurements of the sensor, a set of features (“The deep learning architecture 800, illustrated in FIG. 8, leverages an existing single channel architecture (e.g., the architecture 700 illustrated in FIG. 7) to classify beats and rhythms based on multiple channels of ECG data. In an embodiment, this is done by removing the final classification layers from the existing, fully-trained architecture (e.g., the fully connected layer 732 and the softmax 734). This produces a feature extraction network, which generates a feature vector. Paragraph 66) (A set of features is extracted from the ECG data) and generating, from feature vectors of at least two sensors of the two or more sensors, an input feature vector; (“Each channel of data is analyzed to produce its own feature vector, and the feature vectors are passed to a new set of classification layers”, Paragraph 66) (Each sensor has a channel of data which produces a feature vector correspond to an input feature vector)
However, Teplitzky does not explicitly teach sensor measurements corresponding to characteristics of a user during performance of a task, nor a set of features predictive of a cognitive load of the user; nor generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor; nor defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor; nor generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task; nor outputting, by the computing device, the indication of the cognitive load of the user.
Xiong teaches sensor measurements corresponding to characteristics of a user during performance of a task, (“There are two ways to improve the accuracy of CL vs. BL and CLMM vs. CLM classification. The first one is to enlarge the amount of data samples, and the second one is to use other math tasks which elicit CLMM and CLM states of the college students (i.e., the subjects).“, pgs. 10-11) (The sensor measure cognitive load to then classify the load in cognitive load mismatching.) a set of features predictive of a cognitive load of the user; (“For the CLMM [cognitive load mismatching] vs. CLM [cognitive load matching] problem, we use two HRV features and two EEG features to get better classification results than those in literature [19]. By using the real e-learning ECG data, we got a validation accuracy of 65.5% F1 score, much higher than that of a random guess.”, pg. 10, under “4. Discussion”) (The EEG features help with cognitive load classification.) generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task; (“Each sample was described as 3-dimension vector of the critical ECG features of Model A, i.e., Area, LF and ApEn. Finally, we got 65.5% F1 score of the validation accuracy of Model A in the real e-learning status, and the confusion matrix is shown in Table 8.”, pg. 10, under “3.3. Validation with E-Learning Data”) and outputting, by the computing device, the indication of the cognitive load of the user. (“By using the combination of one EEG feature (BP_F4) and three HRV features (Mean, LF and ApEn), the DT classifier has classified the CL and the BL states with the accuracy of 96.3%, showing that the CL and BL states are distinguishable in the level of active state of mind.”, pg. 11, under “5. Conclusions”) (The model outputs a classification of cognitive load, teaching the limitation.)
Therefore, it would have been considered obvious to someone with ordinary skill in the art, prior to the effective filing date of this application, to combine the sensors and computing device of Teplitzky with the method and model to predict cognitive load, as disclosed in Xiong. One would be motivated to combine the two teachings, as Xiong’s method allows for sensor measurement to be used to measure cognitive load. (“The matching of cognitive load and working memory is the key for effective learning, and cognitive effort in the learning process has nervous responses which can be quantified in various physiological parameters. Therefore, it is meaningful to explore automatic cognitive load pattern recognition by using physiological measures.”, pg. 1, Abstract)
However, the combination does not teach generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor; nor defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor;
Yamamoto teaches generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor; (“The self-attention mechanism ATT1 calculates a weighted average to obtain a feature vector”, Paragraph 81; See also Figure 14.) (The vector obtained by the self-attention mechanism corresponds to a self-attention vector. Combined with the set of features from the sensors of the combination of Teplitzky and Xiong teaches this limitation.) and defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor; (“The LSTM further abstracts an abstracted feature vector as series data. Specifically, for example, feature vectors are sequentially received as series data, and non-linear transformation is performed repeatedly in consideration of past abstracted information. The self-attention mechanism ATT2 obtains a feature vector in consideration of degree of importance of each date and time for a series feature vector {h.sub.t}.sup.T.sub.t=1 (T is reference date and time) abstracted by the LSTM.”, Paragraph 84 and 85; See Figure 14.) (The inputs represent behavior data as feature vector data which corresponds to a set of features. They are input into a self-attention mechanism to generate a self-attention vector. That vector is then sent to a LSTM network which generates series data which sequentially are feature vectors from a combination of an original set of features and a self-attention vector, which teaches the limitation.)
Therefore, it would have been obvious to someone with ordinary skill in the art, prior to the effective filing date of this application, to use the sensors and indication of cognitive load of Teplitzky and Xiong, and combine it with the prediction system of. One with ordinary skill in the art could combine the two teachings together and arrive at the claimed invention with it yielding predictable results, or be motivated to combine the teachings, as using Yamamoto’s method and behavior data “enables estimation of a [psychological] state felt by the user with high accuracy.” (Paragraph 9 of Yamamoto)

Regarding claim 15, Teplitzky a non-transitory computer-readable medium storing instructions (“Any combination of one or more computer readable medium(s) may be utilized.”, Paragraph 80) receiving, by a computing device, sensor measurements from each of two or more sensors, (“multiple channels of ECG data. For example, each sensor pair on the device may detect a different channel of ECG data. That is, a device might include 1 sensor pair for detecting 1 channel of data, 3 sensor pairs for detecting 3 channels of data, 12 sensor pairs for detecting 12 channels of data, or even 64 sensor pairs for detecting 64 channels of data.”, Paragraph 21) (Each channel represents a different sensor which teaches the limitation.) from the sensor measurements of the sensor, a set of features (“The deep learning architecture 800, illustrated in FIG. 8, leverages an existing single channel architecture (e.g., the architecture 700 illustrated in FIG. 7) to classify beats and rhythms based on multiple channels of ECG data. In an embodiment, this is done by removing the final classification layers from the existing, fully-trained architecture (e.g., the fully connected layer 732 and the softmax 734). This produces a feature extraction network, which generates a feature vector. Paragraph 66) (A set of features is extracted from the ECG data) and generating, from feature vectors of at least two sensors of the two or more sensors, an input feature vector; (“Each channel of data is analyzed to produce its own feature vector, and the feature vectors are passed to a new set of classification layers”, Paragraph 66) (Each sensor has a channel of data which produces a feature vector correspond to an input feature vector)
However, Teplitzky does not explicitly teach sensor measurements corresponding to characteristics of a user during performance of a task, nor a set of features predictive of a cognitive load of the user; nor generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor; nor defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor; nor generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task; nor outputting, by the computing device, the indication of the cognitive load of the user.
Xiong teaches sensor measurements corresponding to characteristics of a user during performance of a task, (“There are two ways to improve the accuracy of CL vs. BL and CLMM vs. CLM classification. The first one is to enlarge the amount of data samples, and the second one is to use other math tasks which elicit CLMM and CLM states of the college students (i.e., the subjects).“, pgs. 10-11) (The sensor measure cognitive load to then classify the load in cognitive load mismatching.) a set of features predictive of a cognitive load of the user; (“For the CLMM [cognitive load mismatching] vs. CLM [cognitive load matching] problem, we use two HRV features and two EEG features to get better classification results than those in literature [19]. By using the real e-learning ECG data, we got a validation accuracy of 65.5% F1 score, much higher than that of a random guess.”, pg. 10, under “4. Discussion”) (The EEG features help with cognitive load classification.) generating, by a trained machine-learning model using the input feature vector, an indication of the cognitive load of the user during performance of the task; (“Each sample was described as 3-dimension vector of the critical ECG features of Model A, i.e., Area, LF and ApEn. Finally, we got 65.5% F1 score of the validation accuracy of Model A in the real e-learning status, and the confusion matrix is shown in Table 8.”, pg. 10, under “3.3. Validation with E-Learning Data”) and outputting, by the computing device, the indication of the cognitive load of the user. (“By using the combination of one EEG feature (BP_F4) and three HRV features (Mean, LF and ApEn), the DT classifier has classified the CL and the BL states with the accuracy of 96.3%, showing that the CL and BL states are distinguishable in the level of active state of mind.”, pg. 11, under “5. Conclusions”) (The model outputs a classification of cognitive load, teaching the limitation.)
Therefore, it would have been considered obvious to someone with ordinary skill in the art, prior to the effective filing date of this application, to combine the sensors and computing device of Teplitzky with the method and model to predict cognitive load, as disclosed in Xiong. One would be motivated to combine the two teachings, as Xiong’s method allows for sensor measurement to be used to measure cognitive load. (“The matching of cognitive load and working memory is the key for effective learning, and cognitive effort in the learning process has nervous responses which can be quantified in various physiological parameters. Therefore, it is meaningful to explore automatic cognitive load pattern recognition by using physiological measures.”, pg. 1, Abstract)
However, the combination does not teach generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor; nor defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor;
Yamamoto teaches generating, from the set of features, a self-attention vector that characterizes each feature of the set of features for the sensor relative to another feature of the set of features for the sensor; (“The self-attention mechanism ATT1 calculates a weighted average to obtain a feature vector”, Paragraph 81; See also Figure 14.) (The vector obtained by the self-attention mechanism corresponds to a self-attention vector. Combined with the set of features from the sensors of the combination of Teplitzky and Xiong teaches this limitation.) and defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor; (“The LSTM further abstracts an abstracted feature vector as series data. Specifically, for example, feature vectors are sequentially received as series data, and non-linear transformation is performed repeatedly in consideration of past abstracted information. The self-attention mechanism ATT2 obtains a feature vector in consideration of degree of importance of each date and time for a series feature vector {h.sub.t}.sup.T.sub.t=1 (T is reference date and time) abstracted by the LSTM.”, Paragraph 84 and 85; See Figure 14.) (The inputs represent behavior data as feature vector data which corresponds to a set of features. They are input into a self-attention mechanism to generate a self-attention vector. That vector is then sent to a LSTM network which generates series data which sequentially are feature vectors from a combination of an original set of features and a self-attention vector, which teaches the limitation.)
Therefore, it would have been obvious to someone with ordinary skill in the art, prior to the effective filing date of this application, to use the sensors and indication of cognitive load of Teplitzky and Xiong, and combine it with the prediction system of. One with ordinary skill in the art could combine the two teachings together and arrive at the claimed invention with it yielding predictable results, or be motivated to combine the teachings, as using Yamamoto’s method and behavior data “enables estimation of a [psychological] state felt by the user with high accuracy.” (Paragraph 9 of Yamamoto)

Regarding claims 2, 9, and 16, Teplitzky, as modified by Xiong and Yamamoto, teaches the generation of an input feature vector including deriving a tensor product of the feature vector of the self-attention vector and the set of features. ("The self-attention mechanism… obtains a feature vector in consideration of degree of importance of each date and time for a series feature vector… abstracted by the LSTM. [Long Short-Term Memory] … A weight… corresponding to importance of each feature vector is obtained by two total binding layers in the self-attention mechanism ATT2, like the self-attention mechanisms ATT1. A first total binding layer of two total binding layers outputs a context vector of any size for h.sub.t as input, and a second total binding layer outputs a scalar value corresponding to importance α.sub.t for a context vector as input. The context vector may undergo non-linear transformation. ", pg. 5, right column, Paragraph 85; Fig. 14, ATT1 and ATT2 (Yamamoto)) (The context vector corresponds to a tensor product.)

Regarding claims 3, 10, and 17, Teplitzky, as modified by Xiong and Yamamoto, teaches the generation of an input feature vector including: aggregating the feature vector of each of the two or more sensors. ("Each channel of data is analyzed to produce its own feature vector, and the feature vectors are passed to a new set of classification layers", Paragraph 66 (Teplitzky)) (The classifier aggregates the feature vectors.)

Regarding claims 4, 11, and 18, Teplitzky, as modified by Xiong and Yamamoto, teaches the execution of a feature projection on the set of features, wherein the feature projection is executed before the self-attention vector is generated. ("the feature extraction unit... scans for each column of behavior data, and converts data of a character string type and a time type into numerical data. For example, data of a character string type is converted into a vector of one-hot representation relating to a corresponding dimension", Paragraph 69 (Yamamoto))

Regarding claims 7 and 14, Teplitzky, as modified by Xiong and Yamamoto, teaches normalization of the self-attention vector according to a softmax function before defining the feature vector. ("The total binding layer FC3 converts a feature vector weight averaged by the self-attention mechanism ATT2 into a vector of dimensions as many as types of psychological states of the target user and calculates a probability vector for each psychological state. Here, a softmax function or the like is used to perform non-linear transformation such that the sum of all elements (for example, low, medium, high) of a characteristic as output becomes one." Paragraph 86; Fig. 14, ATT2 and FC3 (Yamamoto))


Claim(s) 5, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Teplitzky in view of Xiong in further view of Yamamoto and in further view of Gajbhiye et al. (Herein referred to as Gajbhiye) (Wavelet Domain Optimized Savitzky–Golay Filter for the Removal of Motion Artifacts From EEG Recordings).

	Regarding claims 5, 12, and 19, Teplitzky, as modified by Xiong and Yamamoto, teaches the derivation of a set of features of a first sensor of the one or more sensors, but does not explicitly teach the filtering of sensor measurements based on a predetermined frequency relative to a type of the first sensor or executing an artifact removal process to remove artifacts in the sensor measurements, or extracting, from the sensor measurements, a plurality of features using a spectral density analysis.
	Gajbhiye teaches the filtering of sensor measurements based on a predetermined frequency relative to a type of the first sensor (“The first database consists of 23 EEG recordings, and each of the EEG recorded data has a sampling frequency (Fs) of 2048 Hz…”, pg. 2, under ‘EEG Databases’) and execution of an artifact removal process to remove artifacts in the sensor measurements, (“Therefore, this article is the first to use the optimized SG filter in the wavelet domain for the removal of motion artifacts from the single-channel EEG signal...”, pg. 2, left column, near the bottom of the page, paragraph 2) and extracting from the sensor measurements, a plurality of features using a spectral density analysis. ("…mean absolute error in power spectral density (MAE-PSD) …In this study, we have evaluated the optimal value of the SG filter based on the minimization of MAE in the PSD value between the δ-bands of contaminated and cleaned EEG signals", pg. 1, under ‘Index Terms’; pg. 4, right column, paragraph 3) Gajbhiye also discloses the benefits of using spectral density and filtering, as filtering sensor measurements leads to better denoising performance and computational feasibility. ("better denoising performance and computational feasibility", pg. 2, bottom of paragraph 1) and spectral density aids in obtaining metrics reflecting performance, fidelity, and optimization. (“Motivated from this, we have formulated the optimization problem based on the minimization of MAE [Mean Absolute Error] in PSD [Power Spectral Density] between δ-bands (MAEδ PSD) of contaminated and cleaned EEG [Electroencephalogram] signals… In this work, we have considered three fidelity measures to evaluate the objective quality of the cleaned EEG signal obtained using the proposed WOSG [wavelet domain optimized Savitzky–Golay] filtering approach. These measures are the change in SNR [signal-to-noise ratio] (                        
                            Δ
                        
                    SNR), the percentage reduction in coefficient of correlation (η), and the MAE in PSD of δ-bands between motion artifact intermixed and cleaned EEG signals, respectively… It has been observed that the PSD plots for cleaned and reference EEG signals are overlapped with a MAEδ PSD value of 0.0220. This shows the effectiveness of the proposed WOSG filtering approach for the removal of motion artifacts from the EEG recording.”, pg. 4, right column, bottom of the page; pg. 5, left column, under "Fidelity Measures"; pg. 6, left column, under "A. Performance Evaluation Using Database 1 EEG Signals")
	Therefore, it would have been obvious for someone with ordinary skill in the art prior to the effective filing date of this invention to combine the teachings of Buchwald, as modified by Yamamoto, and the filtering of sensor measurements as disclosed by Gajbhiye. One of ordinary skill in the art would be motivated to do so, as filtering sensor measurements leads to better denoising performance and computational feasibility, and spectral density aids in obtaining metrics reflective of performance, fidelity, and optimization.


Claim(s) 6, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Teplitzky in view of Xiong in further view of Yamamoto and in further view of Heneghan et al. (Herein referred to as Heneghan) (U.S. Patent No. US 11191466 B1)

Regarding claims 6, 13, and 20, Teplitzky, as modified by Xiong and Yamamoto, teaches the method as claimed in claim 1 and a computing device, but does not teach a computing device wherein the computing device is a mobile device, nor does it teach the first sensor of the one or more sensors positioned within a wearable device.	Heneghan teaches a mobile device and a first sensor of the one or more sensors is positioned within a wearable device. ("Various illustrative embodiments capture and consider objective physiological data non-invasively obtained through wearable monitoring device sensors and logged, such as activity, sleep, heart rate (“HR”), and the like…In various embodiments, approaches discussed herein may be performed by one or more of: firmware operating on a monitoring device or a secondary device, such as a mobile device paired to the monitoring device, a server, host computer, and the like.", Column 3, lines 15-18; Column 27, lines 5-8 FIG. 1A and 1B)
Therefore, it would have been obvious for someone of ordinary skill in the art prior to the effective filing date of this invention to combine the teachings of Buchwald, as modified by Yamamoto, alongside Heneghan’s device. Combining the teachings of Buchwald and Yamamoto and putting the prediction method disclosed therein into a wearable device like the one disclosed by Heneghan and the combination yielding predictable results would be considered obvious to someone of ordinary skill in the art.


Response to Arguments
Applicant's arguments filed October 20th, 2025 have been fully considered but they are not fully persuasive. The applicant lacks a response to the objections in the current action and are therefore maintained.
The applicant argues in substance,
Argument 1: Claims 1, 8, and 15 are not directed to a judicial exception. Specifically, the generation of a self-attention vector is something the human mind is not equipped to do.
The examiner respectfully disagrees. The previous action cited the particular generation step as a “mathematical concept”, which is a grouping of abstract idea. (See MPEP 2106.04(a)(2)(I)) Regardless of whether it is a mental process, that can be done with a pen and paper, or a mathematical concept, the limitation is still interpreted to be, under the broadest reasonable interpretation, an abstract idea.
Argument 2: The alleged abstract idea is clearly integrated into a practical application, as the claimed techniques have advantages over existing solutions.
The examiner respectfully disagrees. While the examiner does not concede that the advantages shown in the specification are true, regardless the specification merely shows the improvement of an abstract idea, which is still an abstract idea. The applicant does not clearly integrate the improvement of a non-abstract idea into the claims.
Argument 3: The claim’s limitations recite “significantly more” than the judicial exception.
The examiner respectfully disagrees. The applicant has not shown any evidence or rationale that the claims amount to “significantly more”. The limitations, as explained above, do not amount to insignificantly more, with the “receiving” and “outputting” step being considered to be well-understood, routine, and conventional, as it is considered to be mere sending or receiving of data. (See MPEP 2106.05(d)(II)) Therefore, the rejections are maintained.
Argument 4: For claims 1, 8, and 15, Yamamoto does not teach “defining a feature vector for the sensor by combining the set of features for the sensor and the self-attention vector of the sensor”
The examiner respectfully disagrees. The set of features are input into a self-attention mechanism to obtain a feature vector teaching a set of features which when combined with the attention mechanism generate a self-attention vector. The attention vector is input to a LSTM network which generate series data, which is interpreted to be feature vectors from the self-attention vector, which is combined, in part, with a set of features. In combination with the sensors of the combined Teplitzky and Xiong, teaches the limitation.


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.



Any inquiry concerning this communication or earlier communications from the examiner should be directed to Tyler E Iles whose telephone number is (571)272-5442. The examiner can normally be reached 9:00am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/T.E.I./Patent Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122
Read full office action
METHODS AND SYSTEMS FOR PREDICTING COGNITIVE LOAD

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

METHODS AND SYSTEMS FOR PREDICTING COGNITIVE LOAD

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email