Last updated: May 29, 2026
Application No. 17/209,210
METHOD AND APPARATUS FOR DETERMINING KEY LEARNING CONTENT, DEVICE AND STORAGE MEDIUM

Non-Final OA §101§103
Filed
Mar 22, 2021
Priority
Jun 16, 2020 — CN 202010549031.8
Examiner
ALVESTEFFER, STEPHEN D
Art Unit
3715
Tech Center
3700 — Mechanical Engineering & Manufacturing
Assignee
BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
OA Round
4 (Non-Final)
This examiner grants 57% of cases after interview

— +24.3% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 436 resolved cases, 2023–2026
Examiner Intelligence

ALVESTEFFER, STEPHEN D View full profile →
Grants 57% of resolved cases
Career Allowance Rate
248 granted / 436 resolved
-13.1% vs TC avg
Strong +24% interview lift
Without
With
+24.3%
Interview Lift
resolved cases with interview
Typical timeline
4y 1m
Avg Prosecution
23 currently pending
Career history
479
Total Applications
across all art units
Statute-Specific Performance

§101
8.5%
-31.5% vs TC avg
§103
76.6%
+36.6% vs TC avg
§102
12.0%
-28.0% vs TC avg
§112
1.9%
-38.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 436 resolved cases
Office Action

§101 §103
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This office action is in response to arguments and amendments entered on October 23, 2025 for the patent application 17/209,210 originally filed on March 22, 2021.
A supplemental amendment was filed November 6, 2025 correcting a typographical error in the numbering of claim 22.
Claims 1, 4, 8, 11, 15, and 18 are amended. Claims 2, 3, 5, 6, 9, 10, 12, 13, 16, 17, 19, and 20 are canceled. Claims 21 and 22 are new. Claims 1, 4, 7, 8, 11, 14, 15, 18, 21, and 22 remain pending. The first office action of April 26, 2024, the second office action of December 6, 2024, and the third office action of March 6, 2025 are fully incorporated by reference into this office action.

Response to Amendment
Applicant’s amendments to the claims have been noted by the Examiner.
The Applicant’s amendments are sufficient to overcome the outstanding rejections under 35 USC 112(b). Therefore, the 35 USC 112(b) rejections are withdrawn.
Applicant’s amendments are not sufficient to overcome the outstanding 35 USC 101 rejections, for reasons set forth below.
Applicant’s amendments are sufficient to overcome the outstanding 35 USC 103 rejections. However, new rejections under 35 USC 103 are applied to the amended claims in this office action, as set forth below.

Claim Rejections - 35 USC § 101
35 U.S.C. § 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1, 4, 7, 8, 11, 14, 15, 18, 21, and 22 are rejected under 35 U.S.C. § 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. 
Claim 1 is directed to “a method” (i.e. a process), claim 8 is directed to “an electronic device” (i.e. a machine), and claim 15 is directed to “a non-transitory computer-readable storage medium” (i.e. a machine), hence the claims are directed to one of the four statutory categories (i.e. process, machine, manufacture, or composition of matter). In other words, Step 1 of the subject-matter eligibility analysis is “Yes.”
However, the claims are drawn to an abstract idea of “determining key learning content,” either in the form of “certain methods of organizing human activity,” in terms of managing personal behavior or relationships or interactions between people (including social activities, teaching and following rules or instructions), or reasonably in the form of “mental processes,” in terms of processes that can be performed in the human mind (including an observation, evaluation, judgement or opinion) which are “performed on a computer” (per MPEP 2106(III)(C) “A Claim That Requires a Computer May Still Recite a Mental Process”).
Regardless, the claims are reasonably understood as either “certain methods of organizing human activity” or “mental processes,” which require the following limitations: 
“acquiring in real time… a face image of a user, and recognizing… from the face image an actual face orientation and an actual viewpoint position of eyes of the user during a watching stage of the online learning in response to the actual face orientation being perpendicular to a playback interface of an online learning application or the actual viewpoint position falling within a playback interface region of the online learning application, determining… that a first actual learning concentration at the watching stage is concentration; and
in response to the actual face orientation being not perpendicular to the playback interface or the actual viewpoint position not falling within the playback interface region, determining… that the first actual learning concentration at the watching stage is non-concentration, and determining a first target period during which the actual face orientation is not perpendicular to the playback interface or the actual viewpoint position is not falling within the playback interface region;
sensing… clicks… and acquiring in real time, based on the clicks… an actual delay of answering and an actual time length consumed by the answering of the user at an online answering stage of the online learning, wherein the actual delay of answering is a period of time when an online answering application gives a question to be answered to a time when a first human-computer interaction with the online learning application is sensed…
in response to the actual delay of the answering being greater than a preset answering delay or the actual time length consumed by the answering being greater than a preset answering time length, determining… that a second actual learning concentration at the online answering stage is non-concentration, and determining a second target period during which the actual delay is greater than the preset answering delay; and
in response to the actual delay of the answering being not greater than the preset answering delay and the actual time length consumed by the answering being not greater than the preset answering time length, determining… that the second actual learning concentration at the online answering stage is concentration,
wherein the method further comprises:
determining learning content corresponding to the first target period and learning content corresponding to the second target period as key learning content during the online learning; and 
pushing the key learning content to the online learning application, and presenting… the key learning content again.” 
These limitations simply describe a process of data gathering and manipulation, which is partially analogous to “collecting information, analyzing it, and displaying certain results of the collection analysis” (i.e. Electric Power Group, LLC, v. Alstom, 830 F.3d 1350, 119 U.S.P.Q.2d 1739 (Fed. Cir. 2016)). Hence, these limitations are akin to an abstract idea which has been identified among non-limiting examples to be an abstract idea. In other words, Step 2A, Prong 1 of the subject-matter eligibility analysis is “Yes.”
Furthermore, the claims do not include additional elements that either alone or in combination are sufficient to claim a practical application because to the extent that, e.g., “a system,” “a server,” “sensors,” “a gesture sensor,” “a camera,” “a touch screen,” “a touch screen click sensor,” “an online learning terminal,” “an electronic device,” “at least one processor,” “a memory,” “a non-transitory computer-readable storage medium” and “a sound pick-up” are claimed, as these are merely claimed to add insignificant extra-solution activity to the judicial exception (e.g., data gathering) and/or do no more than generally link the use of a judicial exception to a particular technological environment or field of use. In other words, the claimed “determining key learning content,” is not providing a practical application, thus Step 2A, Prong 2 of the subject-matter eligibility analysis is “No.”
Likewise, the claims do not include additional elements that either alone or in combination are sufficient to amount to significantly more than the judicial exception because to the extent that, e.g., “a system,” “a server,” “sensors,” “a gesture sensor,” “a camera,” “a touch screen,” “a touch screen click sensor,” “an online learning terminal,” “an electronic device,” “at least one processor,” “a memory,” “a non-transitory computer-readable storage medium” and “a sound pick-up” are claimed these are all generic, well-known, and conventional computing elements. As evidence that these are generic, well-known, and conventional computing elements, Applicant’s specification discloses them in a manner that indicates that the additional elements are sufficiently well-known that the specification does not need to describe the particulars of such additional elements to satisfy 35 U.S.C. § 112(a), per MPEP § 2106.07(a) III (a), which satisfies the Examiner’s evidentiary burden requirement per the Berkheimer memo.
Specifically, the Applicant’s claimed “system” and “server” are shown in Figure 1 and described in instant specification paragraph [0028] as follows: “When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or as a single server. When the server is software, it may be implemented as a plurality of software programs or software modules (for example, used to provide key learning content determination services), or as a single software program or software module.”
As for Applicant’s claimed “sensors,” “a gesture sensor,” “a camera,” “a touch screen,” and “a touch screen click sensor,” the Examiner notes that the claimed sensors are only described once in the instant specification, in paragraph [0031]: “The online learning terminal 101, 102, or 103 may obtain the learning feature data of the user at each interactive learning stage of the online learning through its built-in sensors or an external detection device that can establish a data connection with it, such as a face recognition camera, an ambient light intensity sensor, a touch screen click sensor, and a gesture sensor.”
The Applicant’s claimed “online learning terminal” is described in instant specification paragraph [0027]: “The online learning terminal 101, 102, or 103 may be hardware or software. When the online learning terminal 101, 102, or 103 is hardware, it may be various electronic devices with a display screen, including but not limited to a smart phone, a tablet computer, a laptop computer, a desktop computer, and the like. When the online learning terminal 101, 102, or 103 is software, it may be installed in the electronic devices listed above, and may be implemented as a plurality of software programs or software modules, or as a single software program or software module.”
The Applicant’s claimed “electronic device” is described in paragraph [0112] as follows: “The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workbenches, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile apparatuses, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing apparatuses.”
The Applicant does not specifically define a “non-transitory computer-readable storage medium” as anything other than the plain meaning of the term.
The Applicant’s claimed “a sound pick-up” is only referenced (paragraph [0095]) but not described in the specification. According to broadest reasonable interpretation, the sound pick-up will be interpreted as any device capable of inputting sound, such as a microphone.
Therefore, these elements are reasonably interpreted as being generic computers which provide no details of anything beyond ubiquitous standard equipment. As such, the claimed limitations are reasonably understood as not providing anything significantly more. Therefore, Step 2B, of the subject-matter eligibility analysis is “No.”
In addition, dependent claims 4, 7, 11, 14, 18, 21, and 22 do not provide a practical application and are insufficient to amount to significantly more than the judicial exception. As such, dependent claims 4, 7, 11, 14, 18, 21, and 22 are also rejected under 35 U.S.C. § 101, based on their respective dependencies to independent claims 1, 8, and 15.
Therefore, claims 1, 4, 7, 8, 11, 14, 15, 18, 21, and 22 are rejected under 35 U.S.C. § 101 as being directed to non-statutory subject matter.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 7, 8, 14, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Yang (hereinafter “Yang1,” US 2019/0295430) in view of Iwase et al. (hereinafter “Iwase,” US 2021/0290129) in view of Yang et al. (hereinafter “Yang2,” US 2020/0098284), and in further view of Velozo et al. (hereinafter “Velozo,” US 2010/0255455).
Regarding claim 1, and substantially similar limitations in claims 8 and 15, Yang1 discloses a method for determining key learning content during an online learning, performed by a system comprising sensors and a server, the sensors comprising a camera (see Yang1 Fig. 4, generating assessment result; see also Yang1 Fig. 1, showing server 4; also Yang [0019-0020], “The teacher-end device 1 may be implemented by a personal computer or a notebook computer which includes an image capturing module 11 (e.g., a camera) and an input/output (I/O) interface 12 that may include one or more of a keyboard, a mouse, a display and a speaker. However, implementation of the teacher-end device 1 is not limited to the disclosure herein and may vary in other embodiments… Similar to the teacher-end device 1, the student-end device 2 may be implemented by a personal computer or a notebook computer which includes an image capturing module 21 (e.g., a camera) and an I/O interface 22 that may include one or more of a keyboard, a mouse, a display and a speaker.”) … , the method comprising: 
acquiring in real time, by the camera, a face image of a user, and recognizing by the server from the face image an actual face orientation … during a watching stage of the online learning in response to the actual face orientation being perpendicular to a playback interface of an online learning application or the actual viewpoint position falling within a playback interface region of the online learning application, determining by the server that a first actual learning concentration at the watching stage is concentration (see Yang1 Figs. 7 and 8, showing determining position and orientation of face; also Yang1 [0035], “Referring to FIG. 7, the second sub-condition is that the face position of the teacher has been outside a predetermined range for a predetermined teacher face-deviation duration. In this embodiment, the predetermined range is an area enclosed by the reference positioning frame (F) as shown in FIG. 2, and the predetermined teacher face-deviation duration is three seconds. However, implementations of the predetermined range and the predetermined teacher face-deviation duration are not limited to the disclosure herein and may vary in other embodiments. Specifically speaking, when it is determined that the face position of the teacher has been outside the predetermined range for the predetermined teacher face-deviation duration based on the first image, such as an image (I.sub.4), wherein the face position of the teacher is outside the predetermined range, displayed in the teacher-image window (W) on the display of the teacher-end device 1 as shown in FIG. 7, the server 4 transmits the first notification message that corresponds to the second sub-condition to the teacher-end device 1 for displaying the first notification message”); and 
in response to the actual face orientation being not perpendicular to the playback interface or the actual viewpoint position not falling within the playback interface region, determining by the server that the first actual learning concentration at the watching stage is non-concentration, and determining a first target period during which the actual face orientation is not perpendicular to the playback interface or the actual viewpoint position is not falling within the playback interface region (see Yang1 Figs. 7 and 8, showing determining face position and orientation; also Yang1 [0037], “the fourth sub-condition is that the head of the teacher has turned aside for a predetermined teacher head-turning duration. In this embodiment, the predetermined teacher head-turning duration is three seconds, but implementation of the predetermined teacher head-turning duration is not limited to the disclosure herein and may vary in other embodiments. Specifically speaking, when it is determined that the head of the teacher has turned aside for the predetermined teacher head-turning duration based on the first image, such as an image (I.sub.5), wherein the head of the teacher is turned aside, displayed in the teacher-image window (W) on the display of the teacher-end device 1 as shown in FIG. 7, the server 4 transmits the first notification message that corresponds to the fourth sub-condition to the teacher-end device 1 for displaying the first notification message, such as “Remember to make eye contact with your students!”, so as to notify the teacher to return to the normal head position. In this embodiment, the server 4 determines whether the head of the teacher has turned aside for the predetermined teacher head-turning duration based on whether a rolling angle of the face is greater than a predetermined rolling angle (e.g., twenty-six degrees), whether a yaw angle of the face is greater than a predetermined yaw angle (e.g., thirty-three degrees), or whether a pitch angle of the face is greater than a predetermined pitch angle (e.g., ten degrees), where the rolling angle of the face, the yaw angle of the face and the pitch angle of the face are calculated based on characteristic points located on the face of the teacher in the first image.”)
…
determining learning content corresponding to the first target period and learning content corresponding to the second target period as a key learning content as key learning content during the online learning (Yang1 [0049], “the server 4 generates an assessment result that relates to the performance of the student during the predetermined course period and that indicates what (i.e., which behavior of the student) satisfies the second predetermined condition, the cumulative number of times, and the assessment score in the predetermined course period. Thereafter, the server 4 transmits the assessment result to the user-end device 3 for display of the assessment result by the user-end device 3. Following the example previously described, the assessment result indicates that the cumulative number of times the eyes of the student have been closed is two, that the cumulative number of times the mouth of the student has opened to yawn is one, and that the assessment score is 8.0. The user-end device 3 may further make a determination based on the assessment result. For example, in a scenario that the online education system 100 is utilized by a commercial education company, the user-end device 3 may determine whether the assessed student is likely to ask for a refund due to an unsuitable or unsatisfactory course. When the determination is affirmative, managers of the commercial education company may respond by approaching the assessed student and showing concern, in order to improve the quality of the course.”).
Yang1 does not explicitly teach a touch screen click sensor for sensing clicks on a touch screen… recognizing by the server from the face image… an actual viewpoint position of eyes of the user… sensing, by the touch screen click sensor, clicks on the touch screen… based on the clicks sensed by the touch screen click sensor… wherein the actual delay of answering is a period from a time when an online answering application gives a question to be answered to a time when a first human-computer interaction with the online learning application is sensed by the system.
However, Iwase discloses a touch screen click sensor for sensing clicks on a touch screen… recognizing by the server from the face image… an actual viewpoint position of eyes of the user… sensing, by the touch screen click sensor, clicks on the touch screen… based on the clicks sensed by the touch screen click sensor… wherein the actual delay of answering is a period from a time when an online answering application gives a question to be answered to a time when a first human-computer interaction with the online learning application is sensed by the system (Iwase [0067], “the operation section 23 includes a touch operation section that constitutes, with the display section 11, a touchscreen.”; also Iwase [0112], “The cognitive response time is a time required for the person to take action after acknowledgement. An example of the cognitive response time is a time that is required from time at which the person visually recognizes a particular point in the image displayed on the display section 11 until the person gives an operation (for example, a touch operation) of selecting the particular point to the operation section 23. Examples of the particular point are the icon and a character included in a dial.”; also Iwase [0141], “The sightline detection section 41 detects the user's sightline. In this example, the sightline detection section 41 executes sightline detection processing on the image (the image including the user's face) that is acquired by the front camera 21, and thereby detects the user's sightline. This sightline detection processing may be processing that is executed by using a learning model generated by deep learning (a learning model for detecting the sightline) or may be processing that is executed by using a well-known sightline detection technique. For example, the sightline detection section 41 detects the user's pupils from the image that is acquired by the front camera 21 and detects the user's sightline based on the detected pupils. The user's sightline may be a sightline of the user's right eye, may be a sightline of the user's left eye, or may be a sightline that is derived based on the sightline of the user's right eye and the sightline of the user's left eye,” detecting user sightline from the eyes).
Iwase is analogous to Yang1, as both are drawn to the art of user state determination. It would be obvious to try by one of ordinary skill in the art at the time of filing to have modified the method as taught by Yang1, to include a touch screen click sensor for sensing clicks on a touch screen… recognizing by the server from the face image… an actual viewpoint position of eyes of the user… sensing, by the touch screen click sensor, clicks on the touch screen… based on the clicks sensed by the touch screen click sensor… wherein the actual delay of answering is a period from a time when an online answering application gives a question to be answered to a time when a first human-computer interaction with the online learning application is sensed by the system, as taught by Iwase, in order to appropriately estimate user state (Iwase [0025-0026]). Doing so is a predictable solution that one of ordinary skill in the art could have pursued with a reasonable expectation of success.
Yang1 in view of Iwase does not explicitly teach every limitation of acquiring in real time… an actual delay of answering and an actual time length consumed by the answering of the user at an online answering stage of the online learning… in response to the actual delay of the answering being greater than a preset answering delay or the actual time length consumed by the answering being greater than a preset answering time length, determining by the server that a second actual learning concentration at the online answering stage is non-concentration, and determining a second target period during which the actual delay is greater than the preset answering delay; and in response to the actual delay of the answering being not greater than the preset answering delay or the actual time length consumed by the answering being not greater than the preset answering time length, determining that the actual learning concentration at the online answering stage is concentration.
While Yang1 does disclose determining concentration levels by measuring duration of “eye-closure,” “face-deviation,” “yawning,” and “head-turning” (Yang1 [0034-0037]), which can be equated to “actual delay of the answering,” Yang1 does not determine concentration levels by measuring actual time length consumed by the answering.
However, Yang2 discloses determining concentration levels by measuring actual time length consumed by the answering (Yang2 Abstract, “A task completion feature collecting module records an answer response time and a correct answer rate of a student when completing a task. A cognitive load self-assessment collecting module quantifies and analyzes a mental effort and a task subjective difficulty by a rating scale.”).
Yang2 is analogous to Yang1 in view of Iwase, as both are drawn to the art of teaching. It would be obvious to try by one of ordinary skill in the art at the time of filing to have modified the method as taught by Yang1 in view of Iwase, to include determining concentration levels by measuring actual time length consumed by the answering, as taught by Yang2, in order to objectively, quickly and accurately rate the cognitive load of a student in a classroom so the teaching effect can be improved (Yang2 [0034]). Doing so is a predictable solution that one of ordinary skill in the art could have pursued with a reasonable expectation of success.
Yang1 in view of Iwase and Yang2 does not explicitly teach pushing the key learning content to an online learning application, and presenting, on an online learning terminal, the key learning content again.
However, Velozo discloses pushing the key learning content to an online learning application, and presenting, on an online learning terminal, the key learning content again (Velozo [0032], “During an assessment, session manager 112 may determine that a student (or class, or other group) would benefit by receiving immediate remedial support. For example and without limitation, an algorithm associated with session manager 112 may indicate that a student has significant weaknesses in connection with standard "4.1.2." At that point, session manager 112 may cause one or more of learning objects 116 to be presented to the student via student UI 110.”).
Velozo is analogous to Yang1 in view of Iwase and Yang2, as both are drawn to the art of education systems. It would be obvious to try by one of ordinary skill in the art at the time of filing to have modified the method as taught by Yang1 in view of Iwase and Yang2, to include pushing the key learning content to an online learning application, and presenting, on an online learning terminal, the key learning content again, as taught by Velozo, in order to provide opportunities for immediate remediation of a student’s academic deficiencies (Velozo [0003]). Doing so is a predictable solution that one of ordinary skill in the art could have pursued with a reasonable expectation of success.
Regarding claim 7, and substantially similar limitations in claim 14, Yang1 in view of Iwase and Yang2 does not explicitly teach wherein: the online learning terminal presents the key learning content to the user as review content.
However, Velozo discloses wherein: the online learning terminal presents the key learning content to the user as review content (Velozo [0032], “During an assessment, session manager 112 may determine that a student (or class, or other group) would benefit by receiving immediate remedial support. For example and without limitation, an algorithm associated with session manager 112 may indicate that a student has significant weaknesses in connection with standard "4.1.2." At that point, session manager 112 may cause one or more of learning objects 116 to be presented to the student via student UI 110.”).
Velozo is analogous to Yang1 in view of Iwase and Yang2, as both are drawn to the art of teaching. It would be obvious to try by one of ordinary skill in the art at the time of filing to have modified the method as taught by Yang1 in view of Iwase and Yang2, to include wherein: the online learning terminal presents the key learning content to the user as review content, as taught by Velozo, in order to provide opportunities for immediate remediation of a student’s academic deficiencies (Velozo [0003]). Doing so is a predictable solution that one of ordinary skill in the art could have pursued with a reasonable expectation of success.

Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Yang1 in view of Iwase, Yang2, and Velozo and in further view of Udaka et al. (hereinafter “Udaka,” US 2021/0287561).
Regarding claim 4, and substantially similar limitations in claims 11 and 18, Yang1 in view of Iwase, Yang2, and Velozo does not explicitly teach wherein the sensors further comprise a sound pick-up, and the method further comprises: picking up voice of the user by the sound pick-up, and acquiring actual voice data of the user at the voice interaction stage of the online learning; and in response to content of the actual voice data being consistent with the content of a preset standard voice data, determining that a third actual learning concentration at the voice interaction stage is concentration; and in response to the content of the actual voice data being inconsistent with the content of the preset standard voice data, determining that the third actual learning concentration at the voice interaction stage is non-concentration, and determining a third target period during which the actual voice data is inconsistent with the content of the preset standard voice data, wherein the method further comprises: determining learning content corresponding to the third target period; and pushing the learning content corresponding to the third target period to the online learning application, and presenting, on an online learning terminal, the learning content corresponding to the third target period again.
While Yang1 does disclose determining concentration levels by measuring duration of “eye-closure,” “face-deviation,” “yawning,” and “head-turning” (Yang1 [0034-0037]), Yang1 does not determine concentration levels by measuring voice data.
However, Udaka discloses determining concentration levels by measuring voice data (Udaka [0059], “The judgment apparatus 40 obtains trigger information representing a specified word(s) or behavior(s) of a teacher during a lecture, obtains information representing a visual direction of a student, and judges degree of concentration of the student based on a preferred visual direction according to the target trigger information and the visual direction of the student. Concretely, the judgment apparatus 40 judges the degree of concentration of the student based on the preferred visual direction and the visual direction of the student according to the trigger information extracted by the voice monitor apparatus 10 (target trigger information).”).
Udaka is analogous to Yang1 in view of Iwase, Yang2, and Velozo, as both are drawn to the art of teaching. It would be obvious to try by one of ordinary skill in the art at the time of filing to have modified the method as taught by Yang1 in view of Iwase, Yang2, and Velozo, to include determining concentration levels by measuring voice data, as taught by Udaka, in order to easily grasp degree of concentration of a student, even when lecture situation changes (Udaka [0018]). Doing so is a predictable solution that one of ordinary skill in the art could have pursued with a reasonable expectation of success.

Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Yang1 in view of Iwase, Yang2, and Velozo and in further view of Kei et al. (hereinafter “Kei,” US 2021/0240968).
Regarding claim 21, Yang1 in view of Iwase, Yang2, and Velozo does not explicitly teach every limitation of the sensors further comprise a gesture sensor, and the method further comprises: acquiring in real time, through the gesture sensor, an actual body movement of the user at a movement imitation stage of the online learning; and in response to the actual body movement being consistent with the preset standard body movement, determining that a fourth actual learning concentration at the movement imitation stage is concentration; in response to the actual body movement being inconsistent with the preset standard body movement, determining that the fourth actual learning concentration at the movement imitation stage is non-concentration, and determining a fourth target period during which the actual body movement is inconsistent with the preset standard body movement; wherein the method further comprises: determining learning content corresponding to the fourth target period; and pushing the learning content corresponding to the fourth target period to the online learning application, and presenting, on an online learning terminal, the learning content corresponding to the fourth target period again.
However, Kei discloses the sensors further comprise a gesture sensor, and the method further comprises: acquiring in real time, through the gesture sensor, an actual body movement of the user at a movement imitation stage of the online learning; and in response to the actual body movement being consistent with the preset standard body movement, determining that a fourth actual learning concentration at the movement imitation stage is concentration; in response to the actual body movement being inconsistent with the preset standard body movement, determining that the fourth actual learning concentration at the movement imitation stage is non-concentration, and determining a fourth target period during which the actual body movement is inconsistent with the preset standard body movement; wherein the method further comprises: determining learning content corresponding to the fourth target period; and pushing the learning content corresponding to the fourth target period to the online learning application, and presenting, on an online learning terminal, the learning content corresponding to the fourth target period again (Kei [0025-0026], “the stream of images/video of a user is collected from an integrated camera on a user's computing device. For example, a laptop computer or desktop may include a camera that can capture the images from which user position characteristics are extracted. In another example, a separate camera, such as one in the classroom may be used to determine student attention… such an attention analysis system (100) provides a simply way to determine whether a single user, or multiple users, are focusing on a computing device screen. This may include analyzing more information than just eye gaze. That is other user position information such as head position, body position, and eye position may be relied on.”; also Kei [0032-0033], “the information may include user body position information. For example, shoulder angle information may be indicative of whether a user is slouching in their chair, which may be indicia that a user is not paying attention. While specific reference is made to particular body position characteristics of a user, other body position/movement characteristics may be extracted (block 202) from a stream of images for a user by which it may be determined whether that user is paying attention or not… The stream of images, and more particularly the extracted characteristics from the stream of images, are compared (block 203) against a database (FIG. 1, 102) of images. The database (FIG. 1, 102) of images against which the stream of images are compared have been similarly analyzed with characteristics extracted therefrom. As described, the database (FIG. 1, 102) of images includes a training set which includes images and a classification as to whether it indicates a user is paying attention or not. The features of these images are analyzed such that user position characteristics may be mapped to either an attentive or non-attentive user.”).
Kei is analogous to Yang1 in view of Iwase, Yang2, and Velozo, as both are drawn to the art of education systems. It would be obvious to try by one of ordinary skill in the art at the time of filing to have modified the method as taught by Yang1 in view of Iwase, Yang2, and Velozo, to include the sensors further comprise a gesture sensor, and the method further comprises: acquiring in real time, through the gesture sensor, an actual body movement of the user at a movement imitation stage of the online learning; and in response to the actual body movement being consistent with the preset standard body movement, determining that a fourth actual learning concentration at the movement imitation stage is concentration; in response to the actual body movement being inconsistent with the preset standard body movement, determining that the fourth actual learning concentration at the movement imitation stage is non-concentration, and determining a fourth target period during which the actual body movement is inconsistent with the preset standard body movement; wherein the method further comprises: determining learning content corresponding to the fourth target period; and pushing the learning content corresponding to the fourth target period to the online learning application, and presenting, on an online learning terminal, the learning content corresponding to the fourth target period again, as taught by Kei, since it combines prior art elements of acquiring and analyzing images obtained from a camera according to known methods to yield predictable results. Doing so is a predictable solution that one of ordinary skill in the art could have pursued with a reasonable expectation of success.

Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over Yang1 in view of Iwase, Yang2, and Velozo and in further view of Sahashi (US 2006/0057550).
Regarding claim 22, Yang1 in view of Iwase, Yang2, and Velozo does not teach wherein the sensors further comprise a sound pick-up, and the method further comprises: picking up voice of the user by the sound pick-up, and acquiring actual voice data of the user at a voice interaction stage of the online learning; and determining a fifth actual learning concentration based on a judgment result on whether an actual mouth movement of the user is consistent with the content of the actual voice data.
However, Sahashi discloses wherein the sensors further comprise a sound pick-up, and the method further comprises: picking up voice of the user by the sound pick-up, and acquiring actual voice data of the user at a voice interaction stage of the online learning; and determining a fifth actual learning concentration based on a judgment result on whether an actual mouth movement of the user is consistent with the content of the actual voice data (Sahashi [0019], “Because the action request means asks the student a question and requests an audio response, and the action detection means acquires audio of the student, recognizes the audio response, and determines the validity of the audio response from the student, while also detecting the movement of the mouth of the student accompanying the audio response from image changes in the acquired video, confirmation can be obtained that the student determined to be the legitimate student by the facial image matching means has provided the audio response to the question, and thus, the attendance of the legitimate student can be confirmed.”).
Sahashi is analogous to Yang1 in view of Iwase, Yang2, and Velozo, as both are drawn to the art of online education systems. It would be obvious to try by one of ordinary skill in the art at the time of filing to have modified the method as taught by Yang1 in view of Iwase, Yang2, and Velozo, to include wherein the sensors further comprise a sound pick-up, and the method further comprises: picking up voice of the user by the sound pick-up, and acquiring actual voice data of the user at a voice interaction stage of the online learning; and determining a fifth actual learning concentration based on a judgment result on whether an actual mouth movement of the user is consistent with the content of the actual voice data, as taught by Sahashi, in order to verify that the student is actually present and speaking (Sahashi [0019]). Doing so is a predictable solution that one of ordinary skill in the art could have pursued with a reasonable expectation of success.

Response to Arguments
The Applicant’s arguments filed on October 23, 2025 have been fully considered.

Claim Rejections - 35 USC § 101
The Applicant respectfully argues, “The amended claims recite the above features (i)-(iii), which obviously relates to the technical fields of image recognition and face recognition (see also specification as originally filed, paragraph [0002]) and thus cannot be performed in a human mind.”
The Examiner respectfully disagrees. Applicant respectfully points out that the instant claimed invention requires a computer. However, a claim that requires a computer does not inherently make the claim eligible under 35 USC 101. MPEP 2106.04(a)(2)(III)(C) states that a claim that requires a computer may still recite a mental process. The MPEP guides examiners to “review the specification to determine if the claimed invention is described as a concept that is performed in the human mind and applicant is merely claiming that concept performed 1) on a generic computer, or 2) in a computer environment, or 3) is merely using a computer as a tool to perform the concept. In these situations, the claim is considered to recite a mental process.” In the present case, the claimed invention is described as a concept of “determining key learning content” performed on a generic computer or computer environment, and merely uses a computer as a tool to perform the concept. Therefore, the claims recite mental processes.

The Applicant further respectfully argues, “The above features (i)-(iii) require particular machine or manufacture (e.g. the claimed system which comprises sensors and a server, the sensors comprising a camera and a touch screen click sensor for sensing clicks on a touch screen) to execute the claimed method and the particular machine or manufacture is integral to the claims.”
The Examiner respectfully disagrees. The computing device as claimed can be read as including common implementations of smartphones, notebook computers, and tablets that were considered generic computing devices at the time of filing. For example, under broadest reasonable interpretation in light of the specification, the claimed computing device requires a server, a camera, and a touch screen. At the time of the instant invention’s filing, all of these features were commonly found in generic smartphones, generic notebook computers, and generic tablets. Therefore, the instant claimed invention discloses a generic computing device and does not disclose any features that can be interpreted as being a particular machine or manufacture.

The Applicant also respectfully argues, “based on the hardware (e.g., the camera and touch screen click sensor) and the image recognizing technology etc., the present invention improves the function of the traditional online learning system which has no real-time detection ability. Accordingly, the function of the computer is improved by have improved real-time detection ability.”
The Examiner respectfully disagrees. The claimed improvements to real-time detection abilities are not the result of improvements to the functioning of the computer itself. Instead, these features are the result of the use of the conventional capabilities of generic computing devices (e.g. detecting touch screen inputs, determining delay time between inputs, or acquiring a face image of a user by a camera in real time). These are not improvements to computer capabilities, but rather usage of conventional features of generic computers.
As for improvements to any other technology or technical field, MPEP 2106.05(a)(II) cautions examiners that “it is important to keep in mind that an improvement in the abstract idea itself… is not an improvement in technology.” That is, improvements to the claimed abstract concept of “determining key learning content,” such as “improved real-time detection ability,” is not an improvement in technology.

As such, the arguments are not persuasive. For these reasons, the 35 USC 101 rejections are maintained.

Claim Rejections - 35 USC § 103
Applicant’s arguments with respect to the claims have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Stephen Alvesteffer whose telephone number is (571)272-8680. The examiner can normally be reached M-F 8:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Peter Vasat can be reached at 571-270-7625. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SA/Examiner, Art Unit 3715                                                                                                                                                                                                        
/PETER S VASAT/Supervisory Patent Examiner, Art Unit 3715
Read full office action
Prosecution Timeline

Show 3 earlier events
Dec 06, 2024
Final Rejection mailed — §101, §103
Feb 05, 2025
Response after Non-Final Action
Mar 06, 2025
Request for Continued Examination
Mar 07, 2025
Response after Non-Final Action
Jul 23, 2025
Non-Final Rejection mailed — §101, §103
Oct 23, 2025
Response Filed
Feb 20, 2026
Final Rejection mailed — §101, §103
Apr 13, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

18/679,065
Patent 12609054
FLOORING PANEL SAMPLE MODULE AND METHOD OF MANUFACTURE
1y 10m to grant Granted Apr 21, 2026
17/860,313
Patent 12601566
Howitzer Training Gun
3y 9m to grant Granted Apr 14, 2026
17/539,309
Patent 12595987
SYSTEMS AND METHODS FOR SHOOTING SIMULATION AND TRAINING
4y 4m to grant Granted Apr 07, 2026
18/098,159
Patent 12573317
ANGLE-ADJUSTABLE THREE-DIMENSIONAL PHYSICAL SIMULATION DEVICE FOR EQUIVALENT COAL SEAM MINING
3y 1m to grant Granted Mar 10, 2026
18/122,340
Patent 12573313
APPARATUS AND METHOD FOR GENERATING AN EDUCATIONAL ACTION DATUM USING MACHINE-LEARNING
2y 12m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

4-5
Expected OA Rounds
57%
Grant Probability
81%
With Interview (+24.3%)
4y 1m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 436 resolved cases by this examiner. Grant probability derived from career allowance rate.
METHOD AND APPARATUS FOR DETERMINING KEY LEARNING CONTENT, DEVICE AND STORAGE MEDIUM

This examiner grants 57% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email