Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant’s response to the last office action, filed September 29, 2025 has been entered and made of record. Claim 1 has been amended; and claims 2-16 have been newly added. Claims 1-16 are now pending in this application.
In view of applicant’s amendment, the double patenting rejection of claim 1 is hereby withdrawn.
Claim Objections
Claim 16 is objected to under 37 CFR 1.75 as being a substantial duplicate of claim 15. When two claims in an application are duplicates or else are so close in content that they both cover the same thing, despite a slight difference in wording, it is proper after allowing one claim to object to the other as being a substantial duplicate of the allowed claim. See MPEP § 608.01(m).
Allowable Subject Matter
The following is a statement of reasons for the indication of allowable subject matter:
-- Claims 1, 15, and 16 are allowable over the prior art of record.
-- Claims 2-14 are allowable in view of their dependency from claim 1
With respect to claim 1, the prior art of record, alone or in reasonable combination, does not teach or suggest, the following underlined limitation(s), (in consideration of the claim as a whole):
“adapts the trained machine learning model of said AI system using said
baseline body language and said context to form an adapted trained machine learning model; and applies the adapted trained machine learning model of said AI system to at least one of said at least one image for categorizing said body language resulting in a category, and applying said category for determining said body language message”
The closest prior art of record, Kaihao et al, (“Facial Expression Recognition
based on Deep Evolutional Spatial-Temporal Networks”, IEEE Transactions on image processing, Vol. 26, No. 9, September 2017), discloses body language system for determining a body language message of a living being in a context, said system comprising an artificial intelligence (AI) system, said AI system running a computer program, (see at least: system of Fig. 1 corresponds to the artificial intelligence system), that:
retrieves at least one image of said living being showing body language, (see at least: Fig. 1, input at least one image of human face, “living body”, showing face expression. Note that the facial expression is a type of nonverbal communication, which is therefore equivalent to the “body language”);
labels said living being in said at least one image, resulting in a labeled living being, (see at least: Page 4198, left-hand-column, section IV(A), database includes 123 subjects, with 593 sequences. Among these sequences, 327 of them are labeled with seven emotion labels (anger, contempt, disgust, fear, happiness, sadness, and surprise, [i.e., implicitly labeling the human faces, “living being”, in said at least one image, resulting in a labeled living being]);
determines said context from said at least one image using a first trained machine learning model, (see at least: Page 4198, section IV(B), the Temporal network (PHRNN) detects the facial landmarks using SDM algorithm, including two eyebrows, two eyes, a nose and a mouth, and determining the position and region of local parts, (context), [i.e., where the position and/or the region of local parts corresponds to the “context”, which is determined using the temporal network (PHRNN), “first trained machine learning model”]. See also, Page 4194, left-hand-column, using the Neural Network (MSCNN) to extract spatial features from still frames, using two signals corresponding to different loss functions to increase to increase the variations of different expressions and reduce the difference among identical expressions, such that the MSCNN can capture the whole, appearance of still information, “context information”); and
determines a baseline body language of said labeled living being from said at least one image using the first or a second trained machine learning model, (see at least: Page 4196, left-hand-column, Local features are concatenated along the feature extraction cascade, while the global high-level features are formed in the upper layers based on the facial morphological variations and dynamically evolutional properties of expression, [i.e., where the learned global high-level features correspond to the baseline value, and implicitly determined from at least one landmark face image using the PHRNN, “the first trained model]).
Kaihao et al further discloses adapting model fusion designed Spatial-Temporal Networks (PHRNN-MSCNN), by using a fusion function as shown in equation 14, implicitly based on the determined the position and region of local parts, (context), and the determined global high-level features, “baseline value(s)” from at least one landmark face image), [i.e., adapting the model fusion designed Spatial-Temporal Networks (PHRNN-MSCNN) using said baseline body language and said context], (see at least: Page 4197, under section C, “model fusion”, left and right-hand-columns); and applying the adapted trained machine learning model of said AI system to at least one of said at least one image for categorizing said body language resulting in a category, (see at least: Page 4197, section C, using the adapted fusion function to obtain the predicted sorting of expression classes, ((Ai) of eq. 14), of at least one image face expression); and applying said category for determining said body language message, (see at least: Page 4200, section D, tables II, III, IV, applying the predicted expression classes((Ai) of eq. 14) of the fusion function of the designed Spatial-Temporal Networks (PHRNN-MSCNN) to perform the face expression recognition, as shown at last line of tables II, III, IV, “determine said face expression message”).
However, while disclosing adapting the model fusion designed Spatial-
Temporal Networks (PHRNN-MSCNN) using said baseline body language and said context]; Kaihao fails to teach or suggest, either alone or in combination with the other cited references, adapting the trained machine learning model of said AI system using said baseline body language and said context to form an adapted trained machine learning model; and applying the adapted trained machine learning model of said AI system to at least one of said at least one image for categorizing said body language resulting in a category, and applying said category for determining said body language message
A further prior art of record, Musham et al, (US-PGPUB 20190354592) discloses a AI system further running a sign language computer program for retrieving at least one image of said living being showing sign language and applying said AI system to transform said sign language into said sign language message, said computer program of said body language system determining a body language message from said at least one image for validating said sign language message, (see at least: Figs. 14-15, and claim 1, tracking both hands of persons and records gestures from behind said hands, before said device uses its own in-system database of behind-the-hands perspective correlative gestural images, [i.e., implicitly retrieving at least one image of said living being showing sign language from the database], to map the captured images to standard front-of-hand sign language dictionaries via a machine-learned model, “AI system”, that is trained using multiple similar images per word i.e., convert the visual analogue data to digital data, to text-based language, and outputs in machine speech to a second person’s smartphone [i.e., applying said AI system, “machine-learned model”, to transform said sign language into said sign language message, “text-based language”, where said computer program of said body language system determining a body language message from said at least one image for validating said sign language message, “outputs in machine speech to a second person’s smartphone for implicitly validating said sign language message”); but fails to teach or suggest, either alone or in combination with the other cited references, the above limitations of claim 1, (as combined with the other claimed limitations).
Another prior art of record, Sidney et al, (“Automatic detection of learners affect from Gross body language”, 2009, Pages 123-150), discloses collecting training and validation data on affective states in a learning session with Auto-Tutor, after which the learners’ affective states (i.e., emotions) were rated by the learner, a peer, and two trained judges, and performing Machine-learning experiments for detecting boredom, confusion, delight, flow, and frustration, from neutral; but fails to teach or suggest, either alone or in combination with the other cited references, the above limitations of claim 1, (as combined with the other claimed limitations).
Regarding claim 15, claim 15 recites substantially similar limitations as set forth
in claim 1. As such, claim 10 is in condition for allowance, for at least similar reasons, as stated above.
Regarding claim 16, claim 16 recites substantially similar limitations as set forth
in claim 1. As such, claim 16 is in condition for allowance, for at least similar reasons, as stated above.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMARA ABDI whose telephone number is (571)272-0273. The examiner can normally be reached 9:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached at (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AMARA ABDI/Primary Examiner, Art Unit 2668 11/01/2025