Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: “inspector extraction unit” “subject extraction unit” “individual feature extraction unit” “interaction feature extraction unit” “context feature extraction unit” and “diagnostic aid result output unit” in claims 10-18.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claim(s) 1-18 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to the abstract idea of a mental process without significantly more.
Claims 1 and 10 recite supporting Autism Spectrum Disorder (ASD) diagnosis, comprising: extracting a detection area and voice corresponding to an inspector from an input video (a person can mentally extract this information by looking and listening to a scene or from watching a video); extracting a detection area and voice corresponding to an assessment subject from the input video (a person can mentally extract this information by looking and listening to a scene or from watching a video); extracting a feature of the inspector and a feature of the assessment subject (a person can mentally extract this information by looking and listening to a scene or from watching a video); and extracting an interaction feature using the feature of the inspector and the feature of the assessment subject (a person can mentally extract this information by looking and listening to a scene or from watching a video). This judicial exception is not integrated into a practical application, nor amount to significantly more than the judicial exception, because the additionally claimed apparatus, units, and Artificial Intelligence (AI) are mere generic computer components. See 2024 Guidance Update on Patent Subject Matter Eligibility, Including on Artificial Intelligence (Effective: July 17, 2024).
Claims 2 and 11 can be done mentally.
Claims 3 and 12 can be done mentally.
Claims 4 and 13 can be done mentally.
Claims 5 and 14 can be done mentally.
Claims 6 and 15 can be done mentally.
Claims 7 and 16 can be done mentally.
Claims 8 and 17 can be done mentally.
Claims 9 and 18 can be done mentally. The claimed networks are generic computer components that do not integrate the abstract idea into a practical application, nor amount to significantly more than the judicial exception.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1, 9, 10, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Prakash, Varun Ganjigunte, et al. "Computer vision-based assessment of autistic children: Analyzing interactions, emotions, human pose, and life skills." IEEE Access 11 (2023): 47907-47929 (hereinafter referred to as “Prakash”) in view of Sadiq, Saad, et al. "Deep learning based multimedia data mining for autism spectrum disorder (ASD) diagnosis." 2019 international conference on data mining workshops (ICDMW). IEEE, 2019 (hereinafter referred to as “Sadiq”).
Regarding claim 10, Prakash discloses an apparatus for supporting Autism Spectrum Disorder (ASD) diagnosis based on Artificial Intelligence (AI) (see Prakash Abstract, and pgs. 47918-47920, and 47925, where neural networks are used to assess Autism Spectrum Disorder (ASD), and GPUs are used), comprising: an inspector extraction unit for extracting a detection area corresponding to an inspector from an input video (see Prakash Figs. 1, 4, and 11, and pgs. 47918-47920, where the head and hands of the therapist are detected); a subject extraction unit for extracting a detection area corresponding to an assessment subject from the input video (see Prakash Figs. 1, 4, and 11, and pgs. 47918-47920, where the head and hands of the child are detected); an individual feature extraction unit for extracting a feature of the inspector and a feature of the assessment subject (see Prakash Figs. 1, 4, and 11, and pgs. 47918-47920, where head pose, gaze, and hand pointing direction for the therapist and child are determined); and an interaction feature extraction unit for extracting an interaction feature using the feature of the inspector and the feature of the assessment subject (see Prakash Figs. 1, 4, and 11, and pgs. 47918-47920, where joint attention is measured based on the amount of alignment of the head pose, gaze and/or hand pointing direction of the therapist and child).
Prakash does not explicitly disclose voice.
However, Sadiq discloses an apparatus for supporting Autism Spectrum Disorder (ASD) diagnosis based on Artificial Intelligence (AI) (see Sadiq Abstract and pg. 850, where deep learning is used to diagnose Autism Spectrum Disorder (ASD), and a GPU is disclosed), comprising: an inspector extraction unit for extracting a voice corresponding to an inspector from an input video (see Sadiq Figs. 1 and 4, and pgs. 849-851, where adult vocals are extracted); a subject extraction unit for extracting voice corresponding to an assessment subject from the input video (see Sadiq Figs. 1 and 4, and pgs. 849-851, where child vocals are extracted); an individual feature extraction unit for extracting a feature of the inspector and a feature of the assessment subject (see Sadiq Figs. 1 and 4, and pgs. 849-851, where Mel-Frequency Cepstral Coefficients spectrograms are generated for adult and child); and an interaction feature extraction unit for extracting an interaction feature using the feature of the inspector and the feature of the assessment subject (see Sadiq Figs. 1 and 4, and pgs. 849-851, where speaker activity detection (SAD), speaker change detection (SCD), and Diarization features are determined).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the voice analysis of Sadiq with the computer vision analysis of Prakash, because it is predictable that doing so would improve the accuracy of the Autism Spectrum Disorder (ASD) diagnosis and/or assessment by considering both visual and audio data simultaneously instead of just one or the other.
Claim 1 is rejected under the same analysis as claim 10 above.
Regarding claim 18, Prakash discloses wherein: the feature of the inspector is generated through an inspector feature extraction network by inputting the detection area corresponding to the inspector, and the feature of the assessment subject is generated through a subject feature extraction network by inputting the voice corresponding to the assessment subject (see Prakash Figs. 1, 4, and 11, and pgs. 47918-47920, where head pose, gaze, and hand pointing direction for both the therapist and child are determined using multiple machine learning network models).
Prakash does not explicitly disclose voice.
However, Sadiq discloses wherein: the feature of the inspector is generated through an inspector feature extraction network by inputting the voice corresponding to the inspector, and the feature of the assessment subject is generated through a subject feature extraction network by inputting the voice corresponding to the assessment subject (see Sadiq Figs. 1 and 4, and pgs. 849-851, where Mel-Frequency Cepstral Coefficients spectrograms are generated for both adult and child and then processed using multiple machine learning networks).
Claim 9 is rejected under the same analysis as claim 18 above.
Claim(s) 6 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Prakash in view of Sadiq as applied to claims 1 and 10 above, and in further view of DeQuinzio, Jaime Ann, et al. "Generalized imitation of facial models by children with autism." Journal of applied behavior analysis 40.4 (2007): 755-759 (hereinafter referred to as “DeQuinzio”).
Regarding claim 15, Prakash discloses wherein a feature corresponding to each of the inspector and the assessment subject includes a gaze feature and an action feature (see Prakash Figs. 1, 4, and 11, and pgs. 47918-47920, where head pose, gaze, and hand pointing direction for the therapist and child are determined); and wherein a feature corresponding to the assessment subject includes a facial expression feature (see Prakash Figs. 1, 9, and 10, and pgs. 47916-47918, where facial expressions are recognized).
Prakash does not explicitly disclose wherein a feature corresponding to each of the inspector and the assessment subject includes a voice feature; and wherein a feature corresponding to the inspector includes a gaze feature.
However, Sadiq discloses wherein a feature corresponding to each of the inspector and the assessment subject includes a voice feature (see Sadiq Figs. 1 and 4, and pgs. 849-851, where Mel-Frequency Cepstral Coefficients spectrograms are generated for adult and child).
Furthermore, DeQuinzio discloses wherein a feature corresponding to each of the inspector and the assessment subject includes a facial expression feature (see DeQuinzio Abstract, and pgs. 755 and 756, where low and inconsistent rates of imitation of facial models were observed in children with autism).
It would have been obvious to one of ordinary skill in the art before the effective filing date to apply the teaching of DeQuinzio to capture the facial expression features for both the child and therapist of Prakash, as previously modified by Sadiq, because it is predictable that doing so would improve the accuracy of the assessment and/or diagnosis of Autism Spectrum Disorder (ASD) by including the additional data of the low facial expression imitation rates that are a known sign of autism.
Claim 6 is rejected under the same analysis as claim 15 above.
Claim(s) 7 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Prakash in view of Sadiq and DeQuinzio as applied to claims 6 and 15 above, and in further view of Rudovic, Ognjen, et al. "Multi-modal Active Learning From Human Data: A Deep Reinforcement Learning Approach." arXiv preprint arXiv:1906.03098 (2019) (hereinafter referred to as “Rudovic”).
Regarding claim 16, Prakash discloses wherein the individual feature extraction unit generates the gaze feature, the facial expression feature (see Prakash Figs. 1, 9, and 10, and pgs. 47916-47918, where facial expressions are recognized), and the action feature (see Prakash Figs. 1, 4, and 11, and pgs. 47918-47920, where head pose, gaze, and hand pointing direction for the therapist and child are determined).
Prakash does not explicitly disclose a multimodal feature by fusing and a voice feature.
However, Sadiq discloses the voice feature (see Sadiq Figs. 1 and 4, and pgs. 849-851, where Mel-Frequency Cepstral Coefficients spectrograms are generated for adult and child).
Furthermore, Rudovic discloses wherein the individual feature extraction unit generates a multimodal feature by fusing the gaze feature, the facial expression feature, the action feature, and the voice feature (see Rudovic Abstract, Fig. 1, and pgs. 1 and 2, where gaze, facial expression, action, and voice features are fused and useful for assessing engagement level of an autistic child).
It would have been obvious to one of ordinary skill in the art before the effective filing date to apply the feature fusion technique of Rudovic to the features of Prakash, Sadiq, and DeQuinzio, because it is predictable that storing features for the same child and/or therapist together would improve fetch speeds when doing a comprehensive assessment and/or diagnosis of Autism Spectrum Disorder (ASD), and Rudovic further states “[w]e investigate different strategies for multi-modal data fusion, and show that the proposed model-level fusion coupled with RL outperforms the feature-level and modality-specific models . . .” (see Rudivic Abstract).
Claim 7 is rejected under the same analysis as claim 16 above.
Claim(s) 8 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Prakash in view of Sadiq as applied to claims 1 and 10 above, and in further view of Wilson, US 2014/0289822 A1 (hereinafter referred to as “Wilson”).
Regarding claim 17, Prakash does not explicitly disclose wherein the subject extraction unit extracts the detection area and the voice using previously input image and voice information of the assessment subject.
However, Wilson discloses wherein the subject extraction unit extracts the detection area and the voice using previously input image and voice information of the assessment subject (see Wilson para. 0005, where previously stored reference image and voice data are used to match the person in newly captured image and voice data).
It would have been obvious to one of ordinary skill in the art before the effective filing date to use the previously stored reference data of Wilson to assist with detecting and/or confirming the Prakash’s subject in newly captured data, because it is predictable that doing so would increase both the speed and accuracy of detection and confirm the identity of the subject for Autism Spectrum Disorder (ASD) diagnosis and/or assessment.
Claim 8 is rejected under the same analysis as claim 17 above.
No prior art has been applied to claims 2-5 and 11-14.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Abbas et al., US 2022/0369976 A1, discloses using machine learning to diagnose autism based on the video and audio of an interaction (see Abbas paras. 0002, 0063, 0064, 0067-0070, 0085, and 0127-0132, where a computer is disclosed, Autism is a behavior disorder to be identified, and “machine learning” software is used).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANDREW M MOYER whose telephone number is (571)272-9523. The examiner can normally be reached Monday-Friday 9-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ANDREW M MOYER/ Supervisory Patent Examiner, Art Unit 2675