DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR
1.17(e), was filed in this application after final rejection. Since this application is eligible for continued
examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the
finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's
submission filed on 9 September 2025 has been entered.
Response to Amendment
The amendment filed on 9 September 2025 has been entered.
Claims 1-12, 14-16, 18-26 are pending.
Claims 1, 18, 25-26 are amended.
Claims 27-30 are new.
Claims 1-12, 14-16, 18-26, 27-30 will be pending.
Response to Arguments
Applicant's arguments filed on 9 September 2025 have been fully considered, but they are not persuasive.
Applicant’s remarks, regarding the rejections of claims under 35 USC 103, have been fully considered.
Applicant notes Claim 1 is currently amended to include limitations of: "obtaining facial images for a neural network configuration and a neural network training dataset...; augmenting the neural network training dataset using additional images, wherein the additional images comprise synthetic images that represent a specific demographic, wherein the synthetic images expand a number of images for an age band; partitioning the facial images into multiple subgroups...; calculating a multifactor key performance indicator (KPI) per image...; and promoting the neural network configuration and the neural network training dataset..." (emphasis added).
Applicant submits none of the cited references, alone, or in any combination, show or suggest all the elements of currently amended Claim 1. Applicant submits Claim 1, and its corresponding dependent claims, should be deemed allowable. Applicant notes independent Claims 25 and 26 are currently amended in a manner similar to Claim 1 and therefore, Claims 25 and 26 should also be deemed allowable.
Applicant notes Claim 18 is currently amended to recite "wherein the synthetic images are generated and promoted to the production neural network based on a bias and a non-bias in the neural network training dataset" (emphasis added).
Applicant submits none of the cited references show or suggest "wherein the synthetic images are generated and promoted to the production neural network based on a bias and a non- bias in the neural network training dataset." Applicant submits for this reason, as well as the reasons previously stated in the Section 103 remarks regarding parent Claim 1, currently amended Claim 18 should be deemed allowable.
Applicant submits none of the cited references show or suggest the features of Claims 27-30. For this reason, and the reasons previously stated in the Section 103 remarks regarding currently amended parent Claim 1, new Claims 27-30 should be deemed allowable.
Applicant’s arguments have been considered, but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-4, 10-12, 25-26 are rejected under 35 U.S.C. 103 as being unpatentable over Modarresi (U.S. Pre-Grant Publication No. 2015/0213389), in view of Ghosh et al. (U.S. Pre-Grant Publication No. 2020/0293933, hereinafter 'Ghosh') and Srinivas et al. (NPL: "Age, Gender, and Fine-Grained Ethnicity Prediction using Convolutional Neural Networks for the East Asian Face Dataset", hereinafter 'Srinivas').
Regarding claim 1 and analogous claims 25, 26, Modarresi teaches A computer-implemented method for machine learning comprising ([0101] Although exemplary embodiments have been described in terms of systems and methods, it is contemplated that certain functionality described herein may be implemented in software on microprocessors, such as a processors 126 a-n and 128 included in the user devices 134 a-n and analytics server 102, respectively, shown in FIG. 1, and computing devices such as the computer system 1300 illustrated in FIG. 13.; [0102] Aspects of the present invention shown in FIGS. 1-12, or any part(s) or function(s) thereof, may be implemented using hardware, software modules, firmware, tangible computer readable media having logic or instructions stored thereon, or a combination thereof and may be implemented in one or more computer systems or other processing systems.; [0103] FIG. 13 illustrates an example computer system 1300 in which embodiments of the present invention, or portions thereof, may be implemented as computer-readable instructions or code.):
obtaining facial images for a neural network configuration and a neural network training dataset, wherein the neural network training dataset is associated with the neural network configuration ([0026] At this point, test data can be used to calculate the misfit error for each input each variable in a set of input variables. An embodiment divides data into two parts, training data and test data. According to this embodiment, the neural network training dataset is associated with the neural network configuration training data is used to formulate a model (i.e., an algorithm) and the accuracy of the algorithm is checked using the test data.; [0048] As used herein, the term “electronic content” is used to refer to any type of media that can be rendered for display, played on, or used at a computing device, television, or other electronic device. Computing devices include client and server devices such as, but not limited to, servers, desktop computers, laptop computers, smart phones, video game consoles, smart televisions, tablet computers, portable gaming devices, personal digital assistants, etc. obtaining facial images for a neural network configuration and a neural network training dataset Electronic content can include text or multimedia files, such as images, video, audio, or any combination thereof.);
calculating a multifactor key performance indicator (KPI) per image, wherein the calculating is based on analyzing performance of two or more image classifier models ([0005] One embodiment involves analyzing one or more calculating a multifactor key performance indicator (KPI) per image Key Performance Indicators (KPIs) associated with electronic content.; [0024] An embodiment uses of two or more image classifier models Classification and Regression Trees (CART) decision trees as base learners. As an individual base learner can be inaccurate (i.e., produce an inaccurate decision), embodiments combine many base learners (i.e., many decision trees) and use a form of a majority vote, which results in very accurate overall decisions. In certain embodiments, this step uses Classification and Regression Trees (CART) decision trees.; [0048] As used herein, the term “electronic content” is used to refer to any type of media that can be rendered for display, played on, or used at a computing device, television, or other electronic device. Computing devices include client and server devices such as, but not limited to, servers, desktop computers, laptop computers, smart phones, video game consoles, smart televisions, tablet computers, portable gaming devices, personal digital assistants, etc. Electronic content can include text or multimedia files, such as images, video, audio, or any combination thereof.; [0032] calculating is based on analyzing performance Embodiments disclosed herein determine KPIs by analyzing data sets with missing entries, high-dimensional data, quantitative (i.e., numerical) data, qualitative (i.e., categorical) data, and data sets including statistical outliers.); and
Modarresi fails to teach augmenting the neural network training dataset using additional images, wherein the additional images comprise synthetic images that represent a specific demographic, wherein the synthetic images expand a number of images for an age band; partitioning the facial images into multiple subgroups, wherein the multiple subgroups represent demographics with potential for biased training; promoting the neural network configuration and the neural network training dataset to a production neural network, wherein the promoting is based on the multifactor key performance indicator.
Ghosh teaches partitioning the facial images into multiple subgroups, wherein the multiple subgroups represent demographics with potential for biased training ([0195] In certain embodiments, multi-structured big data sources 304 may be dynamically ingested during the data ingestion and processing 1204 phase. In certain embodiments, based upon a particular context, partitioning the facial images into multiple subgroups extraction, parsing, and tagging operations are performed on language, text and images they contain to generate associated datasets 1214.; [0260] In certain embodiments, the AIS impartiality assessment 1342 engine may be configured to multiple subgroups represent demographics with potential for biased training obtain bias ranges of features; compare the identified significant contrasts to obtained bias ranges, and determine which of the identified significant contrasts fall outside the obtained bias ranges.);
promoting the neural network configuration and the neural network training dataset to a production neural network, wherein the promoting is based on the multifactor key performance indicator ([0155] In certain embodiments, the cognitive agent governance and assurance operations may include KPI-driven AI model optimization. In these embodiments, the definition of such KPIs, and the method by which they are used to optimize a particular AI model, is a matter of design choice.; [0164] In turn, the results of the KPI evaluations are then used as feedback to improve the performance of the cognitive process. In certain embodiments, the promoting is based on the multifactor key performance indicator results of the KPI evaluations may be provided as input in step 904 to promoting the neural network configuration and the neural network training dataset to a production neural network determine additional operational and performance parameters related to the cognitive process. In certain embodiments, these additional operational and performance parameters may be used to repeat one or more steps associated with the lifecycle of the cognitive process to revise its functionality, improve its performance, or both.).
Modarresi and Ghosh are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Ghosh to Modarresi before the effective filing date of the claimed invention in order to ensure ML models are designed, implemented, and maintained responsibly and provide auditability of their fairness, robustness, transparency, and interpretability (cf. Ghosh, [0031] Accordingly, certain aspects of the invention likewise reflect there are ethical, moral, and social obligations for researchers, developers, and organizations alike to ensure their ML models are designed, implemented, and maintained responsibly. Various aspects of the invention likewise reflect an appreciation that one approach to achieving responsible design, implementation, and maintenance of ML models is to provide auditability of their fairness, robustness, transparency, and interpretability.).
Srinivas teaches augmenting the neural network training dataset using additional images, wherein the additional images comprise synthetic images that represent a specific demographic ([1. Introduction, pg. 954] Convolutional neural networks are used here to predict demographics and provide a baseline performance for the WEAFD. CNNs have recently found great success in many applications relying on facial imaging, including facial alignment via landmark detection, parsing, identity recognition, attribute recognition, expression detection, and additional images comprise synthetic images that represent a specific demographic demographic estimation including age, gender, and ethnicity. The success of CNNs stem from the ability of deep neural networks to automatically discover features which discriminate and describe the underlying differences in data that are difficult to quantify. Facial data often exhibits such differences through unconstrained pose, lighting, expressions, and occlusions.; [A. Training, Validation and Testing Dataset, pg. 958] For training the age models, we limited the number of images in each class to a maximum of 500. For training the gender models, we limited the number of images in each class to a maximum of 4000. For training the finegrained ethnicity model, we limited the number of images in each class to a maximum of 500. We use augmenting the neural network training dataset using additional images data augmentation during training to increase the number of training images. Data augmentation is achieved by mirroring the image. This doubles the number of images used during training.),
wherein the synthetic images expand a number of images for an age band ([A. Training, Validation and Testing Dataset, pg. 958] We use wherein the synthetic images expand a number of images data augmentation during training to increase the number of training images. Data augmentation is achieved by mirroring the image. This doubles the number of images used during training.; [B. Metadata Collection Procedure, pg. 955] Age, gender and fine-grained ethnicity information were labeled by the Turkers. Age Estimation: Estimate the for an age band age range of the subjects in the image. The choices for age ranges are: 0-2 years, 3-5 years, 6-9 years, 10-12 years, 13-16 years, 17-23 years, 24-39 years, 40-54 years, 55-64 years, and 65+ years.);
Modarresi, Ghosh, and Srinivas are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi and Ghosh, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Srinivas to Modarresi before the effective filing date of the claimed invention in order to use the Wild East Asian Face Dataset (WEAFD) to predict age, gender and fine-grained ethnicity of an individual by providing baseline results using a convolutional neural network (CNN) (cf. Srinivas, [Abstract, pg. 953] This paper explores the difficulty of performing automatic demographic prediction on the East Asian population. We introduce the Wild East Asian Face Dataset (WEAFD), a new and unique dataset, to the research community. This dataset consists primarily of labeled face images of individuals from East Asian countries, including Vietnam, Burma, Thailand, China, Korea, Japan, Indonesia, and Malaysia. East Asian Amazon Mechanical Turk annotators were used to label the age, gender and fine grain ethnicity attributes to reduce the impact of the “other-race effect” and improve quality of annotations. We focus on predicting age, gender and fine-grained ethnicity of an individual by providing baseline results using a convolutional neural network (CNN). Fine-grained ethnicity prediction refers to predicting refined categorization of the human population (Chinese, Japanese, Korean, etc.). Performance of two CNN architectures is presented, highlighting the difficulty of these tasks and showcasing potential design considerations that improve network optimization by promoting region based feature extraction.).
Regarding claim 2, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Ghosh teaches wherein the multifactor key performance indicator (KPI) identifies bias in the training dataset ([0155] In certain embodiments, the cognitive agent governance and assurance operations may include KPI-driven AI model optimization. In these embodiments, the definition of such multifactor key performance indicator (KPI) KPIs, and the method by which they are used to optimize a particular AI model, is a matter of design choice.; [0216] In certain embodiments, the AIS impartiality assessment operation may be performed to identifies bias in the training dataset detect the presence of bias in a particular ML model, such as the opaque model 1332, and if detected, assess its effect on the outcome of an associated cognitive computing operation.; [0164] In turn, the results of the KPI evaluations are then used as feedback to improve the performance of the cognitive process. In certain embodiments, the results of the KPI evaluations may be provided as input in step 904 to determine additional operational and performance parameters related to the cognitive process.).
Modarresi, Ghosh, and Srinivas are combinable for the same rationale as set forth above with respect to claim 1.
Regarding claim 3, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 2.
Ghosh teaches wherein identified bias precludes promotion to the production neural network ([0216] In certain embodiments, the AIS impartiality assessment operation may be performed to identified bias detect the presence of bias in a particular ML model, such as the opaque model 1332, and if detected, assess its effect on the outcome of an associated cognitive computing operation.; [0271] In certain embodiments, the identified significant contrasts may be presented as a cognitive insight 1352 to the developer of the model, such that the developer can precludes promotion to the production neural network modify the opaque model 1332, the training corpus 1302, the model trainer 1304, or a combination thereof, to achieve less biased results. Skilled practitioners of the art will recognize that the described presentation of identified significant contrasts as a cognitive insight 1352 provides the developer of the opaque model 1332 a basis for modifying their model, the training corpus 1302, the model trainer 1304, or a combination thereof.).
Modarresi, Ghosh, and Srinivas are combinable for the same rationale as set forth above with respect to claim 1.
Regarding claim 4, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 2.
Ghosh teaches wherein an absence of identified bias allows promotion to the production neural network ([0216] In certain embodiments, the AIS impartiality assessment operation may be performed to absence of identified bias detect the presence of bias in a particular ML model, such as the opaque model 1332, and if detected, assess its effect on the outcome of an associated cognitive computing operation.; [0271] In certain embodiments, the identified significant contrasts may be presented as a cognitive insight 1352 to the developer of the model, such that the developer can modify the opaque model 1332, the training corpus 1302, the model trainer 1304, or a combination thereof, to achieve less biased results. Skilled practitioners of the art will recognize that the described presentation of identified significant contrasts as a cognitive insight 1352 provides the developer of the opaque model 1332 allows promotion to the production neural network a basis for modifying their model, the training corpus 1302, the model trainer 1304, or a combination thereof.).
Modarresi, Ghosh, and Srinivas are combinable for the same rationale as set forth above with respect to claim 1.
Regarding claim 10, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi teaches wherein the training dataset includes facial images ([0026] At this point, test data can be used to calculate the misfit error for each input each variable in a set of input variables. An embodiment divides data into two parts, training dataset training data and test data. According to this embodiment, the training data is used to formulate a model (i.e., an algorithm) and the accuracy of the algorithm is checked using the test data.; [0048] As used herein, the term “electronic content” is used to refer to any type of media that can be rendered for display, played on, or used at a computing device, television, or other electronic device. Computing devices include client and server devices such as, but not limited to, servers, desktop computers, laptop computers, smart phones, video game consoles, smart televisions, tablet computers, portable gaming devices, personal digital assistants, etc. Electronic content can include text or multimedia files, such as includes facial images images, video, audio, or any combination thereof.).
Modarresi, Ghosh, and Srinivas are combinable for the same rationale as set forth above with respect to claim 1.
Regarding claim 11, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Ghosh teaches further comprising training the production neural network, using the neural network training dataset that is promoted ([0216] In certain embodiments, the AIS impartiality assessment operation may be performed to detect the presence of bias in a particular ML model, such as the opaque model 1332, and if detected, assess its effect on the outcome of an associated cognitive computing operation.; [0271] In certain embodiments, the identified significant contrasts may be presented as a cognitive insight 1352 to the developer of the model, such that the developer can modify the opaque model 1332, the training the production neural network using the neural network training dataset that is promoted training corpus 1302, the model trainer 1304, or a combination thereof, to achieve less biased results. Skilled practitioners of the art will recognize that the described presentation of identified significant contrasts as a cognitive insight 1352 provides the developer of the opaque model 1332 a basis for modifying their model, the training corpus 1302, the model trainer 1304, or a combination thereof.).
Modarresi, Ghosh, and Srinivas are combinable for the same rationale as set forth above with respect to claim 1.
Regarding claim 12, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 11.
Ghosh teaches wherein the neural network training dataset that is promoted enables bias mitigation ([0216] In certain embodiments, the AIS impartiality assessment operation may be performed to detect the presence of bias in a particular ML model, such as the opaque model 1332, and if detected, assess its effect on the outcome of an associated cognitive computing operation.; [0271] In certain embodiments, the identified significant contrasts may be presented as a cognitive insight 1352 to the developer of the model, such that the developer can modify the opaque model 1332, the neural network training dataset that is promoted training corpus 1302, the model trainer 1304, or a combination thereof, to enables bias mitigation achieve less biased results. Skilled practitioners of the art will recognize that the described presentation of identified significant contrasts as a cognitive insight 1352 provides the developer of the opaque model 1332 a basis for modifying their model, the training corpus 1302, the model trainer 1304, or a combination thereof.).
Modarresi, Ghosh, and Srinivas are combinable for the same rationale as set forth above with respect to claim 1.
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Modarresi, in view of Ghosh, Srinivas, and further in view of Bekkar et al. (NPL: "Evaluation Measures for Models Assessment over Imbalanced Data Sets", hereinafter 'Bekkar') and Tewari et al. (NPL: "FML: Face Model Learning from Videos", hereinafter 'Tewari').
Regarding claim 5, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the multifactor KPI comprises an F-measure, an ROC- AUC measure, a precision measure, a recall/true positive rate, a false positive rate, a total number of videos measure, a number of positive videos measure, a number of positive frames measure, or a number of negative frames measure.
Bekkar teaches wherein the multifactor KPI comprises an F-measure, an ROC- AUC measure, a precision measure, a recall/true positive rate, a false positive rate, a total number of videos measure, a number of positive videos measure, a number of positive frames measure, or a number of negative frames measure ([Abstract, pg. 27] Imbalanced data learning is one of the challenging problems in data mining; among this matter, founding the right model assessment measures is almost a primary research issue. Skewed class distribution causes a misreading of common evaluation measures as well it lead a biased classification. This article presents a set of alternative for multifactor KPI imbalanced data learning assessment, using a combined measures (G-means, likelihood ratios, Discriminant power, F-measure F-Measure Balanced Accuracy, Youden index, Matthews correlation coefficient), and graphical performance assessment (an ROC- AUC measure ROC curve, Area Under Curve, Partial AUC, Weighted AUC, Cumulative Gains Curve and lift chart, Area Under Lift AUL), that aim to provide a more credible evaluation. We analyze the applications of these measures in churn prediction models evaluation, a well known application of imbalanced data.; [2. Fundamental evaluation measures, pg. 27] In machine learning, the classifier is basically evaluated by a confusion matrix. For a binary class problem a matrix is a square of 2×2 as shown in Table 1; column represents the classifier prediction; while the row is the real value of class label. In imbalanced data context, by convention, the observations of minority class are labelled as positive, whilst and the class label of the majority class observations are labelled negative. Accuracy, the most common metric for classifier evaluation, it assesses the overall effectiveness of the algorithm by estimating the probability of the true value of the class label. On the other hand the error rate=1-accuracy is an estimation of misclassification probability according to model prediction. Intuitively, a precision measure precision is a measure of correctness (i.e., a false positive rate out of positive labeled examples, how many are really a positive examples), while a recall/true positive rate Sensitivity (or Recall) is a measure of completeness or accuracy of positive examples (i.e., how many examples of the positive class were labeled correctly). These two metrics, share an inverse relationship between each other. However, unlike accuracy and error, precision and recall are not sensitive to changes in data distributions. A perfect model will capture all positive examples (Recall = 1), and score as only the examples that are in fact (Precision = 1), from an analytical point of view it is desirable to increase recall without sacrificing accuracy.; [5. Application cases and discussion:, pg. 35] Following this assessment we can affirm that model 1B is the best in class in this case in spite of the lowest Accuracy that it has on the 2nd data set, the Model 2D Basic CHAID reveal the highest KPI in accuracy, Precision and Specificity; AUC are close over different models; while the lift brings a low preference to Model 2B. The Discriminant power and MCC values presents a correlation with Accuracy value, where we observe the lowest value of DP in model 2B, and highest one associated to model 2D which is the same trend of accuracy value.).
Modarresi, Ghosh, Srinivas, and Bekkar are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi, Ghosh, and Srinivas, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Bekkar to Modarresi before the effective filing date of the claimed invention in order to assess imbalanced data and evaluations of model predictions (cf. Bekkar, [Abstract, pg. 27] Imbalanced data learning is one of the challenging problems in data mining; among this matter, founding the right model assessment measures is almost a primary research issue. Skewed class distribution causes a misreading of common evaluation measures as well it lead a biased classification. This article presents a set of alternative for imbalanced data learning assessment, using a combined measures (G-means, likelihood ratios, Discriminant power, F-Measure Balanced Accuracy, Youden index, Matthews correlation coefficient), and graphical performance assessment (ROC curve, Area Under Curve, Partial AUC, Weighted AUC, Cumulative Gains Curve and lift chart, Area Under Lift AUL), that aim to provide a more credible evaluation. We analyze the applications of these measures in churn prediction models evaluation, a well known application of imbalanced data.).
Tewari teaches wherein the multifactor KPI comprises an F-measure, an ROC- AUC measure, a precision measure, a recall/true positive rate, a false positive rate, a total number of videos measure, a number of positive videos measure, a number of positive frames measure, or a number of negative frames measure ([Per-frame Parameter Estimation Network, pg. 10816] We employ a convolutional network to extract low-level features. We then apply a series of convolutions, ReLU, and fully connected layers to regress the a number of positive frames measure, or a number of negative frames measure per-frame parameters p [f] . We refer to the supplemental document for further details.)
Modarresi, Ghosh, Srinivas, Bekkar, and Tewari are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi, Ghosh, Srinivas, and Bekkar, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Tewari to Modarresi before the effective filing date of the claimed invention in order to obtain a more accurate and robust model of facial geometry and albedo (cf. Tewari, [1. Introduction, pg. 10813] From a technical point of view, one of our main contributions is a novel multi-frame consistency loss, which ensures that the face identity and albedo reconstruction is consistent across frames of the same subject. This way we can avoid depth ambiguities present in many monocular approaches and obtain a more accurate and robust model of facial geometry and albedo. Moreover, by imposing orthogonality between our learned face identity model and an existing blendshape expression model, our approach automatically disentangles facial expressions from identity based geometry variations, without resorting to a large set of handcrafted priors.).
Claims 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Modarresi, in view of Ghosh, Srinivas, and further in view of Zhang et al. (NPL: "Mitigating Unwanted Biases with Adversarial Learning", hereinafter 'Zhang').
Regarding claim 6, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the multifactor KPI comprises an equal odds or equal opportunity measure.
Zhang teaches wherein the multifactor KPI comprises an equal odds or equal opportunity measure ([1 INTRODUCTION, pg. 335] Machine learning leverages data to build models capable of assessing the labels and properties of novel data. Unfortunately, the available training data frequently contains biases with respect to things that we would rather not use for decision making. Machine learning builds models faithful to training data and can lead to perpetuating these undesirable biases. For example, systems designed to predict creditworthiness and systems designed to perform analogy completion have been demonstrated to be biased against racial minorities and women respectively. Ideally we would be able to build a model which captures exactly those generalizations from the data which are useful for performing some task which are not discriminatory in a way which the people building those models consider unfair. Work on training machine learning systems that output fair decisions has defined several multifactor KPI useful measurements for fairness: Demographic Parity, comprises an equal odds or equal opportunity measure Equality of Odds, and Equality of Opportunity. These can be imposed as constraints or incorporated into a loss function in order to mitigate disproportional outcomes in the system’s output predictions regarding a protected demographic, such as sex.).
Modarresi, Ghosh, Srinivas, and Zhang are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi, Ghosh, and Srinivas, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Zhang to Modarresi before the effective filing date of the claimed invention in order to mitigate disproportional outcomes in the system’s output predictions regarding a protected demographic, such as sex (cf. Zhang, [1. Introduction, pg. 335] Work on training machine learning systems that output fair decisions has defined several useful measurements for fairness: Demographic Parity, Equality of Odds, and Equality of Opportunity. These can be imposed as constraints or incorporated into a loss function in order to mitigate disproportional outcomes in the system’s output predictions regarding a protected demographic, such as sex.).
Regarding claim 7, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the multifactor KPI identifies models that generalize across one or more of the demographics.
Zhang teaches wherein the multifactor KPI identifies models that generalize across one or more of the demographics ([4 PROPERTIES, pg. 337] We note several properties of the above method that we believe distinguish it from past work. (1) multifactor KPI identifies models that generalize across one or more of the demographics Generality: The above method can be used to enforce demographic parity, eqality of odds, or eqality of opportunity as described in Hardt et al. [5]. Further, it applies without modification to the cases when the output variable and/or protected variable are continuous instead of discrete.).
Modarresi, Ghosh, Srinivas, and Zhang are combinable for the same rationale as set forth above with respect to claim 6.
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Modarresi, in view of Ghosh, Srinivas, and further in view of Bist et al. (U.S. Pre-Grant Publication No. 2020/0288206, hereinafter 'Bist').
Regarding claim 8, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the two or more image classifier models operate on the multiple subgroups of facial images.
Bist teaches wherein the two or more image classifier models operate on the multiple subgroups of facial images ([0089] In an embodiment, the present invention provides a method for evaluating media content based on combining two or more image classifier models multi-modal inputs from the participants that include operate on the multiple subgroups of facial images reactions and emotions (captured in form of facial expression) that are recorded in real-time on a frame-by-frame basis. The real time reactions and emotions may be recorded in two different steps or campaigns (with two different sets of people), and which include different participants for each.).
Modarresi, Ghosh, Srinivas, and Bist are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi, Ghosh, and Srinivas, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Bist to Modarresi before the effective filing date of the claimed invention in order to enable a way to add meaningful contextual and personalized information to content, that could then be used for searching, classifying, or analyzing the particular content in a variety of ways (cf. Bist, [0005] In light of above, a method and a system for a scalable platform is provided that enables granular tagging of any multimedia or other web content over connected networks. The method of the invention provides an ability to go in much more granular within a content and enable a way to add meaningful contextual and personalized information to it, that could then be used for searching, classifying, or analyzing the particular content in a variety of ways, and in a variety of applications.).
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Modarresi, in view of Ghosh, Srinivas, and further in view of Rohekar et al. (U.S. Pre-Grant Publication No. 2023/0117143, hereinafter 'Rohekar').
Regarding claim 9, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the neural network configuration includes a neural network topology.
Rohekar teaches wherein the neural network configuration includes a neural network topology ([0218] A second exemplary type of neural network is the Convolutional Neural Network (CNN). A CNN is a specialized neural network configuration includes a neural network topology feedforward neural network for processing data having a known, grid-like topology, such as image data. Accordingly, CNNs are commonly used for compute vision and image recognition applications, but they also may be used for other types of pattern recognition such as speech and language processing.).
Modarresi, Ghosh, Srinivas, and Rohekar are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi, Ghosh, and Srinivas, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Rohekar to Modarresi before the effective filing date of the claimed invention in order to compute vision, image recognition and other types of pattern recognition such as speech and language processing (cf. Rohekar, [0218] A second exemplary type of neural network is the Convolutional Neural Network (CNN). A CNN is a specialized feedforward neural network for processing data having a known, grid-like topology, such as image data. Accordingly, CNNs are commonly used for compute vision and image recognition applications, but they also may be used for other types of pattern recognition such as speech and language processing.).
Claims 14-16, 19 are rejected under 35 U.S.C. 103 as being unpatentable over Modarresi, in view of Ghosh, Srinivas, and further in view of Kristensen et al. (U.S. Pre-Grant Publication No. 2021/0286923, hereinafter 'Kristensen').
Regarding claim 14, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the additional images are processed to produce a further multifactor KPI.
Kristensen teaches wherein the additional images are processed to produce a further multifactor KPI ([0027] A sensor model may include a deep neural network (DNN) with any suitable architecture, and may support generative learning. For example, a sensor model may include a generative adversarial network (GANs), a variational autoencoder (VAE), and/or another type of DNN or machine learning model. At a high level, a sensor model may accept some encoded representation of a scene configuration as an input using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.).; 0127] Now referring to FIG. 8, FIG. 8 includes a data flow diagram for key performance indicator (KPI) analysis and observation, in accordance with some embodiments of the present disclosure. A the additional images are processed to produce a further multifactor KPI KPI evaluation component may evaluate the performance of the virtual object(s) (e.g., vehicles, robots, etc.). Logs 806 may be generated and passed to re-simulator/simulator 804. The re-simulator/simulator 804 may provide sensor data to the software stack(s) 116 which may be executed using HIL, SIL, or a combination thereof. The KPI evaluation component 802 may use different metrics for each simulation or re-simulation instance. For examples, for re-simulation, KPI evaluation component may provide access to the original re-played CAN data and/or the newly generated CAN data from the software stack(s) 116 (e.g., from HIL or SIL).).
Modarresi, Ghosh, Srinivas, and Kristensen are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi, Ghosh, and Srinivas, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Kristensen to Modarresi before the effective filing date of the claimed invention in order to collect real world data to derive training data used to train a sensor model (cf. Kristensen, [0005] Real-world data and/or virtual data may be collected and used to derive training data (e.g., input scene configurations and/or ground truth sensor data), which may be used to train the sensor model to predict virtual sensor data for a given scene configuration.).
Regarding claim 15, Modarresi, as modified by Ghosh, Srinivas, and Kristensen, teaches The method of claim 14.
Ghosh teaches wherein the additional images are promoted based on the further multifactor KPI ([0155] In certain embodiments, the cognitive agent governance and assurance operations may include based on the further multifactor KPI KPI-driven AI model optimization. In these embodiments, the definition of such KPIs, and the method by which they are used to optimize a particular AI model, is a matter of design choice.; [0216] In certain embodiments, the AIS impartiality assessment operation may be performed to detect the presence of bias in a particular ML model, such as the opaque model 1332, and if detected, assess its effect on the outcome of an associated cognitive computing operation.; [0271] In certain embodiments, the identified significant contrasts may be presented as a cognitive insight 1352 to the developer of the model, such that the developer can modify the opaque model 1332, the additional images are promoted training corpus 1302, the model trainer 1304, or a combination thereof, to achieve less biased results. Skilled practitioners of the art will recognize that the described presentation of identified significant contrasts as a cognitive insight 1352 provides the developer of the opaque model 1332 a basis for modifying their model, the training corpus 1302, the model trainer 1304, or a combination thereof.).
Modarresi, Ghosh, Srinivas, and Kristensen are combinable for the same rationale as set forth above with respect to claim 14.
Regarding claim 16, Modarresi, as modified by Ghosh, Srinivas, and Kristensen, teaches The method of claim 14.
Ghosh teaches wherein the additional images provide neural network training dataset bias mitigation ([0155] In certain embodiments, the cognitive agent governance and assurance operations may include KPI-driven AI model optimization. In these embodiments, the definition of such KPIs, and the method by which they are used to optimize a particular AI model, is a matter of design choice.; [0216] In certain embodiments, the AIS impartiality assessment operation may be performed to detect the presence of bias in a particular ML model, such as the opaque model 1332, and if detected, assess its effect on the outcome of an associated cognitive computing operation.; [0271] In certain embodiments, the identified significant contrasts may be presented as a cognitive insight 1352 to the developer of the model, such that the developer can modify the opaque model 1332, the additional images training corpus 1302, the model trainer 1304, or a combination thereof, to provide neural network training dataset bias mitigation achieve less biased results. Skilled practitioners of the art will recognize that the described presentation of identified significant contrasts as a cognitive insight 1352 provides the developer of the opaque model 1332 a basis for modifying their model, the training corpus 1302, the model trainer 1304, or a combination thereof.).
Modarresi, Ghosh, Srinivas, and Kristensen are combinable for the same rationale as set forth above with respect to claim 14.
Regarding claim 19, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the additional images are generated using a generative adversarial network (GAN).
Kristensen teaches wherein the additional images are generated using a generative adversarial network (GAN) ([0035] Generally, an architecture for the sensor model 120 may be selected to fit the shape of the desired input and output data. Some non-limiting examples of DNNs include perceptron, feed-forward, radial basis, deep feed forward, recurrent, long/short term memory, gated recurrent unit, autoencoder, variational autoencoder, convolutional, deconvolutional, and generative adversarial, to name a few. Some DNNs like GANs may include a convolutional neural network that accepts and evaluates an input image. Moreover, some neural network architectures are designed to accept and operate on an input vector that encodes some type of input information. Further, some neural network architectures—such as using a generative adversarial network (GAN) GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors. Some generative techniques such as conditional the additional images are generated image synthesis may be applied to generate an output such as a photorealistic image conditioned on some input data.).
Modarresi, Ghosh, Srinivas, and Kristensen are combinable for the same rationale as set forth above with respect to claim 14.
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Modarresi, in view of Ghosh, Srinivas, and further in view of Jaipuria et al. (NPL: "Deflating Dataset Bias Using Synthetic Data Augmentation", hereinafter 'Jaipuria').
Regarding claim 18, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the synthetic images are generated and promoted to the production neural network based on a bias and a non-bias in the neural network training dataset.
Jaipuria teaches wherein the synthetic images are generated and promoted to the production neural network based on a bias and a non-bias in the neural network training dataset ([3.1. Revisiting “Name That Dataset”, pg. 3] We chose to apply synthetic images are generated and promoted to the production neural network data augmentation to only these two datasets as they are also used for the lane detection experiments in Section 5 with readily available sim2real data on hand. Fig. 1 compares the confusion matrices of the two classifiers, with and without synthetic data augmentation. Here, the labels 1, 2, 3, 4 and 5 denote the ApolloScape, BDD100K, CULane, Mapillary and TuSimple datasets respectively. Consistent with the motivating hypothesis H, synthetic data augmentation diffuses the strength of the diagonal indicating deflated dataset bias.; [6.2. Experiment Details, pg. 7] As with the other tasks, we train the task network with different percentages of simulated (A + S) and sim2real (A + G) data, starting from 0% to 100% and test on KITTI sequences that were not seen during training (B). We use the based on a bias and a non-bias in the neural network training dataset Root Mean Squared Error (RMSE) metric to determine the performance of the network trained on a particular sim/real or sim2real/real mix, after limiting maximum depth to 100m. We provide detailed RMSE results in Figure 9. We also tested this task based on accuracy of depth estimation, measured as the ratio of correctly estimated depth pixels to the total number of depth pixels. These results are summarized, along with RMSE in Table 3 and more detailed results for accuracy are provided in the Supplementary Material. RMSE and accuracy are common metrics used in prior work on single image depth [6]. A lower value of RMSE indicates better performance while the same is true for a higher value for accuracy.).
Modarresi, Ghosh, Srinivas, and Jaipuria are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi, Ghosh, and Srinivas, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Jaipuria to Modarresi before the effective filing date of the claimed invention in order to combine the benefits of gaming engine simulations and sim2real style transfer techniques for filling gaps in real datasets for vision tasks (cf. Jaipuria, [Abstract, pg. 1] The goal of this paper is to investigate the use of targeted synthetic data augmentation - combining the benefits of gaming engine simulations and sim2real style transfer techniques - for filling gaps in real datasets for vision tasks. Empirical studies on three different computer vision tasks of practical use to AVs - parking slot detection, lane detection and monocular depth estimation - consistently show that having synthetic data in the training mix provides a significant boost in cross-dataset generalization performance as compared to training on real data only, for the same size of the training set.).
Claims 20-24, 29-30 are rejected under 35 U.S.C. 103 as being unpatentable over Modarresi, in view of Ghosh, Srinivas, and further in view of Wilf et al. (U.S. Pre-Grant Publication No. 2019/0005359, hereinafter 'Wilf').
Regarding claim 20, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the additional images comprise real images from a specific demographic.
Wilf teaches wherein the additional images comprise real images from a specific demographic ([0030] According to an embodiment of the invention, the process of the face-based personality analysis module may further comprise one or more additional tasks such as: searching for additional images of the subject person by using a name search engine that can be augmented by face recognition by face recognition and analyzing said additional images to enhance the accuracy of predicted personality traits or capabilities.; [0170] Face-based demographic classification (gender, age, ethnicity) is known in prior art. The idea is that improved personality trait/behavior/capability classification may benefit from demographic segmentation. For example, the system additional images comprise real images from a specific demographic collects images of male researchers and female researchers, doing the same with a control group, which in this context may comprise of people that are known not to be researchers. Now we train a classifier for male researcher and one for female researcher (and of course verify during development that we benefit from such segmentation).).
Modarresi, Ghosh, Srinivas, and Wilf are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi, Ghosh, and Srinivas, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Wilf to Modarresi before the effective filing date of the claimed invention in order to predict personality traits based on automated computerized or computer assisted analysis of that person's body images and in particular face images (cf. Wilf, [0006] It is an object of the present invention to provide a system which is capable of predicting personality traits based on automated computerized or computer assisted analysis of that person's body images and in particular face images.; [0007] It is another an object of the present invention to provide an automated method of selecting personality traits and capabilities that can be predicted from face images and predicting such traits and capabilities from one or more face images.).
Regarding claim 21, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the additional images comprise real images containing a specific facial characteristic.
Wilf teaches wherein the additional images comprise real images containing a specific facial characteristic ([0030] According to an embodiment of the invention, the process of the face-based personality analysis module may further comprise one or more additional tasks such as: searching for additional images of the subject person by using a name search engine that can be augmented by face recognition by face recognition and analyzing said additional images to enhance the accuracy of predicted personality traits or capabilities.; [0110] Face Landmark Detection module 120 performs geometric rectification and/or frontalization of face images, e.g., by searching for additional images comprise real images containing a specific facial characteristic specific facial key points that are useful in verifying the face pose and later converting the face image into a standard, normalized representation that may represent a frontal pose, and may have an essentially neutral expression. Such key points customarily include the eyes, eye corners, eyebrows, the nose top, the mouth, the chin, etc. as indicated by the black dots on the face image in FIG. 5. Higher level of detail may include dozens of such points which may be used as an image descriptor for the learning and prediction steps.).
Modarresi, Ghosh, Srinivas, and Wilf are combinable for the same rationale as set forth above with respect to claim 20.
Regarding claim 22, Modarresi, as modified by Ghosh, Srinivas, and Wilf, teaches The method of claim 21.
Wilf teaches wherein the specific facial characteristic includes facial expressions ([0030] According to an embodiment of the invention, the process of the face-based personality analysis module may further comprise one or more additional tasks such as: searching for additional images of the subject person by using a name search engine that can be augmented by face recognition by face recognition and analyzing said additional images to enhance the accuracy of predicted personality traits or capabilities.; [0132] specific facial characteristic includes facial expressions Face expression analysis modules 240 further selects face images of neutral expression, in order to avoid biased results of face personality analysis due to extreme expressions. Such expression analysis can be implemented by using techniques such as disclosed by B. Fasel and J. Luettin, Automatic Facial Expression Analysis: A Survey (1999), Pattern Recognition, 36, pp. 259-275, 1999.).
Modarresi, Ghosh, Srinivas, and Wilf are combinable for the same rationale as set forth above with respect to claim 20.
Regarding claim 23, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the additional images comprise real images containing a specific image characteristic.
Wilf teaches wherein the additional images comprise real images containing a specific image characteristic ([0030] According to an embodiment of the invention, the process of the face-based personality analysis module may further comprise one or more additional tasks such as: searching for additional images of the subject person by using a name search engine that can be augmented by face recognition by face recognition and analyzing said additional images to enhance the accuracy of predicted personality traits or capabilities.; [0111] Face images are captured in multiple poses and expressions. To facilitate representation, learning and classification we assume that all additional images comprise real images containing a specific image characteristic images are either full frontal or side profile images. In most situations, a large number of available images will allow selecting such images for training and prediction, where such selection can be manual or automatic, using prior art techniques of pose classification.).
Modarresi, Ghosh, Srinivas, and Wilf are combinable for the same rationale as set forth above with respect to claim 20.
Regarding claim 24, Modarresi, as modified by Ghosh, Srinivas, and Wilf, teaches The method of claim 23.
Wilf teaches wherein the image characteristic includes lighting, focus, facial orientation, or resolution ([0111] Face images are captured in multiple poses and expressions. To facilitate representation, learning and classification we assume that all images are either full frontal or side profile images. In most situations, a large number of available images will allow selecting such images for training and prediction, where such selection can be manual or automatic, using prior art techniques of pose classification.; [0130] These similarity-filtered search results are further analyzed to select images of high quality, of neutral expression and of facial orientation appropriate pose (for example full frontal images and side profile images). Face quality module 230 uses quality metrics such as face size, face focus image sharpness to select face select images of resolution good quality.; [0143] SIFT=Scale Invariant Feature Transform image characteristic includes extracts from an image a collection of feature vectors, each of which is invariant to image translation, scaling, and rotation, partially invariant to lighting illumination changes and robust to local geometric distortion.; [0214] A face-based personality analysis module generates description of personality trait as described in the present invention. Face images are captured from video during the call itself and optionally stored for later interactions. Additionally, a name search engine, optionally augmented by face recognition, as described in FIG. 2 hereinabove, brings in additional pictures of the same person, with better quality, controlled pose, illumination etc.; [0125] FIG. 2 describes the face gathering module 110 in further details, according to an embodiment of the invention. Image search engines 210 such as Google Images generate multiple search results upon a name search. Certain filters in the search engine allow returning only images with faces, only images larger than a certain size, etc.).
Modarresi, Ghosh, Srinivas, and Wilf are combinable for the same rationale as set forth above with respect to claim 20.
Regarding claim 29, Modarresi, as modified by Ghosh, Srinivas, and Wilf, teaches The method of claim 21.
Srinivas teaches wherein the specific facial characteristic includes facial markings ([A. Training, Validation and Testing Dataset, pg. 958] We use data augmentation during training to increase the number of training images. Data augmentation is achieved by mirroring the image. This doubles the number of images used during training.; [1. Introduction, pg. 954] Facial data often exhibits such differences through unconstrained pose, lighting, expressions, and occlusions. When combined with differences in age, hair, skin color, and includes facial markings facial marks, it is clear that models trained on this data need to be invariant to such changes and generalize well across facial characteristics.; [B. Metadata Collection Procedure, pg. 955] Age, gender and fine-grained ethnicity information were labeled by the Turkers. Phase 3: In Phase 3, only images labeled as East and West Asians were considered for further labeling. All registered turkers irrespective of their nationality could participate. Turkers labeled information corresponding to wherein the specific facial characteristic soft biometrics such as facial expressions, the presence of facial hair (e.g. beard, stubble, mustache, etc.), head hair color, eyebrow appearance, and image lighting. The dataset exhibits a wide variation in unconstrained pose, lighting, expression, and occlusions as shown in figures 1 and 2. The distribution of subjects in WEAFD for age, gender and fine-grained ethnicity is given in Table I.).
Modarresi, Ghosh, Srinivas, and Wilf are combinable for the same rationale as set forth above with respect to claim 20.
Regarding claim 30, Modarresi, as modified by Ghosh, Srinivas, and Wilf, teaches The method of claim 21.
Srinivas teaches wherein the specific facial characteristic includes one of lighting, focus, facial orientation, or resolution ([A. Training, Validation and Testing Dataset, pg. 958] We use data augmentation during training to increase the number of training images. Data augmentation is achieved by mirroring the image. This doubles the number of images used during training.; [1. Introduction, pg. 954] Facial data often exhibits such differences through unconstrained pose, lighting, expressions, and occlusions. When combined with differences in age, hair, skin color, and facial marks, it is clear that models trained on this data need to be invariant to such changes and generalize well across facial characteristics.; [B. Metadata Collection Procedure, pg. 955] Age, gender and fine-grained ethnicity information were labeled by the Turkers. Phase 3: In Phase 3, only images labeled as East and West Asians were considered for further labeling. All registered turkers irrespective of their nationality could participate. Turkers labeled information corresponding to specific facial characteristic includes one of soft biometrics such as facial expressions, the presence of facial hair (e.g. beard, stubble, mustache, etc.), head hair color, eyebrow appearance, and image lighting. The dataset exhibits a wide variation in facial orientation unconstrained pose, lighting lighting, expression, and occlusions as shown in figures 1 and 2. The distribution of subjects in WEAFD for age, gender and fine-grained ethnicity is given in Table I.).
Modarresi, Ghosh, Srinivas, and Wilf are combinable for the same rationale as set forth above with respect to claim 20.
Claims 27-28 are rejected under 35 U.S.C. 103 as being unpatentable over Modarresi, in view of Ghosh, Srinivas, and further in view of Chun et al. (NPL: "NADS-Net: A Nimble Architecture for Driver and Seat Belt Detection via Convolutional Neural Networks", hereinafter 'Chun').
Regarding claim 27, Modarresi, as modified by Ghosh and Srinivas, teaches The method of claim 1.
Modarresi, as modified by Ghosh and Srinivas, fails to teach wherein the demographic includes a seat location.
Chun teaches wherein the demographic includes a seat location ([Data collection] We collected videos of drivers and passengers in a Volvo XC90 research vehicle through on-road driving studies. Over 7 months ranging from Spring to Winter, the total of 100 subjects consented to participate in the study in compliance to the internal review board (IRB) requirements.; [Statistics] It should be noted that all driving sessions were accompanied by a research staff as a safety protocol and, thus, the videos contain some repeated appearances of a few research staffs. To minimize the potential bias in the data, the researchers rotated the duty across the driving sessions. By the safety requirement, the researchers had to sit on the front passenger seat when the vehicle was in motion, but while the vehicle was at park, they moved around to wherein the demographic includes a seat location different seat positions as much as possible to minimize the data bias. Moreover, researchers were asked to wear different clothing and accessories each time.).
Modarresi, Ghosh, Srinivas, and Chun are considered to be analogous to the claimed invention because they are in the same field of machine learning. In view of the teachings of Modarresi, Ghosh, and Srinivas, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Chun to Modarresi before the effective filing date of the claimed invention in order to provide meaningful insights for the autonomous driving research community and automotive industry for future algorithm development and data collection (cf. Chun, [Abstract] A new convolutional neural network (CNN) architecture for 2D driver/passenger pose estimation and seat belt detection is proposed in this paper. The new architecture is more nimble and thus more suitable for in-vehicle monitoring tasks compared to other generic pose estimation algorithms. The new architecture, named NADS-Net, utilizes the feature pyramid network (FPN) backbone with multiple detection heads to achieve the optimal performance for driver/passenger state detection tasks. The new architecture is validated on a new data set containing video clips of 100 drivers in 50 driving sessions that are collected for this study. The detection performance is analyzed under different demographic, appearance, and illumination conditions. The results presented in this paper may provide meaningful insights for the autonomous driving research community and automotive industry for future algorithm development and data collection.).
Regarding claim 28, Modarresi, as modified by Ghosh, Srinivas, and Chun, teaches The method of claim 27.
Chun teaches wherein the seat location includes a seat location within a vehicle ([Data collection] We collected videos of drivers and passengers in a Volvo XC90 research includes a seat location within a vehicle vehicle through on-road driving studies. Over 7 months ranging from Spring to Winter, the total of 100 subjects consented to participate in the study in compliance to the internal review board (IRB) requirements.; [Statistics] It should be noted that all driving sessions were accompanied by a research staff as a safety protocol and, thus, the videos contain some repeated appearances of a few research staffs. To minimize the potential bias in the data, the researchers rotated the duty across the driving sessions. By the safety requirement, the researchers had to sit on the front passenger seat when the vehicle was in motion, but while the vehicle was at park, they moved around to wherein the seat location different seat positions as much as possible to minimize the data bias. Moreover, researchers were asked to wear different clothing and accessories each time.).
Modarresi, Ghosh, Srinivas, and Chun are combinable for the same rationale as set forth above with respect to claim 27.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Cruz et al. (NPL: “SVIRO: Synthetic Vehicle Interior Rear Seat Occupancy Dataset and Benchmark”) teaches a synthetic dataset for sceneries in the passenger compartment of ten different vehicles, in order to analyze machine learning-based approaches for their generalization capacities and reliability when trained on a limited number of variations (e.g. identical backgrounds and textures, few instances per class).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MAGGIE MAIDO whose telephone number is (703) 756-1953. The examiner can normally be reached M-Th: 6am - 4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MM/Examiner, Art Unit 2129
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129