DETAILED ACTION
Notice of Pre-AIA or AIA Status
1. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
2. A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 01/15/2026 has been entered.
Information Disclosure Statement
3. The information disclosure statements (IDS) submitted on the following dates are in compliance with the provisions of 37 CFR 1.97 and are being considered by the Examiner: 01/15/2026.
Response to Amendment
4. Applicant’s amendments filed on 01/15/2026 have been entered. Claims 1, 3, 10, 17, 19, and 26 have been amended. Claims 1-19 and 26 are pending in this application, with claims 1, 17 and 26 being independent.
Response to Arguments
5. Applicant’s arguments, see page 7, filed January 15, 2026, with respect to the claim objections have been fully considered and are persuasive. The amendments to the claims are sufficient to overcome the informalities of the previous claims; thus the objections to these claims have been withdrawn.
6. Applicant's arguments filed January 15, 2026, with respect to the 103 rejection have been fully considered but are moot in view of the new grounds of rejection.
Examiner notes that independent claims 1, 17 and 26 have been amended to include new limitation. Examiner finds these limitations to be unpatentable as can be found in below detail action.
In light of the current Office Action, the Examiner respectfully submits that independent claims 1, 17 and 26 are rejected in view of newly discovered reference(s) to Benson et al. (US-8,498,453-B1).
Examiner notes that independent claims 1, 17 and 26 have been amended to include new limitation. Examiner finds these limitations to be unpatentable as can be found in above detail action.
7. On page 12, Applicant's Remarks, the applicant argues that Benson does not disclose per-portion, live display control of a rendered user face based on per-portion confidence values satisfying region-specific thresholds, nor any enrollment/live-partial fusion for rendering. Thus, the cited portions from Benson do not teach or suggest, while the user is using an electronic device, a process of: "obtaining a second set of data corresponding to one or more partial views of the face of the user from one or more image sensors," "generating a representation of the face of the user based on the first set of data and the second set of data, wherein confidence values are determined for each of multiple portions of the representation of the face of the user," and "displaying the representation of the face of the user, wherein the portions of the representation of the face of the user are displayed based on the corresponding confidence values for each of the multiple portions of the representation of the face satisfying a threshold," as recited by amended claim 1. Applicant further argues Benson does not disclose per-portion, live display control of a rendered user face based on per-portion confidence values satisfying region-specific thresholds, nor any enrollment/live-partial fusion for rendering. Examiner respectfully disagrees with that argument.
The claim language is too broad and didn’t provide details of confidence values for each of the multiple portions of the representation of the face. Therefore, the multiple confidence values are evaluated to generate an overall confidence value for the head point including center of eyes, top of head, sides of head, and chin points would disclose the limitation of “confidence values are determined for each of multiple portions of the representation of the face of the user," and "displaying the representation of the face of the user, wherein the portions of the representation of the face of the user are displayed based on the corresponding confidence values for each of the multiple portions of the representation of the face satisfying a threshold”. Furthermore, presenting a representation of a face of a user using live partial images of the user's face and previously-obtained face data was not claimed at all.
Thus the Examiner respectfully submits that claim 1 is disclosed by the prior art.
Claim Rejections - 35 USC § 103
8. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
9. Claims 1-3, 8, 10, 13, 17-19 and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Hefny et al., (“Hefny”) [US-2021/0056747-A1] in view of Nechyba et al. (“Nechyba”) [US-8,254,647-B1], further in view of Benson et al. (“Benson”) [US-8,498,453-B1].
Regarding claim 1, Hefny discloses a method (Hefny- Fig. 3 and ¶0038, at least disclose a method 300 of puppeteering a remote avatar 160) comprising:
at a processor (Hefny- Fig. 4 and ¶0041, at least disclose The computing device 400 includes a processor 410 […] The processor 410 can process instructions for execution within the computing device 400):
obtaining a first set of data corresponding to features of a face of a user in a plurality of configurations (Hefny- ¶0003-0004, at least disclose receiving, at data processing hardware, a first facial framework and a first captured image of a face of a user with a neutral facial expression. The first facial framework corresponds to the face of the user at a first frame and includes a first facial mesh of facial information […] projecting, by the data processing hardware, the first captured image of the face onto the first facial framework and determining, by the data processing hardware, a facial texture corresponding to the face of the user based on the projected captured image […] a second captured image of the face of the user, the second captured image capturing a smile as a facial expression of the user; […] a third captured image of the face of the user, the third captured image capturing, as the facial expression of the user, both eyebrows raised;[…] the fourth captured image capturing, as the facial expression of the user, a smile and both eyebrows raised [Wingdings font/0xE0] smile, both eyebrows raised, smile and both eyebrows raised as a facial expression suggests “a plurality of configurations”; ¶0007, at least discloses projecting the first captured image of the face onto the first facial framework and determining a facial texture corresponding to the face of the user based on the projected captured image; Fig. 1 and ¶0024, at least disclose when a camera or a sensor with depth capability captures the facial image 130 of the user 10, the captured image 130 includes depth data identifying relationships between facial features and/or facial textures (e.g., shadows, lighting, skin texture, etc.) [a first set of data corresponding to features of a face of a user]; Fig. 2A and ¶0026, at least disclose The puppeteer 200 includes a texturer 210 and an updater 220. The texturer 210 is configured to determine a facial texture 212, while the updater 220 is configured to update the facial texture 212 based on subsequently received facial framework(s) 144 and/or captured image(s) 130 […] the texturer 210 projects the first captured image 130 of the face 20 onto the first facial framework 144 a to determine a facial texture 212, 212 a corresponding to the neutral facial expression 22, 22 a of the face 20; Fig. 2B and ¶0028, at least disclose the puppeteer 200 receives a plurality of captured images 130, 130 a-d of the face 20 of the user 20 and determines, for each captured image 130, a corresponding facial texture 212, 212 a-d [features of a face of a user] by projecting the captured image 130 of the face 20 onto the first facial framework 140 a.) and;
obtaining a second set of data corresponding to one or more partial views of the face of the user from one or more image sensors (Hefny- ¶0004, at least discloses receiving, at the data processing hardware, a second captured image of the face of the user, the second captured image capturing a smile as a facial expression of the user; receiving, at the data processing hardware, a third captured image of the face of the user, the third captured image capturing, as the facial expression of the user, both eyebrows raised; receiving, at the data processing hardware, a fourth captured image of the face of the user, the fourth captured image capturing, as the facial expression of the user, a smile and both eyebrows raised [a second set of data corresponding to one or more partial views of the face]; Fig. 2B and ¶0029, at least discloses the second captured image 130 b corresponds to a smiling facial expression 22 b, the third captured image 130 c corresponds to a both eyebrows raised facial expression 22 c, and the fourth captured image 130 d corresponds to a smiling with both eyebrows raised facial expression 22 d [a second set of data]; Fig. 2E and ¶0036, at least disclose facial information 140 and/or facial framework(s) 144 correspond to a partial capture (e.g, an obstructed image 214) of the face 20 of the user 10 [one or more partial views of the face of the user]. For example, the user 10 moves within a field of view or moves the image capturing device 116 [one or more image sensors]);
generating a representation of the face of the user based on the first set of data and the second set of datarepresentation of the face 20 of the first user 20 a The puppeteer 200 generates the 3D avatar 160 as a real-time 3D avatar 160 based on the output 160 from the first user device 110 a; Figs. 2B-2C and ¶0034, at least disclose a puppeteer 200 with a finite number of captured images 130 (e.g., four captured images 130 a-d) may increase accuracy while still minimizing bandwidth by updating the 3D avatar 160 based on facial information 140 (e.g., a second facial framework 144 b) at a current frame (e.g., the second frame F2) rather than updating the facial texture 212 from a current captured image 130, 130 current (as shown in FIG. 2D); Fig. 2D and ¶0035, at least disclose when utilizing the current captured image 130 current of the user 10, the puppeteer 200 receives and/or reduces an amount of facial texture 212 associated with the current captured image 130 current. For example, the updater 220 generates the updated facial texture 212U based on the current captured image 130 current having one third of the facial texture 212 (e.g., when compared to the first facial texture 212 a)); and
displaying the representation of the face of the user updating the facial texture based on the received second facial framework and displaying the updated facial texture as a three-dimensional avatar. The three-dimensional avatar corresponds to a virtual representation of the face of the user; Figs. 2B-2C and ¶0034, at least disclose a puppeteer 200 with a finite number of captured images 130 (e.g., four captured images 130 a-d) may increase accuracy [confidence values] while still minimizing bandwidth by updating the 3D avatar 160 based on facial information 140 (e.g., a second facial framework 144 b) at a current frame (e.g., the second frame F2) rather than updating the facial texture 212 from a current captured image 130, 130 current (as shown in FIG. 2D); Fig. 3 and ¶0038, at least disclose At operation 312, the method 300 displays the updated facial texture 212 as a 3D avatar 160. The 3D avatar 160 corresponds to a virtual representation of the face 20 of the user 10).
Hefny does not explicitly disclose while the user is using an electronic device: obtaining a second set of data; wherein confidence values are determined for each of multiple portions of the representation of the face of the user; and wherein the portions of the representation of the face of the user are displayed based on the corresponding confidence values for each of the multiple portions of the representation of the face satisfying a threshold.
However, Nechyba discloses
while the user is using an electronic device: obtaining a second set of data corresponding to one or more partial views of the face of the user from one or more image sensors (Nechyba- Figs. 4, 5, 6 shows different views of the face of the user; col 3, line 67 to col 4, line 8, at least discloses During the enrollment phase, user 20 uses camera 26 [one or more image sensors] of mobile computing device 10 to capture captured image 28. Mobile computing device may generate one or more captured templates 32 based on the captured image 28. A template may comprise a processed version of an image, and/or features of an image. A template may further comprise model parameters that may be used in conjunction with facial feature recognition algorithms, as well as intrinsic characteristics of what a person looks like; col 6, line 15-25, at least discloses user 20 may hold mobile computing device 10 [one or more image sensors] below the level of his or her face. Thus, when capturing an image of a face of a user, camera 26 may acquire a captured image of user 20 with his or her head tilted upward and away from camera 26 […] Based on the captured image 28, computing device 10 may produce captured templates 32; col 6, line 52-55, at least discloses As part of determining the quality of the captured image 28, mobile computing device 10 may further determine whether captured templates 32 exhibits pitch, yaw, and/or roll; Figs. 3A to 3I and col 12, line 46-61, at least discloses Facial images 3B and 3C illustrate pitch. In facial image 3B, a user's face is tilted up and away from camera 26. In image 3C, the user's face is tilted down and toward camera 26. Facial images 3D and 3E illustrate roll. In image 3D, the face of user 20 is tilted to the left, and in image 3E, the image of user 20 is tilted to the right of camera 26 […] In image 3H, the face of user 20 is rotated to the right, and in facial image 3I, the face of user 20 is rotated even further to the right than in image 106C [Wingdings font/0xE0] rotation of a user's head about a particular axis from pitch, roll, and/or yaw suggests “one or more partial views of the face of the user”);
wherein confidence values are determined for at least a portion of a face of the representation of the face of the user (Nechyba- Figs. 4, 5, 6 shows different portions of the representation of the face of the user; col 1, line 27-31, at least discloses generating, by the mobile computing device, a facial detection confidence score based at least in part on a likelihood that a representation of at least a portion of a face is included in the image, and generating, by the mobile computing device, a facial landmark detection confidence score based at least in part on a likelihood that representations of facial landmarks are accurately identified in the image; col 6, lines 28-33, at least discloses the values of the confidence scores may be based on a confidence that mobile computing device 10 has successfully recognized various features of a human face in the captured templates 32. If mobile computing device 10 recognizes features of a face, a confidence score may be higher. If mobile computing device 10 does not recognize facial features of a person from captured templates 32, a confidence score may be lower [different confidence values]; col 7, lines 6-28, at least discloses mobile computing device 10 may calculate confidence scores such that the scores may indicate a likelihood that various features have been found or that various parameters are within an acceptable range. In some examples, mobile computing device 10 may calculate a facial detection confidence score, a facial landmark detection confidence score, and a geometric consistency score. The face detection confidence score may indicate a value that a face of a user has been detected. The facial landmark confidence score may indicate a confidence that various facial landmarks of a user have been detected. In some examples, the facial features may comprise a base of the nose and each eye of a user. The geometric consistency score may calculate a confidence value based on measurements of the distances between facial landmarks, such as the centers of a user's eyes and the base of the user's nose. The geometric consistency score may be based on a likelihood that the distances between the landmarks are consistent with those of a human face. Each of the face detection confidence score, facial landmark confidence score, and the geometric consistency scores may indicate whether or not captured image 28 is of high or low quality; col 16, lines 14-20, at least discloses the facial landmark detection confidence score may indicate a lower amount of confidence as the absolute value of the yaw angle of the face of user 20 included in the image increases. In some examples, the facial landmarks may comprise at least the nose base and each eye of a user, e.g. user 20; col 12, line 62 to col 13, line 2, at least discloses determining whether a captured image or template used for facial recognition purposes exhibits a large yaw or pitch, or roll (i.e. the captured image is off-axis). A captured image that exhibits a large pitch, yaw, and/or roll may be unsuitable for use with facial recognition. In some examples, a captured image that exhibits large yaw may also produce one or more low confidence scores when performing facial recognition); and
wherein the portions of the representation of the face of the user are displayed based on the corresponding confidence values for each of the multiple portions of the representation of the face confidence score based at least in part on a likelihood that a representation of at least a portion of a face [portion of the representation of the face] is included in the image, and generating, by the mobile computing device, a facial landmark detection confidence score based at least in part on a likelihood that representations of facial landmarks are accurately identified in the image; col 7, lines 6-28, at least discloses mobile computing device 10 may calculate confidence scores such that the scores may indicate a likelihood that various features have been found or that various parameters are within an acceptable range. In some examples, mobile computing device 10 may calculate a facial detection confidence score, a facial landmark detection confidence score, and a geometric consistency score. The face detection confidence score may indicate a value that a face of a user has been detected. The facial landmark confidence score may indicate a confidence that various facial landmarks of a user have been detected. In some examples, the facial features may comprise a base of the nose and each eye of a user. The geometric consistency score may calculate a confidence value based on measurements of the distances between facial landmarks, such as the centers of a user's eyes and the base of the user's nose. The geometric consistency score may be based on a likelihood that the distances between the landmarks are consistent with those of a human face. Each of the face detection confidence score, facial landmark confidence score, and the geometric consistency scores may indicate whether or not captured image 28 is of high or low quality; Figs. 3A to 3I and col 12, line 46-61, at least discloses Facial images 3B and 3C illustrate pitch. In facial image 3B, a user's face is tilted up and away from camera 26. In image 3C, the user's face is tilted down and toward camera 26. Facial images 3D and 3E illustrate roll. In image 3D, the face of user 20 is tilted to the left, and in image 3E, the image of user 20 is tilted to the right of camera 26 […] In image 3H, the face of user 20 is rotated to the right, and in facial image 3I, the face of user 20 is rotated even further to the right than in image 106C [Wingdings font/0xE0] rotation of a user's head about a particular axis from pitch, roll, and/or yaw suggests “portions of the representation of the face of the user”; col 16, lines 14-20, at least discloses the facial landmark detection confidence score may indicate a lower amount of confidence as the absolute value of the yaw angle of the face of user 20 included in the image increases […] the facial landmarks may comprise at least the nose base and each eye of a user, e.g. user 20; col 11, lines 14-15, at least discloses mobile computing device 10 may display captured image 28 to user 20 via GUI 22; col 12, line 62 to col 13, line 2, at least discloses determining whether a captured image or template used for facial recognition purposes exhibits a large yaw or pitch, or roll (i.e. the captured image is off-axis). A captured image that exhibits a large pitch, yaw, and/or roll may be unsuitable for use with facial recognition. In some examples, a captured image that exhibits large yaw may also produce one or more low confidence scores when performing facial recognition).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny to incorporate the teachings of Nechyba, and apply the confidence scores based at least in part on a likelihood that a representation of at least a portion of a face into Hefny’s teachings for generating a representation of the face of the user based on the first set of data and the second set of data, wherein confidence values are determined for at least a portion of a face of the representation of the face of the user; and displaying the representation of the face of the user, wherein the portions of the representation of the face of the user are displayed based on the corresponding confidence values for each of the multiple portions of the representation of the face. One of ordinary skill in the art could have substituted at least a portion for each of multiple portions, and the results of the substitution would have been predictable.
Doing so would generate an image quality score based in part on a combination of the confidence scores, and the consistency score, and classifying an image quality based on the image quality score.
The prior art does not explicitly disclose, but Benson discloses
confidence values are determined for each of multiple portions of the representation of the face of the user (Benson- col 7, lines 20-22, at least discloses head points include center of eyes, top of head, sides of head, and chin points [multiple portions]; col 17, line 59 to col 18, line 3, at least discloses confidence levels are evaluated for each head point and a specific confidence value is then assigned to that point that represents the confidence level […] the multiple confidence values are then evaluated to generate an overall confidence value for the head point; col 18, lines 14-26, at least discloses A confidence value is then assigned to the head point, where a larger color difference is given a higher confidence value, and a smaller color difference is given a lower confidence value. In some embodiments the confidence values are between 0 and 1, where 1 represents a low confidence level and 0 represents a high confidence level. For example, a confidence value of 0.1 represents a relatively high confidence value for a given head point, where a confidence value of 0.9 represents a relatively low confidence value for the given head point; Figs. 24-25 and col 21, line 61 to col 22, line 3, at least disclose a plurality of data cells 2504, 2506, 2508, and 2510 [multiple portions of the representation of the face]. Each data cell stores one confidence value associated with a portion of the image portion 2202. For example, the upper left data cell 2504 stores a confidence value associated with an upper left portion of the image portion 2202 after the evaluation shown in FIG. 24. As the chin map 2302 is shifted by one shift distance to the right, the next evaluation is performed and the resulting confidence value is stored in data cell 2506);
the portions of the representation of the face of the user are displayed based on the corresponding confidence values for each of the multiple portions of the representation of the face satisfying a threshold (Benson- Fig. 19 and col 18, lines 14-23, at least disclose A confidence value is then assigned to the head point, where a larger color difference is given a higher confidence value, and a smaller color difference is given a lower confidence value […] the confidence values are between 0 and 1, where 1 represents a low confidence level and 0 represents a high confidence level. For example, a confidence value of 0.1 represents a relatively high confidence value for a given head point, where a confidence value of 0.9 represents a relatively low confidence value for the given head point; col 18, lines 42-43, at least discloses For each head point, a threshold distance 1902, 1904, 1906, and 1908 is defined for that head point; col 18, lines 57-63, at least discloses Examples of threshold distances are as follows […] the left and right side of head threshold distances 1902 and 1904 are typically in a range from about 0.5 to about 1.2 times the inter-eye distance 706. The top of the head and chin threshold distances 1904 and 1908 are typically in a range from about 1 to about 1.7 times the inter-eye distance 706).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba to incorporate the teachings of Benson, and apply the confidence value is assigned to the head point represents a low or high confidence level into Hefny/Nechyba’s teachings for generating a representation of the face of the user based on the first set of data and the second set of data, wherein confidence values are determined for each of multiple portions of the representation of the face of the user; and displaying the representation of the face of the user, wherein the portions of the representation of the face of the user are displayed based on the corresponding confidence values for each of the multiple portions of the representation of the face satisfying a threshold.
Doing so would improve the automation of digital image processing, such as to reduce the amount of manual effort required to process the digital images, and to produce final images having improved uniformity and consistency.
Regarding claim 2, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and further discloses wherein the first set of data comprises unobstructed image data of the face of the user (Hefny- ¶0005, at least discloses the obstructed portion of the face of the user with facial texture generated from an unobstructed captured image from a prior frame).
Regarding claim 3, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and further discloses wherein the second set of data comprises partial views of the face of the user (Hefny- ¶0036, at least discloses Referring to FIG. 2E, in some examples, facial information 140 and/or facial framework(s) 144 correspond to a partial capture (e.g, an obstructed image 214) of the face 20 of the user 10. For example, the user 10 moves within a field of view or moves the image capturing device 116. In these examples, the puppeteer 200 may be additionally configured to account for these issues. In some configurations, the texturer 210 identifies whether the current capture image 130 current and or second facial framework 144 b corresponds to an obstructed image.).
Regarding claim 8, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and further discloses wherein the representation is a three-dimensional (3D) avatar (Hefny- ¶0007, at least discloses The operations also include updating the facial texture based on the received second facial framework and displaying the updated facial texture as a three-dimensional avatar. The three-dimensional avatar corresponds to a virtual representation of the face of the user).
Regarding claim 10, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and further discloses wherein the portions of the representation are displayed differently based on a corresponding confidence value (Nechyba- Figs. 4, 5, 6 shows the portions of the representation of the face of the user are displayed; col 1, line 27-31, at least discloses generating, by the mobile computing device, a facial detection confidence score based at least in part on a likelihood that a representation of at least a portion of a face [portion of the representation of the face] is included in the image, and generating, by the mobile computing device, a facial landmark detection confidence score based at least in part on a likelihood that representations of facial landmarks are accurately identified in the image; col 7, lines 6-28, at least discloses mobile computing device 10 may calculate confidence scores such that the scores may indicate a likelihood that various features have been found or that various parameters are within an acceptable range […] The face detection confidence score may indicate a value that a face of a user has been detected. The facial landmark confidence score may indicate a confidence that various facial landmarks of a user have been detected. In some examples, the facial features may comprise a base of the nose and each eye of a user. The geometric consistency score may calculate a confidence value based on measurements of the distances between facial landmarks, such as the centers of a user's eyes and the base of the user's nose. The geometric consistency score may be based on a likelihood that the distances between the landmarks are consistent with those of a human face. Each of the face detection confidence score, facial landmark confidence score, and the geometric consistency scores may indicate whether or not captured image 28 is of high or low quality; Figs. 3A to 3I and col 12, line 46-61, at least discloses Facial images 3B and 3C illustrate pitch. In facial image 3B, a user's face is tilted up and away from camera 26. In image 3C, the user's face is tilted down and toward camera 26. Facial images 3D and 3E illustrate roll. In image 3D, the face of user 20 is tilted to the left, and in image 3E, the image of user 20 is tilted to the right of camera 26 […] In image 3H, the face of user 20 is rotated to the right, and in facial image 3I, the face of user 20 is rotated even further to the right than in image 106C [Wingdings font/0xE0] rotation of a user's head about a particular axis from pitch, roll, and/or yaw suggests “portions of the representation of the face of the user”; col 16, lines 14-20, at least discloses the facial landmark detection confidence score may indicate a lower amount of confidence as the absolute value of the yaw angle of the face of user 20 included in the image increases […] the facial landmarks may comprise at least the nose base and each eye of a user, e.g. user 20; col 11, lines 14-15, at least discloses mobile computing device 10 may display captured image 28 to user 20 via GUI 22; col 12, line 62 to col 13, line 2, at least discloses determining whether a captured image or template used for facial recognition purposes exhibits a large yaw or pitch, or roll (i.e. the captured image is off-axis). A captured image that exhibits a large pitch, yaw, and/or roll may be unsuitable for use with facial recognition. In some examples, a captured image that exhibits large yaw may also produce one or more low confidence scores when performing facial recognition; Benson- Fig. 19 and col 18, lines 14-23, at least disclose A confidence value is then assigned to the head point, where a larger color difference is given a higher confidence value, and a smaller color difference is given a lower confidence value […] the confidence values are between 0 and 1, where 1 represents a low confidence level and 0 represents a high confidence level. For example, a confidence value of 0.1 represents a relatively high confidence value for a given head point, where a confidence value of 0.9 represents a relatively low confidence value for the given head point; col 18, lines 42-43, at least discloses For each head point, a threshold distance 1902, 1904, 1906, and 1908 is defined for that head point).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba to incorporate the teachings of Benson, and apply the confidence value is assigned to the head point represents a low or high confidence level into Hefny/Nechyba’s teachings in order the portions of the representation are displayed differently based on a confidence level of the corresponding confidence value.
The same motivation that was utilized in the rejection of claim 1 applies equally to this claim.
Regarding claim 13, in view of Nechyba and Benson, discloses the method of claim 1, and further discloses wherein displaying the portions of the representation based on the corresponding confidence values (see Claim 1 rejection for detailed analysis) comprises displaying the portions of the representations differently (see Claim 10 rejection for detailed analysis) based on a confidence level of corresponding confidence values (Benson- col 17, lines 59-61, at least discloses confidence levels are evaluated for each head point and a specific confidence value is then assigned to that point that represents the confidence level; col 18, lines 17-23, at least discloses A confidence value is then assigned to the head point, where a larger color difference is given a higher confidence value, and a smaller color difference is given a lower confidence value. In some embodiments the confidence values are between 0 and 1, where 1 represents a low confidence level and 0 represents a high confidence level. For example, a confidence value of 0.1 represents a relatively high confidence value for a given head point, where a confidence value of 0.9 represents a relatively low confidence value for the given head point).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba to incorporate the teachings of Benson, and apply the confidence levels are evaluated for each head point and a specific confidence value into Hefny/Nechyba’s teachings in order for displaying the portions of the representation based on the corresponding confidence values comprises displaying the portions of the representations differently based on a confidence level of corresponding confidence values.
The same motivation that was utilized in the rejection of claim 1 applies equally to this claim.
The system of claims 17-19 are similar in scope to the functions performed by the method of claims 1-3 and therefore claims 17-19 are rejected under the same rationale.
Regarding claim 17, Hefny in view of Nechyba and Benson, discloses a device (Hefny- Fig. 4 and ¶0040, at least disclose computing device 400 that may be used to implement the systems and methods of, for example, the user device 110) comprising:
a non-transitory computer-readable storage medium (Hefny- Fig. 4 and ¶0042, at least disclose The memory 420 stores information non-transitorily within the computing device 400. The memory 420 may be a computer-readable medium, a volatile memory unit(s), or non-volatile memory unit(s). The non-transitory memory 420 may be physical devices used to store programs (e.g., sequences of instructions) or data (e.g., program state information) on a temporary or permanent basis for use by the computing device 400); and
one or more processors coupled to the non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium comprises program instructions that, when executed on the one or more processors (Hefny- Fig. 4 and ¶0041-0043, at least disclose The computing device 400 includes a processor 410, memory 420, a storage device 430, a high-speed interface/controller 440 connecting to the memory 420 and high-speed expansion ports 450 […] The processor 410 can process instructions for execution within the computing device 400, including instructions stored in the memory 420 or on the storage device 430 to display graphical information for a graphical user interface (GUI) on an external input/output device […] The memory 420 stores information non-transitorily within the computing device 400. The memory 420 may be a computer-readable medium, a volatile memory unit(s), or non-volatile memory unit(s). The non-transitory memory 420 may be physical devices used to store programs (e.g., sequences of instructions) or data (e.g., program state information) on a temporary or permanent basis for use by the computing device 400 […] a computer program product is tangibly embodied in an information carrier. The computer program product contains instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer- or machine-readable medium, such as the memory 420, the storage device 430, or memory on processor 410), cause the one or more processors to perform operations comprising the method of claim 1.
Regarding claim 26, Hefny in view of Nechyba and Benson, discloses a non-transitory computer-readable storage medium, storing program instructions executable on a device to perform operations (Hefny- Fig. 4 and ¶0042, at least disclose The memory 420 stores information non-transitorily within the computing device 400. The memory 420 may be a computer-readable medium, a volatile memory unit(s), or non-volatile memory unit(s). The non-transitory memory 420 may be physical devices used to store programs (e.g., sequences of instructions) or data (e.g., program state information) on a temporary or permanent basis for use by the computing device 400) comprising the method of claim 1.
10. Claims 4-7, 9 and 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Hefny in view of Nechyba, further in view of Benson, still further in view of Frueh et al. (“Frueh”) [US-2018/0101227-A1].
Regarding claim 4, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and further discloses wherein the electronic device comprises a first sensor and a second sensor (Hefny- ¶0020, at least discloses image capturing devices 116 are cameras (e.g., depth cameras or RGB cameras) or image sensors (e.g., laser imaging sensors; ¶0020, at least discloses the captured image 130 generated by cameras or sensors without depth capabilities (e.g., RBG cameras) requires further analysis with techniques such as facial landmark detection and/or facial feature detection to generate facial information 140), where the second set of data is obtained from at least one partial image of the face of the user (see Claim 1 rejection for detailed analysis) from the first sensor from a first viewpoint and from at least one partial image of the face of the user (see Claim 1 rejection for detailed analysis).
The prior art does not explicitly disclose the second sensor from a second viewpoint that is different than the first viewpoint.
However, Frueh discloses
the second sensor from a second viewpoint that is different than the first viewpoint (Frueh- Fig. 1 and ¶0037, at least disclose The 3-D locations of the user's eyes and the 3-D locations of the target image that are determined for each image captured by the camera 130 are used to determine gaze vectors that indicate the eye gaze direction for the user 125 in each of the images. For example, a first eye gaze direction 145 for the first image is defined by the relative positions of the 3-D location of the user's eyes in a first image and the 3-D location of the target image while the first image was acquired. For another example, a second eye gaze direction 150 for the second image is defined by the relative positions of the 3-D location of the user's eyes in a second image and the 3-D location of the target image while the second image was acquired. The first eye gaze direction 145 is represented as a first angle 155 relative to a central axis 160 and the second eye gaze direction 150 is represented as a second angle 165 relative to the central axis 160. In the side view 100, the eye gaze directions 145, 150 and the angles 155, 165 [a second viewpoint that is different than the first viewpoint] are illustrated in a vertical plane).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba/Benson to incorporate the teachings of Frueh, and apply the different eye gaze directions and the different angles for the different images into Hefny/Nechyba/Benson’s teachings in order the electronic device comprises a first sensor and a second sensor, where the second set of data is obtained from at least one partial image of the face of the user from the first sensor from a first viewpoint and from at least one partial image of the face of the user from the second sensor from a second viewpoint that is different than the first viewpoint.
Doing so would significantly enhance the social connection between users in a virtual 3-D scene.
Regarding claim 5, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and does not explicitly disclose, but Frueh discloses wherein the confidence values correspond to texture confidence value, wherein displaying the portions of the representation based on the corresponding confidence values comprises determining that the texture confidence value exceeds a threshold ( Frueh- ¶0106, at least discloses The rendered image is then compared to the captured image to generate a matching score. In a 3-D comparison, an ICP algorithm is used to compare the 3-D reference model with the captured image including depth information for each pixel and generate a matching score [texture confidence value]. A relatively high value of the matching score, such as a matching score above a threshold [texture confidence value exceeds a threshold], indicates a match).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba/Benson to incorporate the teachings of Frueh, and apply the texture confidence value exceeds a threshold and confidence values into Hefny/Nechyba/Benson’s teachings in order the confidence values correspond to texture confidence value, wherein displaying the portions of the representation based on the corresponding confidence values comprises determining that the texture confidence value exceeds a threshold.
Doing so would significantly enhance the social connection between users in a virtual 3-D scene.
Regarding claim 6, Hefny in view of Nechyba, Benson and Frueh, discloses the method of claim 5, and further discloses wherein generating the representation of the face of the user comprises:
tracking the features of the face of the user (Hefny- ¶0036, at least discloses the texturer 210 tracks and analyzes how much facial information 140 is received on average and compares this data to the current capture image 130 current and/or the second facial framework 144 b; Frueh- ¶0028, at least discloses the eye gaze of the user is determined by an eye tracker implemented in the HMD and the tracked eye gaze is used to select the appropriate 3-D model of the user's face (or texture used to render a portion of the user's face) from the database, which is indexed by eye gaze direction);
generating a model based on the tracked features (Frueh- Fig. 4 and ¶0043-0047, at least disclose The set of images captured by the camera 310 and (optionally) the eye tracker 315 are used to generate models of the face of the user 305 that correspond to the eye gaze directions associated with each of the images. The models are referred to herein as “samples” of the user's face […] Filtered data representative of the detected face 415 is triangulated to create a 3-D model 420 of the user's face. The 3-D model 420 includes a set of vertices 425 (only one indicated by a reference numeral in the interest of clarity) that are interconnected by corresponding edges 430 (only one indicated by a reference numeral in the interest of clarity).); and
updating the model by projecting live image data onto the model (Hefny- Fig. 3 and ¶0038, at least disclose At operation 304, the method 300 projects the first captured image 130 of the face 20 onto the first facial framework 144 a. At operation 306, the method 300 determines a facial texture 212 corresponding to the face 20 of the user 10 based on the projected captured image 130; Frueh- ¶0115, at least disclose the user 1610 is represented by a live 3-D representation that can be computed using a textured point cloud, a textured mesh, and the like).
Regarding claim 7, Hefny in view of Nechyba, Benson and Frueh, discloses the method of claim 6, and further discloses wherein generating the representation of the face of the user further comprises enhancing the model based on the first set of data (Frueh- ¶0026, at least discloses The social connection between users in a virtual 3-D scene, such as a mixed reality scene, can be significantly enhanced by replacing a portion of the HMD with a model of a portion of the user's face that is obscured by the HMD in the image of the user that is inserted into the virtual 3-D scene […] generating an eye gaze database for a 3-D model of a user's face that is indexed by the user's eye gaze direction).
Regarding claim 9, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and further discloses wherein the portions of the representation are displayed (see Claim 1 rejection for detailed analysis) based on assessing confidence (Hefny- Figs. 2B-2C and ¶0034, at least disclose a puppeteer 200 with a finite number of captured images 130 (e.g., four captured images 130 a-d) may increase accuracy [confidence values] while still minimizing bandwidth by updating the 3D avatar 160 based on facial information 140 (e.g., a second facial framework 144 b) at a current frame (e.g., the second frame F2) rather than updating the facial texture 212 from a current captured image 130, 130 current (as shown in FIG. 2D)).
The prior art does not explicitly disclose, but Frueh discloses
the respective portion accurately corresponds to a live appearance of the face of the user (Frueh- ¶0115, at least discloses the user 1610 is represented by a live 3-D representation [a live appearance] that can be computed using a textured point cloud, a textured mesh, and the like).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba/Benson to incorporate the teachings of Frueh, and apply the live 3-D representation into Hefny/Nechyba/Benson’s teachings in order the portions of the representation are displayed based on assessing confidence that the respective portion accurately corresponds to a live appearance of the face of the user.
Doing so would significantly enhance the social connection between users in a virtual 3-D scene.
Regarding claim 11, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and further discloses wherein the second set of data comprises image data obtained during a scanning process (Hefny- ¶0006, at least discloses The captured image may include a red-green-and blue (RGB) image from a mobile phone; ¶0020, at least discloses Some examples of image capturing devices 116 are cameras (e.g., depth cameras or RGB cameras) or image sensors (e.g., laser imaging sensors)).
The prior art does not explicitly disclose, but Frueh discloses
depth data and light intensity image data obtained (Frueh- ¶0027, at least discloses the camera can be implemented as an RGBD camera that captures RGB values of pixels in the image and a depth value for each pixel that indicates a distance between the camera and the object represented by the pixel […] Face samples are calculated for each image by defining locations of vertices in the face sample using the depth values for the pixels in the image and texture values are defined for each vertex using the RGB values of the corresponding pixel).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba/Benson to incorporate the teachings of Frueh, and apply the RGB values of pixels in the image and a depth value for each pixel into Hefny/Nechyba/Benson’s teachings in order the second set of data comprises depth data and light intensity image data obtained during a scanning process.
Doing so would significantly enhance the social connection between users in a virtual 3-D scene.
Regarding claim 12, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and does not explicitly disclose, but Frueh discloses wherein the electronic device is a head-mounted device (HMD) (Frueh- Figs. 11, 13 show user wearing a head-mounted device (HMD); Fig. 8 and ¶0066, at least discloses The user 815 is wearing an HMD 835 that allows the user to participate in VR, AR, or MR sessions supported by corresponding applications, which may be implemented in the processor 820 or in other processors such as remote cloud servers).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba/Benson to incorporate the teachings of Frueh, and apply the HMD into Hefny/Nechyba/Benson’s teachings in order the electronic device is a head-mounted device (HMD).
Doing so would significantly enhance the social connection between users in a virtual 3-D scene.
11. Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Hefny in view of Nechyba, further in view of Benson, still further in view of Kim et al. (“Kim”) [US-2021/0192187-A1].
Regarding claim 14, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and further discloses wherein displaying the portions of the representation based on the corresponding confidence values (see Claim 1 rejection for detailed analysis), and does not explicitly disclose, but Kim discloses confidence values comprises, for a higher level of confidence, displaying a first portion of the representation and, for a lower level of confidence blurring or distorting the first portion of the representation (Kim- Fig. 5 and ¶0088-0089, at least disclose The first confidence map 530 and the second confidence map 550 are examples of visually displaying confidence values […] when confidence values corresponding to pixels in the first confidence map 530 and the second confidence map 550 increase, the pixels may be displayed in colors close to white. When the confidence values corresponding to the pixels decrease, the pixels may be displayed in colors close to black. Also, due to an occlusion or a distortion in the first confidence map 530 and the second confidence map 550, pixels that do not have depth values are displayed in colors corresponding to null or don't care).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba/Benson to incorporate the teachings of Kim, and apply the occlusion or a distortion into Hefny/Nechyba/Benson’s teachings for displaying the portions of the representation based on the corresponding confidence values comprises, for a higher level of confidence, displaying a first portion of the representation and, for a lower level of confidence blurring or distorting the first portion of the representation.
Doing so would increase a recognition rate of a three-dimensional (3D) face.
12. Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Hefny in view of Nechyba, further in view of Benson, still further in view of Chang et al. (“Chang”) [US-2017/0076142-A1]
Regarding claim 15, Hefny in view of Nechyba and Benson, discloses the method of claim 1, and further discloses wherein displaying the portions of the representation based on the corresponding confidence values (see Claim 1 rejection for detailed analysis),and does not explicitly disclose, but Chang discloses confidence values comprises determining a level of distortion or blurring out of a first portion of the representation based on a confidence level for that first portion, wherein higher confidence corresponds to reduced blur or distortion (Chang- ¶0005-0006, at least disclose The method can further include blurring the spatial function, where determining the confidence mask is based on the blurred spatial function […] blurring one or more of the distributions of color values. For example, blurring the distributions of color values can include blurring values [level of blurring] of the distributions along an axis for which color values for different color channels are approximately equal; ¶0036, at least discloses implementations can blur the spatial function, such that the confidence mask is based on the blurred spatial function; ¶0040, at least discloses Blurring of color value distributions can allow a greater range of color values to be represented in each region, e.g., allowing more accurate probability or confidence determination for pixels that may have color values close to the color values of high-probability or high-confidence pixels).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba/Benson to incorporate the teachings of Chang, and apply the blurring values into Hefny/Nechyba/Benson’s teachings for displaying the portions of the representation based on the corresponding confidence values comprises determining a level of distortion or blurring out of a first portion of the representation based on a confidence level for that first portion, wherein higher confidence corresponds to reduced blur or distortion.
Doing so would provide feature detection and masking in images based on color distributions.
13. Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Hefny in view of Nechyba, further in view of Benson, still further in view of Chang, still further in view of Frueh et al. (“Frueh”) [US-2018/0101227-A1]
Regarding claim 16, Hefny in view of Nechyba, Benson and Chang, discloses the method of claim 15, and does not explicitly disclose, but Frueh discloses wherein if a threshold level of confidence is reached, the first portion of the representation is displayed without any blur or distortion (Frueh- Fig. 15 and ¶0106-0107, at least disclose The rendered image is then compared to the captured image to generate a matching score. In a 3-D comparison, an ICP algorithm is used to compare the 3-D reference model with the captured image including depth information for each pixel and generate a matching score. A relatively high value of the matching score, such as a matching score above a threshold, indicates a match. If the processor detects a match, the method 1500 flows to block 1525 […] At block 1525, the processor determines the pose of the user's face based on the pose of the reference model that produced the high value of the matching score).
It would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to have modified Hefny/Nechyba/Benson/Chang to incorporate the teachings of Frueh, and apply the matching score above a threshold into Hefny/Nechyba/Benson/Chang’s teachings in order if a threshold level of confidence is reached, the first portion of the representation is displayed without any blur or distortion.
Doing so would significantly enhance the social connection between users in a virtual 3-D scene.
Conclusion
14. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. They are as recited in the attached PTO-892 form.
15. Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL LE whose telephone number is (571)272-5330. The examiner can normally be reached 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached at (571) 272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL LE/Primary Examiner, Art Unit 2614