Prosecution Insights
Last updated: April 19, 2026
Application No. 17/220,892

Systems and Methods for Generating an Interactive Avatar Model

Non-Final OA §103§112
Filed
Apr 01, 2021
Examiner
ALABI, OLUWATOSIN O
Art Unit
2129
Tech Center
2100 — Computer Architecture & Software
Assignee
VIDEX, INC.
OA Round
6 (Non-Final)
58%
Grant Probability
Moderate
6-7
OA Rounds
3y 8m
To Grant
85%
With Interview

Examiner Intelligence

Grants 58% of resolved cases
58%
Career Allow Rate
116 granted / 199 resolved
+3.3% vs TC avg
Strong +26% interview lift
Without
With
+26.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 8m
Avg Prosecution
45 currently pending
Career history
244
Total Applications
across all art units

Statute-Specific Performance

§101
21.9%
-18.1% vs TC avg
§103
40.0%
+0.0% vs TC avg
§102
9.5%
-30.5% vs TC avg
§112
23.2%
-16.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 199 resolved cases

Office Action

§103 §112
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 08/04/2025 has been entered. Priority Applicant’s claim for the benefit of as a continuation of U.S. Patent Application Number 13/852,126 filed on 03/28/2013, which claims priority to, and the benefit of U.S. Provisional Application No. 61/618593 filed on 03/30/2012 is acknowledged. Specification The substitute specification filed 04/01/2021 has been entered. Drawings The drawings were received on 03/31/2022. These drawings are acceptable. Claim Interpretation The examiner notes the following guidance regarding the interpretation of claim language for determining the broadest reasonable interpretation (BRI), in light of the specification. MPEP notes: MPEP 2111: CLAIMS MUST BE GIVEN THEIR BROADEST REASONABLE INTERPRETATION IN LIGHT OF THE SPECIFICATION “During patent examination, the pending claims must be ‘given their broadest reasonable interpretation consistent with the specification.’ The Federal Circuit’s en banc decision in Phillips v. AWH Corp., 415 F.3d 1303, 1316, 75 USPQ2d 1321, 1329 (Fed. Cir. 2005) expressly recognized that the USPTO employs the "broadest reasonable interpretation" standard: The Patent and Trademark Office ("PTO") determines the scope of claims in patent applications not solely on the basis of the claim language, but upon giving claims their broadest reasonable construction "in light of the specification as it would be interpreted by one of ordinary skill in the art." In re Am. Acad. of Sci. Tech. Ctr., 367 F.3d 1359, 1364[, 70 USPQ2d 1827, 1830] (Fed. Cir. 2004)…” (emphasis added) 2111.01 (I). THE WORDS OF A CLAIM MUST BE GIVEN THEIR "PLAIN MEANING" UNLESS SUCH MEANING IS INCONSISTENT WITH THE SPECIFICATION “Under a broadest reasonable interpretation (BRI), words of the claim must be given their plain meaning, unless such meaning is inconsistent with the specification. The plain meaning of a term means the ordinary and customary meaning given to the term by those of ordinary skill in the art at the relevant time. The ordinary and customary meaning of a term may be evidenced by a variety of sources, including the words of the claims themselves, the specification, drawings, and prior art. However, the best source for determining the meaning of a claim term is the specification - the greatest clarity is obtained when the specification serves as a glossary for the claim terms… The presumption that a term is given its ordinary and customary meaning may be rebutted by the applicant by clearly setting forth a different definition of the term in the specification. In re Morris, 127 F.3d 1048, 1054, 44 USPQ2d 1023, 1028 (Fed. Cir. 1997) (the USPTO looks to the ordinary use of the claim terms taking into account definitions or other "enlightenment" contained in the written description); But c.f. In re Am. Acad. of Sci. Tech. Ctr., 367 F.3d 1359, 1369, 70 USPQ2d 1827, 1834 (Fed. Cir. 2004) ("We have cautioned against reading limitations into a claim from the preferred embodiment described in the specification, even if it is the only embodiment described, absent clear disclaimer in the specification."). When the specification sets a clear path to the claim language, the scope of the claims is more easily determined and the public notice function of the claims is best served.” (emphasis added) 2111.01 (II). IT IS IMPROPER TO IMPORT CLAIM LIMITATIONS FROM THE SPECIFICATION “Though understanding the claim language may be aided by explanations contained in the written description, it is important not to import into a claim limitations that are not part of the claim. For example, a particular embodiment appearing in the written description may not be read into a claim when the claim language is broader than the embodiment." Superguide Corp. v. DirecTV Enterprises, Inc., 358 F.3d 870, 875, 69 USPQ2d 1865, 1868 (Fed. Cir. 2004). See also Liebel-Flarsheim Co. v. Medrad Inc., 358 F.3d 898, 906, 69 USPQ2d 1801, 1807 (Fed. Cir. 2004) (discussing recent cases wherein the court expressly rejected the contention that if a patent describes only a single embodiment, the claims of the patent must be construed as being limited to that embodiment)…” (emphasis added) The following claim terms are considered under BRI in light of the specification, where the specification provides no definition that limits the use of the plain meaning given/documented by the examiner: Avatar: an avatar is any representation of a person in a cartoon-like/digitalized image or other type of character having human characteristics of an human the avatar is based on. The avatars are animated autonomously to interact with other user, other than the human the avatar characteristics are based on. Contemporaneous: existing or occurring in the same period of time; Examiner interprets that any avatar or virtual agent is considered related to contemporaneous events/things used for training/developing the avatar or virtual agent to function autonomously in a computing environment (e.g. capture using initial training data or video records or other recorded data used to render the avatar or virtual agent). Response to Arguments Applicant's arguments filed 08/04/2025 have been fully considered. Regarding the applicant remarks directed to the rejection under 35 USC 112, the amended limitations remove the problematic language, thus render the applicant remarks moot. The rejection noted in the previous in the office action is no longer applicable and the rejection has been withdrawn. Regarding the applicant remarks directed to the rejection under 35 USC 103, the remarks , the examiner remarks are provided below. First, the applicant argues that the cited prior art fails, Xu et al. (US Pub. No. 2012/0130717, hereinafter ‘Xu’), to disclose limitations newly amended claim limitations. The examiner has not previously rejected the noted amended limitation and refers to the current office action. See current office action below. Second, the applicant argues that the cited prior art fails to disclose limitations requiring the human subject to different from the user because the Xu teaches the use of the avatar that behaves in an identical manner regardless of which user the human subject is interacting with. Examiner respectfully disagrees. MPEP 2111 requires that claim limitation be given there broadest reasonable interpretation (BRI) in light of the specification. In addition, although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993) and that though understanding the claim language may be aided by explanations contained in the written description, it is important not to import into a claim limitations that are not part of the claim. Examiner notes that the claim limitation related to applicant remarks are in claim 41 limitation “transmitting, by the server to the second client device, the first set of rendered frames for presenting the avatar to a user different from the human subject” where the avatar is generated based on video data of the human subject in the limitations “receiving, at a server hosting a rendering engine, video data depicting a human subject, the video data captured by a camera of a first client device; processing, by the server, the video data to determine a set of human characteristics representative of the human subject, …, avatar data including a simulated human memory, the avatar data being based on the set of human characteristics, the simulated human memory comprising details of a previous interaction with an avatar, the avatar being a computer-generated autonomous representation of the human subject based on the avatar data” making the BRI as the human subject considered the subject used to making an avatar and any other user interacting with the avatar is consider a user that is different from the claimed human subject. The claim limitation does not disclose how the avatar interacts differently with first different user versus second different user. Examiner further notes that the claims regarding the teaches of the prior art reference are not restricted by the claim limitations. For example: Xu teaches in paragraph 0050, are not analogous to the claimed teaches for transmitting details of the monitored interaction (in absence of the human subject) and another user how is not the human subject. Examiner notes that as should above, the claim limitations relate the human characteristics as avatar data used to render the claimed avatar as a rendered human subject and the other user interacting with the avatar is considered an interactions that is monitored/transmitted for facilitating autonomous communication between an avatar and a user in a game application or social networking environment as disclosed by the Xu and Kuhn references. The applicant claims includes the use of the avatar interaction data as part of the data used in rendering the avatar and does not exclude the other user from the interaction as claimed in the limitations “receiving, at a server hosting a rendering engine, video data depicting a human subject, the video data captured by a camera of a first client device; processing, by the server, the video data to determine a set of human characteristics representative of the human subject, the set of human characteristics including physical appearance characteristics, voice characteristics, and personality characteristics; maintaining, in memory of the server, avatar data including a simulated human memory, the avatar data being based on the set of human characteristics, the simulated human memory comprising details of a previous interaction with an avatar, the avatar being a computer-generated autonomous representation of the human subject based on the avatar data” The claimed invention is rendered an obvious combination for generating an avatar based on avatar data from user interactions with the avatar and an human subject (e.g. a dealer in a game is an human subject for generating the virtual/avatar dealer). Applicant remarks appear to disavow the scope or their own claimed invention that clearly include the use of artificial intelligence techniques in generating and deploying an avatar, of an human subject, that learns (e.g. updates memory and transmits interaction details) for interactions with others (i.e. another person other than the avatar’s human subject). The remarks that the applicant gives appear to support that the teaches of the Xu and Kuhn references are within the scope of the claimed inventions. That the Kuhn references teaches the avatar data captured from human subjects interacting with the dealer as the claimed details of a pervious interaction with an avatar… requires. The applicant appears to be arguing against their own claim limitations as the claim limitation provide support for making the prior art combination. Also, see current rejection under 35 USC 112(b). Examiner reminds the applicant that claim limitations must be given BRI and limitations not recited in the claim limitations cannot be used to limited the scope of the claim to a preferred embodiment, see MPEP 2111. The applicant’s allegations are considered mere allegation of patentability as the remarks fails to be recited by claim limitations. Xu does teach rendering an avatar that can adapt its behavior in Xu [0019] and [0024]. The secondary reference Kuhn (US Pat. No. 9,202,171, hereinafter ‘Kuhn’) also teaches the use of an avatar/virtual dealer for interacting with players as other users during a game. In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). Applicant argues the examiner has not made a prima facie case of obviousness because one of ordinary skill would not be motivated to combine the Xu and Kuhn references . Specifically, the applicant alleges that Kuhn teaches the rendering of an avatar autonomously while Xu discloses capturing elements of the human subject to render an avatar autonomously. Examiner disagrees. The guidance for documenting a prima face case of obviousness are specified in MPEP 2141 The MPEP section provides a clear framework for the objective analysis used for determining obviousness under 35 U.S.C. 103 stated in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966) as follows: (A) Determining the scope and content of the prior art; (B) Ascertaining the differences between the claimed invention and the prior art; and (C) Resolving the level of ordinary skill in the pertinent art.,” where the examiner has identified all the required elements in both the pervious and current office actions. Therefore, a prima facie case has been made of record by the examiner. When a prima facie case of obviousness is established, the burden shifts to the appellant to come forward with arguments and/or evidence to rebut the prima facie case, see MPEP 2145. The applicant has provided no objective evidence to conclude that the proposed modification and combinations recited in the secondary references would change the principle operation of the primary reference. In addition, both Xu and Kuhn are in the same field of endeavor generating virtual agents (e.g. avatars) based on a human subject using information processing and machine learning techniques. Specifically, Xu states in 0019: “This disclosure describes an architecture and techniques for providing an expressive avatar for various applications. For instance, the techniques described below may allow a user to represent himself or herself as an avatar in some applications, such as chat applications, game applications, social network applications, and the like. Furthermore, the techniques may enable the avatar to express a range of emotional states with realistic facial expressions, lip synchronization, and head movements to communicate in a more interactive manner with another user.” (emphasis added) The Kuhn (US Pat. No. 9,202,171, hereinafter ‘Kuhn’) states that it teaches techniques to generates a virtual dealer that can be animated for presentation as an hologram or video, considered an avatar, for a game application in abstract: “Virtual game dealers based on artificial intelligence are described. In one implementation, an artificial intelligence (AI) engine tracks player attributes and game states of a given electronic game, such as a multiplayer electronic card game hosted by a virtual dealer. The virtual dealer may be embodied as a video, hologram, or robot. The AI engine selects speech and gestures for the virtual dealer based on the game states and player attributes… Supported by the AI engine, the virtual dealer may personalize dialogue, provide information, and perform card and chip tricks. The AI engine may also animate a virtual player and select interactions between the virtual dealer and the virtual player based on game states and attributes of the human players.” And in Kuhn 5:40-56: “The output of the AI engine 102 is a signal representing intelligent reactions 406 for the virtual dealer 106, to be projected by the virtual dealer projector (projection engine) 636. The virtual dealer projector 636 may break the task of generating the virtual dealer 106 into several components. Thus, the virtual dealer projector 636 may have an emotion engine 638, a speech engine 640, a gesture engine 642, a character engine 644, and an accessories engine 646. The virtual dealer projector 636 also includes a video generator 518 and an audio generator 520. The video generator 518 may include a video synthesizer 648 to animate a video dealer 50 image or avatar and/or may include stored video segments 650 or "clips" that can be executed to generate virtual dealer behavior…” (emphasis added) The examiner highlights the cited section as further support that the Xu and Kuhn reference are in the same field of endeavor as required by MPEP 2141. Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references. The rejections made in the previous office action has been maintained. Applicant argues that the prior art fails to teach elements in the amended claim. Examiner notes that the amended claim have not been previously examined by the examiner. See current office action for rejection of the amended claim limitations under 35 USC 103. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention. Claims 41-60 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Regarding claim 41 the recites the limitation “transmitting, by the server to the second client device, the first set of rendered frames for presenting the avatar to a user different from the human subject at a first-time in a display of the second client device, the presenting occurring independent of contemporaneous input from the human subject relating to the presenting” (emphasis added) that renders the claim indefinite. Specifically, contemporaneous usually refers to events or things that occurred or existed during the same broad period of time; and the claimed “the presenting” depends from the limitations “receiving, at a server hosting a rendering engine, video data depicting a human subject, the video data captured by a camera of a first client device; processing, by the server, the video data to determine a set of human characteristics representative of the human subject, the set of human characteristics including physical appearance characteristics, voice characteristics, and personality characteristics; maintaining, in memory of the server, avatar data including a simulated human memory, the avatar data being based on the set of human characteristics, the simulated human memory comprising details of a previous interaction with an avatar, the avatar being a computer-generated autonomous representation of the human subject based on the avatar data; … transmitting, by the server to the second client device, the first set of rendered frames for presenting the avatar to a user different from the human subject at a first-time in a display of the second client device, …” (emphasis added). Examiner notes, these limitations appear to require the avatar data to depend on the claimed “set of human characteristics” captured by claimed “video data depicting a human subject” wherein the claim requires “the avatar being a computer-generated autonomous representation of the human subject based on the avatar data”. This appears to be data dependent of contemporaneous input from the human subject” thereby making any presentation of the avatar in the claimed invention “presenting occurring dependent of contemporaneous input from the human subject relating to the presenting” per the noted claim limitations. The newly amended requirement “…presenting the avatar to a user different from the human subject at a first-time in a display of the second client device, the presenting occurring independent of contemporaneous input from the human subject relating to the presenting” appears to be in conflict with the requirement that the presented avatar depend on input taken from the human subject at the same time the human subject is present (e.g. alive at the time recording claimed video data), as noted in the set of claims preceding the problematic claim limitation and requiring claimed presenting to be dependent of contemporaneous input from the human subject relating to the presenting (as noted in claims above). This makes the scope of the claim limitation indefinite and one of ordinary skill in the art would be unable to ascertained the intended scope of the claimed invention. Examiner interprets the amended claim as any interaction with an avatar and another user as within the scope of the claim limitation. Regarding claims 49 and 57, the claim recite similar limitations as those noted in the claim 41 rejection and are rejected under the same rationale. Regarding dependent claims that depend on independent claims 41, 49, and 57, the dependent claims do not resolve the deficiencies noted in their respective independent claims. Thus the dependent claims are appropriately rejected for the noted issue above. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 41-43, 45-47, 49-51, 53-55, and 57-59 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. (US Pub. No. 2012/0130717, hereinafter ‘Xu’) in view of Kuhn (US Pat. No. 9,202,171, hereinafter ‘Kuhn’) and in further view of Deng et al. (NPL: “Computer Facial Animation: A Survey”, hereinafter ‘Deng’). Regarding independent claim 41 limitations, Xu teaches: a method, performed by one or more computing systems of enhancing the anthropomorphism of an interactive avatar through recall simulation, the method comprising: (Xu teaches in n 0075-0076: FIG. 9 is a block diagram showing an example server usable with the environment of FIG. 1. The server 112 may be configured as any suitable system capable of services, which includes, but is not limited to, implementing the avatar­based service [computing systems of enhancing the anthropomorphism of an interactive avatar through recall simulation] 110 for online services, such as providing ava­tars in instant-messaging programs. In one example configu­ration, the server 114 comprises at least one processor 900, a memory 902, and a communication connection(s) 904… Turning to the contents of the memory 902 in more detail, the memory 902 may store an operating system 906, and the avatar application 116…; And 0082: The server 114 [a computing systems] may also include additional remov­able storage 914 and/or non-removable storage 916. Any memory described herein may include volatile memory ( such as RAM), nonvolatile memory, removable memory, and/or non-removable memory, implemented in any method or tech­nology for storage of information, such as computer-readable storage media, computer-readable instructions, data struc­tures, applications, program modules, emails, and/or other content. Also, any of the processors described herein may include onboard memory in addition to or instead of the memory shown in the figures….) receiving, at a server hosting a rendering engine, video data depicting a human subject, the video data captured by a camera of a first client device; processing, by the server, the video data to determine a set of human characteristics representative of the human subject, the set of human characteristics including physical appearance characteristics, voice characteristics, and personality characteristics; (Xu in 0050-0054: …For example, the process may apply and track about 60 or more facial markers to capture facial features when expressing facial expressions. Multiple cameras [the video data captured by a camera of a first client device] may record the movement to a computer [claimed receiving, at a server hosting a rendering engine, video data depicting a human subject]. The performance capture [claimed processing, by the server, the video data to determine a set of human characteristics representative of the human subject, the set of human characteristics including physical appearance characteristics, voice characteristics, and personality characteristics,]may use a higher resolution to detect and to track subtle facial expressions [claimed personality characteristics;], such as small movements of the eyes and lips [claimed voice characteristics]… The markers may be placed on each side of the shoulder and in the back. Imple­mentations of the data include using a live video feed or a recorded video stored in the database 118… The avatar application 116 processes the speech and observations to identify the relationships between the speech, facial expressions, head and shoulder movements. The avatar application 116 uses the relationships to create one or more animated models for the different upper body parts… In an implemen­tation, the one or more animated models learn and train from the observations of the speech and motion data to generate probabilistic motions of the upper body parts… Returning to FIG. 4, at 402, the avatar application 116 extracts features based on speech signals of the data [claimed voice characteristics]. The avatar application 116 extracts segmented speech phoneme and prosody features from the data. The speech phoneme is further segmented into some or all of the following: indi­vidual phones, diphones, half-phones, syllables, morphemes, words, phrases, and sentences to determine speech character­istics. Fig. 1 and Fig. 9 depicts the server system for performing claimed functions, in 0075-0076: FIG. 9 is a block diagram showing an example server usable with the environment of FIG. 1. The server 112 [at a server hosting a rendering engine] may be configured as any suitable system capable of services, which includes, but is not limited to, implementing the avatar­based service [at a server hosting a rendering engine] 110 for online services, such as providing ava­tars in instant-messaging programs. In one example configu­ration, the server 114 comprises at least one processor 900, a memory 902, and a communication connection(s) 904… Turning to the contents of the memory 902 in more detail, the memory 902 may store an operating system 906, and the avatar application 116…; And 0082: The server 114 [at a server hosting a rendering engine] may also include additional remov­able storage 914 and/or non-removable storage 916. Any memory described herein may include volatile memory ( such as RAM), nonvolatile memory, removable memory, and/or non-removable memory, implemented in any method or tech­nology for storage of information, such as computer-readable storage media, computer-readable instructions, data struc­tures, applications, program modules, emails, and/or other content. Also, any of the processors described herein may include onboard memory in addition to or instead of the memory shown in the figures….) maintaining, in memory of the server, avatar data including a simulated human memory, the avatar data being based on the set of human characteristics, the simulated human memory comprising details of a previous interaction with an avatar, the avatar being a computer-generated autonomous representation of the human subject based on the avatar data; (Xu teaches simulated memory as trained models for creating animated models, in 0053: The avatar application 116 [claimed maintaining, in memory of the server, avatar data including a simulated human memory, the avatar data being based on the set of human characteristics, the simulated human memory comprising details of a previous interaction with an avatar, the avatar being a computer- generated autonomous representation of the human subject based on the avatar data] processes the speech and observations to identify the relationships between the speech, facial expressions, head and shoulder movements [claimed … the avatar data being based on the set of human characteristics, the simulated human memory comprising details of a previous interaction with an avatar, …]. The avatar application 116 uses the relationships to create one or more animated models for the different upper body parts. The ani­mated model may perform similar to a probabilistic trainable model, such as Hidden Markov Models (HMM) or Artificial Neural Networks (ANN). For example, HMMs are often used for modeling as training is automatic and the HMMs are simple and computationally feasible to use. In an implemen­tation, the one or more animated models [claimed maintaining, in memory of the server, avatar data including a simulated human memory, the avatar data being based on the set of human characteristics, the simulated human memory comprising details of a previous interaction with an avatar, the avatar being a computer- generated autonomous representation of the human subject based on the avatar data] learn and train from the observations of the speech and motion data [claimed the simulated human memory comprising details of a previous interaction with an avatar] to generate probabilistic motions of the upper body parts.; Additionally, capturing observations for generating personalized avatar models, in 0043-0053: FIG. 3 is a flowchart showing an illustrative process of creating a personalized avatar comprising an animated representation of an individual 202 [claimed the simulated human memory comprising details of a previous interaction with an avatar, the avatar being a computer- generated autonomous representation of the human subject based on the avatar data] ( discussed at a high level above)… At 300, the avatar application 116 receives a frontal view image of the user 102 as viewed on the computing device 106. Images for the frontal view may start from a top of a head down to a shoulder in some instances, while in other instances these images may include an entire view of a user from head to toe. The images may be photographs or taken from sequences of video, and in color or in black or white. In some instances, the applications for the avatar 104 focus primarily on movements of upper body parts, from the top of the head down to the shoulder. Some possible applications with the upper body parts are to use the personalized avatar 104 as a virtual news anchor, a virtual assistant, a virtual weather person, and as icons in services or programs. Other applications may focus on a larger or different size of avatar, such as a head-to-toe version of the created avatar… The personalized avatar represents dimensions of the user's features as close as possible without any enlargement of any feature. In an implementation, the avatar application 116 may The avatar application 116 receives speech and motion data [claimed the simulated human memory comprising details of a previous interaction with an avatar, the avatar being a computer- generated autonomous representation of the human subject based on the avatar data] to create animated models 400. The speech and motion data may be collected using motion capture and/or performance capture, which records movement of the upper body parts and translates the movement onto the animated models [claimed the simulated human memory comprising details of a previous interaction with an avatar, the avatar being a computer- generated autonomous representation of the human subject based on the avatar data]. The upper body parts include but are not limited to one or more of overall face, a chin, a mouth, a tongue, a lip, a nose, eyes, eyebrows, a forehead, cheeks, a head, and a shoul­der. Each of the different upper body parts may be modeled using same or different observation data…. ) receiving, at the server, a request to present the avatar at a second client device; (Processes may be performed using different devices including claimed second device, in 0035: … For discussion purposes, the processes are described with reference to the computing environment 100 shown in FIG. 1. However, the processes may be performed using different environments and devices. Moreover, the environments and devices [including claimed a second client device] described herein may be used to perform different processes. And including processes to render the avatar as claimed request to present in 0042: … This phase includes combining the personalized avatar generated 202 with the mapping of a number of points (e.g., about 92 points, etc.) to the face to generate a 2D cartoon avatar. The 2D cartoon avatar is a low resolution, which allows rendering of this avatar to occur on many computing devices [receiving, at the server, a request to present the avatar at a second client device].) processing, by the rendering engine of the server and in response to the request, animation data corresponding to the avatar data to generate a first set of rendered frames; (Xu teaches in 0035: … For discussion purposes, the processes are described with reference to the computing environment 100 shown in FIG. 1. However, the processes may be performed using different environments and devices. Moreover, the environments and devices [including claimed a second client device] described herein may be used to perform different processes. And including processes to render the avatar as claimed request to present in 0042: … This phase includes combining the personalized avatar generated 202 with the mapping of a number of points (e.g., about 92 points, etc.) to the face to generate a 2D cartoon avatar [processing, by the rendering engine of the server and in response to the request, animation data corresponding to the avatar data to generate a first set of rendered frames]. The 2D cartoon avatar is a low resolution, which allows rendering [by the rendering engine of the server] of this avatar to occur on many computing devices [receiving, at the server, a request to present the avatar at a second client device].) transmitting, by the server to the second client device, the first set of rendered frames for presenting the avatar to a user different from the human subject at a first-time in a display of the second client device, the presenting occurring independent of contemporaneous input from the human subject relating to the presenting; (Xu teaches in 0019: This disclosure describes an architecture and tech­niques for providing an expressive avatar for various appli­cations [transmitting, by the server to the second client device, the first set of rendered frames for presenting the avatar … at a first-time in a display of the second client device]. For instance, the techniques described below may allow a user to represent himself or herself as an avatar in some applications, such as chat applications, game applica­tions, social network applications, and the like… For example, the user, through the avatar, may express feelings of happiness while inputting text into an application, in response, the avatar's lips may tum up at the comers to show the mouth of the avatar smiling while speaking. By animating the avatar [the first set of rendered frames for presenting the avatar … at a first-time in a display of the second client device] in this man­ner, the other user [a user different from the human subject at a first-time in a display of the second client device] that views the avatar [transmitting, by the server to the second client device, the first set of rendered frames for presenting the avatar to a user different from the human subject at a first-time in a display of the second client device] is more likely to respond accordingly based on the avatar's visual appearance. Stated otherwise, the expressive avatar may be able to repre­sent the user's mood to the other user [the presenting occurring independent of contemporaneous input from the human subject relating to the presenting], which may result in a more fruitful and interactive communication. And Xu teaches the animation as claimed transmitting using video frames in 0024: A variety of applications may use the expressive avatar. The expressive avatar may be referred to as a digital avatar [transmitting, by the server to the second client device, the first set of rendered frames for presenting the avatar to a user different from the human subject at a first-time in a display of the second client device], a cartoon character, or a computer-generated character that exhibits human characteristics. The various applications using the avatar include but are not limited to, instant-mes­saging programs, social networks, video or online games, cartoons, television programs, movies, videos, virtual worlds, and the like. For example, an instant-messaging program displays an avatar representative of a user in a small window [the presenting occurring independent of contemporaneous input from the human subject relating to the presenting]. Through text-to-speech technology, the avatar speaks the text as the user types the text being used at a chat window [the presenting occurring independent of contemporaneous input from the human subject relating to the presenting as the independent presenting of the speaking avatar and user input]. In particular, the user is able to share their mood, temperament, or disposition with the other user [a user different from the human subject at a first-time in a display of the second client device], by having the avatar exhibit facial expressions synchronized with head/shoulder move­ments representative of the emotional state of the user…) monitoring, by the second client device, an interaction between the avatar and the user different from the human subject, the interaction comprising the avatar responding autonomously, based on the avatar data, to input from the user different from the human subject; (Xu teaches in 0019: … For example, the user, through the avatar [the interaction comprising the avatar responding autonomously, based on the avatar data] may express feelings of happiness while inputting text into an application, in response, the avatar's lips may tum up at the comers to show the mouth of the avatar smiling while speaking. By animating the avatar in this man­ner, the other user [the user different from the human subject] that views the avatar is more likely to respond accordingly based on the avatar's visual appearance [monitoring, by the second client device, an interaction between the avatar and the user different from the human subject, …]. Stated otherwise, the expressive avatar may be able to repre­sent the user's mood [the avatar] to the other user [a user different from the human subject], which may result in a more fruitful and interactive communication [monitoring, by the second client device, an interaction between the avatar and the user different from the human subject, the interaction comprising the avatar responding autonomously, based on the avatar data, to input from the user different from the human subject.) transmitting, from the second client device to the server, details related to the interaction to establish an updated simulated human memory related to the user different from the human subject; (Xu teaches learning as claimed updating of the stimulated memory, in 0053: …In an implementation, the one or more animated models learn and train [claimed transmitting, from the second client device to the server, details related to the interaction to establish an updated simulated human memory related to the user] from the observations [claimed with details related to the interaction to establish an updated simulated human memory related to the user] of the speech and motion data to generate probabilistic motions of the upper body parts; And observation of the user including the inputted observations of the avatar interacting with user other than human subject in in 0019: … For example, the user, through the avatar [transmitting, from the second client device to the server, details related to the interaction] may express feelings of happiness while inputting text into an application, in response, the avatar's lips may tum up at the comers to show the mouth of the avatar smiling while speaking. By animating the avatar in this man­ner, the other user [to establish an updated simulated human memory related to the user different from the human subject] that views the avatar is more likely to respond accordingly based on the avatar's visual appearance. Stated otherwise, the expressive avatar may be able to repre­sent the user's mood to the other user [details related to the interaction to establish an updated simulated human memory related to the user different from the human subject], which may result in a more fruitful and interactive communication [transmitting, from the second client device to the server, details related to the interaction to establish an updated simulated human memory related to the user different from the human subject].) And in 0022-0024: The avatar application receives real-time speech input and synthesizes an animated sequence of motion of the upper body parts by applying the animated model [transmitting, from the second client device to the server, details related to the interaction to establish an updated simulated human memory related to the user different from the human subject]… The various applications using the avatar include but are not limited to, instant-messaging programs, social networks, video or online games, cartoons, television programs, movies, videos, virtual worlds, and the like. For example, an instant-messaging program displays an avatar representative of a user in a small window. Through text-to-speech technology, the avatar speaks the text as the user types the text being used at a chat window. In particular, the user is able to share their mood, temperament, or disposition with the other user [transmitting, from the second client device to the server, details related to the interaction to establish an updated simulated human memory related to the user different from the human subject], by having the avatar exhibit facial expressions synchronized with head/shoulder movements representative of the emotional state of the user…. And in claimed simulated memory in 0056: At 406, the avatar application 116 trains the one or more animated models by using the extracted features from the speech 402, motion trajectories transformed from the motion data 404, and speech and motion data 400. The avatar application 116 trains the animated models using the extracted features, such as sentences, phrases, words, pho­nemes, and transformed motion trajectories on a new coordi­nate motion. In particular, the animated model may generate a set of motion trajectories, referred to as probabilistic motion sequences of the upper body parts based on the extracted features of the speech... and transmitting, by the server to the second client device, a second set of rendered frames for presenting the avatar at a second-time in the display of the second client device, the second set of rendered frames comprising a sequence of at least one phoneme and at least one viseme that substantially convey a reference to the monitored interaction between the avatar and the user different from the human subject based on the updated simulated human memory. (Xu teaches providing the avatar in real-time, in 0077-0079: The avatar application 116 provides access to avatar-based service 110. It receives real-time speech input. The avatar application 116 further provides a display of the application on the user interface [and transmitting, by the server to the second client device, a second set of rendered frames for presenting the avatar at a second-time in the display of the second client device,], and interacts with the other modules to provide the real-time animation of the avatar in 2D… The training model module 908 receives the speech and motion data, builds, and trains the animated model. The training model module 908 computes relationships between speech and upper body parts motion by constructing the one or more animated models [the updated simulated human memory] for the different upper body parts…. The synthesis module 910 synthesizes an animated sequence of motion of upper body parts by applying the animated model in response [the updated simulated human memory] to the real-time speech input… The synthesis module 910 provides an output of speech corresponding to the real-time speech input [claimed the second set of rendered frames comprising a sequence of at least one phoneme and at least one viseme that substantially convey a reference to the monitored interaction between the avatar and the user different from the human subject based on the updated simulated human memory], and constructs a real-time animation based on the output of speech synchronized to the animation sequence of motions of the one or more upper body parts [output speech synchronized to animation as claimed a sequence of at least one phoneme… that substantially convey a reference to the monitored interaction between the avatar and the user different from the human subject based on the updated simulated human memory].; And the animation model for generating an user avatar to interact with another user, in 0019: … For instance, the techniques described below may allow a user to represent himself or herself as an avatar in some applications [and transmitting, by the server to the second client device, a second set of rendered frames for presenting the avatar at a second-time in the display of the second client device, the second set of rendered frames comprising a sequence of at least one phoneme and at least one viseme that substantially convey a reference to the monitored interaction between the avatar and the user different from the human subject based on the updated simulated human memory], such as chat applications, game applica­tions, social network applications, and the like. Furthermore, the techniques may enable the avatar to express a range of emotional states with realistic facial expressions, lip synchro­nization, and head movements to communicate in a more interactive manner with another user. In some instances, the expressed emotional states may correspond to emotional states being expressed by the user. For example, the user, through the avatar, may express feelings of happiness while inputting text into an application, in response, the avatar's lips may tum up at the comers to show the mouth of the avatar smiling while speaking. By animating the avatar in this man­ner, the other user that views the avatar is more likely to respond accordingly based on the avatar's visual appearance. Stated otherwise, the expressive avatar may be able to repre­sent the user's mood to the other user [the user different from the human subject], which may result in a more fruitful and interactive communication [the monitored interaction between the avatar and the user different from the human subject based on the updated simulated human memory].) Xu teaches the avatar that is displayed on a user interface device to process speech inputs in real-time in a first and second time as noted above. One of ordinary skill in the art would understand that user interact with avatar in real-time using speech input in a sequence in close a first and second event interaction time (e.g. at a first and second time). Xu does not expressly disclose user interaction with avatar in real-time using speech input in a sequence including a first and second time period of an interaction sequence for presenting an avatar as claimed in transmitting, by the server to the second client device, the first set of rendered frames for presenting the avatar to a user different from the human subject at a first-time in a display of the second client device,… transmitting, by the server to the second client device, a second set of rendered frames for presenting the avatar at a second-time in the display of the second client device,… Kuhn expressly teaches user interaction with avatar in real-time using speech input in a sequence including a first and second time period of an interaction sequence for presenting an avatar as claimed in transmitting, by the server to the second client device, the first set of rendered frames for presenting the avatar to a user different from the human subject at a first-time in a display of the second client device,… transmitting, by the server to the second client device, a second set of rendered frames for presenting the avatar at a second-time in the display of the second client device,… in 3:11-41: Rather than presenting the exemplary virtual game dealers 106 described here
Read full office action

Prosecution Timeline

Apr 01, 2021
Application Filed
Feb 10, 2023
Non-Final Rejection — §103, §112
Jul 14, 2023
Response Filed
Aug 26, 2023
Final Rejection — §103, §112
Feb 23, 2024
Request for Continued Examination
Mar 02, 2024
Response after Non-Final Action
Mar 07, 2024
Interview Requested
Apr 02, 2024
Applicant Interview (Telephonic)
Apr 02, 2024
Examiner Interview Summary
Apr 20, 2024
Non-Final Rejection — §103, §112
Oct 23, 2024
Response Filed
Feb 01, 2025
Final Rejection — §103, §112
Aug 04, 2025
Request for Continued Examination
Aug 06, 2025
Response after Non-Final Action
Sep 09, 2025
Non-Final Rejection — §103, §112
Nov 06, 2025
Non-Final Rejection — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12579409
IDENTIFYING SENSOR DRIFTS AND DIVERSE VARYING OPERATIONAL CONDITIONS USING VARIATIONAL AUTOENCODERS FOR CONTINUAL TRAINING
2y 5m to grant Granted Mar 17, 2026
Patent 12572814
ARTIFICIAL NEURAL NETWORK BASED SEARCH ENGINE CIRCUITRY
2y 5m to grant Granted Mar 10, 2026
Patent 12561570
METHODS AND ARRANGEMENTS TO IDENTIFY FEATURE CONTRIBUTIONS TO ERRONEOUS PREDICTIONS
2y 5m to grant Granted Feb 24, 2026
Patent 12547890
AUTOREGRESSIVELY GENERATING SEQUENCES OF DATA ELEMENTS DEFINING ACTIONS TO BE PERFORMED BY AN AGENT
2y 5m to grant Granted Feb 10, 2026
Patent 12536478
TRAINING DISTILLED MACHINE LEARNING MODELS
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

6-7
Expected OA Rounds
58%
Grant Probability
85%
With Interview (+26.3%)
3y 8m
Median Time to Grant
High
PTA Risk
Based on 199 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month