Prosecution Insights
Last updated: April 19, 2026
Application No. 18/590,104

GENERATING ENGAGEMENT SCORES USING MACHINE LEARNING MODELS FOR USERS INTERACTING WITH VIRTUAL OBJECTS

Non-Final OA §103
Filed
Feb 28, 2024
Examiner
XIE, THEODORE L
Art Unit
3623
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
International Business Machines Corporation
OA Round
3 (Non-Final)
50%
Grant Probability
Moderate
3-4
OA Rounds
1y 7m
To Grant
99%
With Interview

Examiner Intelligence

Grants 50% of resolved cases
50%
Career Allow Rate
2 granted / 4 resolved
-2.0% vs TC avg
Strong +100% interview lift
Without
With
+100.0%
Interview Lift
resolved cases with interview
Fast prosecutor
1y 7m
Avg Prosecution
38 currently pending
Career history
42
Total Applications
across all art units

Statute-Specific Performance

§101
36.6%
-3.4% vs TC avg
§103
43.9%
+3.9% vs TC avg
§102
9.4%
-30.6% vs TC avg
§112
10.1%
-29.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 4 resolved cases

Office Action

§103
DETAILED ACTION Status of Application The following is a Non-Final Office Action. In response to Examiner's communication on 11/10/2025, Applicant on 01/13/2026, amended Claims 1, 3, 5-7, 9-20. Claims 1-20 are now pending in this application and have been rejected below. Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 01/13/2026 has been entered. Response to Amendment Applicants’ amendments are sufficient to overcome the 35 USC 101 rejections set forth in the previous action. Accordingly, the rejections have been withdrawn below. Applicants’ amendments are insufficient to overcome the 35 USC 103 rejections set forth in the previous action. Therefore, these rejections have been updated to address the amendments and are maintained below. Response to Arguments – 35 USC § 101 Applicant's arguments with respect to the 35 USC 101 rejections have been fully considered and are found to be persuasive. The limitation in amended Claims 1, 11, and 16, “controlling an extended reality generator that renders in an extended reality display device the modified presentation of the virtual object to superimpose within a real-world environment…”, are sufficient to integrate the recited abstract ideas into a practical application due to applying or using the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception. Accordingly, the rejections under 35 USC 101 have been withdrawn. Response to Arguments – 35 USC § 103 Applicant' s arguments with respect to the rejection of Claims 1-20 under 35 USC 103 have been considered but are moot in light of new grounds of rejections necessitated by applicant’s amendments. Applicant firstly argues that Serhad does not teach the amended limitation of calculating an engagement score as representing a likelihood the user interacts with the object in real life. Examiner respectfully disagrees, see [0121] of Serhad, “In another embodiment, the virtual objects may relate to marketing and advertising, as depicted in block 540, and may be obtained from one or more related sources that store such marketing and advertising virtual objects. Such objects may be contextually related or unrelated to the surface on which they are to be overlayed. In one embodiment, a marketer or advertiser may overlay a virtual object to determine if the user may interact or engage with the virtual object. The marketer or advertiser may display such a virtual object either to specifically determine the user’s likes and dislikes or as part of a survey being conducted. The marketer or advertiser may use such data to promote their products and sell them to the user and others”. One possible application of this score is exactly the determination of a likelihood that the user will engage with or acquire the object in real life. Applicant further argues that the limitations of “a second engagement score indicating a second likelihood the user further interacts with the real-world entity”, trained from input comprising user information requests with respect to the real-world entity. Examiner respectfully disagrees. Applicant asserts that there is no teaching or suggestion of outputting a second engagement score indicating user interest in a real-world entity based on information access requests that is separate from a first engagement score. We can understand the conversion rate of Zhang to be the second engagement score, with the VR score of Serhad above serving as the first engagement score. In [0002] of Zhang, "A conversion rate is the percentage or proportion of visitors to a website or application that complete some predefined action (e.g., the download of a software instance within the message)". In [0033], "In certain embodiments, a message effectiveness prediction, such as a conversion rate prediction, can then be made for the message. For example, based on historical messages and their associated conversion rates, certain words may surpass a popularity threshold or otherwise be associated with certain conversion rates. Accordingly, an incoming message may use one or more various message elements that have historically been associated with particular conversion rates. Consequently, a predicted conversion rate can be generated based on patterns and associations of the historical messages and conversion rates”. As Zhang factors in the proportion of visitors that interact with the entity, ie navigating to a website or downloading a software application, we consider this to teach the second engagement score. The rejections have been supplanted by new grounds of rejection under 35 USC 103 and maintained below. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-3, 7-12, 15-17, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Serhad(WO2023158797A1) in view of Zhang(US 20210004437 A1) in further view of Kumar(US 20230368262 A1). Claim 1 As to Claim 1, Serhad teaches: A computer program product for determining an engagement score of a user for a real-world entity represented by a virtual object in an extended-reality environment, In [0167], "The control circuitry may determine which of the multiple virtual objects displayed to enhance by increasing its size based on that user interest score calculated, referred to earlier in FIGS. 6 and 7. As such, when multiple virtual objects are displayed, a virtual object receiving the highest score may be increased to a larger size than other virtual objects receiving a lower score". the computer program product comprising a computer readable storage medium having computer readable program code embodied therein that is executable to perform operations, In [0359], "It will be apparent to those of ordinary skill in the art that methods involved in the above-mentioned embodiments may be embodied in a computer program product that includes a computer-usable and/or -readable medium. For example, such a computer-usable medium may consist of a read-only memory device, such as a CD- ROM disk or conventional ROM device, or a random-access memory, such as a hard drive device or a computer diskette, having a computer-readable program code stored thereon. It should also be understood that methods, techniques, and processes involved in the present disclosure may be executed using processing circuitry". the operations comprising: receiving, from a tracking device, movement parameters from a user in a real-world while the user is interacting with a virtual object in the extended-reality environment; processing, by a movement engagement machine learning model, the movement parameters to determine a first engagement score indicating a first likelihood the user further interacts with or acquires the real-world entity represented by the virtual object based on user interaction with the virtual object, wherein the movement engagement machine learning module is trained to output a predicted engagement score indicating a likelihood of a further interaction with a real-world entity represented by the virtual object in the extended-reality environment from input comprising movement parameters from a user in the real-world while interacting with the virtual object in the extended-reality environment In [0054] Serhad teaches, "Based on user interest, a plurality of virtual objects that can potentially be overlayed may be identified and scored. The score calculation, in one embodiment, may be performed by a scoring engine. The calculation may involve analyzing each virtual object in the library and applying a variety of formulas, weighted averages, means, and other calculations to determine a score. For example, in one embodiment, a score may be calculated based on a predetermined value times a component of user interest, e.g., a * seconds of gaze + b * verbal comments + c * heart rate delta + d * seconds of interaction with object + e * timing of the gaze + f * order of gaze with respect to other virtual objects + g * number of repeating gazes + h * magnitude of body movement change. Score may also be calculated based on relevance and context, urgency, and other factors". While it is true that the score is used to determine the display and arrangement of virtual objects, in [0121], “In another embodiment, the virtual objects may relate to marketing and advertising, as depicted in block 540, and may be obtained from one or more related sources that store such marketing and advertising virtual objects. Such objects may be contextually related or unrelated to the surface on which they are to be overlayed. In one embodiment, a marketer or advertiser may overlay a virtual object to determine if the user may interact or engage with the virtual object. The marketer or advertiser may display such a virtual object either to specifically determine the user’s likes and dislikes or as part of a survey being conducted. The marketer or advertiser may use such data to promote their products and sell them to the user and others”. One possible application of this score is exactly the determination of a likelihood that the user will engage with or acquire the object in real life. In [0262], "accessing components of electronic devices, such as cameras, gyroscopes, accelerometers, heart rate monitors, enhancing or removing tools, functions, and functionalities displayed on an interface of a participant of a conference call, invoking an Al or ML algorithm to perform an analysis on any of the above mentioned data, accessing user’s consumption history, gauging user’s interest in a virtual object, accessing virtual, mixed, or augmented reality headsets and their displays, animating virtual objects, and all the functionalities discussed associated with the figures mentioned in this application". Serhad does not disclose the remaining limitations. However, Zhang teaches: determining information on user information access requests with respect to the real-world entity; In [0054], "By way of example and not limitation, data included in storage 225, as well as any user data, may generally be referred to throughout as data. The data within the storage 225 may be structured (e.g., tabular or database data), semi-structured, and/or unstructured (e.g., data within social media feeds, blogs, etc.). Any such data may be sensed or determined from a sensor (referred to herein as sensor data), such as location information of mobile device(s), smartphone data (such as phone state, charging data, date/time, or other information derived from a smartphone), user-activity information (for example: app usage; online activity; searches; voice data such as automatic speech recognition; activity logs; communications data including calls, texts, instant messages, and emails; website posts; other records associated with events; etc.) including user activity that occurs over more than one user device, user history, session logs, application data, contacts data". inputting the information on the user information access requests to a real-world entity engagement machine learning model to output a second engagement score indicating a second likelihood the user further interacts with the real-world entity, wherein the real-world entity engagement machine learning model is trained to output a predicted engagement score indicating a likelihood of a further interaction with the real-world entity from input comprising user information requests with respect to the real-world entity; Applicant asserts that there is no teaching or suggestion of outputting a second engagement score indicating user interest in a real-world entity based on information access requests that is separate from a first engagement score. We can understand the conversion rate of Zhang to be the second engagement score, with the VR score of Serhad above serving as the first engagement score. In [0002] of Zhang, "A conversion rate is the percentage or proportion of visitors to a website or application that complete some predefined action (e.g., the download of a software instance within the message)". In [0033], "In certain embodiments, a message effectiveness prediction, such as a conversion rate prediction, can then be made for the message. For example, based on historical messages and their associated conversion rates, certain words may surpass a popularity threshold or otherwise be associated with certain conversion rates. Accordingly, an incoming message may use one or more various message elements that have historically been associated with particular conversion rates. Consequently, a predicted conversion rate can be generated based on patterns and associations of the historical messages and conversion rates". In [0063], "The model loading component 416 (which may correspond to the model loading component 216 of FIG. 2) loads the model from the model storage 425 (e.g., persistent storage). In some embodiments, the storage 425 represents the same storage 325 as indicated in the training phase system 300 of FIG. 3.". In [0050], "For example, in a deployed machine learning model environment, an incoming message can be vectorized and processed by the conversion prediction component 208 by identifying each message element in the message". determining a correlation between the first engagement score and the second engagement score to confirm an extent to which the first engagement score and the second engagement score In [0022]. "“Message effectiveness predictions”, “conversion predictions”, or associated predictions described herein corresponds to predicting how effective a particular message will be at: conveying its intended message, reaching a particular audience, prompting individuals to perform a predetermined action, or any suitable prediction. For example, predicting message effectiveness can be or include predicting a conversion rate. As described in more detail herein, predicting conversion rates for an input message can be based on analyzing historical input messages and their associated conversion rates. This may include using one or more machine learning models to identify historical patterns and associations for making predictions". indicate a similar level of engagement with the real-world entity In [0037], “Some embodiments of the present disclosure use models, such as random forest regression models as the machine learning technique for model training. This technique benefits from the capability of incorporating Ensembled learning. Ensembled learning helps improve machine learning results by combining several models. That is, various meta-algorithms combine multiple machine learning techniques into one predictive model in order to decrease variance or bagging, bias or boosting, and/or improve predictions or stacking. In this way, there can be better prediction performance using a relatively small training data set compared to existing technologies”. It would be apparent to one of ordinary skill in the art that the integration of various metrics and their comparison to arrive at a consensus would fall under the umbrella of ensemble learning. Zhang does not teach: and modifying a presentation of the virtual object in the extended-reality environment by controlling an extended reality generator that renders in an extended reality display device the modified presentation of the virtual object to superimpose within a real-world environment in which the user is located However, Serhad teaches: and modifying a presentation of the virtual object in the extended-reality environment by controlling an extended reality generator that renders in an extended reality display device the modified presentation of the virtual object to superimpose within a real-world environment in which the user is located In [0056], “In one embodiment, the overlayed virtual object may be enhanced based on interactive and dynamic content received by the control circuitry from the user, group of users, or collective viewers. As referred to herein, enhance, enhancing, graphically emphasizing, are used interchangeably and mean the same…Examples of such enhancement include animating the virtual object, moving or orienting the virtual object in accordance with the movement of the user associated with the viewing device, changing the size of the virtual object, changing the depth perception or displaying the virtual object either in a 2D or a 3D manner, changing the color of the virtual object, providing links to a website embedded in the virtual object, and other enhancements”. Zhang does teach: in response to determining lack of correlation between the first engagement score and the second engagement score We consider this to be disclosed under the umbrella of the mechanics of random forest learning, in [0033], “Random forest regression models require less training data because these models use ensemble learning and because of the iterative voting nature of random forest models that can use the same training data for different decision tree tests, which can lead to different decision tree leaf node decisions”. Serhad discloses a system for analyzing the effectiveness of advertisements in the context of virtual reality. Zhang discloses a system meant to predict the effectiveness of a message. Each reference discloses means for analyzing and adaptively optimizing messages as to maximize effectiveness. Extending the analytical methods of Zhang to Serhad is applicable as both pertain to the task of optimizing the effectiveness of targeted messages, including advertising. It would have been obvious to one having ordinary skill in the art at the effective filling date of the invention to apply the analytical methods of Zhang and apply that to the system as taught in Serhad. Motivation to do so comes from the fact that the claim is plainly directed to the predictable result of combining known items in the prior art, with the expected benefit that adopting those analytical methods would enable users to expand the granularity of their analysis with respect to understanding the effectiveness of an advertisement. Serhad teaches: enhancing appearance of the virtual object in the extended-reality environment to increase engagement with the virtual object in the extended-reality environment… modifying the virtual object In [0177] of Serhad, “At block 1355, in one embodiment, the view of the virtual object may be enhanced. The control circuitry may change the view of the virtual object from an isometric view to a side view, rotate it a certain angle such as 90 or 180 degrees, provide a different perspective view, or constantly or periodically keep orienting the object at different angles in an attempt to draw the user’s attention to the virtual object”. Motivation to integrate these more granular headset calculations would be clear to one of ordinary skill in the art as enabling a user to receive a more detailed analysis of user engagement in an extended-reality environment. Kumar teaches: wherein in response to the correlation indicating the first engagement score is low relative to the second engagement score…and wherein in response to the correlation indicating the first engagement score is high relative to the second engagement score, Here our second engagement score acts as a threshold that the effectiveness score, or our first engagement score, is compared to. [0024], “The system compares the effectiveness score to one or more threshold values to determine an action to perform associated with the customer experience content. If the effectiveness score is below a low threshold score, the system purges the content from a database of sets of customer experience content. The content is no longer available to be used as marketing content for potential customers. If the effectiveness score is between a low threshold and a high threshold, the system performs an intermediate action. For example, the system may flag the customer experience content for user review. In addition, or in the alternative, the system may temporarily prevent the use of the content for marketing purposes. For example, the system may indicate the content may not be used for marketing purposes for three months, at which time the effectiveness score should be re-calculated. If the effectiveness score is above a high threshold, the system may keep the customer experience content in the database and available for marketing purposes”. to target a group of users more likely to be interested in the real-world entity In [0065] we can adjust our inputs such that analogous content is not recommended to a particular target customer, implicitly encoding the recommendation to provide content to other target users, “According to another example, purging customer experience content based on a low target customer effectiveness score prevents a marketing platform from using a particular set of customer experience content as marketing content for a particular target customer. According to one example, setting a non-use time window, in which customer experience content may not be used by a marketing platform, based on a particular effectiveness score prevents the marketing platform from using the customer experience content for any target customers. According to another example, setting a non-use time window, in which customer experience content may not be used by a marketing platform, based on a particular target customer effectiveness score prevents the marketing platform from using the customer experience content for the particular target customer. The customer experience content may still be available to the marketing platform to provide to different target customers”. Serhad combined with Zhang discloses a system for analyzing and adaptively optimizing the effectiveness of targeted messages in a virtual reality environment. Kumar discloses a system meant to dynamically assess effectiveness of advertising and marketing campaigns. Each reference discloses means for performing analysis on users to optimize advertising campaigns. Extending the comparison logic as recorded in Kumaris applicable to Serhad combined with Zhang as they address analogous tasks in the context of advertising optimization. It would have been obvious to one having ordinary skill in the art at the effective filling date of the invention to extend the angle calculations of Kumar and apply that to the system as taught in Serhad combined with Zhang. Motivation to do so comes from the fact that the claim is plainly directed to the predictable result of combining known items in the prior art, enabling users to further leverage the precomputed engagement scores and perform dynamic optimization logic on their basis. Claims 11 and 16 are rejected as presenting substantially similar limitations as Claim 1. Claim 2 As to Claim 2, Serhad combined with Zhang and Kumar teaches all the limitations of Claim 1 as discussed above. Zhang teaches: The computer program product of claim 1, wherein the correlation is used to determine an efficacy of the virtual object in promoting interest in the real-world entity. In [0022]. "“Message effectiveness predictions”, “conversion predictions”, or associated predictions described herein corresponds to predicting how effective a particular message will be at: conveying its intended message, reaching a particular audience, prompting individuals to perform a predetermined action, or any suitable prediction. For example, predicting message effectiveness can be or include predicting a conversion rate. As described in more detail herein, predicting conversion rates for an input message can be based on analyzing historical input messages and their associated conversion rates. This may include using one or more machine learning models to identify historical patterns and associations for making predictions". Here, we are enriching the data inputted in the prediction with movement parameters and interest; there is a wide bevy of data allowed for in [0054], so adapting movement data when in the context of an extended-reality environment would be applicable. It would have been obvious to one having ordinary skill in the art at the effective filling date of the invention to apply the analytical methods of Zhang and apply that to the system as taught in Serhad. Motivation to do so comes from the same rationale as outlined above with respect to Claim 1. Claim 3 As to Claim 3, Serhad combined with Zhang and Kumar teaches all the limitations of Claim 1 as discussed above. Serhad teaches: The computer program product of claim 1, wherein the tracking device includes a camera to record the movement parameters of the user moving in the real-world and an extended-reality headset to capture user eye movements during interaction with the virtual object in the extended-reality environment, In [0231], "At block 2910, the control circuitry 3420 tracks eyeball movement of user A and user B of a conference call session. In this embodiment, the conference call session includes two users, user A and user B, and virtual objects 1-n. The control circuitry may access the camera associated with the electronic devices used by user A and user B to track such eyeball movement. For example, if the users are using a laptop or a mobile phone, then the control circuitry may access the inward-facing camera that is looking at the user to track the user’s eyeball movement.". In [0262], "accessing components of electronic devices, such as cameras, gyroscopes, accelerometers, heart rate monitors, enhancing or removing tools, functions, and functionalities displayed on an interface of a participant of a conference call, invoking an Al or ML algorithm to perform an analysis on any of the above mentioned data, accessing user’s consumption history, gauging user’s interest in a virtual object, accessing virtual, mixed, or augmented reality headsets and their displays, animating virtual objects, and all the functionalities discussed associated with the figures mentioned in this application". In [0132], "In another embodiment, as depicted in block 710, a category evaluated for scoring user interest is the timing of the gaze. In this embodiment, the front-facing or inward-facing camera, or eye tracking cameras embedded in a smart glasses, may detect the user’s gaze directed at the virtual object, and the control circuitry may evaluate the gaze based on the occurrence, timing, or vergence of the gaze, such as during morning, afternoon, evening, certain days or hours of the week, etc. The timing of the gaze may be used to determine the user’s interest level in the virtual object at different times of the day and days of the week". wherein the determining the first engagement score comprises: inputting the movement parameters and the user eye movements to the movement engagement learning machine learning model to output the first engagement score. In [0262], "accessing components of electronic devices, such as cameras, gyroscopes, accelerometers, heart rate monitors, enhancing or removing tools, functions, and functionalities displayed on an interface of a participant of a conference call, invoking an Al or ML algorithm to perform an analysis on any of the above mentioned data, accessing user’s consumption history, gauging user’s interest in a virtual object, accessing virtual, mixed, or augmented reality headsets and their displays, animating virtual objects, and all the functionalities discussed associated with the figures mentioned in this application". In [0130], "At block 615, the control circuitry may calculate a score for an identified virtual object. The calculations may be based on the user’s interest. Some of the categories evaluated to gauge users’ interest are depicted in FIG. 7, which is a block diagram of categories used in calculating a user interest score relating to a virtual object". It would have been obvious to one having ordinary skill in the art at the effective filling date of the invention to apply the analytical methods of Zhang and apply that to the system as taught in Serhad. Motivation to do so comes from the same rationale as outlined above with respect to Claim 1. Claims 12 and 17 are rejected as presenting substantially similar limitations as Claim 3. Claim 7 As to Claim 7, Serhad combined with Zhang and Kumar teaches all the limitations of Claim 1 as discussed above. Zhang teaches: The computer program product of claim 1, wherein the determining the information on the user information access requests with respect to the real-world entity comprises: tracking information on the real-world entity the user accessed at computer network locations external to the extended-reality environment, In [0054], "By way of example and not limitation, data included in storage 225, as well as any user data, may generally be referred to throughout as data. The data within the storage 225 may be structured (e.g., tabular or database data), semi-structured, and/or unstructured (e.g., data within social media feeds, blogs, etc.). Any such data may be sensed or determined from a sensor (referred to herein as sensor data), such as location information of mobile device(s), smartphone data (such as phone state, charging data, date/time, or other information derived from a smartphone), user-activity information (for example: app usage; online activity; searches; voice data such as automatic speech recognition; activity logs; communications data including calls, texts, instant messages, and emails; website posts; other records associated with events; etc.) including user activity that occurs over more than one user device, user history, session logs, application data, contacts data, record data, notification data, social-network data, news (including popular or trending items on search engines or social networks), home-sensor data, appliance data, global positioning system (GPS) data, vehicle signal data, traffic data, weather data (including forecasts), wearable device data, other user device data (which may include device settings, profiles, network connections such as Wi-Fi network data, or configuration data, data regarding the model number, firmware, or equipment, device pairings, such as where a user has a mobile phone paired with a Bluetooth headset, for example), gyroscope data, accelerometer data, other sensor data that may be sensed or otherwise detected by a sensor (or other detector) component including data derived from a sensor component associated with the user (including location, motion, orientation, position, user-access, user-activity, network-access, user-device-charging, or other data that is capable of being provided by a sensor component), data derived based on other data (for example, location data that can be derived from Wi-Fi, Cellular network, or IP address data), and nearly any other source of data that may be sensed or determined as described herein. In some respects, data or information (e.g., the requested content) may be provided in user signals. A user signal can be a feed of various data from a corresponding data source. For example, a user signal could be from a smartphone, a home-sensor device, a GPS device (e.g., for location coordinates), a vehicle-sensor device, a wearable device, a user device, a gyroscope sensor, an accelerometer sensor, a calendar service, an email account, a credit card account, or other data sources". wherein information on the tracked information is entered into the real-world engagement machine learning model to determine the second engagement score. In [0033], "In certain embodiments, a message effectiveness prediction, such as a conversion rate prediction, can then be made for the message. For example, based on historical messages and their associated conversion rates, certain words may surpass a popularity threshold or otherwise be associated with certain conversion rates. Accordingly, an incoming message may use one or more various message elements that have historically been associated with particular conversion rates. Consequently, a predicted conversion rate can be generated based on patterns and associations of the historical messages and conversion rates". In [0063], "The model loading component 416 (which may correspond to the model loading component 216 of FIG. 2) loads the model from the model storage 425 (e.g., persistent storage). In some embodiments, the storage 425 represents the same storage 325 as indicated in the training phase system 300 of FIG. 3.". In [0050], "For example, in a deployed machine learning model environment, an incoming message can be vectorized and processed by the conversion prediction component 208 by identifying each message element in the message". It would have been obvious to one having ordinary skill in the art at the effective filling date of the invention to apply the analytical methods of Zhang and apply that to the system as taught in Serhad. Motivation to do so comes from the same rationale as outlined above with respect to Claim 1. Claim 8 As to Claim 8, Serhad combined with Zhang and Kumar teaches all the limitations of Claim 1 as discussed above. Zhang teaches: The computer program product of claim 1, wherein the virtual object comprises an advertisement of the real-world entity comprising an advertised product or service, and wherein the engagement score indicates a likelihood the user will purchase the advertised product or service in the real-world. In [0020], "This computer network environment can include one or more publisher computing devices, one or more network advertiser computing devices, and/or one or more user devices. For example, a message can be a sentence that describes a marketing message such as, “Brand A phone for sale at X dollars.” A message element can be the word “sale” within the message". In [0021], "A “conversion rate” is the proportion or quantity of all website or application users that perform some predefined action. Alternatively, it is the quantity of predefined actions that occur over all application or website visits. Mathematically, the formula can be stated as the quantity of website or application users that perform some predefined action (i.e., the “conversion”) divided by the total quantity of website or application users that have visited the website or application and have been presented with the message. The “predefined action” can correspond to any suitable user selection, user input, user download, user transaction, or any other action that a user performs that an entity (e.g., a network advertiser) defines to monitor. For example, the predefined action can be or include user downloads, user selections of advertisements, queries, user purchases, etc. per computer over multiple user sessions". Claim 9 As to Claim 9, Serhad combined with Zhang and Kumar teaches all the limitations of Claim 1 as discussed above. Zhang teaches: The computer program product of claim 1, wherein the operations further comprise: receiving real-world outcomes with respect to users, for which the first engagement score and the second engagement score are generated, interacting with the real-world entity. In [0060], "The model training component 306 (which may corresponding to the model training component 206 of FIG. 2) performs machine learning model training using the vectorized message data set (i.e., the output message vector 340). In this way, patterns and associations are determined within the historical message data set, such as number of conversions associated with message elements". generating a first training set In [0061], "For K-fold cross validation, first the K data set is partitioned to K chunks (e.g., groups of messages and other metadata, such as conversion statistics). That is, the K data set is shuffled randomly and then the data set is split into K groups. For each group (i.e., iteratively run through each group): identify the group as a test data set and take the remaining groups as a training data set. In this way, each group will be a test data set at some point. One or more models are fit on the training set and evaluate it on the test set. Then each performance for each K group can be aggregated (e.g., averaged) in some embodiments. This allows models to be chosen. The model that performs well or over a threshold performance on the training data is selected. In some embodiments, the model with the best performance is picked and passed to the next processing step. In some embodiments, Mean Absolute Error (MAE) is used as the metric measurement to model performance to determine “best” performance". Serhad teaches: including movement parameters for multiple users interacting with the virtual object, first engagement scores for the multiple users, In [0054], "Based on user interest, a plurality of virtual objects that can potentially be overlayed may be identified and scored. The score calculation, in one embodiment, may be performed by a scoring engine. The calculation may involve analyzing each virtual object in the library and applying a variety of formulas, weighted averages, means, and other calculations to determine a score. For example, in one embodiment, a score may be calculated based on a predetermined value times a component of user interest, e.g., a * seconds of gaze + b * verbal comments + c * heart rate delta + d * seconds of interaction with object + e * timing of the gaze + f * order of gaze with respect to other virtual objects + g * number of repeating gazes + h * magnitude of body movement change. Score may also be calculated based on relevance and context, urgency, and other factors". In [0116], "In yet another embodiment, the input for the type of virtual object to obtain may be based on the recommendation of an artificial intelligence algorithm, as depicted in block 520. Machine learning and artificial intelligence algorithms may be used in generating a model that can be trained to understand user preferences based on user consumption patterns and other user communications and online interactions". Zhang teaches: a quantized real-world outcomes, and margins of error between the first engagement scores and the quantized real-world outcomes; and training the movement engagement machine learning model to output first engagement scores that minimize the margins of error. In [0061], "This allows models to be chosen. The model that performs well or over a threshold performance on the training data is selected. In some embodiments, the model with the best performance is picked and passed to the next processing step. In some embodiments, Mean Absolute Error (MAE) is used as the metric measurement to model performance to determine “best” performance". Note that we are applying the validation methodology, as part of the training phase of Zhang, to the first engagement scores of Serhad here. Validation is a well-known technique in the art of training machine learning models and would be applicable. Given that conversion rates can be represented as integer percentages, we understand the limitation of quantized real-world outcomes to be disclosed in accordance with the broadest reasonable interpretation of the claim. It would have been obvious to one having ordinary skill in the art at the effective filling date of the invention to apply the analytical methods of Zhang and apply that to the system as taught in Serhad. Motivation to do so comes from the same rationale as outlined above with respect to Claim 1. Claims 15 and 20 are rejected as presenting substantially similar limitations as Claim 9. Claim 10 As to Claim 10, Serhad combined with Zhang and Kumar teaches all the limitations of Claim 9 as discussed above. Zhang teaches: The computer program product of claim 9, wherein the operations further comprise: generating a second training set including the information on the user information access requests for multiple users interacting with the virtual object, second engagement scores for the multiple users, the quantized real-world outcomes, and margins of error between the second engagement scores and the quantized real-world outcomes; and training the real-world entity engagement machine learning model to output second engagement scores that minimize the margins of error. In [0061], "For K-fold cross validation, first the K data set is partitioned to K chunks (e.g., groups of messages and other metadata, such as conversion statistics). That is, the K data set is shuffled randomly and then the data set is split into K groups. For each group (i.e., iteratively run through each group): identify the group as a test data set and take the remaining groups as a training data set. In this way, each group will be a test data set at some point. One or more models are fit on the training set and evaluate it on the test set. Then each performance for each K group can be aggregated (e.g., averaged) in some embodiments. This allows models to be chosen. The model that performs well or over a threshold performance on the training data is selected. In some embodiments, the model with the best performance is picked and passed to the next processing step. In some embodiments, Mean Absolute Error (MAE) is used as the metric measurement to model performance to determine “best” performance. The model storage component 350 receives the “best” performance model generated in the model training and stores it to the model storage 325 so that this model can be used in a deployed setting on actual data sets". Given that conversion rates can be represented as integer percentages, we understand the limitation of quantized real-world outcomes to be disclosed in accordance with the broadest reasonable interpretation of the claim. It would have been obvious to one having ordinary skill in the art at the effective filling date of the invention to apply the analytical methods of Zhang and apply that to the system as taught in Serhad. Motivation to do so comes from the same rationale as outlined above with respect to Claim 1. Claims 4, 5, 13 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Serhad(WO2023158797A1) in view of Zhang(US 20210004437 A1) in further view of Kumar(US 20230368262 A1) in further view of Peters(US 20190007781 A1). Claim 4 As to Claim 4, Serhad combined with Zhang and Kumar teaches all the limitations of Claim 3 as discussed above. Serhad teaches: The computer program product of claim 3, wherein the movement parameters include, over an engagement time of the user interaction with the virtual object, right-hand extended-reality controller handset coordinates, left-hand extended-reality controller handset coordinates, In [0137], "Such body movement may be analyzed by the control circuitry, such as by using an Al algorithm, to determine if the movements can be associated with user interest. For example, an individual viewing a virtual object and reacting by jumping with joy or putting up their hands in excitement may be captured by the gyroscope or motion sensor and associated with the user’s interest in the virtual object". In [0140], "In one embodiment, the calculation may be performed as follows: User Interest = a * seconds of gaze + b * verbal comment + c * heart rate delta + d * seconds of interaction with object + e * timing of the gaze + f * order of gaze with respect to other virtual objects + g * number of repeating gazes + h * magnitude of body movement change. A variety of other formulas, weighted averages, means, and other calculations may also be performed to determine a score for each virtual object based on user interest". Serhad combined with Zhang and Kumar does not teach: headset angles, However, Peters teaches: headset angles, In [0028], “Examples of hardware that can provide head-mounted optics include VR headsets, MR headsets, AR headsets, and various others. Sensing data and/or test data may be used to determine the users' FoV. As one example of sensing data, one or more angles associated with the positioning of a VR headset, which form a “steering angle” of the headset, may indicate the user's FoV. As another example of sensing data, a gaze angle of the user (sensed, for example, via iris detection) may indicate the user's FoV”. Serhad teaches: and body speeds. In [0137], "In yet another embodiment, as depicted in block 740, a category evaluated for scoring based on user interest is the user’s body movements. In this embodiment, a gyroscope, motion sensor, or accelerometer associated with an electronic device is accessed. The control circuitry may access such gyroscope, motion sensor, or accelerometer to determine the user’s body movements before, during, and after engagement with the virtual object. Such body movement may be analyzed by the control circuitry, such as by using an Al algorithm, to determine if the movements can be associated with user interest. For example, an individual viewing a virtual object and reacting by jumping with joy or putting up their hands in excitement may be captured by the gyroscope or motion sensor and associated with the user’s interest in the virtual object". Serhad combined with Zhang and Kumar discloses a system for analyzing and adaptively optimizing the effectiveness of targeted messages in a virtual reality environment. Peters discloses a system meant to dynamically update representations based on user angle in a virtual reality environment. Each reference discloses means for performing analysis on user behavior to optimize a facet of user interaction. Extending the angle calculations as recorded in Peters is applicable to Serhad combined with Zhang and Kumar as they address analogous tasks in the context of augmented reality environments. It would have been obvious to one having ordinary skill in the art at the effective filling date of the invention to extend the angle calculations of Peters and apply that to the system as taught in Serhad combined with Zhang and Kumar. Motivation to do so comes from the fact that the claim is plainly directed to the predictable result of combining known items in the prior art, the relevance of such angles being supported in [0177] of Serhad, “At block 1355, in one embodiment, the view of the virtual object may be enhanced. The control circuitry may change the view of the virtual object from an isometric view to a side view, rotate it a certain angle such as 90 or 180 degrees, provide a different perspective view, or constantly or periodically keep orienting the object at different angles in an attempt to draw the user’s attention to the virtual object”. Motivation to integrate these more granular headset calculations would be clear to one of ordinary skill in the art as enabling a user to receive a more detailed analysis of user engagement in an extended-reality environment. Claim 5 As to Claim 5, Serhad combined with Zhang and Kumar teaches all the limitations of Claim 3 as discussed above. Serhad teaches: The computer program product of claim 3, wherein the determining the first engagement score comprises: In [0226], "The AR engine 2680 may act as a central unit that communicates with all the modules 2610-2660. The AR engine may analyze data from any of the modules described and calculate an overall score that may be used to determine which virtual object to overlay and which virtual object to enhance. The AR engine may direct the rendering unit to overlay a virtual object based on the calculated scores". Serhad combined with Zhang does not teach: calculating angular displacements, during engagement time, as a function of the movement parameters comprising handset controller movement coordinates and headset angles, wherein the movement parameters inputted to the movement engagement machine learning model comprise the angular displacements. However, Peters teaches: calculating angular displacements, during engagement time, as a function of the movement parameters comprising handset controller movement coordinates and headset angles, wherein the movement parameters inputted to the movement engagement machine learning model comprise the angular displacements. In [0070], “As such, the processor(s) of the VR device of FIG. 7B may track a steering angle, using one or more angles associated with the head rotation information… In some examples, the processor(s) of the VR device may use one or more sensors and/or cameras (e.g., the sensors and/or cameras of the headset 200) to capture images that indicate a gaze angle of a user wearing the headset 200…For instance, the processor(s) of the VR device may output portions of the image sequence via the display hardware of the headset 200, at the particular viewing angle that suits the present steering angle of the headset 200.”. In [0102], “The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set)”. Claims 13 and 18 are rejected as presenting substantially similar limitations as Claim 5. It would have been obvious to one having ordinary skill in the art at the effective filling date of the invention to apply the angle calculations of Peters and apply that to the system of Serhad combined with Zhang and Kumar. Motivation to do so comes from the same rationale as outlined above with respect to Claim 4. Claims 6, 14 and 19 is rejected under 35 U.S.C. 103 as being unpatentable over Serhad(WO2023158797A1) in view of Zhang(US 20210004437 A1) in further view of Kumar(US 20230368262 A1) in further view of Spiegel(US 20240249318 A1). Claim 6 As to Claim 6, Serhad combined with Zhang and Kumar teaches all the limitations of Claim 1 as discussed above. Spiegel teaches: The computer program product of claim 1, wherein the determining the information on the user information access requests with respect to the real-world entity comprises: determining user questions entered into a virtual agent rendered in the extended-reality environment concerning the real-world entity; In [0033], " In some examples, the chatbot system can be used in conjunction with many organic experiences such as content, search, augmented reality and engagement with mapping applications. This allows the interactive platform as a whole to be more efficient in targeting the right audience and improve the performance of the ads and organic experiences". In [0035], "In some examples, a chatbot system receives a prompt from a user during a first interactive session. The chatbot system generates a response using the prompt and a large language model. The chatbot system communicates the response to the user during the first interactive session. The chatbot system determines a user intent based on the user prompt and response. The chatbot system determines advertising content based on the user intent. The chatbot system communicates the advertising content to the user during a second interactive session". and processing, by a large language model, the user questions to output summaries of key points of the user questions and user intent with respect to the real-world entity, wherein the summaries of the key points and the user intent In [0024], "In some examples, a chatbot system provides user intent detection that improves targeting and optimization capabilities over time by analyzing data on user intent and conversions. This enhances the user experience and improves the relevance and performance of ads. Additionally, the interactive platform uses the extracted user intent to enhance the user experience across other portions of the interactive platforming site, making them more personalized and relevant to the user community. In some examples, an interactive platform enhances display advertising by targeting users based on their genuine intent ascertained, in whole or in part, through interaction with a chatbot. By extracting high intent and timely relevant keywords and concepts of conversation with the chatbot, the interactive platform may improve a user intent profile". Zhang teaches: are entered into the real-world entity engagement machine learning model to determine the second engagement score. In [0022]. "“Message effectiveness predictions”, “conversion predictions”, or associated predictions described herein corresponds to predicting how effective a particular message will be at: conveying its intended message, reaching a particular audience, prompting individuals to perform a predetermined action, or any suitable prediction. For example, predicting message effectiveness can be or include predicting a conversion rate. As described in more detail herein, predicting conversion rates for an input message can be based on analyzing historical input messages and their associated conversion rates. This may include using one or more machine learning models to identify historical patterns and associations for making predictions". Here, we are enriching the data inputted in the prediction with movement parameters and interest; there is a wide bevy of data allowed for in [0054], so adapting movement data when in the context of an extended-reality environment would be applicable. Serhad combined with Zhang discloses a system for analyzing and adaptively optimizing the effectiveness of targeted messages. Spiegel discloses a system meant to provide enhanced advertising to users on the basis of chatbot queries. Each reference discloses means for analyzing the effectiveness of advertising. Extending the chatbot functionality as recorded in Spiegel to the system of Serhad combined with Zhang is applicable as both disclose mean as for analyzing the effectiveness of advertisements. It would have been obvious to one having ordinary skill in the art at the effective filling date of the invention to extend the chatbot analysis as taught in Spiegel and apply that to the system as taught in Serhad combined with Zhang and Kumar. Motivation to do so comes from the fact that the claim is plainly directed to the predictable result of combining known items in the prior art, with the expected benefit as outlined in [0024] of Spiegel, “In some examples, a chatbot system provides user intent detection that improves targeting and optimization capabilities over time by analyzing data on user intent and conversions. This enhances the user experience and improves the relevance and performance of ads. Additionally, the interactive platform uses the extracted user intent to enhance the user experience across other portions of the interactive platforming site, making them more personalized and relevant to the user community. In some examples, an interactive platform enhances display advertising by targeting users based on their genuine intent ascertained, in whole or in part, through interaction with a chatbot. By extracting high intent and timely relevant keywords and concepts of conversation with the chatbot, the interactive platform may improve a user intent profile”. Claims 14 and 19 are rejected as disclosing substantially similar limitations as Claim 6. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to THEODORE L XIE whose telephone number is (571)272-7102. The examiner can normally be reached M-F 9-5. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rutao Wu can be reached at 571-272-6045. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /THEODORE XIE/Examiner, Art Unit 3623 /CHARLES GUILIANO/Primary Examiner, Art Unit 3623
Read full office action

Prosecution Timeline

Feb 28, 2024
Application Filed
Jul 09, 2025
Non-Final Rejection — §103
Sep 05, 2025
Applicant Interview (Telephonic)
Sep 05, 2025
Examiner Interview Summary
Sep 29, 2025
Response Filed
Nov 05, 2025
Final Rejection — §103
Jan 08, 2026
Applicant Interview (Telephonic)
Jan 08, 2026
Examiner Interview Summary
Jan 13, 2026
Request for Continued Examination
Feb 13, 2026
Response after Non-Final Action
Mar 20, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12591576
DRILLING PERFORMANCE ASSISTED WITH AN ARTIFICIAL INTELLIGENCE ENGINE
2y 5m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 1 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
50%
Grant Probability
99%
With Interview (+100.0%)
1y 7m
Median Time to Grant
High
PTA Risk
Based on 4 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month