Prosecution Insights
Last updated: April 19, 2026
Application No. 18/733,802

LOCATIONS OF MEDIA CONTROLS FOR MEDIA CONTENT AND CAPTIONS FOR MEDIA CONTENT IN THREE-DIMENSIONAL ENVIRONMENTS

Non-Final OA §103
Filed
Jun 04, 2024
Examiner
ZALALEE, SULTANA MARCIA
Art Unit
2614
Tech Center
2600 — Communications
Assignee
Apple Inc.
OA Round
1 (Non-Final)
71%
Grant Probability
Favorable
1-2
OA Rounds
2y 7m
To Grant
86%
With Interview

Examiner Intelligence

Grants 71% — above average
71%
Career Allow Rate
346 granted / 488 resolved
+8.9% vs TC avg
Strong +15% interview lift
Without
With
+15.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
30 currently pending
Career history
518
Total Applications
across all art units

Statute-Specific Performance

§101
7.8%
-32.2% vs TC avg
§103
56.3%
+16.3% vs TC avg
§102
11.4%
-28.6% vs TC avg
§112
13.8%
-26.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 488 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Interpretation The following is a quotation of 35 U.S.C. 112(f): (f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph: An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked. As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph: (A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; (B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and (C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: “component” in claims 19-35. Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 19-35 are rejected under 35 U.S.C. 103 as being unpatentable over Faulkner et al (US 20220091722 A1), and further in view of Kurzhals et al (Kurzhals K, Göbel F, Angerbauer K, Sedlmair M, Raubal M. A view on the viewer: Gaze-adaptive captions for videos. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems 2020 Apr 21 (pp. 1-12).). RE claim 19, Faulkner teaches A method comprising: at a computer system in communication with a display generation component and one or more input devices (Figs 1-3, [0043]): while a three-dimensional environment is visible via the display generation component, concurrently displaying, via the display generation component: media content, wherein the media content is displayed from a first viewpoint of a user relative to the three-dimensional environment, and a user interface element that includes one or more captions for the media content, wherein the user interface element has a first spatial relationship relative to a spatial reference in the media content in the three-dimensional environment (Figs 7-8, [0107]-[0108], [0117], [0122], [0209]); while concurrently displaying the media content from the first viewpoint of the user relative to the three-dimensional environment and the user interface element having the first spatial relationship relative to the spatial reference in the media content in the three-dimensional environment, detecting, via the one or more input devices, an event corresponding to a change of a viewpoint of the user relative to the three-dimensional environment (Figs 6-8, [0110], [0117], [0152]); and in response to detecting the event corresponding to the change of the viewpoint of the user relative to the three-dimensional environment from the first viewpoint of the user to a second viewpoint of the user relative to the three-dimensional environment, wherein the second viewpoint of the user is different from the first viewpoint of the user: displaying the user interface element having a second spatial relationship relative to the spatial reference in the media content, different from the first spatial relationship relative to the spatial reference in the media content, based on the change of the viewpoint of the user (Figs 7, 8, [0110], [0117], [0152], [0200]). Faulkner is silent RE: that includes one or more captions for the media content. However Kurzhals teaches in abstract, Figs 1-2, page 2 col 2 wherein gaze input is used to adapt the position of the captions helping communicate information/audio in loud environments, hearing impairments, unknown languages etc. Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to include in Faulkner a system and method that includes one or more captions for the media content, as suggested by Kurzhals, as this doesn’t change the overall operation of the system, and it could be used to provide the captions with comfortable view and thereby increasing system effectiveness and user experience. RE claim 20, Faulkner as modified by Kurzhals teaches comprising: while the three-dimensional environment is visible from a respective viewpoint of the user and while displaying the user interface element including the one or more captions for the media content having a respective spatial relationship to the spatial reference in the media content, wherein the user interface element including the one or more captions for the media content is displayed at a first location in the three-dimensional environment, detecting, via the one or more input devices, an input to display a playback control user interface including one or more selectable options for performing one or more actions associated with the media content; and in response to detecting the input to display the playback control user interface including one or more selectable options for performing one or more actions associated with the media content: displaying, via the display generation component, the playback control user interface including one or more selectable options for performing one or more actions associated with the media content wherein the playback control user interface is spatially separated from the media content in the three-dimensional environment; and in accordance with a determination that one or more criteria is met, displaying, via the display generation component, the user interface element including the one or more captions for the media content at a second location, different from the first location, wherein when the user interface element including the one or more captions for the media content is at the second location, a spatial arrangement of the playback control user interface and the user interface element including the one or more captions for the media content is a first spatial arrangement ( Faulkner Figs 7E-F, K-L, AD, 9, [0122], [0240], [0242] and [0037], [0138] etc wherein user interface objects are decoupled from initial reference point and coupled to the users viewpoint/gaze based on specific input criteria. ). RE claim 21, Faulkner as modified by Kurzhals teaches wherein the one or more criteria includes a requirement that attention of the user is directed to the playback control user interface in order for the one or more criteria to be met (Faulkner Figs 6, 7E-F, K-L, AD, 9, [0122], [0240], [0242] and [0037], [0138] etc wherein user interface objects are decoupled from initial reference point and coupled to the users viewpoint/gaze based on specific input criteria. In addition Kurzhals abstract, Figs 1-2, page 2 col 2 etc. wherein gaze input is used to adapt the position of the captions). RE claim 22, Faulkner as modified by Kurzhals teaches comprising: while the three-dimensional environment is visible via the display generation component from the respective viewpoint of the user, while displaying, at the second location in the three-dimensional environment, the user interface element including the one or more captions for the media content, and while displaying the playback control user interface, wherein the user interface element and the playback control user interface have the first spatial arrangement, detecting, via the one or more input devices, a second event corresponding to a second change of the viewpoint of the user relative to the three-dimensional environment; and in response to detecting the second event corresponding to the second change of the viewpoint of the user relative to the three-dimensional environment, continuing display, via the display generation component, of the user interface element including the one or more captions for the media content at the second location in the three-dimensional environment and the playback control user interface, with the spatial arrangement of the user interface element including the one or more captions for the media content and the playback control user interface being the first spatial arrangement (Faulkner 7, 8, [0110], [0117], [0152], [0200]. In addition Kurzhals abstract, Figs 1-2, page 2 col 2 etc. wherein gaze input is used to adapt the position of the captions). RE claim 23, Faulkner as modified by Kurzhals teaches comprising: after continuing display, via the display generation component, of the user interface element including the one or more captions for the media content at the second location and the playback control user interface with the spatial arrangement of the user interface element including the one or more captions for the media content and the playback control user interface being the first spatial arrangement, and in accordance with a determination that one or more second criteria is met: reducing a visual prominence of the playback control user interface; initiating a process to display, via the display generation component, the user interface element including the one or more captions for the media content at a location in the three-dimensional environment that is different from the second location in the three-dimensional environment, having a third spatial relationship relative to the spatial reference in the media content, based on the second change of viewpoint of the user relative to the three-dimensional environment, including displaying, via the display generation component, the user interface element including the one or more captions for the media content at the location in the three-dimensional environment having the third spatial relationship relative to the spatial reference in the media content (Faulkner [0317], [0347], [0159]). RE claim 24, Faulkner as modified by Kurzhals teaches wherein: initiating the process to display, via the display generation component, the user interface element including the one or more captions for the media content at the location in the three-dimensional environment having the third spatial relationship relative to the spatial reference in the media content includes: reducing a visual prominence of the user interface element including the one or more captions for the media content that was displayed at the second location with the spatial arrangement of the user interface element including the one or more captions for the media content and the playback control user interface being the first spatial arrangement; and after reducing the visual prominence of the user interface element including the one or more captions for the media content that was displayed at the second location with the spatial arrangement of the user interface element including the one or more captions for the media content and the playback control user interface being the first spatial arrangement, displaying, via the display generation component, the user interface element including the one or more captions for the media content at the location in the three-dimensional environment having the third spatial relationship relative to the spatial reference in the media content (Faulkner Figs 7, 8, [0110], [0117], [0152], [0200], [0317], [0347], [0159]). RE claim 25, Faulkner as modified by Kurzhals teaches wherein displaying, via the display generation component, the user interface element including the one or more captions for the media content at the location in the three-dimensional environment having the third spatial relationship relative to the spatial reference in the media content includes: increasing the visual prominence of the user interface element including the one or more captions for the media content while visually moving the user interface element including the one or more captions for the media content towards the location in the three-dimensional environment having the third spatial relationship relative to the spatial reference in the media content (Faulkner Figs 7, 8, [0110], [0117], [0152], [0200], [0317], [0347], [0159]). RE claim 26, Faulkner as modified by Kurzhals teaches wherein displaying the user interface element including the one or more captions for the media content having the second spatial relationship relative to the spatial reference in the media content is performed in accordance with a determination that one or more criteria is met, wherein the one or more criteria includes a requirement that a threshold period of time has passed since detecting the event corresponding to the change of the viewpoint of the user relative to the three-dimensional environment from the first viewpoint of the user to the second viewpoint of the user relative to the three-dimensional environment in order for the one or more criteria to be met (Faulkner [0119], [0205], [0214] etc.). RE claim 27, Faulkner as modified by Kurzhals teaches comprising: in response to detecting the event corresponding to the change of the viewpoint of the user relative to the three-dimensional environment from the first viewpoint of the user to the second viewpoint of the user: displaying, via the display generation component, the user interface element including the one or more captions for the media content at a first respective location in the three-dimensional environment from the second viewpoint of the user, with a respective amount of visual prominence, wherein the one or more captions are displayed as moving towards a second respective location in the three-dimensional environment; and after displaying, via the display generation component, the user interface element including the one or more captions for the media content having the first respective location in the three-dimensional environment from the second viewpoint of the user and with the respective amount of visual prominence, displaying, via the display generation component, the user interface element including the one or more captions for the media content having the second respective location in the three-dimensional environment from the second viewpoint of the user, different from the first respective location in the three-dimensional environment from the second viewpoint of the user, with an amount of visual prominence that is greater than the respective amount of visual prominence, and wherein when the user interface element has the second respective location in the three-dimensional environment from the second viewpoint of the user, the user interface element including the one or more captions for the media content has the second spatial relationship relative to the spatial reference in the media content in the three-dimensional environment (Faulkner Figs 7, 8, [0110], [0117], [0152], [0200], [0317], [0347], [0159]). RE claim 28, Faulkner as modified by Kurzhals teaches wherein while displaying, via the display generation component, the user interface element including the one or more captions for the media content at the first respective location: in accordance with a determination that the change in viewpoint of the user relative to the three-dimensional environment from the first viewpoint of the user to the second viewpoint of the user is a first amount of change of viewpoint of the user, the respective amount of visual prominence is a first amount of visual prominence, and in accordance with a determination that the change in viewpoint of the user relative to the three-dimensional environment from the first viewpoint of the user to the second viewpoint of the user is a second amount of change of viewpoint of the user, greater than the first amount of change of viewpoint of the user, the respective amount of visual prominence is a second amount of visual prominence that is less than the first amount of visual prominence (Faulkner Figs 7, 8, [0110], [0117], [0152], [0200], [0317], [0347], [0159] wherein the amount of fading or shrinking corresponds to an amount of change in the current distance between the first control and the first object). RE claim 29, Faulkner as modified by Kurzhals teaches comprising: in response to detecting the event corresponding to the change of the viewpoint of the user relative to the three-dimensional environment from the first viewpoint of the user to the second viewpoint of the user: in accordance with a determination that the change in viewpoint of the user relative to the three-dimensional environment from the first viewpoint of the user to the second viewpoint of the user is greater than a threshold change in viewpoint of the user, forgoing display of the user interface element including the one or more captions for the media content via the display generation component at the first respective location (Faulkner Figs 7, 8, [0110], [0117], [0152], [0200], [0262]). RE claim 30, Faulkner as modified by Kurzhals teaches wherein: while displaying the user interface element including the one or more captions for the media content having the second spatial relationship relative to the spatial reference in the media content in the three-dimensional environment, the user interface element including the one or more captions for the media content is displayed at a first location in three-dimensional environment that, in a view of the three-dimensional environment from the second viewpoint of the user, is offset from a center of the second viewpoint of the user (Faulkner Figs 7 K-L, AD, [0194], [0239], [0274] etc). RE claim 31, Faulkner as modified by Kurzhals teaches comprising: in response to detecting the event corresponding to the change of the viewpoint of the user relative to the three-dimensional environment from the first viewpoint of the user to the second viewpoint of the user relative to the three-dimensional environment: displaying, via the display generation component, the media content, wherein the media content is displayed from the second viewpoint of the user relative to the three-dimensional environment, wherein a closest element of the media content to the second viewpoint of the user is at a first distance from the second viewpoint of the user, wherein the user interface element including the one or more captions for the media content is displayed at a second distance from the second viewpoint of the user that is less than the first distance from the second viewpoint of the user (Faulkner Figs 7O-T, [0192]-[0193], [0310]). RE claim 32, Faulkner as modified by Kurzhals teaches comprising: while displaying the media content: in accordance with a determination that attention of the user is directed to a respective user interface element of a first type, different from the user interface element of a second type including the one or more captions for the media content, reducing a visual prominence of the user interface element including the one or more captions for the media content (Faulkner Figs 7S-T, [0276], [0205]-[0209], [0184] etc). RE claim 33, Faulkner as modified by Kurzhals teaches wherein the media content is immersive media content, the method further comprising: displaying, via the display generation component, second media content that is non-immersive media content at a location in the three-dimensional environment; detecting, via the one or more input devices, an event corresponding to a trigger to display a user interface element that includes one or more captions for the second media content; and in response to detecting the event corresponding to the trigger to display the user interface element that includes the one or more captions for the media content, displaying, via the display generation component, the user interface element that includes the one or more captions for the media content at a location in the three-dimensional environment that is independent of a viewpoint of the user (Faulkner Figs 7AA-AD, [0186], [0345], [0353]). Claim 34 recites limitations similar in scope with limitations of claim 19 and therefore rejected under the same rationale. In addition Faulkner teaches A computer system that is in communication with a display generation component and one or more input devices, the computer system comprising: one or more processors; memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions (Figs 1-3, [0043]). Claim 35 recites limitations similar in scope with limitations of claim 19 and therefore rejected under the same rationale. In addition Faulkner teaches A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of a computer system that is in communication with a display generation component and one or more input devices, cause the computer system to perform the method (Figs 1-3, [0043], [0060]). Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See attached 892. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SULTANA MARCIA ZALALEE whose telephone number is (571)270-1411. The examiner can normally be reached Monday- Friday 8:00am-4:30pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached at (571)272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /Sultana M Zalalee/ Primary Examiner, Art Unit 2614
Read full office action

Prosecution Timeline

Jun 04, 2024
Application Filed
Jan 20, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602876
ANNOTATION TOOLS FOR RECONSTRUCTING THREE-DIMENSIONAL ROOF GEOMETRY
2y 5m to grant Granted Apr 14, 2026
Patent 12592035
Fused Bounding Volume Hierarchy for Multiple Levels of Detail
2y 5m to grant Granted Mar 31, 2026
Patent 12586146
PROGRESSIVE MATERIAL CACHING
2y 5m to grant Granted Mar 24, 2026
Patent 12573150
POLYGON CORRECTION METHOD AND APPARATUS, POLYGON GENERATION METHOD AND APPARATUS, ELECTRONIC DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM
2y 5m to grant Granted Mar 10, 2026
Patent 12561908
TOPOLOGICALLY CONSISTENT MULTI-VIEW FACE INFERENCE USING VOLUMETRIC SAMPLING
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
71%
Grant Probability
86%
With Interview (+15.1%)
2y 7m
Median Time to Grant
Low
PTA Risk
Based on 488 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month