DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Chaturvedi et al. (US 12380484) in view of Thomas et al. (US 20180288490).
Note: all documents that are directly or indirectly incorporated by reference in their entirety in Thomas (see paragraphs 0123, 0125, 0128, 0130, 0150, 0158) are treated as part of the specification of Thomas (see MPEP 2163.07 b).
Regarding claim 1, Chaturvedi discloses a method comprising:
receiving, by a computing system, a transcript for a video (receiving, by a computing device, metadata and/or object for a video – see include, but are not limited to, figures 1, 7-10, col. 2, lines 17-24, lines 42-50, col. 32, lines 27-36);
receiving, by the computing system, an input indicative of a request to adjust a playback position of the video, wherein the request does not specify a timestamp of the video to which to adjust the playback position (receiving, by the computing device, a input indicative of a request of rewinding or forwarding to adjust a playback position of the video of the content to rewind or re-play a particular clip or command such as “Alexa, find a dress like the one that Marcy was wearing in this scene”, etc., the request or rewinding or replaying a particular clip or command does not specify a timestamp which to adjust the playback position – see include, but are not limited to, col. 21, lines 26-33, col. 21, line 57-col. 22, line 5);
applying, by the computing system, and based on the request to adjust the playback position, a first machine learning model to the transcript and a current timestamp of the video to identify one or more items/clips (applying, by the computing device, and based on the request of rewinding, re-playing or searching for particular item/content, a first machine learning model to metadata/objects and the current timestamp of the video (e.g., 25:08 in figure 1) to identify one or more items/clip – see include, but are not limited to, figures 4, 7, col. 2, lines 45-51, col. 7, lines 20-42) ;
applying, by the computing system, a second machine learning model to the transcript, the current timestamp, and the one or more items/products/clips to rank, based on user data, the one or more products/items/clips (applying, by the computing device, a second machine learning model to the metadata/current timestamp of the video being displayed with objects, and one or more similar products/clips with object to rank, based on user-specific references, the one or more similar products/clips or items with the object – see include, but are not limited to, figures 3, 5-6, 8, col. 3, lines 25-41, col. 4, lines 15-54, col. 8, line 30-col. 9, line 30); and
adjusting, by the computing system and based on the ranking of the one or more items/products/clips, the playback position to a noncurrent timestamp from the one or more products/items/clips (adjusting, by the computing device and based on the ranking of the one or more items/products/clips, the playback position a noncurrent timestamp associated with the re-playing clip/product/item/segment of interest of the one or more clips/products/items – see include, but are not limited to, col. 6, lines 56-63, col. 7, lines 35-42, col. 21, lines 26-33, lines 59-62).
Chaturvedi does not explicitly disclose one or more clips/items/products comprises one or more noncurrent timestamps.
Additionally and/or alternatively, Thomas discloses receiving, by a computing system, a transcript for a video (receiving, by a computing system with user device, information with entity, actor such as Romeo, Juliet, metadata, etc. for video – see include, but are not limited to, Thomas: figures 6, 9, 14 paragraphs 0010-0011);
receiving, by the computing system, an input indicative of a request to adjust a playback position of the video, wherein the request does not specify a timestamp of the video to which to adjust the playback position (receiving, by computing device, an input to rewind or fast forward or search to adjust playback position of the video to a particular entity/actor, wherein the request to adjust playback position of an entity/actor does not specify specific timestamp of the video to adjust the playback position – see include, but are not limited to, Thomas: figures 1, 6, 14-17, paragraphs 0002, 0069);
applying, by the computing system, and based on the request to adjust the playback position, a first model to the transcript and a current timestamp of the video (e.g., at 112A, 112B) to identify one or more noncurrent time stamps (applying, by the computing system, and based on the request to adjust the playback position to particular portion based on entity/actor, a process to the metadata/feature and a current timestamp of the video being displayed/selected to identify one or more time stamps of previous or subsequent segments/scenes that contains entity/feature– see include, but are not limited to, Thomas: figures 1, 3-4, 6, 12, 16-17, paragraphs 0012-14, 0017, 0069, 0094-0095, 0182);
applying, by the computing system, a second model to the transcript, the current timestamp, and the one or more noncurrent time stamps to rank, based on user data, the one or more noncurrent time stamps (applying, by the computing device, second process/model to the metadata/description of the video, the current timestamp of the video portion being displayed, and the one or more timestamps for the other segments/scenes (previous/subsequent segments/scenes) to rank, based on user data/preferences/usage history, the one or more timestamps associated with other segments that contains Romeo and/or Juliet a custom presentation – see include, but are not limited to, Thomas: figures 1, 4-6, paragraphs 0014, 0018, 0094-0095, 0178, 0182, 0188) ; and
adjusting, by the computing system and based on the ranking of the one or more noncurrent timestamps, the playback position to a noncurrent timestamp from the one or more noncurrent timestamps (adjusting, by the computing system and based on the ranking of timestamps for other segments/scenes that contains the entity/actor in custom presentation, playback position to the timestamp of the other segment/scene from the one or more timestamps associated with other segments/portions/scene that contains the entity/actor = see include, but are not limited to, figures 1, 6, 16-17, paragraphs 0017, 0182, 0188).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chaturvedi with the teaching of identifying and ranking one or more of noncurrent timestamps (timestamp(s) associated with previous or subsequent segments (not timestamp of current video portion/scene) that contains the entity/character/actor) as taught by Thomas in order to yield predictable result of allowing user to easily identify different sub-regions and the associated portion of the media asset related to particular characters, actors, locations, or other customized parameters or quickly navigate to different desired portions of video (see paragraphs 0002, 0028, 0093 (in last three lines)).
See also Gupta: 11546670: figures 3, 5-6, 9-10, 13-18, col. 6, lines 25-55 for the teachings of using learning machine models to identify timestamps associated with other segments/scenes that are ranked/filtered for recommending to user based on request to adjust playback positions (e.g., rewinding, fast forwarding, etc.), user data, and current timestamp of the segment/scene that is rewound/forwarded.
Regarding claim 2, Chaturvedi in view of Thomas discloses the method of claim 1, wherein the noncurrent time stamp is a first-ranked noncurrent timestamp (see include, but are not limited to, Thomas: figures 1, 4, 6, paragraphs 0014, 0017, 0094, 0115, 0182, 0188).
Regarding claim 3, Chaturvedi in view of Thomas discloses the method of claim 1, wherein the input indicative of the request is a first input, wherein the noncurrent timestamp is a first noncurrent timestamp (see include, but are not limited to, Thomas, figures 1, 4, 6, paragraphs 0017, 0094, 0182, 0188) the method further comprising: responsive to receiving a second input indicative of a request to adjust a playback position of the video, adjusting, by the computing system and based on the ranking of the one or more noncurrent timestamps, the playback position to a second noncurrent timestamp from the one or more noncurrent timestamps (in response to select another segment of the entity or rewind to start playing back from start of another segment or position - see include, but are not limited to, Chaturvedi: col. 21, lines 26-33, col. 21, line 57-col. 22, line 5; Thomas: figures 1, 4, 6, paragraphs 0014, 0017, 0094, 0115, 0182, 0188).
Regarding claim 4, Chaturvedi in view of Thomas discloses the method of claim 3, wherein the second noncurrent timestamp is a second-ranked noncurrent timestamp (second timestamp/start time of the subsequent/next segments or product of the custom presentation - see include, but are not limited to, Thomas: figures 1, 4, 6, paragraphs 0017, 0094, 0115, 0182, 0188; Chaturvedi: figures 5-6, col. 6, lines 56-62, col. 21, lines 55-62).
Regarding claim 5, Chaturvedi in view of Thomas discloses the method of claim 1, wherein the one or more noncurrent time stamps include at least one of a start time stamp for a current sentence, a start time stamp for a current dialogue, a start time stamp for a current scene, start time stamp for a future sentence, a start time stamp for a future dialogue, and a start time stamp for a future scene (start time of the current scene or clip that is requested for re-playing, start time of dialogue, start time of future scene/portion that contains the entity, etc. – see include, but are not limited to, Chaturvedi: col. 21, lines 55-61; Thomas: figures 4, 6, paragraphs 0012, 0017, 0089-0090, 0094, 0097, 0182, 0242).
Regarding claim 6, Chaturvedi in view of Thomas discloses the method of claim 1, wherein the user data includes data indicative of one or more of a number of requests for rewinding the video and a number of requests for fast-forwarding the video, and wherein the second machine learning model is trained on the user data (see include, but are not limited to, Chaturvedi: col. 6, lines 55-63, col. 7, lines 35-52, col. 10, lines 1-24, col. 11, lines 7-20, col. 18, line 65-col. 19, line 52,col. 21, line 57-col. 22, line 5; Thomas: figures 1, 4, 6, paragraphs 0069, 0113-0114).
Regarding claim 7, Chaturvedi in view of Thomas discloses the method of claim 1, further comprising: applying, by the computing system, a third machine learning model to the transcript to generate an augmented transcript including information indicative of one or more scenes included in the video; and providing, by the computing system, the augmented transcript to the first machine learning model as input (generated transcript/features for the object detected in the scenes included in the video and providing the information of the objects included in the screen to machine learning model as input -see include, but are not limited to, Chaturvedi: figures 1-2, 5, col. 2, lines 35-50, col. 7, lines 20-34, col. 8, 8-22).
Regarding claim 8, Chaturvedi in view of Thomas discloses the method of claim 1, wherein the first machine learning model and the second machine learning model are the same machine learning model (first machine learning model and the second machine learning model are the same machine learning model of the computing device – see include, but are not limited to, Chaturvedi: figures 7-8, 10, col. 7, lines 20-54, col. 8, line 5-36, col. 17, line 41-col. 18, line 17, line 67-col. 19, line 15, col. 21, lines 1-17).
Regarding claim 9, Chaturvedi in view of Thomas discloses the method of claim 1, wherein the first machine learning model is a transcript matching model (transcription/information matching model by comparing the images/objects for similarity or matching – see include, but are not limited to, figures 1, 3, 5, 8-9, col. 3, lines 4-7, 25-41, col. 8, lines 30-67).
Regarding claim 10, limitations of a computing system that correspond to the limitations of the method in claim 1 are analyzed as discussed in the rejection of claim 1. Particularly, Chaturvedi in view of Thomas discloses a computing system comprising:
one or more processors (see for example, Chaturvedi: figure 10, processor 1002, col. 22, lines 48-67, col. 30, lines 40-45), Thomas: figure 9, processing circuitry 906, paragraphs 0061, 0132) ; and
one or more storage devices that store instructions (storage devices 1004, 1006, 1016 in figure 10 of Chaturvedi or storage 908 in figure 9 of Thomas), wherein the instructions, when executed by the one or more processors, cause the one or more processors to:
receive a transcript for a video; receive an input indicative of a request to adjust a playback position of the video, wherein the request does not specify a timestamp of the video to which to adjust the playback position;
apply, based on the request to adjust the playback position, a first machine learning model to the transcript and a current timestamp of the video to identify one or more noncurrent time stamps;
apply a second machine learning model to the transcript, the current timestamp, and the one or more noncurrent time stamps to rank, based on user data, the one or more noncurrent time stamps; and
adjust, based on the ranking of the one or more noncurrent timestamps, the playback position to a noncurrent timestamp from the one or more noncurrent timestamps (see similar discussion in the rejection of claim 1 and include, but are not limited to, Chaturvedi: figure 10, col. 22, lines 48-67, col. 30, lines 40-45; Thomas: figure 9, paragraphs 0061, 0132, 0135).
Regarding claims 11-18, the additional limitations of the system that correspond to the additional limitations of the method in claims 2-9 are analyzed as discussed in the rejection of claims 2-9.
Regarding claim 19, the limitations of the non-transitory computer readable medium that correspond to the limitations of claim 1 or 10 are analyzed as discussed in the rejection of claim 1 or 10. In particular, Chaturvedi in view of Thomas discloses a non-transitory computer-readable storage medium encoded with instructions that, when executed by one or more processors, cause one or more processors to: receive a transcript for a video; receive an input indicative of a request to adjust a playback position of the video, wherein the request does not specify a timestamp of the video to which to adjust the playback position; apply, based on the request to adjust the playback position, a first machine learning model to the transcript and a current timestamp of the video to identify one or more noncurrent time stamps; apply a second machine learning model to the transcript, the current timestamp, and the one or more noncurrent time stamps to rank, based on user data, the one or more noncurrent time stamps; and adjust, based on the ranking of the one or more noncurrent timestamps, the playback position to a noncurrent timestamp from the one or more noncurrent timestamps (see similar discussion in the rejection of claim 1 and/or 10 include, but are not limited to, Chaturvedi: figure 10, col. 22, lines 48-67, col. 30, lines 40-45; Thomas: figure 9, paragraphs 0061, 0132).
Regarding claim 20, the additional limitations that correspond to the additional limitations of method in claim 3 are analyzed as discussed in the rejection of claim 3.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Gupta (US 11546670) discloses rewind and fast forward of content and using machine learning models to provide a list of recommended portions of media content based on current timestamp associated with current portion being displayed and user data.
Ma (US 20230345059) discloses intelligent video playback using learning machine models (see paragraphs 0018, 0026, 0030, 0035)
Sekar et al. (US 20190289359) discloses intelligent video interaction method.
Bromand et al. (US 20190342419) discloses predictive caching using different machine learning models (paragraph 0198).
Shah et al. (US 20220295131) discloses systems, methods, and apparatus for trick mode implementation using learning machine models (paragraphs 0038-0039).
Gupta et al. (US 20210266632) discloses system and methos for processing overlapping content with adjusting playback positions and using machine learning models (paragraphs 0030-0031, 0047, 0048, 0050).
Foerster et al. (US 20190268632) discloses auto-adjust playback speed and contextual information using machine learning models (paragraphs 0030, 0059-0065, 0068).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AN SON PHI HUYNH whose telephone number is (571)272-7295. The examiner can normally be reached 9:00 am-6:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, NASSER M. GOODARZI can be reached at 571-272-4195. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AN SON P HUYNH/Primary Examiner, Art Unit 2426
January 14, 2026