Last updated: April 19, 2026
Application No. 18/383,217
IMAGE AND VIDEO INSTANCE ASSOCIATION FOR AN E-COMMERCE APPLICATIONS

Final Rejection §103§DP
Filed
Oct 24, 2023
Examiner
REPSHER III, JOHN T
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
EBAY INC.
OA Round
4 (Final)
Interview Optional

— +48.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 347 resolved cases, 2023–2026
Examiner Intelligence

REPSHER III, JOHN T View full profile →
Grants 58% of resolved cases
Career Allow Rate
203 granted / 347 resolved
+3.5% vs TC avg
Strong +48% interview lift
Without
With
+48.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 5m
Avg Prosecution
18 currently pending
Career history
365
Total Applications
across all art units
Statute-Specific Performance

§101
8.9%
-31.1% vs TC avg
§103
49.6%
+9.6% vs TC avg
§102
12.7%
-27.3% vs TC avg
§112
20.6%
-19.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 347 resolved cases
Office Action

§103 §DP
DETAILED ACTION

Remarks
Claims 1-13,15-16 and 19-23 have been examined and rejected. This Office action is responsive to the amendment filed on 07/31/2025, which has been entered in the above identified application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1-13, 15-17, 19, and 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 7, 12-14, 16, and 18 of US 11829446 B2 in view of Ross et al. (US 20220245820 A1, published 08/04/2022), hereinafter Ross. Although the claims at issue are not identical, they are not patentably distinct from each other.  In the table below, the left side contains claims in the instant application while the right side contain portions of claims of US 11829446 B2
18383217 (Instant Application)
US 11829446 B2
Claims 1, 10, 19) One or more non-transitory computer-readable storage devices comprising instructions stored thereon that, responsive to execution by one or more processors, perform operations comprising: receiving listing information used to create a listing of describing an item for sale on an electronic marketplace, the listing comprising one or more images and a video from the listing information that are viewed by accessing the listing on the electronic marketplace
Claims 1, 16, 18) A method for linking one or more images and a video on an e-commerce platform, the method comprising: receiving listing information including the one or more images and the video, the one or more images comprising image data and the video comprising video data
determining a blank portion of the video and a blank portion of an image of the one or more images;

causing display of a user interface responsive to an accessing of the listing on the electronic marketplace, the user interface comprising:
causing display of a user interface, the user interface comprising:
a video display displaying the video of the listing information; an image display displaying an image of the one or more images of the listing information, the image display concurrently with the video display;
a video display that is displayed concurrently with an image display;
a video linking user interface element overlaid on the blank portion of the image in the image display, wherein interaction with the video linking user interface element causes playback of a relevant portion of the video in the video display
a video linking user interface element overlaid on a displayed image at the image display, the displayed image being one of the one or more images, the video linking user interface element operable to cause playback of a relevant portion of the video, the relevant portion being linked with the displayed image based upon the calculated similarity value; and
an image linking user interface element overlaid on the blank portion of the video in the video display, wherein interaction with the image linking user interface element causes display of a relevant image of the one or more images in the image display
an image linking user interface element overlaid on the video at the video display, the image linking user interface element operable to cause display of a relevant image at the image display, the relevant image linked with a portion of the playback of the video based on the calculated similarity value, the relevant image being one of the one or more images;
and causing display of the relevant image of the one or more images in the image display responsive to receiving a user selection of the image linking user interface element during the playback of the video.
and causing display of the relevant image at the image display responsive to reception of a user selection of the image linking user interface element during the playback of the video
Claims 2, 11) further comprising causing display of the relevant image of the one or more images in the image display responsive to receiving a user selection of the image linking user interface element during the playback of the video
Claims 1, 16, 18) causing display of the relevant image at the image display responsive to reception of a user selection of the image linking user interface element during the playback of the video
Claims 3, 12, 20) further comprising causing display of the relevant portion of the video in the video display responsive to receiving a user selection of the video linking user interface element overlaid on the blank portion of the image that is displayed in the image display
a video linking user interface element overlaid on a displayed image at the image display, the displayed image being one of the one or more images, the video linking user interface element operable to cause playback of a relevant portion of the video
Claims 4, 13) wherein video data that includes the relevant portion of the video is linked to image data that includes the relevant image
Claims 1, 16, 18) linking the video data and the image databased upon the calculated similarity value; the relevant image linked with a portion of the playback of the video based on the calculated similarity value
Claims 5, 17) wherein the relevant portion of the video is linked to the relevant image
Claims 1, 16, 18) linking the video data and the image databased upon the calculated similarity value; the relevant image linked with a portion of the playback of the video based on the calculated similarity value
Claims 6, 15) wherein the video data and the image data are linked based on a similarity value
Claims 1, 16, 18) linking the video data and the image databased upon the calculated similarity value;
Claims 7, 16) wherein the similarity value is calculated based upon a distance between one or more image descriptors of the image data in relation to one or more video descriptors of the video data
Claims 7, 12) wherein the similarity value is calculated based upon a distance between the one or more image descriptors and the one or more video descriptors 
Claim 8) wherein the linking of the video data and the image data is based upon the similarity value being greater than or equal to a threshold value
Claim 13) wherein the linking of the video data and the image data is based upon the similarity value being greater than or equal to a threshold value
Claim 9) wherein the threshold value is based upon a method of processing the image data and the video data
Claim 14) wherein the threshold value is based upon a method of processing the image data and the video data


In the same field of endeavor, Ross teaches: determining a blank portion of the video and a blank portion of an image of the one or more images; overlaid on the blank portion of the image; overlaid on the blank portion of the video (Ross Figs. 1-7; [0002], [0015], [0016], [0032], [0044]).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated determining a blank portion of the video and a blank portion of an image of the one or more images; overlaid on the blank portion of the image; overlaid on the blank portion of the video as suggested in Ross.  Doing so would be desirable because many images and video frames include areas where text and other content may be inserted without obscuring important part or parts of the image or the video frame (see Ross [0002]).  Various systems are available today that enable a curator to mark copy space on an image. That process is usually time consuming and inefficient when hundreds of thousands of images of video frames must be marked. Therefore, it is desirable to automatically and accurately identify copy space on images and video frames (see Ross [0003]).  The placement of text over an image is an important part of producing high-quality visual designs. Application areas for copy space detection include generation of email banners, homepages, and call-to-actions, etc. These are mostly performed manually, and therefore making the task of graphic asset development time-consuming, requiring designers to curate and manipulate media and vector graphics, build up layer stacks, and finally place and format text, all while balancing style, brand consistency and tone in the design. Automating this work by selecting appropriate position and orientation, and style for textual elements requires understanding the contents of the image over which the text must be placed. Furthermore, establishing parameters of copy spaces where media content such as text may be rendered over images is distinct from foreground-background separation approaches in image processing because the resulting separation maps may not account for portions of the image that may be potentially overlaid with media content without disrupting significant visual elements in the image (see Ross [0014]).  Additionally, the system of Ross enables any media content may be overlaid (see Ross [0016]) on an image or video frame (see Ross [0015]).  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 10-13, 19, 20, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Tan et al. (US 20080163283 A1, published 07/03/2008), hereinafter Tan, in view of Ross et al. (US 20220245820 A1, published 08/04/2022), hereinafter Ross.

Regarding claim 19, Tan teaches the claim comprising:
One or more non-transitory computer-readable storage devices comprising instructions stored thereon that, responsive to execution by one or more processors, perform operations comprising (Tan Figs. 1-14; [0027], Any suitable client may connect to server 102; a suitable client is one capable of running a video player application to play the requested video content and produce video output on a display screen. For example, client devices 106, 108, 110 are equipped with internal processors capable of running a video player application for requested video content to produce video output for viewing on display screens 116, 118, and 120, respectively): 
receiving listing information used to create a listing describing an item for sale on an electronic marketplace, the listing comprising one or more images and a video from the listing information that are viewed by accessing the listing on the electronic marketplace (Tan Figs. 1-14; [0011], a player for the video content may include or be integrated with contextual information concerning subjects highlighted in the video. As the video is playing, the active subject in the index bar and the second window may update in synchronization with cues embedded in the video. Highlighted subjects may include commercial products placed in the principal video data during a production process, and contextual information may include advertising for the commercial products and hyperlinks to further information or to a site configured for selling the highlighted product; [0028], Video content may be provided in association with links to third-party or backend processes, for example, a third-party site providing further information about a product appearing in the video, or a backend process for processing an order for a product appearing in the video; [0029], video content may be produced using a video production process 204 and stored as known in the art for distribution from a video content server 206; [0030], Secondary video content or "highlights" 210 may be defined in an editing or administrative process based on defined targets in the video clip 208. Defined targets may include images of commercial products present in the video clip, or any other image appearing in the video clip for which it is desired to present advertising or other contextual information; [0031], Contextual information 224, such as advertising, factoids, or menus, may be imported or defined and linked to highlights 210 or other features or events in the video, such as by using cue points embedded in the video file. Contextual information may include, for example, graphics files or images, text, HTML or XML documents. Contextual information may relate to objects in the video that are highlighted using highlights 210; [0035], Selected third-party e-commerce sites 220 partnering with the video content server for order fulfillment may be connected seamlessly to the video content server website 206; [0036], The player application may thereby be configured to implement various functions, such as, for example ([0037-0043]): Allowing users to store products in a "shopping cart" without going to a webpage interface first; [0047], FIG. 3 shows an exemplary screenshot 300 including a video window 302 for displaying the principal video and a product window for displaying advertising or other contextual information cued to products, persons, or objects appearing in the principal video; [0048], The product bar 318 on the right of the video window 302 shows a thumbnail image of all the featured products in the video; [0049], The product window 304 includes an image 328 of the basketball appearing in the video, and text 330 describing the product; The price of the product may be displayed along with links for purchasing the product 332; [0068], Method 1400 comprises preparing 1402 digital video content. Digital video content may comprise first and second video objects that are not encoded together. Preparation may include configuring separately-encoded files or data for display together in overlapping layers)
causing display of a user interface responsive to an accessing of the listing on the electronic marketplace, the user interface comprising: a video display displaying the video of the listing information; an image display displaying the image of the one or more images of the listing information, the image display displayed concurrently with the video display (Tan Figs. 1-14; [0027], a suitable client is one capable of running a video player application to play the requested video content and produce video output on a display screen; [0028], System 100 may further include a backend process server 122 for handling requests from remote clients originating from video content links; [0047], FIG. 3 shows an exemplary screenshot 300 including a video window 302 for displaying the principal video and a product window for displaying advertising or other contextual information cued to products, persons, or objects appearing in the principal video; [0048], The product bar 318 on the right of the video window 302 shows a thumbnail image of all the featured products in the video. The current product, in this case a basketball, is shown in an emphasized thumbnail 320. The emphasized thumbnail may change at each cue point to show the current highlighted product. Each thumbnail image may also act as a control for jumping to a particular point in the video; [0049], the product window 304 includes an image 328 of the basketball appearing in the video, and text 330 describing the product; The price of the product may be displayed along with links for purchasing the product 332; [0070], Method 1400 further comprises serving 1406 the video content to a client device to cause a video output. For example, in response to a client request, video content may be delivered using embedded video within an SWF file formatted for play on a FLASH player, or other video format for play using a client media player; see also [0011], [0028-0031], [0068],)
a video linking user interface element overlaid on the blank portion of the image in the image display, wherein interaction with the video linking user interface element causes playback of a relevant portion of the video in the video display (Tan Figs. 1-14; [0050], The product window may also include a link to enable a user to cause the video to jump either forwards or backwards to a cue point associated with the product shown in the product window; [0053], FIGS. 5 and 6 exemplify operation of a product bar. FIG. 5 shows an exemplary screenshot 500 in which the video window 502 includes a product bar 504 at a time in the video clip when the basketball 506 is cued in the product window 508; [0054], FIG. 6 shows a screenshot 600 of what may happen when the user selects the emphasized image 510; the video in the video window 602 does not jump to the cue point associated with the sunglasses; The product window 604, however, shows the sunglasses graphics and product description. At this point, a user may select the product jump button 606 to jump to the cue point associated with the sunglasses (jump buttons in Figs. 5, 6 shown to partially overlay a blank portion of the image)); 
and an image linking user interface element overlaid on the blank portion of the video in the video display, wherein interaction with the image linking user interface element causes display of a relevant image of the one or more images in the image display (Tan Figs. 1-14; [0053], FIGS. 5 and 6 exemplify operation of a product bar (images in the product bar shown to partially overlay a blank portion of the video). FIG. 5 shows an exemplary screenshot 500 in which the video window 502 includes a product bar 504 at a time in the video clip when the basketball 506 is cued in the product window 508; the user has moved a cursor over a thumbnail image 510 of a pair of sunglasses, causing an emphasized (e.g., no longer grayed-out) image of the sunglasses to appear on the product bar; [0054]. FIG. 6 shows a screenshot 600 of what may happen when the user selects the emphasized image 510, such as by clicking or double-clicking on it. In this example, the video in the video window 602 does not jump to the cue point associated with the sunglasses. The product window 604, however, shows the sunglasses graphics and product description)
causing display of the relevant image of the one or more images in the image display responsive to receiving a user selection of the image linking user interface element during the playback of the video (Tan Figs. 1-14; [0053], FIGS. 5 and 6 exemplify operation of a product bar. FIG. 5 shows an exemplary screenshot 500 in which the video window 502 includes a product bar 504 at a time in the video clip when the basketball 506 is cued in the product window 508; the user has moved a cursor over a thumbnail image 510 of a pair of sunglasses, causing an emphasized (e.g., no longer grayed-out) image of the sunglasses to appear on the product bar; [0054], FIG. 6 shows a screenshot 600 of what may happen when the user selects the emphasized image 510, such as by clicking or double-clicking on it. In this example, the video in the video window 602 does not jump to the cue point associated with the sunglasses. The product window 604, however, shows the sunglasses graphics and product description)
However, Tan fails to expressly disclose determining a blank portion of the video and a blank portion of an image of the one or more images; overlaid on the blank portion of the image; overlaid on the blank portion of the video.  In the same field of endeavor, Ross teaches:
determining a blank portion of the video and a blank portion of an image of the one or more images; overlaid on the blank portion of the image; overlaid on the blank portion of the video (Ross Figs. 1-7; [0002], copy space may be used to insert links and other information into an image or a video frame; [0015], Disclosed herein are system, method and computer readable storage medium for detecting space suitable for overlaying media content onto an image (i.e., copy space, insertion space). The system receives an image which may be an image or a video frame. Generally, an image or a video frame may have space for inserting media content without covering vital portions of the image. Embodiments of the system process the image to determine occupied and unoccupied spaces in the image, and subsequently select regions in the unoccupied spaces to overlay media content on the image; one or more media content items may be selected for insertion onto the selected bounding boxes in the image. The system may then cause a display of the image with the selected media content item overlaid onto the image within the selected bounding boxes; [0016], one portion 110 of the image 100 may have multiple visual elements, while another portion 120 of the image 100 may be devoid of visual elements or may be identified as having visually non-essential elements (for e.g., textural background, blank space, clouds, tree canopy, beach sand, etc.). Embodiments described herein are directed to automatically identify insertion spaces, also termed bounding boxes herein, with visually non-essential elements, and where media content items may be inserted without visually impacting any visually essential elements of the image. For example, an insertion space 130 may be identified by an embodiment such that any media content may be overlaid on top of image 100 within the identified insertion space 130; [0032], The image receiving module 510 receives an image for overlaying media content items. The image receiving module 510 may receive the image over a network or may receive instructions for retrieving the image from an image store. The image may be any appropriately formatted image. In some embodiments, the image may be a frame of a video content item that is also received by the image receiving module 510; [0044], The metadata may include the dimensions of the corresponding media content item as well as other information (e.g., name, type (image, video, etc.)),)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated determining a blank portion of the video and a blank portion of an image of the one or more images; overlaid on the blank portion of the image; overlaid on the blank portion of the video as suggested in Ross into Tan.  Doing so would be desirable because many images and video frames include areas where text and other content may be inserted without obscuring important part or parts of the image or the video frame (see Ross [0002]).  Various systems are available today that enable a curator to mark copy space on an image. That process is usually time consuming and inefficient when hundreds of thousands of images of video frames must be marked. Therefore, it is desirable to automatically and accurately identify copy space on images and video frames (see Ross [0003]).  The placement of text over an image is an important part of producing high-quality visual designs. Application areas for copy space detection include generation of email banners, homepages, and call-to-actions, etc. These are mostly performed manually, and therefore making the task of graphic asset development time-consuming, requiring designers to curate and manipulate media and vector graphics, build up layer stacks, and finally place and format text, all while balancing style, brand consistency and tone in the design. Automating this work by selecting appropriate position and orientation, and style for textual elements requires understanding the contents of the image over which the text must be placed. Furthermore, establishing parameters of copy spaces where media content such as text may be rendered over images is distinct from foreground-background separation approaches in image processing because the resulting separation maps may not account for portions of the image that may be potentially overlaid with media content without disrupting significant visual elements in the image (see Ross [0014]).  Additionally, the system of Ross enables any media content may be overlaid (see Ross [0016]) on an image or video frame (see Ross [0015]).  

Regarding claim 1, claim 1 contains substantially similar limitations to those found in claim 19.  Consequently, claims 1 is rejected for the same reasons.

Regarding claim 10, claim 10 contains substantially similar limitations to those found in claim 19, the only difference being A computing device comprising: a display device; cause display of a user interface on the display device (Tan Figs. 1-14; [0027], Any suitable client may connect to server 102; a suitable client is one capable of running a video player application to play the requested video content and produce video output on a display screen. For example, client devices 106, 108, 110 are equipped with internal processors capable of running a video player application for requested video content to produce video output for viewing on display screens 116, 118, and 120, respectively).  Consequently, claim 10 is rejected for the same reasons.

Regarding claim 2, Tan in view of Ross teaches all the limitations of claim 1, further comprising:
further comprising causing display of the relevant image of the one or more images in the image display responsive to receiving a user selection of the image linking user interface element during the playback of the video (Tan Figs. 1-14; [0053], FIGS. 5 and 6 exemplify operation of a product bar. FIG. 5 shows an exemplary screenshot 500 in which the video window 502 includes a product bar 504 at a time in the video clip when the basketball 506 is cued in the product window 508; the user has moved a cursor over a thumbnail image 510 of a pair of sunglasses, causing an emphasized (e.g., no longer grayed-out) image of the sunglasses to appear on the product bar; [0054], FIG. 6 shows a screenshot 600 of what may happen when the user selects the emphasized image 510, such as by clicking or double-clicking on it. In this example, the video in the video window 602 does not jump to the cue point associated with the sunglasses. The product window 604, however, shows the sunglasses graphics and product description)

Regarding claim 11, claim 11 contains substantially similar limitations to those found in claim 2.  Consequently, claim 11 is rejected for the same reasons.

Regarding claim 3, Tan in view of Ross teaches all the limitations of claim 1, further comprising:
further comprising causing display of the relevant portion of the video in the video display responsive to receiving a user selection of the video linking user interface element overlaid on the blank portion of the image that is displayed in the image display (Tan Figs. 1-14; [0050], The product window may also include a link to enable a user to cause the video to jump either forwards or backwards to a cue point associated with the product shown in the product window; [0053], FIGS. 5 and 6 exemplify operation of a product bar. FIG. 5 shows an exemplary screenshot 500 in which the video window 502 includes a product bar 504 at a time in the video clip when the basketball 506 is cued in the product window 508; [0054], FIG. 6 shows a screenshot 600 of what may happen when the user selects the emphasized image 510; the video in the video window 602 does not jump to the cue point associated with the sunglasses; The product window 604, however, shows the sunglasses graphics and product description. At this point, a user may select the product jump button 606 to jump to the cue point associated with the sunglasses)

Regarding claims 12 and 20, claims 12 and 20 contain substantially similar limitations to those found in claim 3.  Consequently, claims 12 and 20 are rejected for the same reasons.

Regarding claim 5, Tan in view of Ross teaches all the limitations of claim 4, further comprising:
wherein the relevant portion of the video is linked to the relevant image (Tan Figs. 1-14; [0047-0048], The product bar 318 on the right of the video window 302 shows a thumbnail image of all the featured products in the video. The current product, in this case a basketball, is shown in an emphasized thumbnail 320. The emphasized thumbnail may change at each cue point to show the current highlighted product. Each thumbnail image may also act as a control for jumping to a particular point in the video; [0049], the product window 304 includes an image 328 of the basketball appearing in the video, and text 330 describing the product; [0053], FIGS. 5 and 6 exemplify operation of a product bar. FIG. 5 shows an exemplary screenshot 500 in which the video window 502 includes a product bar 504 at a time in the video clip when the basketball 506 is cued in the product window 508; [0053], the user has moved a cursor over a thumbnail image 510 of a pair of sunglasses, causing an emphasized (e.g., no longer grayed-out) image of the sunglasses to appear on the product bar; [0054], FIG. 6 shows a screenshot 600 of what may happen when the user selects the emphasized image 510, such as by clicking or double-clicking on it. In this example, the video in the video window 602 does not jump to the cue point associated with the sunglasses. The product window 604, however, shows the sunglasses graphics and product description. At this point, a user may select the product jump button 606 to jump to the cue point associated with the sunglasses)

Regarding claim 13, Tan in view of Ross teaches all the limitations of claim 10, further comprising:
wherein video data that includes the relevant portion of the video is linked to image data that includes the relevant image (Tan Figs. 1-14; [0047-0048], The product bar 318 on the right of the video window 302 shows a thumbnail image of all the featured products in the video. The current product, in this case a basketball, is shown in an emphasized thumbnail 320. The emphasized thumbnail may change at each cue point to show the current highlighted product. Each thumbnail image may also act as a control for jumping to a particular point in the video; [0049], the product window 304 includes an image 328 of the basketball appearing in the video, and text 330 describing the product; [0053], FIGS. 5 and 6 exemplify operation of a product bar. FIG. 5 shows an exemplary screenshot 500 in which the video window 502 includes a product bar 504 at a time in the video clip when the basketball 506 is cued in the product window 508; the user has moved a cursor over a thumbnail image 510 of a pair of sunglasses, causing an emphasized (e.g., no longer grayed-out) image of the sunglasses to appear on the product bar; [0054], FIG. 6 shows a screenshot 600 of what may happen when the user selects the emphasized image 510, such as by clicking or double-clicking on it. In this example, the video in the video window 602 does not jump to the cue point associated with the sunglasses. The product window 604, however, shows the sunglasses graphics and product description. At this point, a user may select the product jump button 606 to jump to the cue point associated with the sunglasses)

Regarding claim 4, claim 4 contains substantially similar limitations to those found in claim 13.  Consequently, claim 4 is rejected for the same reasons.

Regarding claim 23, Tan in view of Ross teaches all the limitations of claim 1, further comprising:
further comprising placing the video linking user interface element in the blank portion of the image and placing the image linking user interface element in the blank portion of the video (Tan Figs. 1-14; [0050], The product window may also include a link to enable a user to cause the video to jump either forwards or backwards to a cue point associated with the product shown in the product window; [0053], FIGS. 5 and 6 exemplify operation of a product bar (images in the product bar placed to partially overlay a blank portion of the video). FIG. 5 shows an exemplary screenshot 500 in which the video window 502 includes a product bar 504 at a time in the video clip when the basketball 506 is cued in the product window 508; the user has moved a cursor over a thumbnail image 510 of a pair of sunglasses, causing an emphasized (e.g., no longer grayed-out) image of the sunglasses to appear on the product bar; [0054], FIG. 6 shows a screenshot 600 of what may happen when the user selects the emphasized image 510; the video in the video window 602 does not jump to the cue point associated with the sunglasses; The product window 604, however, shows the sunglasses graphics and product description. At this point, a user may select the product jump button 606 to jump to the cue point associated with the sunglasses (jump buttons in Figs. 5, 6 placed to partially overlay a blank portion of the image))

Claims 6-9, 15, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Tan in view of Ross in further view of Zhao et al. (US 20150161147 A1, published 06/11/2015), hereinafter Zhao.

Regarding claim 6, Tan in view of Ross teaches all the limitations of claim 4, further comprising:
wherein the video data and the image data are linked (Tan Figs. 1-14; [0047-0048], The product bar 318 on the right of the video window 302 shows a thumbnail image of all the featured products in the video. The current product, in this case a basketball, is shown in an emphasized thumbnail 320. The emphasized thumbnail may change at each cue point to show the current highlighted product. Each thumbnail image may also act as a control for jumping to a particular point in the video; [0049], the product window 304 includes an image 328 of the basketball appearing in the video, and text 330 describing the product; [0053], FIGS. 5 and 6 exemplify operation of a product bar. FIG. 5 shows an exemplary screenshot 500 in which the video window 502 includes a product bar 504 at a time in the video clip when the basketball 506 is cued in the product window 508; the user has moved a cursor over a thumbnail image 510 of a pair of sunglasses, causing an emphasized (e.g., no longer grayed-out) image of the sunglasses to appear on the product bar; [0054], FIG. 6 shows a screenshot 600 of what may happen when the user selects the emphasized image 510, such as by clicking or double-clicking on it. In this example, the video in the video window 602 does not jump to the cue point associated with the sunglasses. The product window 604, however, shows the sunglasses graphics and product description. At this point, a user may select the product jump button 606 to jump to the cue point associated with the sunglasses)
However, Tan in view of Ross fails to expressly disclose wherein the video data and the image data are linked based on a similarity value.  In the same field of endeavor, Zhao teaches:
wherein the video data and the image data are linked based on a similarity value (Zhao Figs. 1-8 [0034], the association data also associates an image with a particular frame from the video that best matches, e.g., is the most visually similar to, the image among one or more frames extracted from the video; [0035], the system uses the association data to identify related pairs of images or related pairs of videos. For example, if the system determines that two images are associated with the same video, the system can store data associating the two images; [0065], The system determines whether the image and the video are related by comparing the strength of relationship to a predetermined threshold; [0066], Various algorithms for calculating the metric for a frame of the video and an image can be used. In some implementations, the system generates a hash key corresponding to the features of the frame and a hash key corresponding to the features of the image; [0073], The system receives a search query (402). The system receives data identifying images that are responsive to the search query (404). The system presents a search results user interface including a search result referencing an image responsive to the search query, and a link to similar videos (406), i.e., videos similar to the responsive image; [0091], FIG. 5A illustrates an example search user interface 500 displaying image search results (502a, 502b, and 502c) responsive to a query 504; [0092], The image search result 502a also includes a link 506 to related videos. When a user selects the link 506, the user is presented with a second user interface that includes search results for videos that were identified as being related to the image corresponding to the search result 502a; [0093], FIG. 5B illustrates an example second user interface displaying video search results; A user can play any of the videos by selecting a link or a control in the appropriate video search result; [0094], A user can return to the image search results user interface 500 by selecting the return to image results control, e.g., link, 554);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated wherein the video data and the image data are linked based on a similarity value as suggested in Zhao into Tan in view of Ross.  Doing so would be desirable because internet search engines identify and score responsive image and video search results according to text associated with the images and videos. However, some images and videos have little associated text, making it difficult for a search engine to identify responsive images and videos or to determine an accurate score for the images and videos (see Zhao [0004]).   Particular embodiments of the subject matter described in this specification can be implemented so as to realize one or more of the following advantages. Images can be associated with similar videos. Images that are related to the same video can be identified as being related to each other. Videos that are related to the same image can be identified as being related to each other. Metadata for an image or video can be augmented with metadata for related images or videos. Search results that include both images and videos can be presented to users, where only videos or only images were initially identified by a search engine (see Zhao [0010]).   Additionally, the system of Zhao would improve the system of Tan by automatically identifying the best and most similar images to be displayed to the user (see Zhao [0034]), thereby avoiding user confusion through the display of poorly correlated images and ensuring the user is quickly and easily provided with their desired information.  

Regarding claim 15, claim 15 contains substantially similar limitations to those found in claim 6.  Consequently, claim 15 is rejected for the same reasons.

Regarding claim 7, Tan in view of Ross in further view of Zhao teaches all the limitations of claim 6.  Zhao further teaches:
wherein the similarity value is calculated based upon a distance between one or more image descriptors of the image data in relation to one or more video descriptors of the video data (Zhao Figs. 1-8 [0034], the association data also associates an image with a particular frame from the video that best matches, e.g., is the most visually similar to, the image among one or more frames extracted from the video; [0035], the system uses the association data to identify related pairs of images or related pairs of videos. For example, if the system determines that two images are associated with the same video, the system can store data associating the two images; [0065], The system determines whether the image and the video are related by comparing the strength of relationship to a predetermined threshold; [0066], Various algorithms for calculating the metric for a frame of the video and an image can be used. In some implementations, the system generates a hash key corresponding to the features of the frame and a hash key corresponding to the features of the image. The system then calculates the metric by determining the Hamming distance between the two hash keys; [0067-0068], The T-bit hash function can be trained based on an affinity matrix that identifies similarity between each pair of images in a set of training data. The training can be performed using machine learning techniques that modify the individual binary hash functions to increase the Hamming distance between dissimilar objects according to the affinity matrix and decrease the Hamming distance between similar objects according to the affinity matrix; [0069-0070], Other conventional methods for determining nearest neighbors, for example, KD-trees, or permutation grouping can also be used to determine whether an image and a frame are sufficiently similar to be related; [0037], metadata can be geographic location data specifying where an image or video was taken or keyword data specifying keywords describing or otherwise associated with the image or the video; [0077], search results are presented according to geographic metadata associated with the images and videos; [0103], results are displayed at a location on the map corresponding to geographic metadata for the image and video; see also [0073], [0091-0094]);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated wherein the similarity value is calculated based upon a distance between one or more image descriptors of the image data in relation to one or more video descriptors of the video data as suggested in Zhao into Tan in view of Ross.  Doing so would be desirable because internet search engines identify and score responsive image and video search results according to text associated with the images and videos. However, some images and videos have little associated text, making it difficult for a search engine to identify responsive images and videos or to determine an accurate score for the images and videos (see Zhao [0004]).    Particular embodiments of the subject matter described in this specification can be implemented so as to realize one or more of the following advantages. Images can be associated with similar videos. Images that are related to the same video can be identified as being related to each other. Videos that are related to the same image can be identified as being related to each other. Metadata for an image or video can be augmented with metadata for related images or videos. Search results that include both images and videos can be presented to users, where only videos or only images were initially identified by a search engine (see Zhao [0010]).   Additionally, the system of Zhao would improve the system of Tan by automatically identifying the best and most similar images to be displayed to the user (see Zhao [0034]), thereby avoiding user confusion through the display of poorly correlated images and ensuring the user is quickly and easily provided with their desired information.  

Regarding claim 16, claim 16 contains substantially similar limitations to those found in claim 7.  Consequently, claim 16 is rejected for the same reasons.

Regarding claim 8, Tan in view of Ross in further view of Zhao teaches all the limitations of claim 6.  Zhao further teaches:
wherein the linking of the video data and the image data is based upon the similarity value being greater than or equal to a threshold value (Zhao Figs. 1-8 [0034], the association data also associates an image with a particular frame from the video that best matches, e.g., is the most visually similar to, the image among one or more frames extracted from the video; [0035], the system uses the association data to identify related pairs of images or related pairs of videos. For example, if the system determines that two images are associated with the same video, the system can store data associating the two images; [0065], The system determines whether the image and the video are related by comparing the strength of relationship to a predetermined threshold. If the strength of relationship satisfies the threshold, the image and video are determined to be related; [0066], Various algorithms for calculating the metric for a frame of the video and an image can be used. In some implementations, the system generates a hash key corresponding to the features of the frame and a hash key corresponding to the features of the image. The system then calculates the metric by determining the Hamming distance between the two hash keys; [0067-0068], The T-bit hash function can be trained based on an affinity matrix that identifies similarity between each pair of images in a set of training data. The training can be performed using machine learning techniques that modify the individual binary hash functions to increase the Hamming distance between dissimilar objects according to the affinity matrix and decrease the Hamming distance between similar objects according to the affinity matrix; [0069-0070], Other conventional methods for determining nearest neighbors, for example, KD-trees, or permutation grouping can also be used to determine whether an image and a frame are sufficiently similar to be related; see also [0073], [0091-0094]);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated wherein the linking of the video data and the image data is based upon the similarity value being greater than or equal to a threshold value as suggested in Zhao into Tan in view of Ross.  Doing so would be desirable because internet search engines identify and score responsive image and video search results according to text associated with the images and videos. However, some images and videos have little associated text, making it difficult for a search engine to identify responsive images and videos or to determine an accurate score for the images and videos (see Zhao [0004]).  Particular embodiments of the subject matter described in this specification can be implemented so as to realize one or more of the following advantages. Images can be associated with similar videos. Images that are related to the same video can be identified as being related to each other. Videos that are related to the same image can be identified as being related to each other. Metadata for an image or video can be augmented with metadata for related images or videos. Search results that include both images and videos can be presented to users, wh
Read full office action
Prosecution Timeline

Oct 24, 2023
Application Filed
May 17, 2024
Non-Final Rejection — §103, §DP
Aug 20, 2024
Examiner Interview Summary
Aug 20, 2024
Applicant Interview (Telephonic)
Aug 21, 2024
Response Filed
Sep 05, 2024
Final Rejection — §103, §DP
Oct 16, 2024
Applicant Interview (Telephonic)
Oct 16, 2024
Examiner Interview Summary
Oct 30, 2024
Request for Continued Examination
Nov 04, 2024
Response after Non-Final Action
May 30, 2025
Non-Final Rejection — §103, §DP
Jul 31, 2025
Response Filed
Jul 31, 2025
Applicant Interview (Telephonic)
Jul 31, 2025
Examiner Interview Summary
Aug 22, 2025
Final Rejection — §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/091,844
Patent 12574602
CONTROL DISPLAY METHOD, ELECTRONIC DEVICE, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM
2y 5m to grant Granted Mar 10, 2026
17/171,794
Patent 12568166
TIME-AVERAGED PROXIMITY SENSOR
2y 5m to grant Granted Mar 03, 2026
16/174,108
Patent 12554991
Device and Method for Performing Self-Learning Operations of an Artificial Neural Network
2y 5m to grant Granted Feb 17, 2026
18/374,371
Patent 12511029
USER INTERFACE FOR AN AUTOMATED MASSAGE SYSTEM WITH BODY MODEL AND CONTROL OBJECT
2y 5m to grant Granted Dec 30, 2025
18/301,685
Patent 12483602
COMPUTER IMPLEMENTED METHOD AND APPARATUS FOR MANAGEMENT OF NON-BINARY PRIVILEGES IN A STRUCTURED USER ENVIRONMENT
2y 5m to grant Granted Nov 25, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
58%
Grant Probability
99%
With Interview (+48.0%)
3y 5m
Median Time to Grant
High
PTA Risk
Based on 347 resolved cases by this examiner. Grant probability derived from career allow rate.
IMAGE AND VIDEO INSTANCE ASSOCIATION FOR AN E-COMMERCE APPLICATIONS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email