DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The Amendment filed December 11, 2025 has been entered. Claims 1, 3-7, 9, 11-12, 14-18, 20-21, 23, 27, and 29-30 remain pending in the application. Applicant’s amendments to the Claims have overcome each and every objection and 101 rejection previously set forth in the Non-Final Office Action mailed September 11, 2025.
Claim Objections
Claim 11 is objected to because of the following informalities: in claim 11 lines 1-3, “wherein further comprising” should read “further comprising”.
Appropriate correction is required.
Applicant is advised that should claim 1 be found allowable, claim 6 will be objected to under 37 CFR 1.75 as being a substantial duplicate thereof. When two claims in an application are duplicates or else are so close in content that they both cover the same thing, despite a slight difference in wording, it is proper after allowing one claim to object to the other as being a substantial duplicate of the allowed claim. See MPEP § 608.01(m).
Applicant is advised that should claim 12 be found allowable, claim 20 will be objected to under 37 CFR 1.75 as being a substantial duplicate thereof. When two claims in an application are duplicates or else are so close in content that they both cover the same thing, despite a slight difference in wording, it is proper after allowing one claim to object to the other as being a substantial duplicate of the allowed claim. See MPEP § 608.01(m).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1, 5-7, 9, 11-12, 17, 20-21, 23 and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (CN110213492A) in view of Waggoner et al. (US 2017/0177197 A1).
Regarding claim 1, Li discloses an image collection method implemented by a terminal device (paragraph 0020: “an electronic device”) that comprises a processor (paragraph 0020: “a processor”), a memory (paragraph 0020: “a memory”), and a camera (paragraph 0020: “a first camera of a first type”), the image collection method comprising: acquiring, by the terminal device, voice information from a user (paragraph 0050: “the electronic device can obtain one or more voice information of a user through a microphone”); determining, by the terminal device, whether the voice information satisfies a first preset condition (paragraph 0052: “Determine target voice information containing a preset keyword from at least one voice information”); determining, by the terminal device, a position of a target object (paragraph 0088: “The electronic device determines a relative position between the target object and the electronic device”) in response to determining that the voice information satisfies the first preset condition (paragraph 0052: the target object is determined based on the voice information containing the preset keyword); automatically adjusting, by the terminal device, the camera based on the position of the target object, wherein the adjusting the camera comprises adjusting at least one of an angle of view or a focal length of the camera (paragraph 0093: “The electronic device adjusts a shooting angle of the first camera according to the relative position so that the target object is located in a central area of a shooting area of the first camera”); capturing an image of the target object by the adjusted camera (paragraph 0095: “the electronic device photographs the target object through the first camera”). However, Li fails to explicitly disclose the target object comprises an object whose image needs to be displayed on a display screen, and the determining the position of the target object comprises: acquiring a body image of the user, wherein the body image comprises a target limb of the user, wherein the target limb performs an action that identifies the target object, and determining the position of the target object based on a feature point of the target limb; and displaying the captured image of the target object on the display screen. In the related art of displaying objects of interest, Waggoner discloses the target object comprises an object whose image needs to be displayed on a display screen (Waggoner paragraphs 0028, 0040: “An object of interest may be any object or region within a video that a user desires to track...The portion of the subsequent video frames that include a representation of the object of interest can then be displayed 614”), and the determining the position of the target object comprises: acquiring a body image of the user (Waggoner paragraph 0058: “at least one image capture element operable to image a user, people, or other viewable objects in the vicinity of the device”), wherein the body image comprises a target limb of the user, wherein the target limb performs an action that identifies the target object (Waggoner paragraph 0027: “one or more image capture devices on the mobile device may detect a gesture of the user…a user may point at an object in the video content”), and determining the position of the target object based on a feature point of the target limb (Waggoner paragraph 0027: “The image(s) of the captured gesture may be processed to determine the position of the gesture with respect to the video content and determine a corresponding object of interest”); and displaying the captured image of the target object on the display screen (Waggoner FIG. 3(c), paragraph 0040: “The portion of the subsequent video frames that include a representation of the object of interest can then be displayed 614”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Li to incorporate the teachings of Waggoner to enable a greater level of detail to be seen, particularly on devices with relatively small and/or low resolution display screens (Waggoner paragraph 0017).
Regarding claim 5, Li, modified by Waggoner, discloses the image collection method according to claim 1, wherein the target limb comprises at least one of a hand or an arm (Waggoner paragraph 0027: “a user may point at an object in the video content”).
Regarding claim 6, Li, modified by Waggoner, discloses the image collection method according to claim 1, further comprising: adjusting the angle of view during capturing (Li paragraph 0093: “adjusts a shooting angle of the first camera according to the relative position so that the target object is located in a central area of a shooting area of the first camera”) or adjusting the focal length during capturing the image of the target object (this limitation is disclosed in an alternative clause and thus, read only on the first limitation).
Regarding claim 7, Li, modified by Waggoner, discloses the image collection method according to claim 6, wherein the angle of view during capturing is adjusted, so that the target object is located in the middle of the captured image (Li paragraph 0093: “adjusts a shooting angle of the first camera according to the relative position so that the target object is located in a central area of a shooting area of the first camera”); and/or, the focal length is adjusted to increase a magnification during capturing; and/or, the adjusting a focal length during capturing comprises: adjusting the focal length during capturing according to a size of a display screen, wherein the display screen is used for displaying the captured image (this limitation is disclosed in an alternative clause and thus, read only on the first limitation).
Regarding claim 9, Li, modified by Waggoner, discloses the image collection method according to claim 1, further comprising: acquiring a voice instruction (Li paragraphs 0008, 0051: “obtain at least one voice message”; for example, “Please take a photo for me”); and adjusting, according to the voice instruction (Li paragraphs 0009, 0054: the target object is determined based on the voice message), the angle of view and/or focal length during capturing (Li paragraph 0093: “adjusts a shooting angle of the first camera according to the relative position so that the target object is located in a central area of a shooting area of the first camera”); and/or, the image collection method further comprising: acquiring voice information again; determining whether the voice information acquired again satisfies a second preset condition; and adjusting the image collection apparatus to a first state if the voice information acquired again satisfies the second preset condition, wherein the first state is a state of the image collection apparatus before the target object is captured to obtain an image of the target object (this limitation is disclosed in an alternative clause and thus, read only on the first limitation).
Regarding claim 11, Li, modified by Waggoner, discloses the image collection method according to claim 1, further comprising: determining whether the voice information comprises preset keywords (Li paragraph 0009: “Determining a target voice message containing a preset keyword from the at least one voice message”); in response to determining that the voice information comprises the keywords, the first preset condition is satisfied; in response to determining that the voice information does not comprise the keywords, the first preset condition is not satisfied (Li paragraph 0009: the target shooting subject is determined only from the voice message contains the preset keyword).
Regarding claim 12, Li discloses an image collection method implemented by a terminal device (Li paragraph 0020: “an electronic device”) that comprises a processor (Li paragraph 0020: “a processor”), a memory (Li paragraph 0020: “a memory”), and a camera (Li paragraph 0020: “a first camera of a first type”), the image collection method comprising: acquiring an image of a preset object, wherein the image of the preset object comprises a body image of a user or an image of a preset article (Li paragraphs 0010, 0087: “photographing the target object using the first camera” where “the user (the user holding the electronic device) can select which user to determine as the target shooting subject”); determining a position of a target object (Li paragraph 0088: “determines a relative position between the target object and the electronic device”); automatically adjusting the camera based on the position of the target object, wherein the adjusting the camera comprises adjusting at least one of an angle of view or a focal length of the camera (Li paragraph 0093: “The electronic device adjusts a shooting angle of the first camera according to the relative position so that the target object is located in a central area of a shooting area of the first camera”); capturing an image of the target object by the adjusted camera (Li paragraph 0095: “the electronic device photographs the target object through the first camera”). However, Li fails to explicitly disclose the preset object is associated with an action that identifies the target object; determining a position of a target object according to the image of the preset object, wherein the target object comprises an object whose image needs to be displayed on a display screen; and displaying the captured image of the target object on the display screen. In related art, Waggoner discloses the preset object is associated with an action that identifies the target object (Waggoner paragraph 0027: “one or more image capture devices on the mobile device may detect a gesture of the user…a user may point at an object in the video content”); determining a position of a target object according to the image of the preset object (Waggoner paragraph 0027: “The image(s) of the captured gesture may be processed to determine the position of the gesture with respect to the video content and determine a corresponding object of interest”), wherein the target object comprises an object whose image needs to be displayed on a display screen (Waggoner paragraphs 0028, 0040: “An object of interest may be any object or region within a video that a user desires to track...The portion of the subsequent video frames that include a representation of the object of interest can then be displayed 614”); and displaying the captured image of the target object on the display screen (Waggoner FIG. 3(c), paragraph 0040: “The portion of the subsequent video frames that include a representation of the object of interest can then be displayed 614”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Li to incorporate the teachings of Waggoner to enable a greater level of detail to be seen, particularly on devices with relatively small and/or low resolution display screens (Waggoner paragraph 0017).
Regarding claim 17, Li, modified by Waggoner, discloses the image collection method according to claim 12, wherein the determining the position of the target object according to the image of the preset object comprises: acquiring a position of a feature point of the preset object in the image of the preset object (Waggoner paragraph 0027: “The image(s) of the captured gesture may be processed to determine the position of the gesture with respect to the video content”); and determining the position of the target object according to the position of the feature point of the preset object (Waggoner paragraph 0027: “and determine a corresponding object of interest”).
Regarding claim 20, Li, modified by Waggoner, discloses the image collection method according to claim 12, further comprising: adjusting the angle of view during capturing (Li paragraph 0093: “adjusts a shooting angle of the first camera according to the relative position so that the target object is located in a central area of a shooting area of the first camera”) or adjusting the focal length during capturing the image of the target object (this limitation is disclosed in an alternative clause and thus, read only on the first limitation).
Regarding claim 21, Li, modified by Waggoner, discloses the image collection method according to claim 20, wherein the angle of view during capturing is adjusted, so that the target object is located in the middle of the captured image (Li paragraph 0093: “adjusts a shooting angle of the first camera according to the relative position so that the target object is located in a central area of a shooting area of the first camera”); and/or, the focal length is adjusted to increase a magnification during capturing; and/or, the adjusting the focal length during capturing comprises: adjusting the focal length during capturing according to a size of a display screen, wherein the display screen is used for displaying the captured image (this limitation is disclosed in an alternative clause and thus, read only on the first limitation).
Regarding claim 23, Li, modified by Waggoner, discloses the image collection method according to claim 12, further comprising: acquiring a voice instruction (Li paragraphs 0008, 0051: “obtain at least one voice message”; for example, “Please take a photo for me”), and determining the position of the target object according to the image of the preset object (as claimed in claim 12) or adjusting the angle of view and/or focal length during capturing (Li paragraph 0093: “adjusts a shooting angle of the first camera according to the relative position so that the target object is located in a central area of a shooting area of the first camera”) according to the voice instruction when the voice instruction satisfies a preset condition (Li paragraphs 0009, 0054: the target object is determined based on the voice message containing the preset keyword); and/or, the image collection method further comprising: acquiring voice information; determining whether the acquired voice information satisfies a fourth preset condition; and adjusting the image collection apparatus to a second state if the acquired voice information satisfies the fourth preset condition, wherein the second state is a state of the image collection apparatus before the target object is captured to obtain an image of the target object (this limitation is disclosed in an alternative clause and thus, read only on the first limitation).
Regarding claim 27, it is the corresponding device configured to execute the method claimed in claim 1. Therefore, Li, modified by Waggoner, discloses the limitations of claim 27 as it does the limitations of claim 1.
Claim(s) 3, 14-16 and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Li and Waggoner in view of Rainisto (US 2017/0060828 A1).
Regarding claim 3, Li, modified by Waggoner, discloses the image collection method according to claim 1. However, Li and Waggoner fail to explicitly disclose determining whether the body image comprises the feature point of a target limb; and in response to determining that the body image does not comprise the feature point of the target limb, acquiring another body image of the user. In the related art of identifying the target of gestures, Rainisto discloses determining whether the body image comprises the feature point of a target limb (Rainisto paragraph 0042: “skeletal maps may be continuously monitored to be detect possible gestures by the participants. Detection may be focused on certain parts of participants' skeletal maps depending upon which body parts the predefined gestures involve”); and in response to determining that the body image does not comprise the feature point of the target limb, acquiring another body image of the user (Rainisto paragraph 0042: “If no match is found, monitoring of skeletal maps continues in step 406”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Li and Waggoner to incorporate the teachings of Rainisto to generate annotations indicating tasks in real time, without affecting the natural progress of a meeting (Rainisto paragraph 0032).
Regarding claim 14, Li, modified by Waggoner, discloses the image collection method according to claim 12. However, Li and Waggoner fail to explicitly disclose after acquiring an image of a preset object and before determining the position of a target object according to the image of the preset object, the method further comprises: determining whether the image of the preset object satisfies a third preset condition; and determining the position of the target object according to the image of the preset object in response to determining that the image of the preset object satisfies the third preset condition. In related art, Rainisto discloses after acquiring an image of a preset object and before determining the position of a target object according to the image of the preset object, the method further comprises: determining whether the image of the preset object satisfies a third preset condition (Rainisto paragraph 0042: “determining whether a gesture detected in the skeletal map of a speaker is recognized as being one from a set of predefined gestures”); and determining the position of the target object according to the image of the preset object (Rainisto FIG. 3B; paragraphs 0019-0020: “the annotation associated…may comprise…an identity of the target of the gesture” where “a target of a gesture may be located and identified from the angle and direction of at least one limb from a skeletal map in conjunction with already processed location awareness”) in response to determining that the image of the preset object satisfies the third preset condition (Rainisto paragraph 0042: “If a gesture is recognized, its associated annotation is fetched”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Li and Waggoner to incorporate the teachings of Rainisto to generate annotations indicating tasks in real time, without affecting the natural progress of a meeting (Rainisto paragraph 0032).
Regarding claim 15, Li, modified by Waggoner and Rainisto, discloses the image collection method according to claim 14, wherein the image of the preset object comprises the body image of the user (Rainisto FIG. 3A; paragraphs 0003, 0017: “receive a video of the participant” where “the analysis of video may be done on individual frames comprising the video”), and the determining whether the image of the preset object satisfies the third preset condition comprises: determining whether the body image comprises a target limb having a target action (Rainisto paragraph 0042: “skeletal maps may be continuously monitored to be detect possible gestures by the participants. Detection may be focused on certain parts of participants' skeletal maps depending upon which body parts the predefined gestures involve”); if so, the third preset condition is satisfied (Rainisto paragraph 0042: “If a gesture is recognized, its associated annotation is fetched”); or, otherwise, the third preset condition is not satisfied (Rainisto paragraph 0042: “If no match is found, monitoring of skeletal maps continues in step 406”); or, wherein the image of the preset object comprises an image of a preset article, and the determining whether the image of the preset object satisfies the third preset condition comprises: determining whether the preset article in the image of the preset article is held and points to the target object; if so, the third preset condition is satisfied, or, otherwise, the third preset condition is not satisfied (this limitation is disclosed in an alternative clause and thus, read only on the first limitation).
Regarding claim 16, Li, modified by Waggoner and Rainisto, discloses the image collection method according to claim 15, wherein the target limb having a target action comprises: at least one of a finger pointing to the target object (Rainisto FIG. 3A, paragraph 0020: “a gesture may comprise pointing towards a target in a particular fashion”), a hand lifting up the target object, a hand holding the target object, or eyes looking at the target object (Waggoner paragraph 0027: “automatically magnify and track an object upon detecting that the user's gaze has been focused on the object for a determined period of time”); or the target limb comprises at least one of a hand and an arm (Rainisto FIG. 3A, paragraph 0029: skeletal map 601 of participant 60 comprises a pointing hand 602 with arm 605).
Regarding claim 29, it is the corresponding device configured to execute the method claimed in claim 3. Therefore, Li, modified by Waggoner and Rainisto, discloses the limitations of claim 29 as it does the limitations of claim 3.
Claim(s) 4 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Li and Waggoner in view of Davis et al. (US 2013/0106892 A1).
Regarding claim 4, Li, modified by Waggoner, discloses the image collection method according to claim 1. However, Li and Waggoner fail to explicitly disclose determining a target range with the feature point of the target limb as a center and a preset distance as a radius, and positioning the target object within the target range to determine the position of the target object; or searching and positioning the target object near the feature point of the target limb. In the related art of identifying the target of gestures, Davis discloses determining a target range with the feature point of the target limb as a center and a preset distance as a radius, and positioning the target object within the target range to determine the position of the target object (this limitation is disclosed in an alternative clause and thus, read only on the latter limitation); or searching and positioning the target object (Davis paragraphs 0114-0115: “the user can effectively use one or more such limbs (fingers or hands, e.g.) to indicate a three-dimensional region 903 containing one or more elements 931,932 of interest to the user…second optical data 642 indicates an approximate position of one or more elements 931, 932 in a 3D region toward which the person gestures”) near the feature point of the target limb (Davis FIG. 10: three-dimensional regions 1003, 1004; paragraph 0117: “the user's vicinity (region 902, e.g.) defines "the environment," in which auditory data 656 and one or more visible elements 931, 932 are both captured”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Li and Waggoner to incorporate the teachings of Davis to facilitate a search partly based on a position of a person’s limb (Davis paragraph 0058).
Regarding claim 18, Li, modified by Waggoner, discloses the image collection method according to claim 17. However, Li and Waggoner fail to explicitly disclose determining a target range with the feature point of the preset object as a center and a preset distance as a radius; and positioning the target object within the target range to determine the position of the target object; or searching and positioning the target object near the feature point of the preset object. In related art, Davis discloses determining a target range with the feature point of the preset object as a center and a preset distance as a radius; and positioning the target object within the target range to determine the position of the target object (this limitation is disclosed in an alternative clause and thus, read only on the latter limitation); or searching and positioning the target object (Davis paragraphs 0114-0115: “the user can effectively use one or more such limbs (fingers or hands, e.g.) to indicate a three-dimensional region 903 containing one or more elements 931,932 of interest to the user…second optical data 642 indicates an approximate position of one or more elements 931, 932 in a 3D region toward which the person gestures”) near the feature point of the preset object (Davis FIG. 10: three-dimensional regions 1003, 1004; paragraph 0117: “the user's vicinity (region 902, e.g.) defines "the environment," in which auditory data 656 and one or more visible elements 931, 932 are both captured”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Li and Waggoner to incorporate the teachings of Davis to facilitate a search partly based on a position of a person’s limb (Davis paragraph 0058).
Allowable Subject Matter
Claim 30 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
Regarding claim 30, Li, modified by Waggoner, discloses the terminal device according to claim 27. However, the cited prior art either alone or in combination fails to disclose, teach, or suggest: determining a target range utilizing the feature point of the target limb as a center and a preset distance as a radius; and positioning the target object within the target range to determine the position of the target object.
Response to Arguments
Applicant’s arguments with respect to claim(s) 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHRISTINE ZHAO whose telephone number is (703)756-5986. The examiner can normally be reached Monday - Friday 9:00am - 5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Bee can be reached at (571)270-5183. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/C.Z./Examiner, Art Unit 2677
/ANDREW W BEE/Supervisory Patent Examiner, Art Unit 2677