Prosecution Insights
Last updated: May 29, 2026
Application No. 18/766,513

HEARABLE DEVICE TO HEARABLE DEVICE COMMUNICATION USING IMAGE RECOGNITION

Non-Final OA §102§103
Filed
Jul 08, 2024
Examiner
MCCORD, PAUL C
Art Unit
2692
Tech Center
2600 — Communications
Assignee
Sony Group Corporation
OA Round
1 (Non-Final)
69%
Grant Probability
Favorable
1-2
OA Rounds
1y 6m
Est. Remaining
95%
With Interview

Examiner Intelligence

Grants 69% — above average
69%
Career Allowance Rate
398 granted / 575 resolved
+7.2% vs TC avg
Strong +26% interview lift
Without
With
+26.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 5m
Avg Prosecution
28 currently pending
Career history
613
Total Applications
across all art units

Statute-Specific Performance

§101
0.6%
-39.4% vs TC avg
§103
92.4%
+52.4% vs TC avg
§102
3.6%
-36.4% vs TC avg
§112
1.1%
-38.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 575 resolved cases

Office Action

§102 §103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . DETAILED ACTION Claim Rejections - 35 USC § 102 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention. (a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention. Claim 1, 18 are rejected under 35 U.S.C. 102 as being anticipated by Wexler: 20200296521 hereinafter Wex . Regarding claim 1 Wex teaches: A computer-implemented method for hearable devices to connect for audio communication (Wex: Abstract: system determines a look direction of a user based on captured images and conditions an audio signal based thereon for communication of audio among a plurality of wireless devices of a personal network), the method performed, comprising: receiving at least one image of an environment of a user, from an image capture device of the user (Wex: Abstract; ¶ 6, etc.; Fig 4D: a wearable camera unit captures images local to a user); identifying a first target person in the environment by, at least in part, analyzing the at least one image (Wex: ¶ 223, etc.; Fig 20B, 21: such as by operating a facial recognition component upon captured images); transmitting a communication to the first target person to connect a first hearable device of the first target person with a user hearable device of the user (Wex: ¶ 224, 250, 587; Figs 5C, 22: such as by using the identification to determine a contact of a recognized individual in the environment as the “facial recognition component 2040 may also access a contact list,” and to thereby initiate contact based thereon such as by conducting operations whereby “ the voice audio signals may be processed by the remotely located device and/or transmitted further” and/or operating an affordance on the display such as to initiate a phone call as the “display may also include other functionality associated with the individual, such as contacting the individual…e.g. by phone“); and by utilizing information of the recognized individual establishing a first communication connection between the user hearable device and the first hearable device of the first target person (Wex: ¶ 224, 250, 587; Figs 5C, 22: such as by the recognized individual accepting and conducting the call, and/or forwarding the call to voicemail). Regarding claim 18—the claim is considered to recite substantially similar subject matter to that of claim 1 supra and is similarly rejected Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Wexler: 20200296521 hereinafter Wex further in view of von Liechtenstein: 20160154460 hereinafter von. Regarding claim 1 Wex teaches: A computer-implemented method for hearable devices to connect for audio communication (Wex: Abstract: system determines a look direction of a user based on captured images and conditions an audio signal based thereon for communication of audio among a plurality of wireless devices of a personal network), the method performed, comprising: receiving at least one image of an environment of a user, from an image capture device of the user(Wex: Abstract; ¶ 6, etc.; Fig 4D: a wearable camera unit captures images local to a user); identifying a first target person in the environment by, at least in part, analyzing the at least one image (Wex: ¶ 223, etc.; Fig 20B, 21: such as by operating a facial recognition component upon captured images); transmitting a communication to the first target person to connect a first hearable device of the first target person with a user hearable device of the user (Wex: ¶ 224, 250, 587; Figs 5C, 22: such as by using the identification to determine a contact of a recognized individual in the environment as the “facial recognition component 2040 may also access a contact list,” and to thereby initiate contact based thereon such as by conducting operations whereby “ the voice audio signals may be processed by the remotely located device and/or transmitted further” and/or operating an affordance on the display such as to initiate a phone call as the “display may also include other functionality associated with the individual, such as contacting the individual…e.g. by phone“); and by utilizing information of the recognized individual establishing a first communication connection between the user hearable device and the first hearable device of the first target person (Wex: ¶ 224, 250, 587; Figs 5C, 22: such as by the recognized individual accepting and conducting the call, and/or forwarding the call to voicemail). Wex strongly suggests but does not explicitly discuss the connection dynamics of a hearable device to a hearable device beyond the suggestion that the user may access a contact list to determine a contact for a recognized individual in the environment and operate the system to initiate a call thereto and as such may not be considered to explicitly teach the recited “transmitting a communication to the first target person to connect a first hearable device of the first target person with a user hearable device of the user.” In a related field of endeavor von teaches a system and method for pairing of devices for two way voice communication (von: Abstract; ¶ 29, Fig 5): comprising receiving at least one image of an environment of a user (von: ¶ 25, 26, Fig 1B, 2: wearer of a head mountable device gazes into a crowd comprising a first user); identifying a first target person in the environment (von: ¶ 25, 26, Fig 1B, 2: such as by determination, exchange, etc. of unique identifiers); in response, at least in part, to identifying the first target person, transmitting a communication to the first target person to connect a first hearable device of the first target person with a user hearable device of the user (von: ¶ 27-29; Fig 4-6: such as by selecting a first user and selecting an option to initiate two way voice communication resulting in a virtual handshake whereby each user accepts the interaction); and establishing a first communication connection between the user hearable device and the first hearable device of the first target person (von: ¶ 30: an interaction such as a two way voice communication is initiated when accepted by both the wearer and first user). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to combine the methods of facial recognition as taught or suggested by Wex with the selectable device to device communication system as taught or suggested by von for at least the purpose of realizing a multi-modal communicator which determines and initiatives contact with communication partners based on a plurality of modalities; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 2 Wex in view of von teaches or suggests: The computer-implemented method of claim 1, further comprising prior to establishing the first communication connection: outputting a request to the user to confirm the first communication connection with the first target person; and receiving confirmation of the first communication connection from the user, wherein establishing the first communication connection is in response to receiving the confirmation (von: ¶ 27-30: system begins an interaction by querying each participant to agree to a virtual handshake; an interaction such as a two way voice communication is initiated when accepted by both the wearer and first user). The claim is considered obvious over Wex as modified by von as addressed in the base claim as it would have been obvious to apply the further teaching of Wex and/or von to the modified device of Wex and von; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 3 Wex in view of von teaches or suggests: The computer-implemented method of claim 1, wherein identifying the first target person by analyzing the at least one image includes: extracting one or more visual indicators from the at least one image; and matching the one or more visual indicators with stored distinguishing visual characteristics of the first target person in one or more visual identifying techniques including facial recognition, iris recognition, gait recognition, and/or combinations thereof (Wex: ¶ 222, 511; Fig 20: system analyzes acquired image using facial and/or gait recognition). The claim is considered obvious over Wex as modified by von as addressed in the base claim as it would have been obvious to apply the further teaching of Wex and/or von to the modified device of Wex and von; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 4 Wex in view of von teaches or suggests: The computer-implemented method of claim 1, wherein identifying the first target person by analyzing the at least one image includes: applying an artificial intelligence model trained on stored visual indicator data of known target persons to predict whether a potential target person in the at least one image is the first target person (Wex: 223-225, 374; Fig 20B: facial recognition model trained by mapping faces to identities generates iteratively improved identity prediction based on faces detected in image data; “the determination of whether the at least one individual is a recognized individual may be based on one or more facial features associated with the at least one individual that are detected based on analysis of the at least one of the plurality of images,”). The claim is considered obvious over Wex as modified by von as addressed in the base claim as it would have been obvious to apply the further teaching of Wex and/or von to the modified device of Wex and von; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 5 Wex in view of von teaches or suggests: The computer-implemented method of claim 1, wherein identifying the first target person further includes: receiving a broadcast identifier associated with the first hearable device; and matching the broadcast identifier with a stored broadcast identifier (von: ¶ 27-30; such as by conducting the beacon intersection method of von wherein the matching occurs when a virtual handshake is initiated and/or accepted). The claim is considered obvious over Wex as modified by von as addressed in the base claim as it would have been obvious to apply the further teaching of Wex and/or von to the modified device of Wex and von; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 6 Wex in view of von teaches or suggests: The computer-implemented method of claim 1, wherein identifying the first target person further includes: receiving a broadcast identifier associated with the first hearable device; determining that the identified first target person is associated with a different broadcast identifier in a broadcast identifier library; and adding the received broadcast identifier to the broadcast identifier library associated with the first target person (Wex: ¶ 374, 489: system stores identified image data in a database for seeking a match to identify an individual); (von: ¶ 27-30; claim 2: system dynamically updates broadcast identifiers such as by adapting the system to accommodate new identifiers for users). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to adapt the Wex taught database with the emergent broadcast identifiers taught or suggested by von for at least the purpose of maintaining a historical database of regular contacts; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 7 Wex in view of von teaches or suggests: The computer-implemented method of claim 1, wherein identifying the first target person further includes: receiving speech from the first target person; and matching one or more distinguishing audio characteristics of the speech with stored voice features associated with the first target person (Wex: ¶ 222, 227; Fig 20b: such as by determining an individual’s identity based on voice recognition). The claim is considered obvious over Wex as modified by von as addressed in the base claim as it would have been obvious to apply the further teaching of Wex and/or von to the modified device of Wex and von; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 8 Wex in view of von teaches or suggests: The computer-implemented method of claim 1, wherein identifying the first target person further includes: tracking an eye gaze of the user from the at least one eye image (Wex: ¶ 121, 200: system operates by conducting tracked eye movement; various features may be used to track gaze including user’s eyes); (von: system tracks gaze to mediate interactions among users); and determining the eye gaze is in a direction of the first target person (Wex: ¶ 441: system captures an image based on the gaze of a user to determine a communication partner in the captured image); (von: Abstract; ¶ 12, 23: system tracks the gaze of plural users to determine a gaze interlock event such as using cameras pointing in the same direction as the gaze of the users). While Wex in view of von does not explicitly teach the gaze tracking conducted by receiving at least one eye image of the user captured by one or more inward facing capture sensors of the image capture device, Examiner takes official notice that gaze detection by performance of eye tracking was well known in the art before the effective filing date of the instant application and would have comprised an obvious inclusion for conducting gaze tracking and user identification based thereon in the Wex in view of von system by the use of well-established and successful methods; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 9 Wex in view of von teaches or suggests: The computer-implemented method of claim 1, further comprising: determining the first target person is a member of a group that includes a second target person (Wex: ¶ 417: system determines one or more members of a group of individuals by identifying members of the group); (von: ¶ 27; Fig 2, 3: such as operating by a wearer the disclosed method upon a crowd with two or more individuals enhanced by the von system); transmitting a request to the second target person to connect a second hearable device of the second target person with the user hearable device of the user (Wex: ¶ 224, 250, 487, 587; Figs 5C, 22: such as by choosing to communicate such as by contacting a second individual in the group); (von: ¶ 25-30; Fig 4-6: such as by activating an icon and initiating two way voice communication between the wearer and a second member of the group, either before, during and/or after communicating with the first member of the group); and establishing a second communication connection between the user hearable device and the second hearable device of the second target person (Wex: ¶ 224, 250, 587; Figs 5C, 22: such as by the second member accepting and conducting the call, and/or forwarding the call to voicemail); (von: ¶ 25-30; Fig 4-6: such as in the event of mutual acceptance of two way voice communication by the wearer and second member of the group). The claim is considered obvious over Wex as modified by von as addressed in the base claim as it would have been obvious to apply the further teaching of Wex and/or von to the modified device of Wex and von; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 10 Wex in view of von teaches or suggests: The computer-implemented method of claim 9, wherein visual content in the at least one image is insufficient to identify the second target person in the environment of the user (Wex: ¶ 591: system attempts to identify a user by facial recognition based on visual characteristics or based on visual and audible characteristics). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to operate the system whereby failure to reach a determined degree of certainty for a visual recognition modality initiated a subsequent modality such as voice recognition for at least the purpose of creating a system robust to noise, uncertainty, etc.; one of ordinary skill in the art would have expected only predictable results therefrom. The claim is considered obvious over Wex as modified by von as addressed in the base claim as it would have been obvious to apply the further teaching of Wex and/or von to the modified device of Wex and von; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 11 Wex in view of von teaches or suggests: The computer-implemented method of claim 1, further comprising: detecting a stopping action indicating a stopping point of the audio conversation with the first target person via the first communication connection; requesting user confirmation of the stopping point; and in response to receiving the confirmation of the stopping point, disconnecting the first communication connection with the first hearable device (Wex: ¶ 231, 518: such as based on a user determination to terminate selective condition based on a determination to ignore a particular encounter or a desire to no longer condition or engage with the voice output of a user, such as effected by the user using functionality associated with contacting a user by phone); (von: ¶ 24: such as by a disengagement event initiated by a member of the interaction). The claim is considered obvious over Wex as modified by von as addressed in the base claim as it would have been obvious to apply the further teaching of Wex and/or von to the modified device of Wex and von; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 12 Wex teaches: A hearable communication system (Wex: Abstract: system determines a look direction of a user based on captured images and conditions an audio signal based thereon for communication of audio among a plurality of wireless devices of a personal network), the system comprising: an image capture device of a user to capture at least one image of an environment of the user (Wex: Abstract; ¶ 6, etc.; Fig 4D: a wearable camera unit captures images local to a user), the image capture device comprising an interface to transmit the at least one image to a user hearable device of the user (Wex: ¶ 118, 128, 197-199, 211, etc.; Fig 5C, 17A: wearable camera comprises wireless transceiver operable to capture images and further functions to wirelessly communicate such as with a computing device, smartphone, etc.; said smartphone considered a hearable device and said system operable to communicate wirelessly among processor and sensors of the wearable camera and additionally between the wearable camera, smartphone and additional devices such as a hearing interface to thereby enable the system to operate as a camera based hearing aid); and the user hearable device comprising: one or more processors; and logic encoded in one or more non-transitory media for execution by the one or more processors (Wex: Fig 5C: such as processor and memory of computing device 120); operable and when executed operable to perform operations comprising: receiving the at least one image (Wex: ¶ 211, etc.; Fig 5C, 17A, 19: such as processor and memory of computing device 120 operable to receive images from the wearable device); identifying a first target person in an environment of the user by, at least in part, analyzing the at least one image (Wex: ¶ 223, etc.; Fig 20B, 21: such as by operating a facial recognition component upon received images); in response, at least in part, to identifying the first target person, establishing a first communication connection between the user hearable device and the first target person by adjusting audio parameters of audio corresponding to the first target person (Wex: ¶ 8, 472, etc.: such as by conditioning of the audio corresponding to the identified individual such as by altering parameters, amplifying, attenuating the signal). Wex does not explicitly teach the system operable in response, at least in part, to identifying the first target person, transmitting a request to the first target person to connect a first hearable device of the first target person with the user hearable device; and establishing a first communication connection between the user hearable device and the first hearable device of the first target person. In a related field of endeavor von teaches a system and method for pairing of devices for two way voice communication (von: Abstract; ¶ 29, Fig 5): comprising receiving at least one image of an environment of a user (von: ¶ 25, 26, Fig 1B, 2: wearer of a head mountable device gazes into a crowd comprising a first user); identifying a first target person in the environment (von: ¶ 25, 26, Fig 1B, 2: such as by determination, exchange, etc. of unique identifiers); in response, at least in part, to identifying the first target person, transmitting a communication to the first target person to connect a first hearable device of the first target person with a user hearable device of the user (von: ¶ 27-29; Fig 4-6: such as by selecting a first user and selecting an option to initiate two way voice communication resulting in a virtual handshake whereby each user accepts the interaction); and establishing a first communication connection between the user hearable device and the first hearable device of the first target person (von: ¶ 30: an interaction such as a two way voice communication is initiated when accepted by both the wearer and first user). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to combine the methods of facial recognition as taught or suggested by Wex with the selectable device to device communication system as taught or suggested by von for at least the purpose of realizing a multi-modal communicator which determines and initiatives contact with communication partners based on a plurality of modalities; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 13 Wex in view of von teaches or suggests: The hearable communication system of claim 12, wherein the image capture device includes a wearable device having one or more outward facing image capture sensors to capture the at least one image of the first target person in the environment (Wex: Abstract; ¶ 222, 511 Fig 1, 3A: system comprises a camera by which to determine visual parameters of adjacent humans by pointing in a direction of a gaze of the wearer); (von: ¶ 12: such as a camera pointing in the direction of a gaze of the user), wherein the operations further comprise: extracting one or more visual indicators from the at least one image; and matching the one or more visual indicators with stored distinguishing visual characteristics of the first target person in one or more visual identifying techniques including facial recognition, iris recognition, gait recognition, and/or combinations thereof (Wex: ¶ 222, 511; Fig 20: system analyzes acquired image using facial and/or gait recognition). The claim is considered obvious over Wex as modified by von as addressed in the base claim as it would have been obvious to apply the further teaching of Wex and/or von to the modified device of Wex and von; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 14—the claim is considered to recite substantially similar subject matter to that of claim 8 supra and is similarly rejected Regarding claim 15 Wex in view of von teaches or suggests: The hearable communication system of claim 12, wherein the user hearable device includes at least one microphone (Wex: ¶ 10, 561: system utilizes at least one microphone for conducting voice recognition to identify a user based on parameters of a user speech, and wherein the operations further comprise: receiving speech from the first target person by the at least one microphone (Wex: ¶ 10, 222, 227, 561: such as to determine voice parameters by which to identify the user); and matching one or more distinguishing audio characteristics of the speech with stored voice features associated with the first target person (Wex: ¶ 222, 227; Fig 20b: such as by determining an individual’s identity based on voice recognition). The claim is considered obvious over Wex as modified by von as addressed in the base claim as it would have been obvious to apply the further teaching of Wex and/or von to the modified device of Wex and von; one of ordinary skill in the art would have expected only predictable results therefrom. Regarding claim 16—the claim is considered to recite substantially similar subject matter to that of claim 6 supra and is similarly rejected Regarding claim 17, 21—the claims are considered to recite substantially similar subject matter to that of claim 6 supra and are similarly rejected Regarding claim 18—the claim is considered to recite substantially similar subject matter to that of claims 1, 12 supra and is similarly rejected Regarding claim 19—the claim is considered to recite substantially similar subject matter to that of claims 3, 13 supra and is similarly rejected Regarding claim 20—the claim is considered to recite substantially similar subject matter to that of claims 5, 6, 16 supra and is similarly rejected. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL C MCCORD whose telephone number is (571)270-3701. The examiner can normally be reached 730-630 M-F. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CAROLYN EDWARDS can be reached at (571) 270-7136. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /PAUL C MCCORD/Primary Examiner, Art Unit 2692 /CAROLYN R EDWARDS/Supervisory Patent Examiner, Art Unit 2692
Read full office action

Prosecution Timeline

Jul 08, 2024
Application Filed
Apr 13, 2026
Non-Final Rejection mailed — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12639525
PROMPTING LANGUAGE MODELS WITH WORKFLOW PLANS
3y 5m to grant Granted May 26, 2026
Patent 12632482
TRAINING A LEARNING-TO-RANK MODEL USING A LINEAR DIFFERENCE VECTOR
3y 2m to grant Granted May 19, 2026
Patent 12634652
MEDIA PLAYBACK BASED ON SENSOR DATA
1y 4m to grant Granted May 19, 2026
Patent 12626723
SYSTEM AND METHOD OF DETERMINING AUDITORY CONTEXT INFORMATION
5y 0m to grant Granted May 12, 2026
Patent 12625791
Adjusting a Playback Device
2y 4m to grant Granted May 12, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2
Expected OA Rounds
69%
Grant Probability
95%
With Interview (+26.1%)
3y 5m (~1y 6m remaining)
Median Time to Grant
Low
PTA Risk
Based on 575 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month