Last updated: April 19, 2026
Application No. 19/040,544
SYSTEMS AND METHODS FOR DETECTING AND ANALYZING AUDIO IN A MEDIA PRESENTATION ENVIRONMENT TO DETERMINE WHETHER TO REPLAY A PORTION OF THE MEDIA

Non-Final OA §102§103§DP
Filed
Jan 29, 2025
Examiner
MESA, JOSE M
Art Unit
2484
Tech Center
2400 — Computer Networks
Assignee
Adeia Guides Inc.
OA Round
1 (Non-Final)
Interview Optional

— +16.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 575 resolved cases, 2023–2026
Examiner Intelligence

MESA, JOSE M View full profile →
Grants 70% — above average
Career Allow Rate
401 granted / 575 resolved
+11.7% vs TC avg
Strong +16% interview lift
Without
With
+16.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 5m
Avg Prosecution
18 currently pending
Career history
593
Total Applications
across all art units
Statute-Specific Performance

§101
5.0%
-35.0% vs TC avg
§103
51.5%
+11.5% vs TC avg
§102
29.3%
-10.7% vs TC avg
§112
5.1%
-34.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 575 resolved cases
Office Action

§102 §103 §DP
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Allowable Subject Matter
Claims 4, 5, 11, 12, 15 and 16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the claims at issue are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the reference application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/forms/. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Instant Application
Patent No. 12,244,890
(Claim 2)
2. (New) A method comprising: detecting, during playing of a media asset comprising audio, a sound that is not part of the audio of the media asset; determining the sound is not related to the media asset; determining whether a sound duration of the sound is longer than a time threshold; and based at least in part on determining the sound is not related to the media asset and that the sound duration is longer than the time threshold, causing replay of a portion of the media asset. 




(Claim 3)
3. (New) The method of claim 2, further comprising determining, based at least in part on the sound duration, a first timestamp in the media asset from which to start replaying the media asset and a second timestamp in the media asset at which to end the replying, wherein causing replay of the portion of the media asset comprises causing replay of the media asset from the first timestamp to the second timestamp.  

(Claim 4)
4. (New) The method of claim 3, wherein the time threshold is a first time threshold, the method further comprising: determining whether the sound duration is longer than a second time threshold, wherein the second time threshold is longer than the first time threshold; and based at least in part on determining the sound duration is shorter than the second time threshold, determining the first timestamp to correspond to a time of the detection of the sound. 

(Claim 5)
5. (New) The method of claim 3, wherein the time threshold is a first time threshold, the method further comprising: determining whether the sound duration is longer than a second time threshold, wherein the second time threshold is longer than the first time threshold; and based at least in part on determining the sound duration is longer than the second time threshold, determining the first timestamp to correspond to a beginning of a scene of the media asset that was played during the detection of the sound, wherein the beginning of the scene is at an earlier first time in the media asset than a second time corresponding to the detection of the sound. 

(Claim 6)
6. (New) The method of claim 2, wherein the time threshold is a first time threshold, further comprising: determining the sound duration by: determining the sound has not been detected for more than a second time threshold, wherein the sound duration is from a first time of the detection of the sound to a second time when the sound was last detected. 




(Claim 7)
7. (New) The method of claim 2, wherein: the sound comprises words spoken by a user; and determining the sound is not related to the media asset comprises comparing the words spoken with metadata corresponding to the media asset. 

(Claim 8)
8. (New) The method of claim 2, further comprising determining whether a sound volume of the sound exceeds a sound volume threshold, wherein causing replay of the portion of the media asset is further based at least in part on determining the sound volume of the sound exceeds the sound volume threshold.  


(Claim 9)
9. (New) The method of claim 2, further comprising: based at least in part on determining the sound duration is longer than the time threshold, causing display of an option to replay the portion of the media asset; and receiving an input indicting selection of the option to replay the portion of the media asset, wherein causing replay of the portion of the media asset is further based on receiving the input. 


(Claim 10)
10. (New) The method of claim 2, wherein: the sound is a first sound detected at a first time; and the method further comprises: detecting, during the playing of the media asset, a second sound at a second time that is not part of the audio of the media asset; and based at least in part on determining the second sound is not related to the media asset, modifying the media asset by inserting additional content into the media asset at a second timestamp corresponding to the second time. 

(Claim 11)
11. (New) The method of claim 2, wherein the sound is a first sound detected at a first time during playing of the media asset at a first device, the method further comprising: causing playing of the media asset at a second device while the media asset is being played at the first device, wherein the first device and the second device are in communication over a network; receiving an indication that a user of the first device has requested to mute audio received from the second device over the network; detecting, during the playing of the media asset at the first device, a second sound that is not part of the media asset, wherein the second sound (i) is detected at the second device at a second time and (ii) has a second sound duration that is longer than the time threshold; and refraining from causing replay of a second portion of the media asset based at least in part on receiving the indication. 

(Claim 12)
12. (New) The method of claim 11, wherein: the first sound is transmitted from the first device to the second device; and determining the first sound is not related to the media asset comprises analyzing the transmitted first sound to determine a relevance to the media asset. 

(Claim 13)
13. (New) A system comprising: input/output circuitry configured to cause playing of a media asset; and control circuitry configured to: detect, during the playing of a media asset comprising audio, a sound that is not part of the audio of the media asset; determine the sound is not related to the media asset; determine whether a sound duration of the sound is longer than a time threshold; and based at least in part on determining the sound is not related to the media asset and that the sound duration is longer than the time threshold, cause replay of a portion of the media asset. 

(Claim 14)
14. (New) The system of claim 13, wherein the control circuitry is further configured to: determine, based at least in part on the sound duration, a first timestamp in the media asset from which to start replaying the media asset and a second timestamp in the media asset at which to end the replying; and cause replay of the portion of the media asset by causing replay of the media asset from the first timestamp to the second timestamp. 






(Claim 15)
15. (New) The system of claim 14, wherein: the time threshold is a first time threshold; and the control circuitry is further configured to: determine whether the sound duration is longer than a second time threshold, wherein the second time threshold is longer than the first time threshold; and based at least in part on determining the sound duration is shorter than the second time threshold, determine the first timestamp to correspond to a time of the detection of the sound. 







(Claim 16)
16. (New) The system of claim 14, wherein: the time threshold is a first time threshold; and the control circuitry is further configured to: determine whether the sound duration is longer than a second time threshold, wherein the second time threshold is longer than the first time threshold; and based at least in part on determining the sound duration is longer than the second time threshold, determine the first timestamp to correspond to a beginning of a scene of the media asset that was played during the detection of the sound, wherein the beginning of the scene is at an earlier first time in the media asset than a second time corresponding to the detection of the sound. 

(Claim 17)
17. (New) The system of claim 13, wherein: the time threshold is a first time threshold; and the control circuitry is configured to determine the sound duration by: determining the sound has not been detected for more than a second time threshold, wherein the sound duration is from a first time of the detection of the sound to a second time when the sound was last detected. 
(Claim 18)
18. (New) The system of claim 13, wherein: the sound comprises words spoken by a user; and the control circuitry is configured to determine the sound is not related to the media asset by comparing the words spoken with metadata corresponding to the media asset. 

(Claim 19)
19. (New) The system of claim 13, wherein: the control circuitry is further configured to determine whether a sound volume of the sound exceeds a sound volume threshold; and causing replay of the portion of the media asset is further based at least in part on determining the sound volume of the sound exceeds the sound volume threshold. 

(Claim 20)
20. (New) The system of claim 13, wherein: the control circuitry is further configured to based at least in part on determining the sound duration is longer than the time threshold, cause display of an option to replay the portion of the media asset; and the input/output circuitry is further configured to receive an input indicting selection of the option to replay the portion of the media asset, wherein causing replay of the portion of the media asset is further based on receiving the input. 

(Claim 21)
21. (New) The system of claim 13, wherein: the sound is a first sound detected at a first time; and the control circuitry is further configured to: detect, during the playing of the media asset, a second sound at a second time that is not part of the audio of the media asset; and based at least in part on determining the second sound is not related to the media asset, modify the media asset by inserting additional content into the media asset at a second timestamp corresponding to the second time.
(Claim 11)
11. A method comprising: detecting, during output of a media asset, a sound that is not part of the media asset; determining that a number of portions of the sound that relate to the media asset is less than a threshold; based at least in part on the determining that a number of portions of the sound that relate to the media asset is less than a threshold, determining that the sound is not related to the media asset; and based at least in part on the determining that the sound is not related to the media asset, causing replay of a portion of the media asset.

(Claim 11 above and Claim 17 below include the claimed limitations of Claim 3 of the Instant Application)

(Claim 12)
12. The method of claim 11, further comprising: based at least in part on detecting the sound, storing the portion of the media asset that is to be replayed.
(Claim 4 of the Instant Application is indicated allowable)

(Claim 13)
13. The method of claim 11, wherein determining that the number of portions of the sound that relate to the media asset is less than the threshold comprises comparing the sound with metadata corresponding to the media asset.

(Claim 5 of the Instant Application is indicated allowable)

(Claim 14)
14. The method of claim 13, wherein the metadata comprises one or more of subtitles for the media asset, a title of the media asset, names of actors in the media asset, or genre data.
(Claim 15)
15. The method of claim 11, wherein in the sound comprises words spoken by a user.


(Claim 12 above includes the claimed limitations of Claim 6 of the Instant Application)

(Claim 16)
16. The method of claim 15, wherein determining that the number of portions of that relate to the media asset is less than the threshold comprises determining that a number or percentage of the words spoken that match metadata corresponding to the media asset is less than a threshold value.
(Claim 13 and Claim 16 above include the claimed limitations of Claim 7 of the Instant Application)

(Claim 17)
17. The method of claim 11, further comprising: based at least in part on the detecting the sound, identifying a timestamp in the media asset corresponding to a start of the sound; wherein causing replay of the portion of the media asset comprises causing replay of the media asset from the identified timestamp.

(Claim 18)
18. The method of claim 11, further comprising: analyzing the media asset at a time when the sound is detected to identify a portion of dialogue occurring in the media asset at the time when the sound was detected; and determining a timestamp in the media asset corresponding to a start of the identified portion of dialogue; wherein causing replay of the portion of the media asset comprises causing replay of the media asset at the determined timestamp.
(Claim 19)
19. The method of claim 11, further comprising: based at least in part on the determining that the number of portions of the sound that relate to the media asset is less than the threshold, causing display, on a client device, of an option to replay the portion of the media asset and a countdown timer for responding to the option; wherein causing replay of the portion of the media asset is performed further based at least in part on receiving input selecting, prior to completion of a countdown of the countdown timer, the option to replay the portion of the media asset.
(Claim 11 of the Instant Application is indicated allowable)


(Claim 20)
20. The method of claim 11, wherein the media asset is being output by a plurality of client devices in a media watch party, wherein the sound is detected as being audible in only a single environment corresponding to a client device of the plurality of client devices, and wherein the method further comprises: displaying, at the client device, an option to replay the portion of the media asset after conclusion of the media watch party; wherein causing replaying the portion of the media asset on the client device is performed further based at least in part on receiving input selecting the option to replay the portion of the media asset.
(Claim 12 of the Instant Application is indicated allowable)






(Claim 1)
1. A system comprising: input/output circuitry configured to cause output of a media asset; and control circuitry configured to: detect, during output of a media asset, a sound that is not part of the media asset; determine that a number of portions of the sound that relate to the media asset is less than a threshold; based at least in part on the determining that a number of portions of the sound that relate to the media asset is less than a threshold, determine that the sound is not related to the media asset; and based at least in part on determining that the sound is not related to the media asset, cause replay of a portion of the media asset.

(Claim 2)
2. The system of claim 1, wherein the control circuitry configured to determine that the number of portions of the sound that relate to the media asset is less than the threshold is further configured to compare the sound with metadata corresponding to the media asset.



(Claim 3)
3. The system of claim 2, wherein the metadata comprises one or more of subtitles for the media asset, a title of the media asset, names of actors in the media asset, or genre data.
(Claim 15 of the Instant Application is indicated allowable)


(Claim 4)
4. The system of claim 1, wherein in the sound comprises words spoken by a user.
(Claim 5)
5. The system of claim 4, wherein the control circuitry configured to determine that the number of portions of the sound that relate to the media asset is less than the threshold is further configured to determine that a number or percentage of the words spoken that do not match metadata corresponding to the media asset is less than a threshold value.
(Claim 16 of the Instant Application is indicated allowable)

(Claim 6)
6. The system of claim 1, wherein the control circuitry is further configured to: based at least in part on detecting the sound, identify a timestamp in the media asset corresponding to a start of the sound; wherein the control circuitry configured to cause replay of the portion of the media asset is further configured to cause replay of the media asset from the identified timestamp.
(Claim 7)
7. The system of claim 1, wherein the control circuitry is further configured to: based at least in part on detecting the sound, store the portion of the media asset that is to be replayed.



(Claim 8)
8. The system of claim 1, wherein the control circuitry is further configured to: analyze the media asset at a time when the sound is detected to identify a portion of dialogue occurring in the media asset at the time when the sound was detected; and determine a timestamp in the media asset corresponding to a start of the identified portion of dialogue; wherein the control circuitry configured to cause replay of the portion of the media asset is further configured to cause replay of the media asset at the determined timestamp.
(Claim 9)
9. The system of claim 1, wherein the control circuitry is further configured to: based at least in part on the determining that the number of portions of the sound that relate to the media asset is less than the threshold, cause display, on a client device, of an option to replay the portion of the media asset and a countdown timer for responding to the option; wherein the control circuitry configured to cause replay of the portion of the media asset is further configured to do so based at least in part on receiving input selecting, prior to a completion of a countdown of the countdown timer, the option to replay the portion of the media asset.
(Claim 10)
10. The system of claim 1, wherein the media asset is being output by a plurality of client devices in a media watch party, wherein the sound is detected as being audible in only a single environment corresponding to a client device of the plurality of client devices, and wherein the control circuitry is further configured to: display, at the client device, an option to replay the portion of the media asset after conclusion of the media watch party; wherein the control circuitry configured to cause replaying the portion of the media asset on the client device is further configured to do so based at least in part on receiving input selecting the option to replay the portion of the media asset.


Claims 2, 6, 9, 13, 17 and 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 11 and 12 of U.S. Patent No. 12,244,890, and further in view of Pawlowski Pub. No. US 2007/0143820.
Re claim 2, the conflicting claims are not patentably distinct from each other because every limitation of claim 2 of the Instant Application is found in claim 11 of the Patent No. 12,244,890, except the following limitation: “determining whether a sound duration of the sound is longer than a time threshold.”
However, the reference of Pawlowski explicitly teaches “determining whether a sound duration of the sound is longer than a time threshold” (see ¶ 44 for determining whether a sound duration of the sound is longer than a time threshold (i.e. when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, this may limit the number of stored tags, when the time difference is large, then a disturbing event tag is generated in step 303, the tag specifies a certain point in the stream, associated with the point at which the event occurred as shown in fig. 3))
Therefore, taking the combined teachings of Patent No. 12,244,890 and Pawlowski as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (determining) into the system of Patent No. 12,244,890 as taught by Pawlowski.
One will be motivated to incorporate the above feature into the method of Patent No. 12,244,890 as taught by Pawlowski for the benefit of monitoring ambient sounds by the tag manager, wherein the tag manager listens for a disturbing event notification from the ambient sound analyzer in step 301, wherein when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, wherein this may limit the number of stored tags, wherein when the time difference is large, then a disturbing event tag is generated in step 303, wherein the tag specifies a certain point in the stream, associated with the point at which the event occurred, wherein the association may be direct, i.e. the tag may specify the exact point in the stream at which the event occurred, or indirect, i.e. the tag may specify a point offset from the point of event occurrence, such as 30 seconds earlier for enabling the user to get acquainted with the fragment that was missed, or may define a beginning of the scene during which the event occurred in order to ease the processing time and have a user friendly interaction when enabling the user to get acquainted with the fragment that was missed (see fig. 3 ¶ 44)
Re claim 6, the conflicting claims are not patentably distinct from each other because every limitation of claim 6 of the Instant Application is found in claim 12 of the Patent No. 12,244,890, except the following limitation: “determining the sound duration by: determining the sound has not been detected for more than a second time threshold, wherein the sound duration is from a first time of the detection of the sound to a second time when the sound was last detected.”
However, the reference of Pawlowski explicitly teaches “determining the sound duration by: determining the sound has not been detected for more than a second time threshold, wherein the sound duration is from a first time of the detection of the sound to a second time when the sound was last detected” (see ¶ 44 for the time threshold is a first time threshold, further comprising: determining the sound duration by: determining the sound has not been detected for more than a second time threshold, wherein the sound duration is from a first time of the detection of the sound to a second time when the sound was last detected (i.e. when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, this may limit the number of stored tags, when the time difference is large, then a disturbing event tag is generated in step 303, the tag specifies a certain point in the stream, associated with the point at which the event occurred as shown in fig. 3))
Therefore, taking the combined teachings of Patent No. 12,244,890 and Pawlowski as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (determining) into the system of Patent No. 12,244,890 as taught by Pawlowski.
One will be motivated to incorporate the above feature into the method of Patent No. 12,244,890 as taught by Pawlowski for the benefit of monitoring ambient sounds by the tag manager, wherein the tag manager listens for a disturbing event notification from the ambient sound analyzer in step 301, wherein when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, wherein this may limit the number of stored tags, wherein when the time difference is large, then a disturbing event tag is generated in step 303, wherein the tag specifies a certain point in the stream, associated with the point at which the event occurred, wherein the association may be direct, i.e. the tag may specify the exact point in the stream at which the event occurred, or indirect, i.e. the tag may specify a point offset from the point of event occurrence, such as 30 seconds earlier for enabling the user to get acquainted with the fragment that was missed, or may define a beginning of the scene during which the event occurred in order to ease the processing time and have a user friendly interaction when enabling the user to get acquainted with the fragment that was missed (see fig. 3 ¶ 44)
Re claim 9, the conflicting claims are not patentably distinct from each other because every limitation of claim 9 of the Instant Application is found in claim 11 of the Patent No. 12,244,890, except the following limitation: “based at least in part on determining the sound duration is longer than the time threshold, causing display of an option to replay the portion of the media asset; and receiving an input indicting selection of the option to replay the portion of the media asset, wherein causing replay of the portion of the media asset is further based on receiving the input.”
However, the reference of Pawlowski explicitly teaches “based at least in part on determining the sound duration is longer than the time threshold, causing display of an option to replay the portion of the media asset” (see fig. 2 ¶ 43 for the time threshold is a first time threshold, further comprising: determining the sound duration by: determining the sound has not been detected for more than a second time threshold, wherein the sound duration is from a first time of the detection of the sound to a second time when the sound was last detected (i.e. a list of stored tags for selection by the user can be displayed upon receiving a stream replay request as described in paragraph 24, furthermore, when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, this may limit the number of stored tags, when the time difference is large, then a disturbing event tag is generated in step 303, the tag specifies a certain point in the stream, associated with the point at which the event occurred as described in fig. 3 paragraph 44)); “and receiving an input indicting selection of the option to replay the portion of the media asset, wherein causing replay of the portion of the media asset is further based on receiving the input” (see ¶ 42 for receiving an input indicting selection of the option to replay the portion of the media asset, wherein causing replay of the portion of the media asset is further based on receiving the input (i.e. when the stream play is initiated by a play request input by the user in step 201, the ambient sounds are monitored in step 202 by the ambient sound analyzer and disturbing event tags are generated and stored in the tag memory according to the procedure shown in FIG. 3, upon receiving a replay request, in step 203 a tag is selected from the tag memory and the stream processor is directed to replay the stream from the point associated with the selected tag, according to the procedure shown in FIG. 4 as described in fig. 2 paragraph 43, furthermore, when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, this may limit the number of stored tags, when the time difference is large, then a disturbing event tag is generated in step 303, the tag specifies a certain point in the stream, associated with the point at which the event occurred as described in fig. 3 paragraph 44))
Therefore, taking the combined teachings of Patent No. 12,244,890 and Pawlowski as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (determining) into the system of Patent No. 12,244,890 as taught by Pawlowski.
One will be motivated to incorporate the above feature into the method of Patent No. 12,244,890 as taught by Pawlowski for the benefit of monitoring ambient sounds by the tag manager, wherein the tag manager listens for a disturbing event notification from the ambient sound analyzer in step 301, wherein when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, wherein this may limit the number of stored tags, wherein when the time difference is large, then a disturbing event tag is generated in step 303, wherein the tag specifies a certain point in the stream, associated with the point at which the event occurred, wherein the association may be direct, i.e. the tag may specify the exact point in the stream at which the event occurred, or indirect, i.e. the tag may specify a point offset from the point of event occurrence, such as 30 seconds earlier for enabling the user to get acquainted with the fragment that was missed, or may define a beginning of the scene during which the event occurred in order to ease the processing time and have a user friendly interaction when enabling the user to get acquainted with the fragment that was missed (see fig. 3 ¶ 44)
Re claim 13, the conflicting claims are not patentably distinct from each other because every limitation of claim 13 of the Instant Application is found in claim 11 of the Patent No. 12,244,890, except the following limitation: “determine whether a sound duration of the sound is longer than a time threshold.”
However, the reference of Pawlowski explicitly teaches “determine whether a sound duration of the sound is longer than a time threshold” (see ¶ 44 for determine whether a sound duration of the sound is longer than a time threshold (i.e. when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, this may limit the number of stored tags, when the time difference is large, then a disturbing event tag is generated in step 303, the tag specifies a certain point in the stream, associated with the point at which the event occurred as shown in fig. 3))
Therefore, taking the combined teachings of Patent No. 12,244,890 and Pawlowski as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (determine) into the system of Patent No. 12,244,890 as taught by Pawlowski.
One will be motivated to incorporate the above feature into the system of Patent No. 12,244,890 as taught by Pawlowski for the benefit of monitoring ambient sounds by the tag manager, wherein the tag manager listens for a disturbing event notification from the ambient sound analyzer in step 301, wherein when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, wherein this may limit the number of stored tags, wherein when the time difference is large, then a disturbing event tag is generated in step 303, wherein the tag specifies a certain point in the stream, associated with the point at which the event occurred, wherein the association may be direct, i.e. the tag may specify the exact point in the stream at which the event occurred, or indirect, i.e. the tag may specify a point offset from the point of event occurrence, such as 30 seconds earlier for enabling the user to get acquainted with the fragment that was missed, or may define a beginning of the scene during which the event occurred in order to ease the processing time and have a user friendly interaction when enabling the user to get acquainted with the fragment that was missed (see fig. 3 ¶ 44)
Re claim 17, the combination of Patent No. 12,244,890 and Pawlowski as discussed in claim 6 above discloses all the claimed limitations of claim 17.
Re claim 20, the combination of Patent No. 12,244,890 and Pawlowski as discussed in claim 9 above discloses all the claimed limitations of claim 20.
Claims 3, 7, 14 and 18 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 11, 13, 16 and 17 of U.S. Patent No. 12,244,890, and further in view of Logan US Pat. No. 10,149,008. 
Re claim 3, the conflicting claims are not patentably distinct from each other because every limitation of claim 3 of the Instant Application is found in claims 11 and 17 of the Patent No. 12,244,890, except the following limitation: “determining, based at least in part on the sound duration, a first timestamp in the media asset from which to start replaying the media asset and a second timestamp in the media asset at which to end the replying, wherein causing replay of the portion of the media asset comprises causing replay of the media asset from the first timestamp to the second timestamp.”
However, the reference of Logan explicitly teaches “determining, based at least in part on the sound duration, a first timestamp in the media asset from which to start replaying the media asset and a second timestamp in the media asset at which to end the replying” (see col. 11 lines 53-61 for determining, based at least in part on the sound duration, a first timestamp in the media asset from which to start replaying the media asset and a second timestamp in the media asset at which to end the replying (i.e. to annotate the respective portion of the media asset, the media guidance application may identify a start time and an end time of the respective portion of the media asset, e.g., by reading the relevant data section starting_time and ending_time in the metadata associated with the respective portion. The media guidance application may also determine, from the metadata, a plurality of topics corresponding to the media asset, e.g., by reading the relevant data entries topics in the metadata. In another example, the media guidance application may extract keywords from the text strings in the data entry caption as the topics of the program as described in col. 12 lines 50-61, furthermore, the media guidance application may also perform speech recognition to extract one or more keywords from the voice alert “you just missed the penalty goal” 117. The keywords may include “penalty,” “goal,” etc. The media guidance application may then determine a portion from the subset of portions of the soccer game, to which the keyword corresponds. For example, for each respective portion of the subset, the media guidance application may compare attributes corresponding to the respective portion to the keyword “penalty” or “goal.” If a respective portion of the media asset includes attributes that match “penalty” or “goal,” the respective portion may be identified for potential playing back as described in fig. 1 col. 15 lines 20-32). Also, see col. 12 lines 62-67, col. 13 line 1, col. 15 lines 1-19), wherein causing replay of the portion of the media asset comprises causing replay of the media asset from the first timestamp to the second timestamp (see col. 15 lines 5-19, col. 32 lines 42-58 for causing replay of the portion of the media asset comprises causing replay of the media asset from the first timestamp to the second timestamp (i.e. to annotate the respective portion of the media asset, the media guidance application may identify a start time and an end time of the respective portion of the media asset, e.g., by reading the relevant data section starting_time and ending_time in the metadata associated with the respective portion. The media guidance application may also determine, from the metadata, a plurality of topics corresponding to the media asset, e.g., by reading the relevant data entries topics in the metadata. In another example, the media guidance application may extract keywords from the text strings in the data entry caption as the topics of the program as described in col. 12 lines 50-61, furthermore, replaying a portion of a media asset to a first user when a second user delivers a voice alert to the first user that is indicative of the portion of the media asset as described in col. 31 lines 53-55))
Therefore, taking the combined teachings of Patent No. 12,244,890 and Logan as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (timestamp) into the system of Pawlowski as taught by Logan.
One will be motivated to incorporate the above feature into the method of Patent No. 12,244,890 as taught by Logan for the benefit of identifying a start time and an end time of the respective portion of the media asset, e.g., by reading the relevant data section starting_time and ending_time in the metadata associated with the respective portion to annotate the respective portion of the media asset, wherein the media guidance application may also determine, from the metadata, a plurality of topics corresponding to the media asset, e.g., by reading the relevant data entries topics in the metadata in order to ease the processing time when reading the relevant data section starting_time and ending_time in the metadata associated with the respective portion to annotate the respective portion of the media asset (see col. 12 lines 50-58)
Re claim 7, the conflicting claims are not patentably distinct from each other because every limitation of claim 7 of the Instant Application is found in claims 13 and 16 of the Patent No. 12,244,890, except the following limitation: “the sound comprises words spoken by a user; and determining the sound is not related to the media asset comprises comparing the words spoken with metadata corresponding to the media asset.”
However, the reference of Logan explicitly teaches “the sound comprises words spoken by a user” (see col. 14 lines 61-67, col. 15 lines 1-2 for the sound comprises words spoken by a user (i.e. detect a voice alert (e.g., via user input interface 410) relating to the media asset from User B 112 towards User A 111 as described in fig. 1 col. 15 lines 2-4); “and determining the sound is not related to the media asset comprises comparing the words spoken with metadata corresponding to the media asset” (see col. 14 lines 61-67, col. 15 lines 1-2 for determining the sound is not related to the media asset comprises comparing the words spoken with metadata corresponding to the media asset (i.e. detect a voice alert (e.g., via user input interface 410) relating to the media asset from User B 112 towards User A 111 as described in fig. 1 col. 15 lines 2-4, furthermore, the media guidance application may also perform speech recognition to extract one or more keywords from the voice alert “you just missed the penalty goal” 117. The keywords may include “penalty,” “goal,” etc. The media guidance application may then determine a portion from the subset of portions of the soccer game, to which the keyword corresponds. For example, for each respective portion of the subset, the media guidance application may compare attributes corresponding to the respective portion to the keyword “penalty” or “goal.” If a respective portion of the media asset includes attributes that match “penalty” or “goal,” the respective portion may be identified for potential playing back as described in fig. 1 col. 15 lines 20-32). Also, see col. 15 lines 5-19)
Therefore, taking the combined teachings of Patent No. 12,244,890 and Logan as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (comparing) into the system of Pawlowski as taught by Logan.
One will be motivated to incorporate the above feature into the method of Patent No. 12,244,890 as taught by Logan for the benefit of comparing attributes corresponding to the respective portion to the keyword “penalty” or “goal”, wherein if a respective portion of the media asset includes attributes that match “penalty” or “goal,” the respective portion may be identified for potential playing back in order to improve efficiency when identifying the respective portion for potential playing back (see fig. 1 col. 15 lines 20-32)
Re claim 14, the combination of Patent No. 12,244,890 and Logan as discussed in claim 3 above discloses all the claimed limitations of claim 14.
Re claim 18, the combination of Patent No. 12,244,890 and Logan as discussed in claim 7 above discloses all the claimed limitations of claim 18.
Claims 8, 10, 19 and 21 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim 11 of U.S. Patent No. 12,244,890, and further in view of Reynolds US Pub. No. 2015/0012270.
Re claim 8, the conflicting claims are not patentably distinct from each other because every limitation of claim 8 of the Instant Application is found in claim 11 of the Patent No. 12,244,890, except the following limitation: “determining whether a sound volume of the sound exceeds a sound volume threshold, wherein causing replay of the portion of the media asset is further based at least in part on determining the sound volume of the sound exceeds the sound volume threshold.”
However, the reference of Reynolds explicitly teaches “determining whether a sound volume of the sound exceeds a sound volume threshold, wherein causing replay of the portion of the media asset is further based at least in part on determining the sound volume of the sound exceeds the sound volume threshold” (see ¶ 72 for determining whether a sound volume of the sound exceeds a sound volume threshold, wherein causing replay of the portion of the media asset is further based at least in part on determining the sound volume of the sound exceeds the sound volume threshold (i.e. at step 104, it is determined whether the participant is speaking, i.e., whether the sound level emanating from the participant's audio stream (e.g., input by a microphone or other audio input device) is greater than a defined threshold value as described in fig. 12 paragraph 70, furthermore, at step 116, the previously stopped conference audio is accessed at the location indicated by the index points stored at step 108, and played back from that point as described in fig. 12 paragraph 73))
Therefore, taking the combined teachings of Patent No. 12,244,890 and Reynolds as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (exceed) into the system of Pawlowski as taught by Reynolds.
One will be motivated to incorporate the above feature into the method of 12,244,890 as taught by Reynolds for the benefit of implementing a process 100 for conference enhancement, wherein the steps of FIG. 12 may be performed by a process (which may be a software application, e.g., executed on a local device or a remote server), wherein at step 102, the sound level of a participant on a conference is monitored, wherein at step 104, it is determined whether the participant is speaking, i.e., whether the sound level emanating from the participant's audio stream (e.g., input by a microphone or other audio input device) is greater than a defined threshold value, wherein at step 116, the previously stopped conference audio is accessed at the location indicated by the index points stored at step 108, and played back from that point in order to ease the processing time when accessing and playing back audio from the location indicated by the index points (see fig. 12 ¶s 70, 73)
Re claim 10, the conflicting claims are not patentably distinct from each other because every limitation of claim 10 of the Instant Application is found in claim 11 of the Patent No. 12,244,890, except the following limitation: “based at least in part on determining the second sound is not related to the media asset, modifying the media asset by inserting additional content into the media asset at a second timestamp corresponding to the second time.”
However, the reference of Reynolds explicitly teaches “based at least in part on determining the second sound is not related to the media asset, modifying the media asset by inserting additional content into the media asset at a second timestamp corresponding to the second time” (see ¶ 98 for based at least in part on determining the second sound is not related to the media asset, modifying the media asset by inserting additional content into the media asset at a second timestamp corresponding to the second time (i.e. the additional utterances may be added at the beginning, middle, or end of the original conference, and redirection tags may be inserted automatically so that later listeners are redirected to the additional utterances at the proper time))
Therefore, taking the combined teachings of Patent No. 12,244,890 and Reynolds as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (inserting) into the system of Patent No. 12,244,890 as taught by Reynolds.
One will be motivated to incorporate the above feature into the method of Patent No. 12,244,890 as taught by Reynolds for the benefit of continuing the conference by recording additional utterances when a user is listening to a playback of a conference that has previously been recorded, wherein this may be implemented in a mode referred to conference continuation mode, in which the user may record additional utterances to continue the thread of discussion and update the conference, wherein later, when the user or other users listen to the playback of the conference, the user's additional utterances are included in the playback, wherein in an example, the additional utterances may be added at the beginning, middle, or end of the original conference, and redirection tags may be inserted automatically so that later listeners are redirected to the additional utterances at the proper time in order to have a user friendly interaction (see ¶ 98)
Re claim 19, the combination of Patent No. 12,244,890 and Reynolds as discussed in claim 8 above discloses all the claimed limitations of claim 19.
Re claim 21, the combination of Patent No. 12,244,890 and Reynolds as discussed in claim 10 above discloses all the claimed limitations of claim 21.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 2, 6, 9, 13, 17 and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Pawlowski (US 2007/0143820 A1)(hereinafter Pawlowski).
Re claim 2, Pawlowski discloses a method comprising: detecting, during playing of a media asset comprising audio, a sound that is not part of the audio of the media asset (see ¶s 37-38 for detecting, during playing of a media asset comprising audio, a sound that is not part of the audio of the media asset (i.e. the audio/video device 100 further comprises a microphone 103, which receives ambient sound, the sound received by the microphone is input to an ambient sound analyzer 104, which is used to detect disturbing events as described in figs. 1, 4 paragraphs 39-40). Also, see paragraphs 41-42); determining the sound is not related to the media asset (see ¶ 40 for determining the sound is not related to the media asset (i.e. the audio/video device 100 further comprises a microphone 103, which receives ambient sound as described in fig. 1 paragraph 39)); determining whether a sound duration of the sound is longer than a time threshold (see ¶ 44 for determining whether a sound duration of the sound is longer than a time threshold (i.e. when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, this may limit the number of stored tags, when the time difference is large, then a disturbing event tag is generated in step 303, the tag specifies a certain point in the stream, associated with the point at which the event occurred as shown in fig. 3)); and based at least in part on determining the sound is not related to the media asset and that the sound duration is longer than the time threshold, causing replay of a portion of the media asset (see fig. 3 ¶ 44 for based at least in part on determining the sound is not related to the media asset and that the sound duration is longer than the time threshold, causing replay of a portion of the media asset (i.e. a procedure for handling the replay of the stream by the replay manager, initiated by receiving a replay request in step 401, the replay request may be a forward or backward request, i.e. it may be related to replaying a previous fragment or a next fragment of the stream in relation to the currently watched, the forward request may be input by the user in case a previous backward replay request resulted in moving to a point in the stream, which was already watched, next, the replay manager selects a tag searching scheme in step 402, the tags may be searched in several ways, including: searching the oldest available tag, in order to replay the stream from the first detected disturbing event, searching the latest available tag, in order to replay the stream from the last detected disturbing event as described in fig. 4 paragraphs 45-47). Also, see paragraph 48-49)
Re claim 6, Pawlowski as discussed in claim 2 above discloses all the claim limitations with additional claimed feature wherein the time threshold is a first time threshold, further comprising: determining the sound duration by: determining the sound has not been detected for more than a second time threshold, wherein the sound duration is from a first time of the detection of the sound to a second time when the sound was last detected (see ¶ 44 for the time threshold is a first time threshold, further comprising: determining the sound duration by: determining the sound has not been detected for more than a second time threshold, wherein the sound duration is from a first time of the detection of the sound to a second time when the sound was last detected (i.e. when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, this may limit the number of stored tags, when the time difference is large, then a disturbing event tag is generated in step 303, the tag specifies a certain point in the stream, associated with the point at which the event occurred as shown in fig. 3))
Re claim 9, Pawlowski as discussed in claim 2 above discloses all the claim limitations with additional claimed feature further comprising: based at least in part on determining the sound duration is longer than the time threshold, causing display of an option to replay the portion of the media asset (see fig. 2 ¶ 43 for the time threshold is a first time threshold, further comprising: determining the sound duration by: determining the sound has not been detected for more than a second time threshold, wherein the sound duration is from a first time of the detection of the sound to a second time when the sound was last detected (i.e. a list of stored tags for selection by the user can be displayed upon receiving a stream replay request as described in paragraph 24, furthermore, when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, this may limit the number of stored tags, when the time difference is large, then a disturbing event tag is generated in step 303, the tag specifies a certain point in the stream, associated with the point at which the event occurred as described in fig. 3 paragraph 44)); and receiving an input indicting selection of the option to replay the portion of the media asset, wherein causing replay of the portion of the media asset is further based on receiving the input (see ¶ 42 for receiving an input indicting selection of the option to replay the portion of the media asset, wherein causing replay of the portion of the media asset is further based on receiving the input (i.e. when the stream play is initiated by a play request input by the user in step 201, the ambient sounds are monitored in step 202 by the ambient sound analyzer and disturbing event tags are generated and stored in the tag memory according to the procedure shown in FIG. 3, upon receiving a replay request, in step 203 a tag is selected from the tag memory and the stream processor is directed to replay the stream from the point associated with the selected tag, according to the procedure shown in FIG. 4 as described in fig. 2 paragraph 43, furthermore, when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, this may limit the number of stored tags, when the time difference is large, then a disturbing event tag is generated in step 303, the tag specifies a certain point in the stream, associated with the point at which the event occurred as described in fig. 3 paragraph 44))
Re claim 13, Pawlowski discloses a system comprising: input/output circuitry configured to cause playing of a media asset (see ¶ 38 for input/output circuitry (i.e. audio/video device 100 as shown in fig. 1) configured to cause playing of a media asset (i.e. the stream processor 102 also provides a stream play status, such as information whether a stream is currently played and what is the current stream position as shown in fig. 1)); and control circuitry configured to (i.e. audio/video device 100 as shown in fig. 1 paragraph 38): detect, during the playing of a media asset comprising audio, a sound that is not part of the audio of the media asset (see ¶s 37-38 for detect, during the playing of a media asset comprising audio, a sound that is not part of the audio of the media asset (i.e. the audio/video device 100 further comprises a microphone 103, which receives ambient sound, the sound received by the microphone is input to an ambient sound analyzer 104, which is used to detect disturbing events as described in figs. 1, 4 paragraphs 39-40). Also, see paragraphs 41-42); determine the sound is not related to the media asset (see ¶ 40 for determine the sound is not related to the media asset (i.e. the audio/video device 100 further comprises a microphone 103, which receives ambient sound as described in fig. 1 paragraph 39)); determine whether a sound duration of the sound is longer than a time threshold (see ¶ 44 for determine whether a sound duration of the sound is longer than a time threshold (i.e. when a disturbing event is detected in step 302, the tag manager may check if the time difference between the current event and the last stored event is less than a predefined time, for example 1 minute, and if so, listen for another event, this may limit the number of stored tags, when the time difference is large, then a disturbing event tag is generated in step 303, the tag specifies a certain point in the stream, associated with the point at which the event occurred as shown in fig. 3)); and based at least in part on determining the sound is not related to the media asset and that the sound duration is longer than the time threshold, cause replay of a portion of the media asset (see fig. 3 ¶ 44 for based at least in part on determining the sound is not related to the media asset and that the sound duration is longer than the time threshold, cause replay of a portion of the media asset (i.e. a procedure for handling the replay of the stream by the replay manager, initiated by receiving a replay request in step 401, the replay request may be a forward or backward request, i.e. it may be related to replaying a previous fragment or a next fragment of the stream in relation to the currently watched, the forward request may be input by the user in case a previous backward replay request resulted in moving to a point in the stream, which was already watched, next, the replay manager selects a tag searching scheme in step 402, the tags may be searched in several ways, including: searching the oldest available tag, in order to replay the stream from the first detected disturbing event, searching the latest available tag, in order to replay the stream from the last detected disturbing event as described in fig. 4 paragraphs 45-47). Also, see paragraph 48-49)
Re claim 17, Pawlowski as discussed in claims 6 and 13 above discloses all the claimed limitations of claim 17.
Re claim 20, Pawlowski as discussed in claims 9 and 13 above discloses all the claimed limitations of claim 20.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3, 7, 14 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Pawlowski (US 2007/0143820 A1)(hereinafter Pawlowski) as applied to claims 2, 6, 9, 13, 17 and 20 above, and further in view of Logan (US 10,149,008 B1)(hereinafter Logan).
Re claim 3, Pawlowski as discussed in claim 2 above discloses all the claimed limitations but fails to explicitly teach further comprising determining, based at least in part on the sound duration, a first timestamp in the media asset from which to start replaying the media asset and a second timestamp in the media asset at which to end the replying, wherein causing replay of the portion of the media asset comprises causing replay of the media asset from the first timestamp to the second timestamp. However, the reference of Logan explicitly teaches further comprising determining, based at least in part on the sound duration, a first timestamp in the media asset from which to start replaying the media asset and a second timestamp in the media asset at which to end the replying (see col. 11 lines 53-61 for determining, based at least in part on the sound duration, a first timestamp in the media asset from which to start replaying the media asset and a second timestamp in the media asset at which to end the replying (i.e. to annotate the respective portion of the media asset, the media guidance application may identify a start time and an end time of the respective portion of the media asset, e.g., by reading the relevant data section starting_time and ending_time in the metadata associated with the respective portion. The media guidance application may also determine, from the metadata, a plurality of topics corresponding to the media asset, e.g., by reading the relevant data entries topics in the metadata. In another example, the media guidance application may extract keywords from the text strings in the data entry caption as the topics of the program as described in col. 12 lines 50-61, furthermore, the media guidance application may also perform speech recognition to extract one or more keywords from the voice alert “you just missed the penalty goal” 117. The keywords may include “penalty,” “goal,” etc. The media guidance application may then determine a portion from the subset of portions of the soccer game, to which the keyword corresponds. For example, for each respective portion of the subset, the media guidance application may compare attributes corresponding to the respective portion to the keyword “penalty” or “goal.” If a respective portion of the media asset includes attributes that match “penalty” or “goal,” the respective portion may be identified for potential playing back as described in fig. 1 col. 15 lines 20-32). Also, see col. 12 lines 62-67, col. 13 line 1, col. 15 lines 1-19), wherein causing replay of the portion of the media asset comprises causing replay of the media asset from the first timestamp to the second timestamp (see col. 15 lines 5-19, col. 32 lines 42-58 for causing replay of the portion of the media asset comprises causing replay of the media asset from the first timestamp to the second timestamp (i.e. to annotate the respective portion of the media asset, the media guidance application may identify a start time and an end time of the respective portion of the media asset, e.g., by reading the relevant data section starting_time and ending_time in the metadata associated with the respective portion. The media guidance application may also determine, from the metadata, a plurality of topics corresponding to the media asset, e.g., by reading the relevant data entries topics in the metadata. In another example, the media guidance application may extract keywords from the text strings in the data entry caption as the topics of the program as described in col. 12 lines 50-61, furthermore, replaying a portion of a media asset to a first user when a second user delivers a voice alert to the first user that is indicative of the portion of the media asset as described in col. 31 lines 53-55))
Therefore, taking the combined teachings of Pawlowski and Logan as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (timestamp) into the system of Pawlowski as taught by Logan.
One will be motivated to incorporate the above feature into the system of Pawlowski as taught by Logan for the benefit of identifying a start time and an end time of the respective portion of the media asset, e.g., by reading the relevant data section starting_time and ending_time in the metadata associated with the respective portion to annotate the respective portion of the media asset, wherein the media guidance application may also determine, from the metadata, a plurality of topics corresponding to the media asset, e.g., by reading the relevant data entries topics in the metadata in order to ease the processing time when reading the relevant data section starting_time and ending_time in the metadata associated with the respective portion to annotate the respective portion of the media asset (see col. 12 lines 50-58)
Re claim 7, Pawlowski as discussed in claim 2 above discloses all the claimed limitations but fails to explicitly teach wherein: the sound comprises words spoken by a user; and determining the sound is not related to the media asset comprises comparing the words spoken with metadata corresponding to the media asset. However, the reference of Logan explicitly teaches wherein: the sound comprises words spoken by a user (see col. 14 lines 61-67, col. 15 lines 1-2 for the sound comprises words spoken by a user (i.e. detect a voice alert (e.g., via user input interface 410) relating to the media asset from User B 112 towards User A 111 as described in fig. 1 col. 15 lines 2-4); and determining the sound is not related to the media asset comprises comparing the words spoken with metadata corresponding to the media asset (see col. 14 lines 61-67, col. 15 lines 1-2 for determining the sound is not related to the media asset comprises comparing the words spoken with metadata corresponding to the media asset (i.e. detect a voice alert (e.g., via user input interface 410) relating to the media asset from User B 112 towards User A 111 as described in fig. 1 col. 15 lines 2-4, furthermore, the media guidance application may also perform speech recognition to extract one or more keywords from the voice alert “you just missed the penalty goal” 117. The keywords may include “penalty,” “goal,” etc. The media guidance application may then determine a portion from the subset of portions of the soccer game, to which the keyword corresponds. For example, for each respective portion of the subset, the media guidance application may compare attributes corresponding to the respective portion to the keyword “penalty” or “goal.” If a respective portion of the media asset includes attributes that match “penalty” or “goal,” the respective portion may be identified for potential playing back as described in fig. 1 col. 15 lines 20-32). Also, see col. 15 lines 5-19)
Therefore, taking the combined teachings of Pawlowski and Logan as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (comparing) into the system of Pawlowski as taught by Logan.
One will be motivated to incorporate the above feature into the system of Pawlowski as taught by Logan for the benefit of comparing attributes corresponding to the respective portion to the keyword “penalty” or “goal”, wherein if a respective portion of the media asset includes attributes that match “penalty” or “goal,” the respective portion may be identified for potential playing back in order to improve efficiency when identifying the respective portion for potential playing back (see fig. 1 col. 15 lines 20-32)
Re claim 14, the combination of Pawlowski and Logan as discussed in claim 3, and also, claim 13 above discloses all the claimed limitations of claim 14.
Re claim 18, the combination of Pawlowski and Logan as discussed in claim 7, and also, claim 13 above discloses all the claimed limitations of claim 18.
Claims 8, 10, 19 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Pawlowski (US 2007/0143820 A1)(hereinafter Pawlowski) as applied to claims 2, 6, 9, 13, 17 and 20 above, and further in view of Reynolds (US 2015/0012270 A1)(hereinafter Reynolds).
Re claim 8, Pawlowski as discussed in claim 2 above discloses all the claimed limitations but fails to explicitly teach further comprising determining whether a sound volume of the sound exceeds a sound volume threshold, wherein causing replay of the portion of the media asset is further based at least in part on determining the sound volume of the sound exceeds the sound volume threshold. However, the reference of Reynolds explicitly teaches further comprising determining whether a sound volume of the sound exceeds a sound volume threshold, wherein causing replay of the portion of the media asset is further based at least in part on determining the sound volume of the sound exceeds the sound volume threshold (see ¶ 72 for determining whether a sound volume of the sound exceeds a sound volume threshold, wherein causing replay of the portion of the media asset is further based at least in part on determining the sound volume of the sound exceeds the sound volume threshold (i.e. at step 104, it is determined whether the participant is speaking, i.e., whether the sound level emanating from the participant's audio stream (e.g., input by a microphone or other audio input device) is greater than a defined threshold value as described in fig. 12 paragraph 70, furthermore, at step 116, the previously stopped conference audio is accessed at the location indicated by the index points stored at step 108, and played back from that point as described in fig. 12 paragraph 73))
Therefore, taking the combined teachings of Pawlowski and Reynolds as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (exceed) into the system of Pawlowski as taught by Reynolds.
One will be motivated to incorporate the above feature into the system of Pawlowski as taught by Reynolds for the benefit of implementing a process 100 for conference enhancement, wherein the steps of FIG. 12 may be performed by a process (which may be a software application, e.g., executed on a local device or a remote server), wherein at step 102, the sound level of a participant on a conference is monitored, wherein at step 104, it is determined whether the participant is speaking, i.e., whether the sound level emanating from the participant's audio stream (e.g., input by a microphone or other audio input device) is greater than a defined threshold value, wherein at step 116, the previously stopped conference audio is accessed at the location indicated by the index points stored at step 108, and played back from that point in order to ease the processing time when accessing and playing back audio from the location indicated by the index points (see fig. 12 ¶s 70, 73)
Re claim 10, Pawlowski as discussed in claim 2 above discloses all the claim limitations with additional claimed feature wherein: the sound is a first sound detected at a first time (see ¶ 39 for the sound is a first sound detected at a first time (i.e. the sound received by the microphone is input to an ambient sound analyzer 104, which is used to detect disturbing events as described in fig. 1 paragraph 40)); and the method further comprises: detecting, during the playing of the media asset, a second sound at a second time that is not part of the audio of the media asset (see ¶s 37-38 for detecting, during the playing of the media asset, a second sound at a second time that is not part of the audio of the media asset (i.e. the audio/video device 100 further comprises a microphone 103, which receives ambient sound, the sound received by the microphone is input to an ambient sound analyzer 104, which is used to detect disturbing events as described in figs. 1, 4 paragraphs 39-40). Also, see paragraphs 41-42)
Pawlowski fails to explicitly teach and based at least in part on determining the second sound is not related to the media asset, modifying the media asset by inserting additional content into the media asset at a second timestamp corresponding to the second time. However, the reference of Reynolds explicitly teaches and based at least in part on determining the second sound is not related to the media asset, modifying the media asset by inserting additional content into the media asset at a second timestamp corresponding to the second time (see ¶ 98 for based at least in part on determining the second sound is not related to the media asset, modifying the media asset by inserting additional content into the media asset at a second timestamp corresponding to the second time (i.e. the additional utterances may be added at the beginning, middle, or end of the original conference, and redirection tags may be inserted automatically so that later listeners are redirected to the additional utterances at the proper time))
Therefore, taking the combined teachings of Pawlowski and Reynolds as a whole, it would have been obvious before the effective filing date of the claimed invention to incorporate this feature (inserting) into the system of Pawlowski as taught by Reynolds.
One will be motivated to incorporate the above feature into the system of Pawlowski as taught by Reynolds for the benefit of continuing the conference by recording additional utterances when a user is listening to a playback of a conference that has previously been recorded, wherein this may be implemented in a mode referred to conference continuation mode, in which the user may record additional utterances to continue the thread of discussion and update the conference, wherein later, when the user or other users listen to the playback of the conference, the user's additional utterances are included in the playback, wherein in an example, the additional utterances may be added at the beginning, middle, or end of the original conference, and redirection tags may be inserted automatically so that later listeners are redirected to the additional utterances at the proper time in order to have a user friendly interaction (see ¶ 98)
Re claim 19, the combination of Pawlowski and Reynolds as discussed in claim 8, and also, claim 13 above discloses all the claimed limitations of claim 19.
Re claim 21, the combination of Pawlowski and Reynolds as discussed in claim 10, and also, claim 13 above discloses all the claimed limitations of claim 21.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSE M MESA whose telephone number is (571)270-1706. The examiner can normally be reached Monday-Friday 8:30AM-6:00PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thai Tran can be reached on 571-272-7382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
4/1/2026
/JOSE M. MESA/
Examiner
Art Unit 2484




/THAI Q TRAN/Supervisory Patent Examiner, Art Unit 2484
Read full office action
Prosecution Timeline

Jan 29, 2025
Application Filed
Apr 01, 2026
Non-Final Rejection — §102, §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/832,063
Patent 12598333
DATA PROCESSING METHOD AND APPARATUS, AND DEVICE, STORAGE MEDIUM AND PROGRAM PRODUCT
2y 5m to grant Granted Apr 07, 2026
18/863,037
Patent 12598389
IMAGING DEVICE, SENSOR CHIP, AND PROCESSING CIRCUIT
2y 5m to grant Granted Apr 07, 2026
18/957,276
Patent 12597444
SYSTEMS AND METHODS FOR AUTOMATED DIGITAL EDITING
2y 5m to grant Granted Apr 07, 2026
18/678,895
Patent 12580004
VIDEO EDITING SUPPORT DEVICE, VIDEO EDITING SUPPORT METHOD, AND RECORDING MEDIUM
2y 5m to grant Granted Mar 17, 2026
18/964,107
Patent 12581156
DISPLAY APPARATUS AND RECORDING METHOD
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
70%
Grant Probability
86%
With Interview (+16.4%)
2y 5m
Median Time to Grant
Low
PTA Risk
Based on 575 resolved cases by this examiner. Grant probability derived from career allow rate.
SYSTEMS AND METHODS FOR DETECTING AND ANALYZING AUDIO IN A MEDIA PRESENTATION ENVIRONMENT TO DETERMINE WHETHER TO REPLAY A PORTION OF THE MEDIA

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email