Last updated: May 29, 2026

Application No. 18/645,832

Integrated System for Multimodal Media Review and Selective Content Redaction

Non-Final OA §103

Filed

Apr 25, 2024

Priority

Jan 24, 2024 — IN 202441004839

Examiner

ANDERSON, BRODERICK C

Art Unit

2178

Tech Center

2100 — Computer Architecture & Software

Assignee

Open Text Technologies India Private Limited

OA Round

1 (Non-Final)

Interview Optional

— +19.5% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 74% grant rate with +19.5% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 261 resolved cases, 2023–2026

Examiner Intelligence

ANDERSON, BRODERICK C View full profile →

Grants 74% — above average

Career Allowance Rate

192 granted / 261 resolved

+18.6% vs TC avg

Strong +20% interview lift

Without

With

+19.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

8 currently pending

Career history

282

Total Applications

across all art units

Statute-Specific Performance

§101

3.4%

-36.6% vs TC avg

§103

88.7%

+48.7% vs TC avg

§102

5.1%

-34.9% vs TC avg

§112

0.7%

-39.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 261 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
The present application has claimed priority under 35 U.S.C. 119 from Indian Patent Application No. IN202441004839 filed 1/24/2024. Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.

Drawings
The drawings filed 4/25/2024 were accepted.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4 and 10-14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Quaeler et al (US20070030528A1; filed 7/31/2006) in view of Cohen et al (US20090031381A1; filed 7/24/2007) and Somasundaram et al (US20150121225A1; filed 10/25/2013).


With regards to claim 1, Quaeler et al discloses A method for reviewing audio and video content in a document review and electronic discovery software platform, the method comprising: …
transcribing audio content of the files using a transcription service to make the content searchable (Quaeler et al, paragraph 170: “where audio content is involved that can be converted from speech to text [2730], the user interface [110] offers a transcript-style view [2735] of the text;” paragraph 179: “If the motion document [415] has speech content, in one embodiment the system will perform a speech to text transformation and index the results in a manner such that the linkage of the utterances to time is preserved [2730];” “to make the content searchable” is treated as intended use);
displaying a timeline view associated with the audio and video content, including an audio waveform indicating spoken parts and a… view for video content (Quaeler et al, paragraph 169: “if the motion document [415] contains video and audio data, the data representation shown is a video playback window, similar to what a user of a personal computer would expect to see when playing video files [2720]”);
providing playback controls for the audio and video content, including speed adjustment and autoplay features (Quaeler et al, Fig. 27: Playback controls 2725 include play, stop, fast forward and rewind; Note: “autoplay” isn’t defined in the specification or the claims, and is a bit vague as to what it is in this context. Examiner is interpreting it as a part of the functionality included with the play button);
synchronizing playback of the audio and video content with the displayed timeline view and transcription (Quaeler et al, paragraph 172: “In edit mode, the user is able to click on a pre-existing redacted [305] region's representation in the timeline view [2720]; once clicked on, that region is considered selected [2705].” Fig. 27: the timeline indicators 2705 and 2745 are synchronized on the different timelines); and
facilitating review and analysis of the audio and video content within the document review and electronic discovery software platform (Quaeler et al, paragraph 172: “In edit mode, the user is able to click on a pre-existing redacted [305] region's representation in the timeline view [2720]; once clicked on, that region is considered selected [2705].”).
	However, Quaeler et al does not disclose normalizing and compressing audio and video files for playback with reduced bandwidth and latency requirements… a thumbnail view for video content.
	Cohen et al teaches normalizing (Cohen et al, paragraph 24: “Examples of the normalizing services or functions implemented by the proxy video server for video surveillance systems include video device discovery protocols, session protocols, video and audio transcoding, surveillance data storage and retrieval, video analytics functions, and metadata formats.”) and compressing audio and video files for playback with reduced bandwidth and latency requirements (Cohen et al, paragraph 25: “the proxy video server makes a video source appear to provide a wider range of image compression options than are directly supported by that video source.”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date to have combined Quaeler et al and Cohen et al such that the video/audio data is normalized and compressed. This would have enabled the invention to more efficient formats for transmission (Cohen et al, paragraph 57: “in the case of the novel video management system 200, transcoding capabilities available on-board the novel proxy video server 210 enable automatic translation of the video source streams into uniform and bandwidth efficient compression formats (e.g., MPEG-4; H.264). That is, the transcoding “translates” the video data to a normalized form”).
Somasundaram et al teaches a thumbnail view for video content (Somasundaram et al, Fig. 4-6: small frames are displayed on the timeline; paragraph 40: “After looking at the “thumbnail” views of frames M through U, the viewer may determine that the desired frame was recorded prior to frame M”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date to have combined Quaeler et al, Cohen et al, and Somasundaram et al such that the user can navigate the video using a thumbnail view. This would have enabled a user to quickly navigate through a video (Somasundaram et al, abstract: “navigating through image frames of video content to display particular frames of the video content while the video content continues to play;” paragraph 52: “Having more frames displayed within timeline 320 may provide the viewer with a better “bird's eye view” of the video content, thereby allowing the viewer to navigate more quickly to chronologically distant frames of the video content relative to the frame that is currently playing”).

	With regards to claim 2, which depends on claim 1, Quaeler et al discloses enabling a user to select portions of audio and video content for redaction based on timestamps or transcription text (Quaeler et al, paragraph 170: “the user interface [110] offers a transcript-style view [2735] of the text with selectable or “hot” vertically placed time markers [2710] indicating time span [2715].”).

	With regards to claim 3, which depends on claim 2, Quaeler et al discloses the redaction process (Quaeler et al, abstract: “enabling a user to define a redaction of a part of a document”) is synchronized with a timeline view and transcription of the audio and video content (Quaeler et al, paragraph 172: “In edit mode, the user is able to click on a pre-existing redacted [305] region's representation in the timeline view [2720]; once clicked on, that region is considered selected [2705].” Fig. 27: the timeline indicators 2705 and 2745 are synchronized on the different timelines). 

	With regards to claim 4, which depends on claim 3, Quaeler et al discloses replacing redacted portions with blank segments to prevent inference of the redacted content (Quaeler et al, paragraph 182: “Supported styles include FIG. 29, but are not limited to the following… Replacement of the data, with ‘empty data’ the data following the redacted data is not shifted in time. What data constitutes ‘empty data’ depends on the motion document [415] type. Examples would include: for an audio document, empty data would be zero-frequency, zero-amplitude signals (commonly called “silence”); for an audio-less video document, empty data would be data whose rendering result would be video frames featuring only black pixel data”).

	With regards to claim 10, which depends on claim 1, Quaeler et al discloses providing a preview of redacted content before finalizing the redaction process (Quaeler et al, Fig. 1: The user interface 110 and processing are performed prior to being sent to the production component 115 to generate the produced documents (finalized); The preview is interpreted as the redacted versions of the documents seen via the user interface, such as in Fig. 27).

Claims 11-14 recite substantially similar limitations to claims 1-4 respectively and are thus rejected along the same rationales.


Claim(s) 5-6 and 15-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Quaeler et al in view of Cohen et al and Somasundaram et al, and further in view of Mandic et al (US20180054519A1; filed 4/18/2017).

With regards to claim 5, which depends on claim 4, Quaeler et al, Cohen et al, and Somasundaram et al do not disclose adding a buffer period to the beginning and end of each redacted segment.
However, Mandic et al teaches adding a buffer period to the beginning and end of each redacted segment (Mandic et al, abstract: “Audio file segments can be trimmed with precursive and successive time periods to move the start and end times of the audio segments;“ paragraph 86: “a precursive predetermined time period is added to t-df-1. The predetermined time period tx needs to be added to the redaction segment thereby extending or moving back the redaction segment to a point earlier in time when the customer is audibly announcing the credit card number.”). 
It would have been obvious to a person of ordinary skill in the art before the effective filing date to have combined Quaeler et al, Cohen et al, Somasundaram et al, and Mandic et al such that a period is added before and after the redacted segment. This would have enabled the invention to “ensure that the audio file containing PII, medical data or highly sensitive security information recorded during the comm session is handled with a reasonable degree of data security” (Mandic et al, paragraph 17: “ensure that the audio file containing PII, medical data or highly sensitive security information recorded during the comm session is handled with a reasonable degree of data security”).

	With regards to claim 6, which depends on claim 5, Quaeler et al discloses generating an output of the redacted content wherein redacted sections are visually and audibly distinct from non-redacted sections (Quaeler et al, paragraph 182: “Supported styles include FIG. 29, but are not limited to the following… Replacement of the data, with ‘empty data’ the data following the redacted data is not shifted in time. What data constitutes ‘empty data’ depends on the motion document [415] type. Examples would include: for an audio document, empty data would be zero-frequency, zero-amplitude signals (commonly called “silence”); for an audio-less video document, empty data would be data whose rendering result would be video frames featuring only black pixel data”).

Claims 15-16 recite substantially similar limitations to claims 5-6 respectively and are thus rejected along the same rationales.



Claim(s) 7-9 and 17-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Quaeler et al  in view of Cohen et al and Somasundaram et al, and further in view of Skinner et al (US10347293B1; filed 7/31/2018).

With regards to claim 7, which depends on claim 1, Quaeler et al, Cohen et al, and Somasundaram et al do not disclose integrating video OCR and object detection features to enhance video analysis.
However, Skinner et al teaches integrating video OCR (Skinner et al, Summary, paragraph 2: “causing, in response to the selection, with one or more processors, optical character recognition (OCRing) of each frame in the subset of frames and obtaining corresponding frame-OCR records, each frame-OCR record including text determined by the OCRing to be depicted in a corresponding frame”) and object detection features to enhance video analysis (Skinner et al, col 8, lines 20-30: “Patterns may match to various types of content. In some embodiments, the patterns matched to non-text images, like faces, images of objects, images of rooms, images of maps, images of schematics, images of CAD files, and the like. In some embodiments, the patterns include object detection and localization models, such convolution neural networks trained on labeled training sets including examples of images with labels identifying objects to be detected”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date to have combined Quaeler et al, Cohen et al, Somasundaram et al, and Skinner et al such that the video frames are analyzed to extract data which should be redacted. This would have enabled the invention to redact confidential text that appears on each frame of the video (Skinner et al, Summary, paragraph 2: “classifying, with one or more processors, text in each frame-OCR record as confidential or non-confidential; and forming, with one or more processors, a redacted version of the screen-cast video”).

With regards to claim 8, which depends on claim 7, Quaeler et al, Cohen et al, and Somasundaram et al do not disclose wherein video OCR includes extracting textual content from video frames.
However, Skinner et al teaches wherein video OCR includes extracting textual content from video frames (Skinner et al, Summary, paragraph 2: “causing, in response to the selection, with one or more processors, optical character recognition (OCRing) of each frame in the subset of frames and obtaining corresponding frame-OCR records, each frame-OCR record including text determined by the OCRing to be depicted in a corresponding frame”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date to have combined Quaeler et al, Cohen et al, Somasundaram et al, and Skinner et al such that the video frames are analyzed to extract data which should be redacted. This would have enabled the invention to redact confidential text that appears on each frame of the video (Skinner et al, Summary, paragraph 2: “classifying, with one or more processors, text in each frame-OCR record as confidential or non-confidential; and forming, with one or more processors, a redacted version of the screen-cast video”).

With regards to claim 9, which depends on claim 7, Quaeler et al, Cohen et al, and Somasundaram et al do not disclose wherein object detection includes identifying and classifying objects within the video content.
However, Skinner et al teaches wherein object detection includes identifying and classifying objects within the video content (Skinner et al, col 8, lines 20-30: “Patterns may match to various types of content. In some embodiments, the patterns matched to non-text images, like faces, images of objects, images of rooms, images of maps, images of schematics, images of CAD files, and the like. In some embodiments, the patterns include object detection and localization models, such convolution neural networks trained on labeled training sets including examples of images with labels identifying objects to be detected”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date to have combined Quaeler et al, Cohen et al, Somasundaram et al, and Skinner et al such that the video frames are analyzed to extract data which should be redacted. This would have enabled the invention to redact confidential text that appears on each frame of the video (Skinner et al, Summary, paragraph 2: “classifying, with one or more processors, text in each frame-OCR record as confidential or non-confidential; and forming, with one or more processors, a redacted version of the screen-cast video”).

Claims 17-19 recite substantially similar limitations to claims 7-9 respectively and are thus rejected along the same rationales.
Claim(s) 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Quaeler et al in view of Mandic et al.


With regards to claim 20, Quaeler et al discloses A method for redacting portions of audio and video content in a document review and electronic discovery software platform, the method comprising: allowing a user to select portions of audio and video content for redaction based on timestamps or transcription text (Quaeler et al, abstract: “enabling a user to define a redaction of a part of a document in a corpus of documents;” paragraph 170: “where audio content is involved that can be converted from speech to text [2730], the user interface [110] offers a transcript-style view [2735] of the text;” paragraph 78: “A motion document [415]… include two-dimensional video content, audio content;” Fig. 27: the UI shows redaction of a motion document);
synchronizing the redaction process with a timeline view and transcription of the audio and video content, enabling precise selection of redaction points (Quaeler et al, paragraph 172: “In edit mode, the user is able to click on a pre-existing redacted [305] region's representation in the timeline view [2720]; once clicked on, that region is considered selected [2705].” Fig. 27: the timeline indicators 2705 and 2745 are synchronized on the different timelines);
permanently removing the selected portions from the audio and video content and replacing them with blank segments (Quaeler et al, paragraph 182: “Supported styles include FIG. 29, but are not limited to the following… Replacement of the data, with ‘empty data’ the data following the redacted data is not shifted in time. What data constitutes ‘empty data’ depends on the motion document [415] type. Examples would include: for an audio document, empty data would be zero-frequency, zero-amplitude signals (commonly called “silence”); for an audio-less video document, empty data would be data whose rendering result would be video frames featuring only black pixel data”);…
generating a native output of the redacted content wherein redacted sections are visually and audibly distinct from non-redacted sections (Quaeler et al, paragraph 182: “Supported styles include FIG. 29, but are not limited to the following… Replacement of the data, with ‘empty data’ the data following the redacted data is not shifted in time. What data constitutes ‘empty data’ depends on the motion document [415] type. Examples would include: for an audio document, empty data would be zero-frequency, zero-amplitude signals (commonly called “silence”); for an audio-less video document, empty data would be data whose rendering result would be video frames featuring only black pixel data”); and
generating a preview or the redacted native output before production workflow is triggered within the eDiscovery platform (Quaeler et al, Fig. 1: The user interface 110 and processing are performed prior to being sent to the production component 115 to generate the produced documents (finalized); The preview is interpreted as the redacted versions of the documents seen via the user interface, such as in Fig. 27).
	However, Quaeler et al does not disclose adding a buffer period to the beginning and end of each redacted segment to prevent inference of the redacted content.
Mandic et al teaches adding a buffer period to the beginning and end of each redacted segment to prevent inference of the redacted content (Mandic et al, abstract: “Audio file segments can be trimmed with precursive and successive time periods to move the start and end times of the audio segments;“ paragraph 86: “a precursive predetermined time period is added to t-df-1. The predetermined time period tx needs to be added to the redaction segment thereby extending or moving back the redaction segment to a point earlier in time when the customer is audibly announcing the credit card number.”). 
It would have been obvious to a person of ordinary skill in the art before the effective filing date to have combined Quaeler et al and Mandic et al such that a period is added before and after the redacted segment. This would have enabled the invention to “ensure that the audio file containing PII, medical data or highly sensitive security information recorded during the comm session is handled with a reasonable degree of data security” (Mandic et al, paragraph 17: “ensure that the audio file containing PII, medical data or highly sensitive security information recorded during the comm session is handled with a reasonable degree of data security”).



Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Roks (US 20180322106 A1): Teaches a user interface for redacting portions of a video that transcribes the audio.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRODERICK C ANDERSON whose telephone number is (313)446-6566. The examiner can normally be reached Monday-Tuesday, Thursday-Saturday 9-5 PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen Hong can be reached at 5712724124. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/B.C.A/Examiner, Art Unit 2178                                                                                                                                                                                                        
/STEPHEN S HONG/Supervisory Patent Examiner, Art Unit 2178

Read full office action

Prosecution Timeline

Apr 25, 2024

Application Filed

Apr 07, 2026

Non-Final Rejection mailed — §103

May 01, 2026

Interview Requested

May 07, 2026

Examiner Interview Summary

May 07, 2026

Applicant Interview (Telephonic)

Precedent Cases

Applications granted by this same examiner with similar technology

18/222,854

Patent 12619350

Shortcut Framework for Supporting Flexible User Interface Features

2y 9m to grant Granted May 05, 2026

18/339,263

Patent 12614324

IMAGE TABLE GENERATION

2y 10m to grant Granted Apr 28, 2026

18/374,734

Patent 12572199

METHOD AND APPARATUS FOR GENERATING GROUP EYE MOVEMENT TRAJECTORY, COMPUTING DEVICE, AND STORAGE MEDIUM

2y 5m to grant Granted Mar 10, 2026

17/188,312

Patent 12564337

RECURRENT NEURAL NETWORK FOR TUMOR MOVEMENT PREDICTION

5y 0m to grant Granted Mar 03, 2026

18/333,081

Patent 12566821

GENERATIVE SYSTEM FOR WRITING ENTITY RECOMMENDATIONS

2y 8m to grant Granted Mar 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

74%

Grant Probability

93%

With Interview (+19.5%)

2y 11m (~10m remaining)

Median Time to Grant

Low

PTA Risk

Based on 261 resolved cases by this examiner. Grant probability derived from career allowance rate.