Last updated: April 19, 2026

Application No. 17/818,278

AUTOMATED HAPTICS GENERATION AND DISTRIBUTION

Final Rejection §103

Filed

Aug 08, 2022

Examiner

YANG, JIANXUN

Art Unit

2662

Tech Center

2600 — Communications

Assignee

Disney Enterprises Inc.

OA Round

4 (Final)

Interview Optional

— +18.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 635 resolved cases, 2023–2026

Examiner Intelligence

YANG, JIANXUN View full profile →

Grants 74% — above average

Career Allow Rate

472 granted / 635 resolved

+12.3% vs TC avg

Strong +19% interview lift

Without

With

+18.6%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

45 currently pending

Career history

680

Total Applications

across all art units

Statute-Specific Performance

§101

3.8%

-36.2% vs TC avg

§103

56.1%

+16.1% vs TC avg

§102

16.7%

-23.3% vs TC avg

§112

17.1%

-22.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 635 resolved cases

Office Action

§103

DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claims 1,3-19 and 21 are pending.
Claims 2 and 20 are canceled.


Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

Claim(s) 1,3-19 and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bharitkar et al (US20220113801) in view of Cha et al (US20180151036) and further in view of Chaudhuri et al (US20180174600)

Regarding claims 1 and 13, Bharitkar teaches a computer-implemented method comprising:
receiving media content from a computing device,
(Bharitkar, Fig. 9, input audio signals and video frames; Figs. 1-2, “smartphones, … imaging devices”, [0020])
determining one or more features of the media content, wherein:
(Bharitkar, Fig. 9; feature extraction from both audio and video, “A feature extraction module 902 receives input audio signals (i.e., spatial audio) and performs feature extraction to extract features of the audio signals”, [0052]; “A feature extraction module 903 receives input video scenes. The video scenes are extracted on a frame-by-frame basis”, [0053]; “extracting features from the video”, [claim 8])
	the one or more features comprises one or more segmented audio signals of the audio content; and
(Bharitkar, Fig. 9; “A feature extraction module 602 receives input audio signals (i.e., spatial audio). The feature extraction can be performed using a convolutional neural network, an autoencoder, long short-term memory, hand-designed features, or a combination thereof. The input audio signals can be mono audio sources {x1,k(n), x2,k(n), . . . , xM,k(n)} where k is a frame index and n is the sample in the frame for audio source P in xP,k(n)”, [0046]; this shows that the audio is processed in frames, which are segmented pieces of audio. Cha further confirms this interpretation by explicitly defining an "audio frame" as a "segmented pieces of audio data" (“Fig. 5, S510, “the audio processor 410 may segment input audio data at specified intervals. Each of the segmented pieces of audio data (hereinafter referred to as an ‘audio frame’) … [0096]). Therefore, the combination of Bharitkar and Cha teaches the recited limitation because the audio is processed on a frame-by-frame basis, and an audio frame is a segmented audio signal; Cha further teaches an electronic device network system in which performing audio/video/haptics functions may be distributed from one device to another, Fig. 11)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate the teachings of Cha into the system or method of Bharitkar in order to enable an electronic device network system in which performing audio/video/haptics functions may be distributed from one device to another. The combination of Bharitkar and Cha also teaches other enhanced capabilities.
	The combination of Bharitkar and Cha does not expressly disclose but Chaudhuri further teaches:
	the one or more segmented audio signals are determined based at least in part on an analysis of content in one or more frames of the video content; and
(Chaudhuri, Fig. 1; “The speech detector 132 detects speech in the audio portion of a video... the speech detector 132 splits or divides the audio portion into multiple segments and analyzes each segment using a trained model or classifier to determine the likelihood value indicating whether the segment of the audio exhibits speech sounds”, [0039]; use neural networks to determine which segments of the audio signal contain speech sounds)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate the teachings of Chaudhuri into the modified system or method of Bharitkar and Cha in order to use Neural Networks (NNs) for speech sound detection for learning complex patterns in noisy, real-world audio. The combination of Bharitkar, Cha and Chaudhuri also teaches other enhanced capabilities.
	The combination of Bharitkar, Cha and Chaudhuri further teaches:
generating a first set of haptic data for the media content, based on evaluating the one or more features of the media content with at least one first machine learning model.
(Bharitkar, Fig. 9, “A classification module 906 generates predicted labels (i.e., classifications) using a previously trained deep learning model”, [0052]; “The haptic signal for the audio and the haptic signal for the video is then combined at block 908 to generate haptic output”, [0054]; the audio haptic signal and the video haptic signal generated from classification modules 906 and 908, respectively, are based on the extracted audio and video features 902 and 903, respectively)

Regarding claim 3, the combination of Bharitkar, Cha and Chaudhuri teaches its/their respective base claim(s).
The combination teaches the computer-implemented method of claim 1, wherein the one or more features comprises one or more attributes of one or more frames of the video content.
 (Cha, Fig. 12, RGB sensor 1240H captures color images or color video; the RGB value of the color video frames is an attribute of the color video frames)

Regarding claim 4, the combination of Bharitkar, Cha and Chaudhuri teaches its/their respective base claim(s).
The combination further teaches the computer-implemented method of claim 3, wherein the one or more attributes comprises:
at least one of a grayscale, RGB value, contrast, and exposure; or
a type of motion associated with an object within a video segment of the video content.
(Cha, see comment on claim 3)

Regarding claim 5, the combination of Bharitkar, Cha and Chaudhuri teaches its/their respective base claim(s).
The combination further teaches the computer-implemented method of claim 4, wherein:
the one or more attributes further comprises one or more properties of the type of motion associated with the object; and
(Cha, Fig. 4, motion event detection 430, “motion data”, [0079])
the one or more properties comprises at least one of a spatial direction, an acceleration, or a velocity.
(Cha, Figs. 1 and 4, “motion data from the camera device 110”, [0079])

Regarding claims 6 and 14, the combination of Bharitkar, Cha and Chaudhuri teaches its/their respective base claim(s).
The combination teaches the computer-implemented method of claim 1, further comprising generating a second set of haptic data for the media content, based at least in part on the one of more features of the media content or metadata,
wherein the second set of haptic data is different from the first set of haptic data.
(Cha, “generates the haptic effect (e.g., a tremor, vibration, or rotation effect)”, [0060])

Regarding claims 7 and 21, the combination of Bharitkar, Cha and Chaudhuri teaches its/their respective base claim(s).
The combination further teaches the computer-implemented method of claim 6, wherein the second set of haptic data comprises a different number of haptic effects than the first set of haptic data.
(Cha, “generates the haptic effect (e.g., a tremor, vibration, or rotation effect)”, [0060])

Regarding claim 8, the combination of Bharitkar, Cha and Chaudhuri teaches its/their respective base claim(s).
The combination teaches the computer-implemented method of claim 1, wherein the first set of haptic data comprises an indication of at least one haptic effect associated with one or more segments of the video content.
(Cha, “The haptic information may include information about the time information (occurrence time, occurrence timing, duration, or the like), type (e.g., vibration, shake, or compression/decompression), intensity, direction, or the like of the haptic effect”, [0091])

Regarding claim 9, the combination of Bharitkar, Cha and Chaudhuri teaches its/their respective base claim(s).
The combination further teaches the computer-implemented method of claim 1, wherein the first set of haptic data is generated in real-time while the media content is being recorded by the computing device.
(Bharitkar, Fig. 7, input audio data and input video data are real-time data, “A feature extraction module 703 receives input video scenes. The video scenes are extracted on a frame-by-frame basis, for example at approximately 29.97 frames per second”, [0048])

Regarding claim 10, the combination of Bharitkar, Cha and Chaudhuri teaches its/their respective base claim(s).
The combination further teaches the computer-implemented method of claim 1, wherein the first set of haptic data is generated as part of a production pipeline used for generating the media content.
(Bharitkar, Fig. 2; “the package rendering module 218 encodes the audio/haptics signal from the integration module 216. In some examples, the package rendering module 218 also encodes the video with the audio/haptics signal from the integration module 216 to generate a rendering package. The encoding can be lossy or lossless encoding. The rendering package can be sent to a user device (not shown) to playback the content, including presenting the audio, video, and haptics to the user”, [0036]; “As an example of haptics metadata, the user's device (i.e., a rendering device) parses and interprets the haptic metadata and uses that information to activate haptics devices associated with the user”, [0028])

Regarding claim 11, the combination of Bharitkar, Cha and Chaudhuri teaches its/their respective base claim(s).
The combination teaches the computer-implemented method of claim 1, further comprising receiving metadata describing at least one event associated with the media content,
wherein the first set of haptic data is generated further based on evaluating the metadata with the at least one first machine learning model.
(Cha, Fig. 4, “the motion processor 430 may detect a motion event (e.g., a collision event, a shake, or the like) based on the motion data. The motion processor 430 may determine whether the motion event occurs, by calculating how much the motion data coincides with pre-stored motion pattern information. The haptic information generating unit 440 may determine or change the attribute of the haptic event based on the determined motion event”, [0090])

Regarding claim 12, the combination of Bharitkar, Cha and Chaudhuri teaches its/their respective base claim(s).
The combination further teaches the computer-implemented method of claim 11, wherein the metadata comprises information from one or more hardware sensors.
(Cha, Fig. 4, “a motion sensor (e.g., GPS, acceleration sensor, a gyro sensor, a geomagnetic sensor, a magnetic sensor, or the like)”, [0119])

Regarding claim 15, Bharitkar teaches a computer-implemented method comprising:
obtaining a set of haptic data associated with media content, wherein:
	the media content comprises one or more audio files, one or more video files, and 
(Bharitkar, Fig. 3; package rendering module 218 receives audio signals generated from spatial audio authoring module 212, video signals generated from video module 214, and haptics metadata (=> “haptic data”) from video/audio-driven haptics information generation module 210, [0030, 0034, 0035, 0036]; “The video/audio-driven haptics information generation module 210 generates haptics metadata using audio-haptics classification based at least in part on spatial audio associated with a digital environment”, [0030]; note that both audio and video signals are digital multimedia signals (“spatial audio and/or video associated with a digital environment”, [0026]); they normally contain A/V metadata with them in order for rendering correctly and efficiently; also, obviously, since audio signal, video signals and the associated A/V metadata are in digital format, they are normally organized in the form of multimedia digital files so that they can be easily stored, retrieved and transmitted)
Bharitkar does not expressly disclose but Cha teaches:
... metadata;
(Cha, “The multimedia manager 1343 may recognize a format required for playing various media files and may encode or decode a media file using a codec matched to the format”, [0249]; the media files should include format data of encoding/decoding audio/video signals such format data is the media metadata)
transmitting the set of haptic data, the one or more audio files, the one or more video files, and the metadata to a client device,
(Cha, “The multimedia manager 1343 may recognize a format required for playing various media files and may encode or decode a media file using a codec matched to the format”, [0249]; the media files should include format data of encoding/decoding audio/video signals such format data is the media metadata; Figs. 11 and 13, “A portion or all of operations performed in the electronic device 1101 may be performed in one or more other electronic devices (e.g., the first electronic device 1102, the second external electronic device 1104, and/or the server 1106). ... the electronic device 1101 may request at least a portion of functions related to the function or service from another device (e.g., the first electronic device 1102, the second external electronic device 1104, and/or the server 1106) .... The other electronic device (e.g., the first electronic device 1102, the second external electronic device 1104, or the server 1106) may perform the requested function or additional function, and may transfer a result of the performance to the electronic device 1101. The electronic device 1101 may use a received result itself or additionally process the received result to provide the requested function or service”, [0208]; obviously, when electronic device 1101 requests electronic device 1102 to perform the audio/video/haptics functions, electronic device 1101 needs to send the audio files, the video files and the haptics metadata (=> “haptic data”) to electronic device 1102; the A/V signals are the streaming signals as pointed out by Cha, Fig. 1, “The camera device 110… may stream the collected video data, the collected audio data, … to an external device”, [0046], e.g., to an HMD-type haptic device 150)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate the teachings of Cha into the system or method of Bharitkar in order to enable an electronic device network system in which performing audio/video/haptics functions may be distributed from one device to another. The combination of Bharitkar and Cha also teaches other enhanced capabilities.
	The combination of Bharitkar and Cha further teaches:
wherein the set of haptic data comprises a plurality of haptic files,
(Bharitkar, “a haptics signal can be a mono-track haptics signal or a multi-track haptics signal. In the case of a multi-track haptics signal, the signal can have N channels, where N is the number of channels”, [0026]; “The haptics information can include a mono-track haptics signal, a multi-track haptics signal, and/or haptics metadata”, [0018]; indicating multiple alternative haptics-signal forms (mono-track vs. multi-track/N-channel), which correspond to different discrete haptics data instances that can be provided as different haptic files within the overall haptic data set)
each haptic file of the plurality of haptic files
	(i) indicating a respective set of haptic effects to be generated during streaming of the media content and
(Bharitkar, Fig. 1; “Haptics metadata describe a desired haptic effect. For example, haptics metadata can describe the presence or absence of a vibration, a directional wind, etc”, [0026])
	(ii) being associated with a respective amount of available resources of the client device.
(Bharitkar, “The encoding can be lossy or lossless encoding. The rendering package can be sent to a user device (not shown) to playback the content, including presenting the audio, video, and haptics to the user”, [0036]; this implies that the haptic files are associated with different amounts of available resources of the client device in two ways: first, by offering "lossy or lossless encoding," which associates the files with varying amounts of computing/bandwidth resources (i.e., lossy for lower resources, lossless for higher); and second, by associating each channel with a specific "haptics device" (e.g., glove, vest; “Haptics signals are analog or digital signals that cause a haptics device (e.g., a haptics-enabled glove, vest, head-mounted display, etc.) to provide haptic feedback to a user associated with the haptics device”, [0026]) available on the client, thereby tailoring the data to the specific hardware resources present on the device)

Regarding claim 16, the combination of Bharitkar and Cha teaches its/their respective base claim(s).
The combination further teaches the computer-implemented method of claim 15, wherein transmitting the set of haptic data, the one or more audio files, the one or more video files, and the metadata comprises including the set of haptic data in a transport stream comprising the one or more audio files, the metadata, and the one or more video files.
(Bharitkar, Fig. 3; “The integration module 216 combines the audio signal and the haptics information, which can then be embedded by the package rendering module 218. In particular, the package rendering module 218 encodes the audio/haptics signal from the integration module 216. In some examples, the package rendering module 218 also encodes the video with the audio/haptics signal from the integration module 216 to generate a rendering package. The encoding can be lossy or lossless encoding. The rendering package can be sent to a user device (not shown) to playback the content, including presenting the audio, video, and haptics to the user”, [0036])

Regarding claim 17, the combination of Bharitkar and Cha teaches its/their respective base claim(s).
The combination further teaches the computer-implemented method of claim 15, further comprising generating a manifest file comprising the set of haptic data, the one or more audio files, the metadata, and the one or more video files,
wherein transmitting the set of haptic data, the metadata, the one or more audio files, and the one or more video files comprises transmitting the manifest file.
(Bharitkar, see comments on claim 16)

Regarding claim 18, the combination of Bharitkar and Cha teaches its/their respective base claim(s).
The combination further teaches the computer-implemented method of claim 15, wherein each of the plurality of haptic files is associated with a different type of client device.
(Bharitkar, “Haptics signals are analog or digital signals that cause a haptics device (e.g., a haptics-enabled glove, vest, head-mounted display, etc.) to provide haptic feedback to a user associated with the haptics device”, [0026]; different devices may have different capabilities supporting different haptic events; thus, it is necessary to correctly associate different haptic data to the different devices with different haptic capabilities)

Regarding claim 19, the combination of Bharitkar and Cha teaches its/their respective base claim(s).
The combination further teaches the computer-implemented method of claim 15, wherein each of the plurality of haptic files is associated with a different set of capabilities of a client device.
(Bharitkar, see comments on claim 18)

Response to Arguments
Applicant's arguments filed on 11/18/2025 with respect to one or more of the pending claims have been fully considered but they are not persuasive.

Regarding claim(s) 1 and 13, Applicant, in the remarks, argues that the combination of the cited reference(s) fails to teach the newly amended limitations in the claims. 
The Examiner respectfully disagreed. The office action has been updated to address applicant’s argument with new ground(s) of rejection. See the updated review comments for details.

Regarding claim(s) 15, Applicant, in pages the remarks, argues that the combination of the cited references fails to teach “each haptic file of the plurality of haptic files... (ii) being associated with a respective amount of available resources of the client device” as recited in claim 15.
The Examiner respectfully disagreed. The office action has been updated to address applicant’s argument. See the updated review comments for details.


Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JIANXUN YANG whose telephone number is (571)272-9874. The examiner can normally be reached on MON-FRI: 8AM-5PM Pacific Time.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached on (571)272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center. for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272- 1000.


/JIANXUN YANG/
Primary Examiner, Art Unit 2662				1/1/2026

Read full office action

Prosecution Timeline

Aug 08, 2022

Application Filed

Sep 13, 2024

Non-Final Rejection — §103

Jan 16, 2025

Response Filed

Mar 22, 2025

Final Rejection — §103

Jun 27, 2025

Request for Continued Examination

Jun 30, 2025

Response after Non-Final Action

Aug 14, 2025

Non-Final Rejection — §103

Nov 18, 2025

Response Filed

Jan 01, 2026

Final Rejection — §103

Mar 11, 2026

Applicant Interview (Telephonic)

Mar 11, 2026

Examiner Interview Summary

Apr 02, 2026

Request for Continued Examination

Apr 06, 2026

Response after Non-Final Action

Precedent Cases

Applications granted by this same examiner with similar technology

18/191,202

Patent 12602917

OBJECT DETECTION DEVICE AND METHOD

2y 5m to grant Granted Apr 14, 2026

18/502,347

Patent 12602853

METHODS AND APPARATUS FOR PET IMAGE RECONSTRUCTION USING MULTI-VIEW HISTO-IMAGES OF ATTENUATION CORRECTION FACTORS

2y 5m to grant Granted Apr 14, 2026

18/072,471

Patent 12590906

X-RAY INSPECTION APPARATUS, X-RAY INSPECTION SYSTEM, AND X-RAY INSPECTION METHOD

2y 5m to grant Granted Mar 31, 2026

17/927,692

Patent 12586223

METHOD FOR RECONSTRUCTING THREE-DIMENSIONAL OBJECT COMBINING STRUCTURED LIGHT AND PHOTOMETRY AND TERMINAL DEVICE

2y 5m to grant Granted Mar 24, 2026

18/130,022

Patent 12586152

METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT FOR TRAINING IMAGE PROCESSING MODEL

2y 5m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

5-6

Expected OA Rounds

74%

Grant Probability

93%

With Interview (+18.6%)

2y 9m

Median Time to Grant

High

PTA Risk

Based on 635 resolved cases by this examiner. Grant probability derived from career allow rate.