Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claim Interpretation
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
Specification
The amendment filed 12/17/25 is objected to under 35 U.S.C. 132(a) because it introduces new matter into the disclosure. 35 U.S.C. 132(a) states that no amendment shall introduce new matter into the disclosure of the invention. The added material which is not supported by the original disclosure is as follows: the newly amended claims recite a device, method, etc. operative to determine sounds that match a narrative subsection of an ebook upon which a user is currently focused the specification as filed does not discuss the manner in which sounds are matched to a particular narrative subsection, in distinction from a non-narrative subsection.
Applicant is required to cancel the new matter in the reply to this Office Action.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Applicant’s claim amendments filed 12/17/25 do not suffice to obviate the under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. Claim(s) 1, 10, 19 is/are directed to a system, method, etc. for generating an song or music based on a determination of portions of a content and analysis of said determined portions such as using a large language model. The claims rely on well understood, routine, and conventional structures such as a processor, memory, data structure, etc. to instruct the system along methods by which a piece of content is analyzed and the data generated thereby is used to instruct a model to parameterize a composition based on the generated data. The claims are considered a manner by which data resolves more data, in this case a data driven or data informed analysis of specific data and the generation of musical data based thereon; the claims are also considered a stand in for human behavior as the claims steps are substantially similar to the manner in which a human being might ask for music, sounds, effects, etc., such as by providing a request for whistling which a text is read, a request for a second human to sing a particular portion of dialog, etc.. As such the claims cannot be considered to integrate the judicial exceptions of an abstract idea such as data per se or programs per se nor the judicial exception of human activity and/or mental processes such as operations performed in the human mind, human activity, human behavior; etc. as the claims do not include substantially more than the performance of such exceptions upon a computer claimed at a high level of generality and based on models intended to mimic or replicate human cognitive processes. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Dependent claims 4, 6-9, 13, 16-18 do not remedy and are similarly rejected as the claims further address additional subject matter which may be seen as the generation of data from data; a stand in for human behavior, and/or human application of agency in concert with assistive instructions, mathematic concepts, AI models, etc.
Examiner appreciates Applicant’s arguments however claims 1, 4, 6-10, 13, 16-19 remain rejected as reacting an abstract idea such as in the form of a mental process or stand in for human behavior, such as by imagining music which might coincide with the reading of a book aloud, or asking a second human “please generate music to go along with my reading of this book aloud.” While the claims nominally recite an ebook reader the recitations are essentially without limit and con comprise operation upon virtually any device and essentially amount to applying the claim to a generic computing product; the recited LXM and AI models are additionally claimed at a high level of generality as such the claim cannot be considered to integrate into a practical application. Further the claims cannot be considered to recite substantially more as the generic components perform their basic function such as text processing using a computer, in concert with a model operative to generate music; this cannot be considered significantly more as these concepts considered a routine use of coded instructions therefor retrieved from memory and executed by a processor. Applicant should claim the underlying operations which direct the computer along the process of producing real time auditory output alongside the tracked reading of an ebook, such as in claims 5, 14, 15. which are considered to remedy the 35 USC 101 rejection of claims.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 1, 4-10, 13-19 rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. Claims 1, 10, 19 recite a device, method, etc. operative to determine sounds that match a narrative subsection of an ebook upon which a user is currently focused the specification as filed does not discuss the manner in which sounds are matched to a narrative subsection, in distinction from a non-narrative subsection. Claims 4-9, 13-18 do not remedy and are similarly rejected.
The following is a quotation 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1, 4-10, 13-19 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claims 1, 10, 19 recite a device, method, etc. operative to determine sounds that match a narrative subsection of an ebook upon which a user is currently focused. The determination of a match to a narrative section renders the claim indefinite. Presuming that the determining sound that match of a narrative subsection to be synonymous with a portion of content as an ebook generally comprises narration then the scope is unclear as all of the book must be considered in that regard. In the alternate if narrative subsection delineates a particular type of subsection then the specification may be found to lack disclosure of the structural or conceptual underpinnings which operate to qualify a subsection as narrative; in which case the objection to the specification and 112 1st paragraph rejection supersede. Claims 4-9, 13-18 do not remedy and are similarly rejected.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 4-10, 13-20 rejected under 35 U.S.C. 103 as being unpatentable over Levy: 20230134451 hereinafter Lev further in view of Do: 20150269133 and further in view of Xq: “How to Generate Real Music Using ChatGPT?” (provided by Applicant in IDS filed 5/12/25; copyright 3/25/23 and hereinafter Xq).
Regarding claim 1
Lev teaches:
A computing device, comprising:
a memory (Lev: ¶ 20; Fig 2: such as a medium storing instructions); a display configured to display an electronic book (eBook) (Lev: Abstract; ¶ 4, 22, etc.: text displayed on a display device); a sound-producing component (Lev: ¶ 4, 16, Fig 1: system generates music, sound etc. for output such as from an audio output device); and at least one processor coupled to the memory, the display and the sound-producing component (Lev: ¶ 17, Fig 1: such as a processor operative upon the display device), and configured to:
generate an audible response based on context information, user profile information, and narrative elements of a section of an eBook (Lev: ¶ 19, 23-28; Fig 3-5: system utilizes artificial intelligence to identify narrative elements of sections, subsections etc. of a text such as words, phrases, etc.; context information thereof such as descriptions of scenes, occurrences, etc.; and user information such as determined from a database and dependent upon detected, stored, etc. user information such as reading speed, etc., user information of this sort persisted in the disclosed manner is considered a user profile; the device employs same such as for generating sensory augmentation information such as an audio, musical, etc. said information generated based on predicting an emotional response of a user such as with respect to the recited elements);
determine, context- appropriate sounds that match a subsection of the eBook on which a reader is currently focused based on the received responses (Lev: ¶ 23-25; Fig 3, 4. 5: upon reader arriving upon a particular portion of the text a sensory automation routine based on a predicted emotional response determined based on stored sensory augmentation routines appropriate for the particular portion of the text); and
output the determined context-appropriate sounds on the sound-producing component.
Lev does not explicitly teach the device, method, etc. operative to generate a generative artificial intelligence model (LXM) prompt based on context information, user profile information, and narrative elements of a section of an eBook;
apply the generated LXM prompt to a local or remote LXM to receive an LXM response;
and determine, by using a sound-generating artificial intelligence (AI) model, context- appropriate sounds that match [[for]] a narrative subsection of the eBook on which a reader is currently focused based on the received LXM responses.
In a related field of endeavor Do teaches an ebook device, comprising: a memory; a display configured to display text; a sound-producing component; and at least one processor coupled to the memory, the display and the sound-producing component (All: Abstract; ¶ 2, 14; Fig 1: a processor implemented method for generating audio to be played over a reading duration of the ebook operative such as on a computing device comprising a processor, memory, etc.; said device comprising a display for at least text and/or a user interface, and output interface for output of audio such that the ebook text is displayed on the screen and augmented with output music), and configured to:
generate a prompt based on intelligent mood analysis of context information, user profile information, and narrative elements of a section of an eBook (Do: ¶ 15, 16, 18-20; Fig 3-5: system generates a prompt in the form of tuple comprising context information based on intelligent mood analysis of subsections of an ebook corresponding to start and end pages and lines; said tuple comprising context in the form of genre, type; narrative elements in the form of objects involved and operative in concert with user profile information including user preferences operable to modify or add features too the output; the tuples operate as a prompt by which the system augments the reading with music for output based on a reading location) ;
generate model parameters using based on context information, and narrative elements of a section of an eBook (Do: 16; Fig 3-5: system develops a table of model parameters based on determined parameters);
apply the generated parameters to determine, by using a sound-generating model, context-appropriate sounds that match a narrative subsection of the eBook on which a reader is currently focused (Do: ¶ 15-20; figs 2-5: determination of context information corresponding to the portion of the book includes determining genre, type, and objects parameters associated therewith and using same to determine music to output coincident with the reading; said type parameters including narrative and non-narrative parameters such as for subsections comprising narrative elements such as pursuit, action, etc. and subsections comprising non-narrative elements such as atmosphere, happy, sad, such that the parameter may resolve an appropriate mood for the audio) and output the determined context-appropriate sounds on the sound-producing component (id.: such as to augment the ebook with audio, music, etc. based on context, user profile, narrative, etc. parameters). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to utilize the Do taught or suggested input parameters to a model such as that of Lev for determining output audio matching subsections of an ebook such that a mood of the music changes based on narrative and/or non-narrative subsections, elements, etc. thereof and for at least the purpose of more appropriately outputting audio in concert with an appropriate mood; one of ordinary skill in the art would have expected only predictable results therefrom.
Lev in view of Do does not explicitly teach the system, method, etc. operative to generate a generative artificial intelligence model (LXM) prompt based on determined parameters relevant to a section of an eBook such as based on the Lev in view of Do context, user profile, and narrative element information; to thereby apply the generated LXM prompt to a local or remote LXM to receive an LXM response and determined context appropriate audio based on the LXM response.
In a related field of endeavor Xq teaches an LLM system for generation of audio to match a text such as by generation of a user prompt and comprising a two stage pipeline for generating the audio by prompting a generative model such as an LLM (Xq: pp 4-12: such as prompts 1-5 on pages) to generate a structured output (id.) and pass the structured output to a second model (Xq: pp 4: such as a plugin or other framework addressed in the prompt for which the response is formatted) the pipeline inherently comprising the claimed computing processing and memory components and at least strongly suggests the recited display and sound output component (Xq: generally: such as a system operative to receive, display, etc. an input prompt; communicate same to a first, second, etc. model; receive and output data based thereon) and operative to generate a generative artificial intelligence model (LXM) prompt based on a textual input (Xq: pp 4-12: the prompts) apply the generated LXM prompt to a local or remote LXM to receive an LXM response (Xq: pp 4-12: such as by output of the prompts to a Chat-GPT model); determine, by using a sound-generating artificial intelligence (AI) model, appropriate sounds that match the received LXM responses (Xq: pp 4-12: such as output by Chat-GPT model of structured data formatted for the second model; said second model operative to utilize the structured data to generate sound) output the determined context-appropriate sounds on the sound-producing component (id.: such as by execution of the play command). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to improve the Lev in view of Do system and method such as by utilizing a two stage pipeline comprising a generative artificial intelligence model such as ChatGPT and a music generation model as taught or suggested by Xq for at least the purpose of optimizing sensory augmentation routines appropriate to portions of text, context, user profile and narrative data thereof, etc. to thereby audibly confer a mood of a book subsection based on parameters thereof and upon a user reading said subsection; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 4
Lev in view of Do in view of Xq teaches or suggests:
The computing device of claim 1, wherein the at least one processor is configured to determine context-appropriate sounds for the subsection of the eBook on which the reader is currently focused based on the received LXM response by determining the context-appropriate sounds based on character analysis information included in the received LXM response, wherein the character analysis information characterizes at least one of an age, gender, or personality of a character in the subsection of the eBook on which the reader is currently focused (Lev: ¶ 22: system determines music based on a context of a character, personality, etc. such as descriptive of the character in a perilous or thrilling context, within a determined section of the text; analysis of a character in context is considered to reify personality). The claim is considered obvious over Lev as modified by Do and Xq as addressed in the base claim as it would have been obvious to apply the further teaching of Lev, Do, and/or Xq to the modified device of Lev, Do, and Xq; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 5
Lev in view of Do in view of Xq teaches or suggests:
The computing device of claim 1, further comprising an eye-tracking sensor or a gaze-tracking sensor, wherein the at least one processor is configured to determine the subsection of the eBook on which the reader is focused by using at least one or more of the eye-tracking sensor or the gaze-tracking sensor (Lev: ¶ 24: system utilizes an image capture device as a gaze tracking sensor for the purpose of determining a portion of a text currently being read for output of augmentations and predictively determine augmentation information to output with respect to upcoming potions). The claim is considered obvious over Lev as modified by Do and Xq as addressed in the base claim as it would have been obvious to apply the further teaching of Lev, Do, and/or Xq to the modified device of Lev, Do, and Xq; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 6
Lev in view of Do in view of Xq teaches or suggests:
The computing device of claim 4, wherein the at least one processor is further configured to adjust the context-appropriate sounds in response to determining that the reader is focused on a dialogue-focused passage identified in the received LXM response. While Lev in view of Xq determines text and context of passages in a text for the determination and generation of audio with respect thereto neither Lev nor Xq explicitly discusses the determination of a dialog portion of a text. Examiner has taken official notice which Applicant has failed to timely and explicitly traverse and it is thus accepted as Admitted Prior Art (APA: please see MPEP 2144.03) that determining textual metadata of portions of a text such as a dialog portion would have comprised an obvious inclusion for at least the purpose of generating audio with respect to the presence of dialog and appropriate to the dialog such as based on analysis of at least the text, context, narrative elements, etc. of the portion of the text; one of ordinary skill in the art would have expected only predictable results therefrom. The claim is thus considered obvious over Lev as modified by Do and Xq as addressed in the base claim as it would have been obvious to apply the further teaching of Lev, Do, and/or Xq to the modified device of Lev, Do, and Xq; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 7
Lev in view of Do in view of Xq teaches or suggests:
The computing device of claim 1, wherein the at least one processor is configured to determine the subsection of the eBook on which the reader is currently focused by using historical data to determine a reading pace, time estimates for sound effects, and transition points in a soundscape based on the determined reading pace (Lev: ¶ 23-28; Figs 3-5: system determines reading speed, pace etc. to generate time estimates for insertions of timewise portions of music such as to transition among such in a manner to flow with the text such as based on the user reading speed, pace, etc.). The claim is considered obvious over Lev as modified by Do and Xq as addressed in the base claim as it would have been obvious to apply the further teaching of Lev, Do, and/or Xq to the modified device of Lev, Do, and Xq; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 8
Lev in view of Do in view of Xq teaches or suggests:
The computing device of claim 1, wherein the at least one processor is further configured to determine music for one or more subsections in the section of the eBook based on the received LXM response (Lev: ¶ 23-28; Figs 3-5); (Xq: pp 4-12). The claim is considered obvious over Lev as modified by Do and Xq as addressed in the base claim as it would have been obvious to apply the further teaching of Lev, Do, and/or Xq to the modified device of Lev, Do, and Xq; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 9
Lev in view of Do in view of Xq teaches or suggests:
The computing device of claim 8, wherein the at least one processor is further configured to reduce or halt the music during an intense dialogue or transition sounds to match narrative shifts identified in the received LXM response (please see claims 6, 7 supra; the claim is considered to recite substantially similar subject matter and is similarly rejected). The claim is considered obvious over Lev as modified by Do and Xq as addressed in the base claim as it would have been obvious to apply the further teaching of Lev, Do, and/or Xq to the modified device of Lev, Do, and Xq; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 10, 19—the claims are considered to recite substantially similar subject matter to that of claim 1 supra and are similarly rejected.
Regarding claim 11—the claim is considered to recite substantially similar subject matter to that of claim 2 supra and is similarly rejected.
Regarding claim 12, 20—the claims are considered to recite substantially similar subject matter to that of claim 3 supra and are similarly rejected.
Regarding claim 13—the claim is considered to recite substantially similar subject matter to that of claim 4 supra and is similarly rejected.
Regarding claim 14—the claim is considered to recite substantially similar subject matter to that of claim 5 supra and is similarly rejected.
Regarding claim 15—the claim is considered to recite substantially similar subject matter to that of claim 6 supra and is similarly rejected.
Regarding claim 16—the claim is considered to recite substantially similar subject matter to that of claim 7 supra and is similarly rejected.
Regarding claim 17—the claim is considered to recite substantially similar subject matter to that of claim 8 supra and is similarly rejected.
Regarding claim 18—the claim is considered to recite substantially similar subject matter to that of claim 9 supra and is similarly rejected.
Response to Arguments
Applicant’s arguments in concert with claim amendments, see Remarks and Claims, filed 12/17/25, with respect to the rejection(s) of claim(s) 1-20 under 35 USC 103 over Levy and Xq have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of Levy, Do and Xq. Applicant’s remaining arguments have been addressed under the relevant headings supra.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL C MCCORD whose telephone number is (571)270-3701. The examiner can normally be reached 730-630 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CAROLYN EDWARDS can be reached at (571) 270-7136. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PAUL C MCCORD/Primary Examiner, Art Unit 2692