Prosecution Insights
Last updated: April 19, 2026
Application No. 18/315,530

PROCESSING DATA IN A DATA FORMAT WITH KEY-VALUE PAIRS

Non-Final OA §101§103§112
Filed
May 11, 2023
Examiner
QIAN, SHELLY X
Art Unit
2154
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
5 (Non-Final)
37%
Grant Probability
At Risk
5-6
OA Rounds
3y 11m
To Grant
57%
With Interview

Examiner Intelligence

Grants only 37% of cases
37%
Career Allow Rate
47 granted / 126 resolved
-17.7% vs TC avg
Strong +19% interview lift
Without
With
+19.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 11m
Avg Prosecution
28 currently pending
Career history
154
Total Applications
across all art units

Statute-Specific Performance

§101
16.7%
-23.3% vs TC avg
§103
64.0%
+24.0% vs TC avg
§102
10.6%
-29.4% vs TC avg
§112
6.3%
-33.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 126 resolved cases

Office Action

§101 §103 §112
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 11/18/2025 has been entered. Response to Arguments Applicant's arguments filed 11/18/2025 have been fully considered but they are not persuasive. Applicant states (pp. 9-11) that none of Xu, Beyer and Yueh teach all claim 1 limitations. However as explained below, Xu teaches limitations “extracting…”and “reorganizing…”; Beyer teaches limitation “grouping…”; and Yueh teaches limitations “compressing…”, “determining a reused part…” and “arranging…” Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Beyer and Yueh to Xu. One having ordinary skill in the art would have found motivation to utilize the deduplication of Yueh to losslessly compress and minimize data redundancy globally across the nested JSON objects of Beyer representing unstructured documents of Xu. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. Claims 1-2, 4-11, 13-20 are rejected under 35 U.S.C. 112(b) because the term “a large volume of records” in claim 1 is a relative term which renders the claim indefinite. The term is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The same problem exists in independent claims 10 and 19; and in dependent claims 2, 4-9, 11, 13-18 and 20 by inheritance. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-2, 4-11 and 13-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea as a mental process without significantly more. Claims 1, 10 and 19 are rejected under 35 U.S.C. 101. Step 1 Claim 1 recites “A computer-implemented method”. Claim 10 recites “A computer system” comprising computer processors, and thus is not directed to software per se. Claim 19 recites “A computer program product” comprising computer-readable storage media. According to the instant specification (spec. [0019]), a computer readable storage medium is not to be construed as storage in the form of transitory signals per se. Therefore, all three claims are directed to a statutory category. Step 2A – Prong One Claims 1, 10 and 19 recite the following limitations directed to an abstract idea: “extracting all keys from…data objects…, wherein duplicated keys are removed” recites an abstract idea as a mental process. One can mentally observe or evaluate to extract a key simply by identifying a piece of data from a data object. “reorganizing the data…into a format with a key portion followed by a value portion, wherein the extracted keys are arranged…in a predetermined order, and the values…are arranged…in an order…of the keys” recites an abstract idea as a mental process. One can mentally evaluate or judge to arrange extracted keys in an order simply by sorting the keys. “grouping the values corresponding to each of the extracted keys…into a respective value groups…arranged in respective value partitions of the value portion in the order…of the keys” recites an abstract idea as a mental process. One can mentally evaluate or judge to group and partition values sharing the same key, and to sort the resulting groups in the same order as the corresponding keys. “determining a reused part…wherein the reused part…is replaced by a reference to the reused part” recites an abstract idea as a mental process. One can mentally evaluate or judge to replace copies of a value by references to a single copy of the value. “compressing data into a reorganized data format” recites an abstract idea as a mental process. One can mentally evaluate or judge to format data in a way to reduce its size in storage. Step 2A – Prong Two This judicial exception is not integrated into a practical application. Claims 1, 10 and 19 include addition elements “processor”, “storage media” and "instructions", which are high-level recitations of generic computer components and functions that represent mere instructions to apply on a computer per MPEP §2106.05(f). Viewing the additional limitations together and the claim as a whole, nothing provides integration into a practical application. Step 2B The conclusions on the mere instructions to apply the abstract idea using generic computer components and functions carry over and do not add significantly more or provide any "inventive concept". In summary, claims 1, 10 and 19 are not eligible. Claims 2, 4-9, 11, 13-18 and 20 depend on claims 1, 10 and 19 respectively and recite the same abstract idea. Step 2A Prong One The following claims recite additional elements that are mentally performable. Claims 2 and 11 recite sorting value groups in the same order as the corresponding keys. One can mentally evaluate or judge to sort values in the same order as the keys. Claims 4-5 and 13-14 recite storing single copies of shared values in reused-part partitions. One can mentally evaluate or judge by replacing all duplicate values in reused-part partitions by their corresponding single common values. Claims 6, 15 and 20 recite using a search keyword to match keys in a range of the sorted list of keys. One can mentally evaluate or judge by matching a list of keys to a search keyword. Claims 7 and 16 recite sorting keys in the same order as traversing the leaf nodes of a tree structure. One can mentally evaluate or judge by traversing the leaf nodes of a tree and order the keys encountered accordingly. Step 2A Prong Two Claims 9 and 18 recite additional element “JSON”, which is a high-level recitation of generic computer components and functions. Step 2B Claim 8 recites transmitting data in the reorganized format, which qualifies as “i. Receiving or transmitting data over a network”, and is recognized by the courts as well-understood, routine, and conventional per MPEP §2106.05(d)(II). Claim 17 recites storing data in the reorganized format, which qualifies as “iv. Storing and retrieving information in memory”, and is recognized by the courts as well-understood, routine, and conventional per MPEP §2106.05(d)(II). In summary, these dependent claims do not add any additional elements sufficient to make the claims non-abstract. Therefore, they are not eligible accordingly. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-2, 4-11 and 13-20 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. US patent application 2022/0309549 [herein “Xu”], in view of Beyer et al. US patent application 2010/0211572 [herein “Beyer”] and Data deduplication vs. data compression. 2022, pp. 1-10. https://www.lytics.com/blog/data-deduplication-vs-data-compression/ [herein “DedupCompress], and further in view of Yueh. US patent 8,190,835 [herein “Yueh”]. Claim 1 recites “A computer-implemented method for processing data in a data format with key-value pairs, comprising: extracting all keys from a plurality of data objects of the data from a large volume of records, wherein duplicated keys are removed such that only one of a same key remains;” Xu automatically converts unstructured documents (i.e., data objects) into lists of structured key-value pairs (i.e., data format) [0030], leveraging machine-learned detection models to accurately identify (i.e., extract) key-value pairs [0032]. Claim 1 further recites “reorganizing the data in the data format into a format with a key portion followed by a value portion, wherein the extracted keys are arranged in the key portion in a predetermined order, and”. In Xu, a key detection model identifies a set of keys (i.e., key portion), a value detection model identifies a set of values (i.e., value portion), and a key-value pair is detected by a match score exceeding a threshold [0065]. The detected keys are ordered (i.e., reorganized) by their visual appearance in the document (i.e., predetermined order) [0047], based on which their matching (i.e., corresponding) values can be ordered (i.e., arranged) accordingly. Xu does not disclose limitations “the values of each of the plurality of data objects are arranged in the value portion in an order corresponding to the order of the keys; grouping the values corresponding to each of the extracted keys for the plurality of data objects into a respective value group, wherein the respective value groups are arranged in respective value partitions of the value portion in the order corresponding to the order of the keys;” and “arranging all respective value groups in a value portion in an order corresponding to an order of the keys, such that a consistent order is used for the keys and values; and”. However, Beyer teaches a JSON document as a list of name-value pairs (i.e., key-value pairs), where a value is a JSON object that can be an atom, an array of values, or a nested JSON object (Beyer: [0005]). Each name in a name-value pair is unique (i.e., only one of a same key remains) within the document (Beyer: [0006]), because two name-value pairs (n,v1) and (n,v2) with the same name can be equivalently represented (i.e., grouped) as one name-value pair (n,[v1,v2]), whose value is an array (i.e., partition) of values in the same (i.e., consistent) order as that of the corresponding name-value pairs. Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Beyer to Xu. One having ordinary skill in the art would have found motivation to adopt the nested JSON object model of Beyer to represent the key-value pairs extracted from unstructured documents by Xu for better indexing and searching. Xu and Beyer do not disclose limitations “determining a reused part shared by the values of different data objects of the plurality of data objects, wherein the reused part of the values sharing the reused part in the value portion is replaced by a reference to the reused part, wherein the value portion comprises a reused-part partition in which the reused part is arranged in a position corresponding to a reference to the reused part;” and “compressing data into a reorganized data format with the key portion followed by a value portion.” However, deduplication is a form of lossless compression, where duplicate bits of data are replaced with a pointer (DedupCompress: pp. 4/10). Yueh deduplicates (i.e., compresses) redundant data globally across files in a shared storage architecture (Yueh: 2:64-66). Yueh breaks (i.e., partitions) each file (i.e., data object) or other digital sequence (i.e., portion) into blocks (i.e., values) of a fixed or variable size (Yueh: 7:59-62), and hash each block. A new block is duplicate of an existing block if they have the same hash, in which case the new block is replaced with a pointer (i.e., reference) to a single (i.e., shared) instance of the block (i.e., reused part) in storage (Yueh: 3:9-21). A hash recipe is used to reconstruct a file (i.e., arrange in position) from the list of hashes of blocks in the file (Yueh: 6:64-7:6). Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Beyer, DedupCompress and Yueh to Xu. One having ordinary skill in the art would have found motivation to utilize the deduplication of Yueh to losslessly compress and minimize data redundancy globally across the nested JSON objects of Beyer representing unstructured documents of Xu. Claims 10 and 19 are analogous to claim 1, and are similarly rejected. Claim 2 recites “The computer-implemented method of claim 1, wherein the respective value groups are arranged in the value portion in the order corresponding to the order of the keys.” Xu orders (i.e., arranges) the detected (i.e., extracted) keys by their visual appearance in the document (i.e., data object) [0047], based on which their matching (i.e., corresponding) values (i.e., value portion) can be ordered accordingly. Xu and Beyer teach claim 1, where a JSON document in Beyer is a list of name-value pairs, and a value is a JSON object which can be an atom, an array (i.e., group) of values, or a nested JSON object (Beyer: [0005]). Claim 11 is analogous to claim 2, and is similarly rejected. Claim 4 recites “The computer-implemented method of claim 1, wherein the reused part is arranged in the reused-part partition as one of a list of reused parts shared by the values of the different data objects, and the reference to the reused part comprises the position of the reused part within the list of the reused-part partition.” Xu, Beyer and Yueh teach claim 1, where Yueh breaks (i.e., partitions) a file (i.e., data object) or other digital sequence into blocks (Yueh: 7:59-62), and hash each block (i.e., value). A new block is duplicate of an existing block if they have the same hash, in which case the new block is replaced with a pointer (i.e., reference) to a single (i.e., shared) instance of the block (i.e., reused part) in storage (Yueh: 3:9-21). A hash recipe is used to reconstruct (i.e., arrange in position) a file from the list of hashes of blocks in the file (Yueh: 6:64-7:6). Claim 13 is analogous to claim 4, and is similarly rejected. Claim 5 recites “The computer-implemented method of claim 1, wherein the value partitions of the value portion following the reused-part partition, wherein the reused part of the values sharing the reused part in a respective value partition is replaced by the reference to the reused part.” Xu, Beyer and Yueh teach claim 1, where Yueh breaks (i.e., partitions) a file or other digital sequence (i.e., value portion) into blocks (Yueh: 7:59-62), and hash each block (i.e., value). A new block is duplicate of an existing block if they have the same hash, in which case the new block is replaced with a pointer (i.e., reference) to a single (i.e., shared) instance of the block (i.e., reused part) in storage (Yueh: 3:9-21). Claim 14 is analogous to claim 5, and is similarly rejected. Claim 6 recites “The computer-implemented method of claim 1, further comprising: identifying one of the keys of the plurality of data objects of the data based on a search keyword; determining a search range within the value portion corresponding to the identified key based on the order of the keys; and”. Xu orders the detected keys by their visual appearance in the document (i.e., data object) [0047], based on which their matching values (i.e., value portion) are ordered accordingly. Xu and Beyer teach claim 1, where a JSON document of Beyer is a list of name-value pairs, and a value is a JSON object which can be an atom, an array of values, or a nested JSON object (Beyer: [0005]). Xu does not disclose limitation “searching for the search keyword within the search range corresponding to the identified key.” However, Beyer indexes JSON documents in two separate files, one for document IDs and the other for payloads (Beyer: [0034]). Using the parse tree of a search query (i.e., keyword), Beyer searches the first file to identify candidate documents (i.e., keys) that may match the query. Beyer then searches the second file (i.e., search range) to check if those candidates match the query exactly (Beyer: [0039]). Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Beyer to Xu. One having ordinary skill in the art would have found motivation to adopt the nested JSON object model of Beyer to represent the key-value pairs of Xu extracted from unstructured documents for better indexing and searching. Claims 15 and 20 are analogous to claim 6, and are similarly rejected. Claim 7 recites “The computer-implemented method of claim 1, wherein the predetermined order for the extracted keys is an order of traversing leaf nodes of a tree structure of the data in the data format with key-value pairs.” Xu orders the detected (i.e., extracted) keys by their visual appearance in the document (i.e., predetermined order) [0047], based on which their matching values are ordered accordingly. Xu and Beyer teach claim 1, where a JSON document (i.e., data format) of Beyer is a list of name-value pairs (i.e., key-value pairs), and a value is a JSON object which can be an atom, an array of values, or a nested JSON object (Beyer: [0005]). Xu does not disclose this claim; however, Beyer indexes JSON documents in two separate files, one for document IDs and the other for payloads (Beyer: [0034]). Using the parse tree of a search query, Beyer searches the first file to identify candidate documents that may match the query. Beyer then searches the second file to check if those candidates match the query exactly (Beyer: [0039]), by recursively traversing the tree to atom (i.e., leaf) nodes (Beyer: [0048]). Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to apply the teachings of Beyer to Xu. One having ordinary skill in the art would have found motivation to adopt the nested JSON object model of Beyer to represent the key-value pairs of Xu extracted from unstructured documents for better indexing and searching. Claim 16 is analogous to claim 7, and is similarly rejected. Claim 8 recites “The computer-implemented method of claim 1, further comprising: transmitting the data in the reorganized format.” After identifying key-value pairs (i.e., reorganized format) from documents, Xu outputs (i.e., transmits) the identified key-value pairs [0063] using input/output devices [0081]. Claim 9 recites “The computer-implemented method of claim 1, wherein the data format comprises JavaScript Object Notation (JSON)”. Xu and Beyer teach claim 1, where a JSON document (i.e., data format) of Beyer is a list of name-value pairs, where a value is a JSON object which can be an atom, an array of values, or a nested JSON object (Beyer: [0005]). Claim 18 is analogous to claim 9, and is similarly rejected. Claim 17 recites “The computer system of claim 10, further comprising: storing the data in the reorganized format.” Xu stores data defining the identified key-value pairs (i.e., reorganized format) in a database accessible to users [0063]. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. For example, Storer et al. Secure data deduplication. StorageSS’08, pp. 1-10. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHELLY X. QIAN whose telephone number is (408)918-7599. The examiner can normally be reached Monday - Friday 8-5 PT. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Boris Gorney can be reached at (571)270-5626. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /BORIS GORNEY/Supervisory Patent Examiner, Art Unit 2154 /SHELLY X QIAN/Examiner, Art Unit 2154
Read full office action

Prosecution Timeline

May 11, 2023
Application Filed
May 05, 2024
Non-Final Rejection — §101, §103, §112
Aug 06, 2024
Response Filed
Nov 12, 2024
Final Rejection — §101, §103, §112
Jan 16, 2025
Response after Non-Final Action
Feb 20, 2025
Request for Continued Examination
Feb 27, 2025
Response after Non-Final Action
May 20, 2025
Non-Final Rejection — §101, §103, §112
Aug 14, 2025
Interview Requested
Aug 20, 2025
Examiner Interview Summary
Aug 20, 2025
Applicant Interview (Telephonic)
Aug 21, 2025
Response Filed
Sep 16, 2025
Final Rejection — §101, §103, §112
Oct 26, 2025
Interview Requested
Nov 18, 2025
Response after Non-Final Action
Dec 18, 2025
Request for Continued Examination
Jan 06, 2026
Response after Non-Final Action
Feb 20, 2026
Non-Final Rejection — §101, §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12578892
FINGERPRINT TRACKING STRUCTURE FOR STORAGE SYSTEM
2y 5m to grant Granted Mar 17, 2026
Patent 12475044
Method And System For Estimating Garbage Collection Suspension Contributions Of Individual Allocation Sites
2y 5m to grant Granted Nov 18, 2025
Patent 12450197
BACKGROUND DATASET MAINTENANCE
2y 5m to grant Granted Oct 21, 2025
Patent 12386904
SYSTEMS AND METHODS FOR MEASURING COLLECTED CONTENT SIGNIFICANCE
2y 5m to grant Granted Aug 12, 2025
Patent 12314225
CONTINUOUS INGESTION OF CUSTOM FILE FORMATS
2y 5m to grant Granted May 27, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

5-6
Expected OA Rounds
37%
Grant Probability
57%
With Interview (+19.4%)
3y 11m
Median Time to Grant
High
PTA Risk
Based on 126 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month