skip to main content

Title: Detecting Database File Tampering through Page Carving
Database Management Systems (DBMSes) secure data against regular users through defensive mechanisms such as access control, and against privileged users with detection mechanisms such as audit logging. Interestingly, these security mechanisms are built into the DBMS and are thus only useful for monitoring or stopping operations that are executed through the DBMS API. Any access that involves directly modifying database files (at file system level) would, by definition, bypass any and all security layers built into the DBMS itself. In this paper, we propose and evaluate an approach that detects direct modifications to database files that have already bypassed the DBMS and its internal security mechanisms. Our approach applies forensic analysis to first validate database indexes and then compares index state with data in the DBMS tables. We show that indexes are much more difficult to modify and can be further fortified with hashing. Our approach supports most relational DBMSes by leveraging index structures that are already built into the system to detect database storage tampering that would currently remain undetectable.
Authors:
; ; ; ; ;
Award ID(s):
1656268
Publication Date:
NSF-PAR ID:
10057816
Journal Name:
21st International Conference on Extending Database Technology
Page Range or eLocation-ID:
121-132
Sponsoring Org:
National Science Foundation
More Like this
  1. Database Management Systems (DBMSes) secure data against regular users through defensive mechanisms such as access control, and against privileged users with detection mechanisms such as audit logging. Interestingly, these security mechanisms are built into the DBMS and are thus only useful for monitoring or stopping operations that are executed through the DBMS API. Any access that involves directly modifying database files (at file system level) would, by definition, bypass any and all security layers built into the DBMS itself. In this paper,we propose and evaluate an approach that detects direct modifications to database files that have already bypassed the DBMS and its internal security mechanisms. Our approach applies forensic analysis to first validate database indexes and then compares index state with data in the DBMS tables. We show that indexes are much more difficult to modify and can be further fortified with hashing. Our approach supports most relational DBMSes by leveraging index structures that are already built into the system to detect database storage tampering that would currently remain undetectable.
  2. Stefanidis, K. ; Golab, L. (Ed.)
    Secondary indexes in relational database systems are traditionally built under the assumption that one data record maps to one indexed value. Nowadays, particularly in NoSQL systems, single data records can hold collections of values that users want to access efficiently in an ad-hoc manner. Multi-valued indexes aim to give users the best of both worlds: (i) to keep a more natural data model of records with collections of values, and (ii) to reap the benefits of a secondary index. In this paper, we detail the steps taken to realize multi-valued indexes in AsterixDB, a Big Data management system with a structured query language operating over a collection of docu- ments. This includes (a) creating the specification language for such indexes, (b) illustrating data flows for bulk-loading and maintaining an index, and (c) discussing query plans to take advantage of multi-valued indexes for use in predicates with existential and universal quantification. We conclude with ex- periments that compare AsterixDB multi-valued indexes against similar indexes in MongoDB and Couchbase Query.
  3. The contents of RAM in an operating system (OS) are a critical source of evidence for malware detection or system performance profiling. Digital forensics focused on reconstructing OS RAM structures to detect malware patterns at runtime. In an ongoing arms race, these RAM reconstruction approaches must be designed for the attack they are trying to detect. Even though database management systems (DBMS) are collectively responsible for storing and processing most data in organizations, the equivalent problem of memory reconstruction has not been considered for DBMS-managed RAM. In this paper, we propose and evaluate a systematic approach to reverse engineer data structures and access patterns in DBMS RAM. Rather than develop a solution for specific scenarios, we describe an approach to detect and track any RAM area in a DBMS. We evaluate our approach with the four most common RAM areas in well-known DBMSes; this paper describes the design of each area-specific query workload and the process to capture and quantify that area at runtime. We further evaluate our approach by observing the RAM data flow in presence of built-in DBMS encryption. We present an overview of available DBMS encryption mechanisms, their relative advantages and disadvantages, and then illustrate the practicalmore »implications for the four memory areas.« less
  4. Security investigations often rely on forensic tools to deliver the necessary supporting evidence. It is therefore imperative that forensic tools are scientifically tested in both their accuracy and capabilities. The primary means to develop and validate forensic tools is by evaluating them against a set of known answers (i.e., a data corpus). While researchers have long recognized the need for standardized forensic corpora, there are few such tools or datasets available, particularly for database management systems (DBMS). In fact, there are currently no publicly available tools that can generate a DBMS dataset for forensic testing. In this paper, we share SysGen, a customizeable data generator and a pre-built corpus that offers a reference for most major relational DBMSes. The pre-built corpus includes individual DBMS files, the full disk snapshot, the RAM snapshot, and network packets taken from a set of clean virtual machines. SysGen can be easily adapted to execute a custom workload scenario, capturing a new data corpus; it can also create other variations of full system snapshots, even beyond DBMS testing.
  5. Abstract
    <p>Binder is a publicly accessible online service for executing interactive notebooks based on Git repositories. Binder dynamically builds and deploys containers following a recipe stored in the repository, then gives the user a browser-based notebook interface. The Binder group periodically releases a log of container launches from the public Binder service. Archives of launch records are available here. These records do not include identifiable information like IP addresses, but do give the source repo being launched along with some other metadata. The main content of this dataset is in the <code>binder.sqlite</code> file. This SQLite database includes launch records from 2018-11-03 to 2021-06-06 in the <code>events</code> table, which has the following schema.</p> <code>CREATE TABLE events( version INTEGER, timestamp TEXT, provider TEXT, spec TEXT, origin TEXT, ref TEXT, guessed_ref TEXT ); CREATE INDEX idx_timestamp ON events(timestamp); </code> <ul><li><code>version</code> indicates the version of the record as assigned by Binder. The <code>origin</code> field became available with version 3, and the <code>ref</code> field with version 4. Older records where this information was not recorded will have the corresponding fields set to null.</li><li><code>timestamp</code> is the ISO timestamp of the launch</li><li><code>provider</code> gives the type of source repo being launched (&#34;GitHub&#34; is by far the most common). The rest of theMore>>