WO2015187187A1 - Journal events in a file system and a database - Google Patents
Journal events in a file system and a database Download PDFInfo
- Publication number
- WO2015187187A1 WO2015187187A1 PCT/US2014/048005 US2014048005W WO2015187187A1 WO 2015187187 A1 WO2015187187 A1 WO 2015187187A1 US 2014048005 W US2014048005 W US 2014048005W WO 2015187187 A1 WO2015187187 A1 WO 2015187187A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- metadata
- database
- file
- value
- update
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
Definitions
- FIG. 1 is a block diagram of an example computing device using journal events in a file system and a database
- FIG. 2 is a block diagram of an example computing environment using journal events in a file system and a database
- FIG. 3 is a flowchart of an example method for using journal events in a file system and a database
- FIG. 4 is a flowchart of an example method for using journal events in a file system and a database
- FIG. 5 is a block diagram of an example system using journal events in a file system and a database.
- Custom metadata allows a user to label a data with customized information that doesn't fit into any of the existing metadata fields.
- custom metadata may include genre, artist, year performed, etc.
- custom metadata may include project name, author, reviewed by, signatures, etc.
- Custom metadata may be associated with a file present in a file system.
- out-of-band custom metadata may be stored in a database distinct from a file system that includes in-file metadata.
- a traditional application or service for example, a backup and restore application, a file replication service, etc.
- a traditional application may only understand commonly used file semantics (such as Posix).
- Posix commonly used file semantics
- the present disclosure describes a mechanism to provide query performance and object semantics of a file through a database along with a simultaneous mapping to a traditional metadata namespace (for example, extended file attributes) in such a way that an ingest mechanism inserts custom metadata in a file metadata and a database via a single call.
- a traditional metadata namespace for example, extended file attributes
- such mechanism may ensure that both traditional applications as well as new applications may use common semantics to push custom metadata to a file and a database.
- the present disclosure describes using journal events to maintain a transactionally consistent mapping between metadata associated with a file in a file system and out-of-band metadata associated with the file in a database.
- a request may be received to insert, update, or delete metadata both to a file in a file system and into a database that stores out-of-band metadata associated with the file.
- a first journal event may be generated to insert, update, or delete the metadata into the database and define a first value in a consistent field in the database.
- the first journal event may be processed to insert, update, or delete the metadata into the database.
- the request to insert, update, or delete the metadata to the file in the file system is processed.
- journal events may be generated to define a second value in the consistent field in the database.
- the second journal event may then be processed to update the second value in the consistent field in the database.
- journal events maintain a transactionally consistent mapping between metadata associated with a file in a file system and out-of-band metadata associated with the file in a database.
- FIG. 1 is a block diagram of an example computing device 100 using journal events in a file system and a database.
- Computing device 100 may be a server, a desktop computer, a notebook computer, a tablet computer, and the like.
- computing device 100 may be a file server 100.
- File server 100 may include a processor 102 and a machine-readable storage medium 104.
- Processor 102 may be any type of Central Processing Unit (CPU), microprocessor, or processing logic that interprets and executes machine- readable instructions stored in machine-readable storage medium 104.
- CPU Central Processing Unit
- microprocessor or processing logic that interprets and executes machine- readable instructions stored in machine-readable storage medium 104.
- Machine-readable storage medium 104 may be a random access memory (RAM) or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by processor 102.
- machine-readable storage medium 104 may be Synchronous DRAM (SDRAM), Double Data Rate (DDR), Rambus DRAM (RDRAM), Rambus RAM, etc. or storage memory media such as a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, and the like.
- machine-readable storage medium 104 may be a non-transitory machine-readable medium.
- machine-readable storage medium 104 may store a file system 106, a database 108, and a custom metadata module 1 10.
- module may refer to a software component (machine readable instructions), a hardware component or a combination thereof.
- a module may include, by way of example, components, such as software components, processes, tasks, co-routines, functions, attributes, procedures, drivers, firmware, data, databases, data structures, Application Specific Integrated Circuits (ASIC) and other computing devices.
- a module may reside on a volatile or non-volatile storage medium (e.g. 104) and configured to interact with a processor (e.g. 102) of a computing device (e.g. 100).
- File system 106 may be a local file system or a scale-out file system such as a shared file system or a network file system.
- Examples of a shared file system may include a Network Attached Storage (NAS) file system or a cluster file system.
- NAS Network Attached Storage
- Examples of a network file system may include a distributed file system or a distributed parallel file system.
- file system 106 may be used for storage and retrieval of data from a storage device. Typically, each piece of data is called a "file" (or file object).
- file system 106 may include at least one file.
- a file in the file system may be associated with file metadata and/or custom metadata. Custom metadata may be defined by a user.
- Metadata (or custom metadata) associated with a file in a file system may be termed as "in-file” metadata.
- in-file metadata may include extended file attributes. Extended file attributes enable users to associate files with metadata not interpreted by the file system, whereas regular attributes have a purpose strictly defined by the file system. For example, users may use these attributes to store the name of an author of a document, a checksum, a digital signature, etc.
- File system 106 may include a journaling system 1 12 and a virtual file system interface 1 14.
- journaling system 1 12 may maintain a special file called a journal that may be used to repair any inconsistencies that may occur as the result of an improper shutdown of a computer.
- Journaling system 1 12 may write metadata into the journal. In the event of a system crash, if a given set of updates have not been implemented, the system may read the journal in order to roll up to the most recent consistent data point.
- file system may generate a first journal event to insert, update, or delete the metadata into the database (example, 108) and define a first value in a consistent field of a content metadata table in the database (example, 108).
- the request to insert, update, or delete metadata to a file may be transferred to a Virtual File System (VFS) layer 1 14 or, more specifically, to specific VFS handler of the file system (example, 106).
- VFS Virtual File System
- Virtual File System (VFS) layer 1 14 may act as an interface for an underlying operating system to support a variety of file systems so that the system could handle various types of I/O system calls.
- the VFS layer 1 14 is Linux VFS layer.
- file system (example, 106) may generate a first journal event to insert, update, or delete the metadata into the database (example, 108) and define a first value in a consistent field in the database.
- the VFS handler for a SetXattr call may determine if it's an insert or update metadata call, and based on said determination may generate a first journal event and define a first value in a consistent field (for example, the field may be set to 0) of a content metadata table in the database (example, 108).
- the VFS handler for RemoveXattr may carry out the same process for a delete metadata call.
- the first journal event may be processed by journaling system 1 12 to insert, update, or delete the metadata into the database (example, 108).
- File system (example, 106) may then process the request to insert, update, or delete the metadata to the file in the file system (example, 106).
- journaling system 1 12 may generate a second journal event to define a second value in the consistent field in the database (example, 108).
- the second journal event may then be processed by file system (example, 106) to update the second value in the consistent field in the database (example, 108).
- Database 108 may be a repository that stores an organized collection of data.
- database 108 may store an out-of-band metadata of a file.
- "Out-of-band metadata" of a file may be defined as metadata (or custom metadata) that may be stored in a location (example, a database) other than the file system.
- Database (example, 108) may include one or more tables for storing data.
- At least one table in the database may be used to store out-of-band content metadata of a file.
- Such table may be called "content metadata" table.
- a table in the database (such as content metadata table) may include a consistent field.
- a consistent field is based on consistency property that ensures that any transaction will bring the database from one valid state to another. In other words, a consistent field may change a value defined for it to reflect a successful transaction.
- a first value (for instance, 0) may be defined in a consistent field of a database (example, 108) upon generation of a first journal event to insert, update, or delete metadata into the database (example, 108).
- a second value (for example, 1 ) may be defined in the consistent field of the database (example, 108) upon generation of a second journal event which may occur upon processing of a request to insert, update, or delete metadata to a file in a file system (example, 106).
- database 108 may be a distributed database that provides high query rates and high-throughput updates using a batching process.
- Database 108 may use a pipelined architecture that provides access to update batches at various points through processing.
- database 108 may be based on a batched update model, which decouples update processing from read-only queries (i.e. query processing task). In this model, the updates may be batched and processed in the background, and do not interfere with the foreground query workload.
- Database 108 may allow different stages of the updates in the pipeline to be queried independently. Queries that could use slightly out-of-date data may use only the final output of the pipeline, which may correspond to the completely ingested and indexed data. Queries that require even fresher results may access data at any stage in the pipeline.
- database 108 may be integrated into file system 106.
- Custom metadata module 1 10 may include instructions to receive a request to insert, update, or delete metadata both to a file in a file system (example, 106) and into a database (example, 108) that may store out-of- band metadata associated with the file of said file system.
- aforesaid request may be in the form of representational state transfer (REST) call.
- the request may use another data access protocol such as, but not limited to, Network File System (NFS), Server Message Block (SMB), and the like.
- Custom metadata module 1 10 may include a suitable interface to handle the aforementioned request that may be generated using either of these protocols.
- said REST call may be handled by Hypertext Transfer protocol (HTTP) service that may pass on the request to custom metadata module 1 10.
- custom metadata module 1 10 may issue a system call to the file for which the request to insert, update, or delete metadata is intended. The system call is then forwarded to the VFS layer of the file system.
- custom metadata module 1 10 may issue a Linux call Setfattr to the file for which metadata needs to be set (insert, update, or delete). The call may then be handled by the Linux VFS layer and passed on to the file system specific VFS handler.
- FIG. 2 is a block diagram of an example computing environment 200 using journal events in a file system and a database.
- Computing environment 200 may include client systems 202, 204, and 206, a file server 208, and a storage device 210.
- the number of client systems 202, 204, and 206, file server 208, and storage device 210 shown in FIG. 1 is for the purpose of illustration only and their number may vary in other implementations.
- computing environment 200 may represent a scale-out file system.
- Client systems 202, 204, and 206 may each be a computing device such as a desktop computer, a notebook computer, a tablet computer, a mobile phone, personal digital assistant (PDA), a server, and the like. Client systems 202, 204, and 206, may communicate with file server 208 via a computer network 212.
- Computer network 212 may be a wireless or wired network. Computer network 212 may include, for example, a Local Area Network (LAN), a Wireless Local Area Network (WAN), a Metropolitan Area Network (MAN), a Storage Area Network (SAN), a Campus Area Network (CAN), or the like. Further, computer network 212 may be a public network (for example, the Internet) or a private network (for example, an intranet).
- client systems 202, 204, and 206 may be directly coupled to file server 208.
- client systems 202, 204, and 206 may host one or more applications 224, 226, and 228 that may, in an example, send a request to file server (example, 208) to insert, update, or delete metadata both to a file in file system and into a database that may store out-of-band metadata of said file.
- an application may use a data access protocol such as, but not limited to, Hypertext Transfer Protocol (HTTP), Network File System (NFS), Server Message Block (SMB), and the like, to read and/or write data such as files, metadata, custom metadata, and the like, from file server 208.
- HTTP Hypertext Transfer Protocol
- NFS Network File System
- SMB Server Message Block
- File server 208 may include a non-transitory machine- readable storage medium 214 that may store machine executable instructions.
- file server 208 may be similar to file server 100 described earlier. Accordingly, components of file server 208 that are similarly named and illustrated in file server 100 may be considered similar.
- components or reference numerals of FIG. 2 having a same or similarly described function in FIG. 1 are not being described in connection with FIG. 2. Said components or reference numerals may be considered alike.
- machine-readable storage medium 214 may store a file system 106, a database 108, a custom metadata module 1 10, and an archive journal scanner module 220.
- Archive journal scanner module 220 may include instructions to process a journal generated by journaling subsystem.
- archive journal module may include instructions to identify, from a database, a consistent field that includes a first value which may be defined for the consistent field upon generation of a first journal event to insert, update, or delete metadata into the database.
- Archive journal scanner module 220 may further include instructions to identify a metadata entry related to the consistent field with the first value in the database.
- Archive journal scanner module 220 may then determine that a metadata entry corresponding to the metadata entry related to the consistent field with the first value is present in metadata of the file in a file system and, in response to the determination, may update the first value to a second value in the consistent field of the database.
- Storage device 210 may be used to store and retrieve data stored by file system 106.
- Some non-limiting examples of storage device 210 may include a Direct Attached Storage (DAS) device, a Network Attached Storage (NAS) device, a tape drive, a magnetic tape drive, or a combination of these devices.
- Storage device 210 may be directly coupled to file server 106 or may communicate with file server 106 via a computer network 222.
- Such a computer network 222 may be similar to the computer network 212 described above.
- computer network 222 may be a Storage Area Network (SAN).
- SAN Storage Area Network
- FIG. 3 is a flowchart of an example method 300 for using journal events in a file system and a database.
- the method 300 may at least partially be executed on a computing device 100 of FIG. 1 or file server 208 of FIG. 2. However, other computing devices may be used as well.
- a request may be received to insert, update, or delete metadata both to a file in a file system (example, 106) and into a database (example, 108) that may store out-of-band metadata associated with the file.
- such request may received by custom metadata module.
- the file system may generate a first journal event to insert, update, or delete the metadata into the database (example, 108) and define a first value (for example, 0) in a consistent field in the database (example, 108).
- the consistent field may be present in a content metadata table of the database (example, 108).
- a journaling system may process the first journal event to insert, update, or delete the metadata into the database (example, 108). In other words, the journaling system may insert, update, or delete the metadata into the database (example, 108).
- the file system may process the request to insert, update, or delete the metadata to the file in the file system (example, 106).
- the file system (example, 106) may insert, update, or delete the metadata into the database (example, 108).
- the file system may generate a second journal event to define a second value (for example, 1 ) in the consistent field in the database (example, 108).
- the journaling system may process the second journal event to update the second value in the consistent field in the database (example, 108). In other words, the journaling system may replace the first value (for example, 0) with a second value (for example, 1 ) in the consistent field.
- example method 400 is a flowchart of an example method 400 for using journal events in a file system and a database.
- the method 400 may at least partially be executed on a computing device 100 of FIG. 1 or file server 208 of FIG. 2. However, other computing devices may be used as well.
- example method 400 may be used to ensure that there are no metadata inconsistencies between metadata associated with a file in a file system and out-of-band metadata associated with the file in a database (example, 108) if a system crash takes place during processing of a request to insert, update, or delete metadata both to a file in a file system and into a database (example, 108) that stores out-of-band metadata associated with the file.
- an archive journal scanner module may identify, from a database (example, 108) that may store out-of-band metadata associated with a file, a consistent field (or fields) that includes a given value (or first value). For instance, a given value may be zero (0).
- an archive journal scanner module may identify a metadata entry related to the identified consistent field with the first value in the database (example, 108). In other words, a journal module may identify a metadata entry that may have been inserted, updated, or deleted against a consistent field with a given value.
- an archive journal scanner module may determine that a metadata entry corresponding to the metadata entry related to the consistent field with the given value (i.e. first value) is present in metadata of a file in a file system (example, 106). In other words, a determination is made whether there's a metadata entry in the file system that matches with a metadata entry against a given value (for example, 0) in the database (example, 108).
- the first value (for example, 0) may be update to a second value (for example, 1 ) in said consistent field of the database. Presence of a matching metadata entry both in a file system and a database indicates that a request to insert, update, or delete metadata both to a file in the file system (example, 106) and into the database (example, 108) is successful. In such case, a first value in a consistent value may be updated to a second value to indicate a successful transaction.
- FIG. 5 is a block diagram of an example system 500 using journal events in a file system and a database.
- System 500 includes a processor 502 and a machine-readable storage medium 504 communicatively coupled through a system bus.
- system 500 may be analogous to computing device 100 of FIG. 1 or file server 208 of FIG. 2.
- Processor 502 may be any type of Central Processing Unit (CPU), microprocessor, or processing logic that interprets and executes machine-readable instructions stored in machine-readable storage medium 504.
- Machine-readable storage medium 504 may be a random access memory (RAM) or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by processor 502.
- machine-readable storage medium 504 may be Synchronous DRAM (SDRAM), Double Data Rate (DDR), Rambus DRAM (RDRAM), Rambus RAM, etc. or a storage memory media such as a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, and the like.
- machine-readable storage medium 504 may be a non-transitory machine-readable medium.
- Machine-readable storage medium 504 may store instructions 506, 508, 510, 512, 514, and 516.
- instructions 506 may be executed by processor 502 to receive a request to insert, update, or delete metadata both to a file in a file system and into a database that stores out-of-band metadata associated with the file.
- Instructions 508 may be executed by processor 502 to generate a first journal event to insert, update, or delete the metadata into the database and define a first value in a consistent field in the database.
- Instructions 510 may be executed by processor 502 to process the first journal event to insert, update, or delete the metadata into the database.
- Instructions 512 may be executed by processor 502 to process the request to insert, update, or delete the metadata to the file in the file system.
- Instructions 514 may be executed by processor 502 to generate a second journal event to define a second value in the consistent field in the database.
- Instructions 516 may be executed by processor 502 to process the second journal event to update the second value in the consistent field in the database.
- Embodiments within the scope of the present solution may also include program products comprising non- transitory computer-readable media for carrying or having computer- executable instructions or data structures stored thereon.
- Such computer- readable media can be any available media that can be accessed by a general purpose or special purpose computer.
- such computer-readable media can comprise RAM, ROM, EPROM, EEPROM, CD-ROM, magnetic disk storage or other storage devices, or any other medium which can be used to carry or store desired program code in the form of computer-executable instructions and which can be accessed by a general purpose or special purpose computer.
- the computer readable instructions can also be accessed from memory and executed by a processor. 33] It may be noted that the above-described examples of the present solution is for the purpose of illustration only. Although the solution has been described in conjunction with a specific embodiment thereof, numerous modifications may be possible without materially departing from the teachings and advantages of the subject matter described herein. Other substitutions, modifications and changes may be made without departing from the spirit of the present solution. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
Abstract
In an example technique, a request may be received to insert, update, or delete metadata both to a file in a file system and into a database that stores out-ofband metadata associated with the file. A first journal event may be generated to insert, update, or delete the metadata into the database and define a first value in a consistent field in the database. The first journal event may be processed to insert, update, or delete the metadata into the database. The technique may then process the request to insert, update, or delete the metadata to the file in the file system. A second journal event may be generated to define a second value in the consistent field in the database. The second journal event may be processed to update the second value in the consistent field in the database.
Description
JOURNAL EVENTS IN A FILE SYSTEM AND A DATABASE
Background
[001] Storage systems are inevitable in modern day computing. Whether it is a general purpose computing device or a large data center of an enterprise, storage systems have become a key part of any computing experience. Exploding growth in structured and unstructured data over the years has also led enterprises to pursue storage solutions that could store terabytes or petabytes of data with reduced costs, complexity, and time. Organizations are looking to extract meaningful and customized business value from such large pools of data.
Brief Description of the Drawings
[002] For a better understanding of the solution, embodiments will now be described, purely by way of example, with reference to the accompanying drawings, in which:
[003] FIG. 1 is a block diagram of an example computing device using journal events in a file system and a database;
[004] FIG. 2 is a block diagram of an example computing environment using journal events in a file system and a database;
[005] FIG. 3 is a flowchart of an example method for using journal events in a file system and a database;
[006] FIG. 4 is a flowchart of an example method for using journal events in a file system and a database; and
[007] FIG. 5 is a block diagram of an example system using journal events in a file system and a database.
Detailed Description
[008] Growth in structured and unstructured data has led enterprises to invest in storage solutions that could help them extract information which is of business value to them. One such mechanism is to allow a user to define custom metadata. Custom metadata allows a user to label a data with customized information that doesn't fit into any of the existing metadata fields. To provide an example, in case the data is a music file, custom metadata may include genre, artist, year performed, etc. In another example, if the data is a business document, custom metadata may include project name, author, reviewed by, signatures, etc.
[009] Custom metadata may be associated with a file present in a file system.
However, in an instance, out-of-band custom metadata may be stored in a database distinct from a file system that includes in-file metadata. In such case, one of the challenges of maintaining an out-of-band metadata in a database is that a traditional application or service (for example, a backup and restore application, a file replication service, etc.) may not have any mechanism to implicitly understand such data. A traditional application may only understand commonly used file semantics (such as Posix). Thus, while out-of-band metadata (or the custom metadata) may be critical for building the object semantic on top of a file, it may also pose a challenge to the traditional applications and services that do not understand out-of-band metadata in a database and are only able to read the usual file metadata stored on a file system.
[0010] The present disclosure describes a mechanism to provide query performance and object semantics of a file through a database along with a simultaneous mapping to a traditional metadata namespace (for example,
extended file attributes) in such a way that an ingest mechanism inserts custom metadata in a file metadata and a database via a single call. In an example, such mechanism may ensure that both traditional applications as well as new applications may use common semantics to push custom metadata to a file and a database.
[0011] The present disclosure describes using journal events to maintain a transactionally consistent mapping between metadata associated with a file in a file system and out-of-band metadata associated with the file in a database. In an example, a request may be received to insert, update, or delete metadata both to a file in a file system and into a database that stores out-of-band metadata associated with the file. In response to the request, a first journal event may be generated to insert, update, or delete the metadata into the database and define a first value in a consistent field in the database. The first journal event may be processed to insert, update, or delete the metadata into the database. Further, the request to insert, update, or delete the metadata to the file in the file system is processed. Next, a second journal event may be generated to define a second value in the consistent field in the database. The second journal event may then be processed to update the second value in the consistent field in the database. Thus, by processing a single request to insert, update, or delete metadata both to a file in a file system and into a database that stores out-of-band metadata associated with the file, journal events maintain a transactionally consistent mapping between metadata associated with a file in a file system and out-of-band metadata associated with the file in a database.
[0012] FIG. 1 is a block diagram of an example computing device 100 using journal events in a file system and a database. Computing device 100 may be a server, a desktop computer, a notebook computer, a tablet computer, and the like. In an example, computing device 100 may be a file server 100. File server 100 may include a processor 102 and a machine-readable storage medium 104.
[0013] Processor 102 may be any type of Central Processing Unit (CPU), microprocessor, or processing logic that interprets and executes machine- readable instructions stored in machine-readable storage medium 104.
[0014] Machine-readable storage medium 104 may be a random access memory (RAM) or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by processor 102. For example, machine-readable storage medium 104 may be Synchronous DRAM (SDRAM), Double Data Rate (DDR), Rambus DRAM (RDRAM), Rambus RAM, etc. or storage memory media such as a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, and the like. In an example, machine-readable storage medium 104 may be a non-transitory machine-readable medium.
[0015] In an example, machine-readable storage medium 104 may store a file system 106, a database 108, and a custom metadata module 1 10. The term "module" may refer to a software component (machine readable instructions), a hardware component or a combination thereof. A module may include, by way of example, components, such as software components, processes, tasks, co-routines, functions, attributes, procedures, drivers, firmware, data, databases, data structures, Application Specific Integrated Circuits (ASIC) and other computing devices. A module may reside on a volatile or non-volatile storage medium (e.g. 104) and configured to interact with a processor (e.g. 102) of a computing device (e.g. 100).
[0016] File system 106 may be a local file system or a scale-out file system such as a shared file system or a network file system. Examples of a shared file system may include a Network Attached Storage (NAS) file system or a cluster file system. Examples of a network file system may include a distributed file system or a distributed parallel file system. In general, file
system 106 may be used for storage and retrieval of data from a storage device. Typically, each piece of data is called a "file" (or file object). In an example, file system 106 may include at least one file. A file in the file system may be associated with file metadata and/or custom metadata. Custom metadata may be defined by a user. Metadata (or custom metadata) associated with a file in a file system may be termed as "in-file" metadata. In an example, such "in-file" metadata may include extended file attributes. Extended file attributes enable users to associate files with metadata not interpreted by the file system, whereas regular attributes have a purpose strictly defined by the file system. For example, users may use these attributes to store the name of an author of a document, a checksum, a digital signature, etc.
[0017] File system 106 may include a journaling system 1 12 and a virtual file system interface 1 14. In an example, journaling system 1 12 may maintain a special file called a journal that may be used to repair any inconsistencies that may occur as the result of an improper shutdown of a computer. Journaling system 1 12 may write metadata into the journal. In the event of a system crash, if a given set of updates have not been implemented, the system may read the journal in order to roll up to the most recent consistent data point.
[0018] In an example, upon receipt of a request to insert, update, or delete metadata both to a file in a file system (example, 106) and into a database (example, 108), file system (example, 106) may generate a first journal event to insert, update, or delete the metadata into the database (example, 108) and define a first value in a consistent field of a content metadata table in the database (example, 108). In an example, the request to insert, update, or delete metadata to a file may be transferred to a Virtual File System (VFS) layer 1 14 or, more specifically, to specific VFS handler of the file system (example, 106). Virtual File System (VFS) layer 1 14 may act as an interface for an underlying operating system to support a variety of file systems so
that the system could handle various types of I/O system calls. In an example, the VFS layer 1 14 is Linux VFS layer. In response to the aforesaid request, file system (example, 106) may generate a first journal event to insert, update, or delete the metadata into the database (example, 108) and define a first value in a consistent field in the database. In an example, in case of a Linux VFS layer, the VFS handler for a SetXattr call may determine if it's an insert or update metadata call, and based on said determination may generate a first journal event and define a first value in a consistent field (for example, the field may be set to 0) of a content metadata table in the database (example, 108). The VFS handler for RemoveXattr may carry out the same process for a delete metadata call. The first journal event may be processed by journaling system 1 12 to insert, update, or delete the metadata into the database (example, 108). File system (example, 106) may then process the request to insert, update, or delete the metadata to the file in the file system (example, 106). Next, journaling system 1 12 may generate a second journal event to define a second value in the consistent field in the database (example, 108). The second journal event may then be processed by file system (example, 106) to update the second value in the consistent field in the database (example, 108). 19] Database 108 may be a repository that stores an organized collection of data. In an example, database 108 may store an out-of-band metadata of a file. "Out-of-band metadata" of a file may be defined as metadata (or custom metadata) that may be stored in a location (example, a database) other than the file system. Database (example, 108) may include one or more tables for storing data. In an example, at least one table in the database (example, 108) may be used to store out-of-band content metadata of a file. Such table may be called "content metadata" table. In an example, a table in the database (such as content metadata table) may include a consistent field. A consistent field is based on consistency property that ensures that any transaction will bring the database from one valid state to another. In other words, a consistent field may change a value defined for it to reflect a
successful transaction. In an example, a first value (for instance, 0) may be defined in a consistent field of a database (example, 108) upon generation of a first journal event to insert, update, or delete metadata into the database (example, 108). A second value (for example, 1 ) may be defined in the consistent field of the database (example, 108) upon generation of a second journal event which may occur upon processing of a request to insert, update, or delete metadata to a file in a file system (example, 106).
[0020] In an example, database 108 may be a distributed database that provides high query rates and high-throughput updates using a batching process. Database 108 may use a pipelined architecture that provides access to update batches at various points through processing. In an instance, database 108 may be based on a batched update model, which decouples update processing from read-only queries (i.e. query processing task). In this model, the updates may be batched and processed in the background, and do not interfere with the foreground query workload. Database 108 may allow different stages of the updates in the pipeline to be queried independently. Queries that could use slightly out-of-date data may use only the final output of the pipeline, which may correspond to the completely ingested and indexed data. Queries that require even fresher results may access data at any stage in the pipeline. In an example, database 108 may be integrated into file system 106.
[0021] Custom metadata module 1 10 may include instructions to receive a request to insert, update, or delete metadata both to a file in a file system (example, 106) and into a database (example, 108) that may store out-of- band metadata associated with the file of said file system. In an example, aforesaid request may be in the form of representational state transfer (REST) call. In other examples, the request may use another data access protocol such as, but not limited to, Network File System (NFS), Server Message Block (SMB), and the like. Custom metadata module 1 10 may include a suitable interface to handle the aforementioned request that may
be generated using either of these protocols. In an example, said REST call may be handled by Hypertext Transfer protocol (HTTP) service that may pass on the request to custom metadata module 1 10. In turn, custom metadata module 1 10 may issue a system call to the file for which the request to insert, update, or delete metadata is intended. The system call is then forwarded to the VFS layer of the file system. In an example, if the underlying operating system is Linux, custom metadata module 1 10 may issue a Linux call Setfattr to the file for which metadata needs to be set (insert, update, or delete). The call may then be handled by the Linux VFS layer and passed on to the file system specific VFS handler.
[0022] FIG. 2 is a block diagram of an example computing environment 200 using journal events in a file system and a database. Computing environment 200 may include client systems 202, 204, and 206, a file server 208, and a storage device 210. The number of client systems 202, 204, and 206, file server 208, and storage device 210 shown in FIG. 1 is for the purpose of illustration only and their number may vary in other implementations. In an example, computing environment 200 may represent a scale-out file system.
[0023] Client systems 202, 204, and 206 may each be a computing device such as a desktop computer, a notebook computer, a tablet computer, a mobile phone, personal digital assistant (PDA), a server, and the like. Client systems 202, 204, and 206, may communicate with file server 208 via a computer network 212. Computer network 212 may be a wireless or wired network. Computer network 212 may include, for example, a Local Area Network (LAN), a Wireless Local Area Network (WAN), a Metropolitan Area Network (MAN), a Storage Area Network (SAN), a Campus Area Network (CAN), or the like. Further, computer network 212 may be a public network (for example, the Internet) or a private network (for example, an intranet). In an example, client systems 202, 204, and 206, may be directly coupled to file server 208.
[0024] In an example, client systems 202, 204, and 206 may host one or more applications 224, 226, and 228 that may, in an example, send a request to file server (example, 208) to insert, update, or delete metadata both to a file in file system and into a database that may store out-of-band metadata of said file. Ina an example, an application (for example, 224, 226, and 228) may use a data access protocol such as, but not limited to, Hypertext Transfer Protocol (HTTP), Network File System (NFS), Server Message Block (SMB), and the like, to read and/or write data such as files, metadata, custom metadata, and the like, from file server 208.
[0025] File server 208 may include a non-transitory machine- readable storage medium 214 that may store machine executable instructions. In an example, file server 208 may be similar to file server 100 described earlier. Accordingly, components of file server 208 that are similarly named and illustrated in file server 100 may be considered similar. For the sake of brevity, components or reference numerals of FIG. 2 having a same or similarly described function in FIG. 1 are not being described in connection with FIG. 2. Said components or reference numerals may be considered alike.
[0026] In an example, machine-readable storage medium 214 may store a file system 106, a database 108, a custom metadata module 1 10, and an archive journal scanner module 220.
[0027] Archive journal scanner module 220 may include instructions to process a journal generated by journaling subsystem. In an example, archive journal module may include instructions to identify, from a database, a consistent field that includes a first value which may be defined for the consistent field upon generation of a first journal event to insert, update, or delete metadata into the database. Archive journal scanner module 220 may further include instructions to identify a metadata entry related to the consistent field with
the first value in the database. Archive journal scanner module 220 may then determine that a metadata entry corresponding to the metadata entry related to the consistent field with the first value is present in metadata of the file in a file system and, in response to the determination, may update the first value to a second value in the consistent field of the database.
[0028] Storage device 210 may be used to store and retrieve data stored by file system 106. Some non-limiting examples of storage device 210 may include a Direct Attached Storage (DAS) device, a Network Attached Storage (NAS) device, a tape drive, a magnetic tape drive, or a combination of these devices. Storage device 210 may be directly coupled to file server 106 or may communicate with file server 106 via a computer network 222. Such a computer network 222 may be similar to the computer network 212 described above. In an example, computer network 222 may be a Storage Area Network (SAN).
[0029] FIG. 3 is a flowchart of an example method 300 for using journal events in a file system and a database. The method 300, which is described below, may at least partially be executed on a computing device 100 of FIG. 1 or file server 208 of FIG. 2. However, other computing devices may be used as well. At block 302, a request may be received to insert, update, or delete metadata both to a file in a file system (example, 106) and into a database (example, 108) that may store out-of-band metadata associated with the file. In an example, such request may received by custom metadata module. At block 304, the file system (example, 106) may generate a first journal event to insert, update, or delete the metadata into the database (example, 108) and define a first value (for example, 0) in a consistent field in the database (example, 108). In an example, the consistent field may be present in a content metadata table of the database (example, 108). At block 306, a journaling system may process the first journal event to insert, update, or delete the metadata into the database (example, 108). In other words, the journaling system may insert, update, or delete the metadata into the
database (example, 108). At block 308, the file system may process the request to insert, update, or delete the metadata to the file in the file system (example, 106). In other words, the file system (example, 106) may insert, update, or delete the metadata into the database (example, 108). At block 310, the file system may generate a second journal event to define a second value (for example, 1 ) in the consistent field in the database (example, 108). At block 312, the journaling system may process the second journal event to update the second value in the consistent field in the database (example, 108). In other words, the journaling system may replace the first value (for example, 0) with a second value (for example, 1 ) in the consistent field. 30] FIG. 4 is a flowchart of an example method 400 for using journal events in a file system and a database. The method 400, which is described below, may at least partially be executed on a computing device 100 of FIG. 1 or file server 208 of FIG. 2. However, other computing devices may be used as well. In an example, example method 400 may be used to ensure that there are no metadata inconsistencies between metadata associated with a file in a file system and out-of-band metadata associated with the file in a database (example, 108) if a system crash takes place during processing of a request to insert, update, or delete metadata both to a file in a file system and into a database (example, 108) that stores out-of-band metadata associated with the file. At block 402, an archive journal scanner module (example, 220) may identify, from a database (example, 108) that may store out-of-band metadata associated with a file, a consistent field (or fields) that includes a given value (or first value). For instance, a given value may be zero (0). Upon said identification, at block 404, an archive journal scanner module (example, 220) may identify a metadata entry related to the identified consistent field with the first value in the database (example, 108). In other words, a journal module may identify a metadata entry that may have been inserted, updated, or deleted against a consistent field with a given value. At block 406, an archive journal scanner module (example, 220) may determine that a metadata entry corresponding to the metadata entry related to the
consistent field with the given value (i.e. first value) is present in metadata of a file in a file system (example, 106). In other words, a determination is made whether there's a metadata entry in the file system that matches with a metadata entry against a given value (for example, 0) in the database (example, 108). At block 408, in response to the determination, if there's a metadata entry in the file system that matches with a metadata entry against a given value (for example, 0) in the database (example, 108), the first value (for example, 0) may be update to a second value (for example, 1 ) in said consistent field of the database. Presence of a matching metadata entry both in a file system and a database indicates that a request to insert, update, or delete metadata both to a file in the file system (example, 106) and into the database (example, 108) is successful. In such case, a first value in a consistent value may be updated to a second value to indicate a successful transaction. In the event there's no metadata entry in the file system that matches with a metadata entry against a first value (for example, 0) in the database (example, 108), metadata of the file in the file system (for example, extended file attributes) may be updated with the first value to attain one-to-one mapping between metadata of the file in the file system and metadata of the file in the database. 31] FIG. 5 is a block diagram of an example system 500 using journal events in a file system and a database. System 500 includes a processor 502 and a machine-readable storage medium 504 communicatively coupled through a system bus. In an example, system 500 may be analogous to computing device 100 of FIG. 1 or file server 208 of FIG. 2. Processor 502 may be any type of Central Processing Unit (CPU), microprocessor, or processing logic that interprets and executes machine-readable instructions stored in machine-readable storage medium 504. Machine-readable storage medium 504 may be a random access memory (RAM) or another type of dynamic storage device that may store information and machine-readable instructions that may be executed by processor 502. For example, machine-readable storage medium 504 may be Synchronous DRAM (SDRAM), Double Data
Rate (DDR), Rambus DRAM (RDRAM), Rambus RAM, etc. or a storage memory media such as a floppy disk, a hard disk, a CD-ROM, a DVD, a pen drive, and the like. In an example, machine-readable storage medium 504 may be a non-transitory machine-readable medium. Machine-readable storage medium 504 may store instructions 506, 508, 510, 512, 514, and 516. In an example, instructions 506 may be executed by processor 502 to receive a request to insert, update, or delete metadata both to a file in a file system and into a database that stores out-of-band metadata associated with the file. Instructions 508 may be executed by processor 502 to generate a first journal event to insert, update, or delete the metadata into the database and define a first value in a consistent field in the database. Instructions 510 may be executed by processor 502 to process the first journal event to insert, update, or delete the metadata into the database. Instructions 512 may be executed by processor 502 to process the request to insert, update, or delete the metadata to the file in the file system. Instructions 514 may be executed by processor 502 to generate a second journal event to define a second value in the consistent field in the database. Instructions 516 may be executed by processor 502 to process the second journal event to update the second value in the consistent field in the database. 32] For the purpose of simplicity of explanation, the example methods of FIGS. 3 and 4 are shown as executing serially, however it is to be understood and appreciated that the present and other examples are not limited by the illustrated order. The example systems of FIGS. 1 , 2 and 5, and methods of FIGS. 3 and 4 may be implemented in the form of a computer program product including computer-executable instructions, such as program code, which may be run on any suitable computing device in conjunction with a suitable operating system (for example, Microsoft Windows, Linux, UNIX, and the like). Embodiments within the scope of the present solution may also include program products comprising non- transitory computer-readable media for carrying or having computer-
executable instructions or data structures stored thereon. Such computer- readable media can be any available media that can be accessed by a general purpose or special purpose computer. By way of example, such computer-readable media can comprise RAM, ROM, EPROM, EEPROM, CD-ROM, magnetic disk storage or other storage devices, or any other medium which can be used to carry or store desired program code in the form of computer-executable instructions and which can be accessed by a general purpose or special purpose computer. The computer readable instructions can also be accessed from memory and executed by a processor. 33] It may be noted that the above-described examples of the present solution is for the purpose of illustration only. Although the solution has been described in conjunction with a specific embodiment thereof, numerous modifications may be possible without materially departing from the teachings and advantages of the subject matter described herein. Other substitutions, modifications and changes may be made without departing from the spirit of the present solution. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
Claims
1 . A method, comprising:
receiving a request to insert, update, or delete metadata both to a file in a file system and into a database that stores out-of-band metadata associated with the file;
generating a first journal event to insert, update, or delete the metadata into the database and define a first value in a consistent field in the database; processing the first journal event to insert, update, or delete the metadata into the database;
processing the request to insert, update, or delete the metadata to the file in the file system;
generating a second journal event to define a second value in the consistent field in the database; and
processing the second journal event to update the second value in the consistent field in the database.
2. The method of claim 1 , wherein the metadata is custom metadata.
3. The method of claim 1 , wherein the request is a representational state transfer (REST) call.
4. The method of claim 1 , wherein the metadata associated with the file in the file system includes extended file attributes of the file.
5. The method of claim 1 , further comprising:
identifying, from the database, the consistent field with the first value; identifying a metadata entry related to the consistent field with the first value in the database;
determining that a metadata entry corresponding to the metadata entry related to the consistent field with the first value is present in metadata of the file in the file system; and
in response to the determination, updating the first value to the second value in the consistent field of the database.
6. The method of claim 1 , wherein the processing of the request to insert, update, or delete the metadata to the file in the file system comprises using a Virtual File System (VFS) to insert, update, or delete the metadata to the file in the file system.
7. A system, comprising:
a file system comprising a journaling system;
a database; and
a custom metadata module to receive a request to insert, update, or delete a custom metadata entry both to a file in the file system and into the database, wherein:
the file system is to generate a first journal event to insert, update, or delete the custom metadata entry into a content metadata table in the database and define a first value in a consistent field of the content metadata table;
the journaling system is to process the first journal event to insert, update, or delete the custom metadata entry into the content metadata table in the database;
the file system is to process the request to insert, update, or delete the custom metadata entry to the file in the file system;
the file system is to generate a second journal event to define a second value in the consistent field of the content metadata table in the database; and the journaling system is to process the second journal event to update the second value in the consistent field of the content metadata table.
8. The system of claim 7, wherein the request is received from an application present on a communicatively coupled client system.
9. The system of claim 7, further comprising a journal scanner module to:
identify, from the content metadata table in the database, the consistent field with the first value;
identify the custom metadata entry related to the consistent field with the first value in the content metadata table;
determine that a content metadata entry corresponding to the custom metadata entry related to the consistent field with the first value is present in metadata of the file in the file system; and
in response to the determination, update the first value to the second value in the consistent field of the content metadata table.
10. The system of claim 7, wherein the database is integrated into the file system.
1 1 . The system of claim 7, wherein the database is to allow pipelining of updates and independent querying of the pipelined updates.
12. The system of claim 7, wherein the database stores out-of-band metadata associated with the file.
13. A non-transitory machine-readable storage medium comprising instructions executable by a processor to:
receive a request to insert, update, or delete custom metadata both to a file in a file system and into a database;
generate a first journal event to insert, update, or delete the custom metadata into the database and define a first value in a consistent field in the database;
process the first journal event to insert, update, or delete the custom metadata into the database;
process the request to insert, update, or delete the custom metadata to the file in the file system;
generate a second journal event to define a second value in the consistent field in the database; and
process the second journal event to update the second value in the consistent field in the database.
14. The storage medium of claim 13, further include instructions to:
identify, from the database, the consistent field with the first value;
identify the custom metadata related to the consistent field with the first value in the content metadata table;
determine that a custom metadata corresponding to the custom metadata related to the consistent field with the first value is present in metadata of the file in the file system; and
in response to the determination, replace the first value with the second value in the consistent field in the database.
15. The storage medium of claim 13, wherein the file system is a Network Attached Storage (NAS) file system.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN2704/CHE/2014 | 2014-06-02 | ||
IN2704CH2014 | 2014-06-02 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015187187A1 true WO2015187187A1 (en) | 2015-12-10 |
Family
ID=54767125
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2014/048005 WO2015187187A1 (en) | 2014-06-02 | 2014-07-24 | Journal events in a file system and a database |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2015187187A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10942910B1 (en) * | 2018-11-26 | 2021-03-09 | Amazon Technologies, Inc. | Journal queries of a ledger-based database |
US11036708B2 (en) | 2018-11-26 | 2021-06-15 | Amazon Technologies, Inc. | Indexes on non-materialized views |
US11119998B1 (en) | 2018-11-26 | 2021-09-14 | Amazon Technologies, Inc. | Index and view updates in a ledger-based database |
US11196567B2 (en) | 2018-11-26 | 2021-12-07 | Amazon Technologies, Inc. | Cryptographic verification of database transactions |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6877109B2 (en) * | 2001-11-19 | 2005-04-05 | Lsi Logic Corporation | Method for the acceleration and simplification of file system logging techniques using storage device snapshots |
US20100031274A1 (en) * | 2004-05-10 | 2010-02-04 | Siew Yong Sim-Tang | Method and system for real-time event journaling to provide enterprise data services |
US7809778B2 (en) * | 2006-03-08 | 2010-10-05 | Omneon Video Networks | Idempotent journal mechanism for file system |
US8145686B2 (en) * | 2005-05-06 | 2012-03-27 | Microsoft Corporation | Maintenance of link level consistency between database and file system |
US8412685B2 (en) * | 2004-07-26 | 2013-04-02 | Riverbed Technology, Inc. | Method and system for managing data |
-
2014
- 2014-07-24 WO PCT/US2014/048005 patent/WO2015187187A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6877109B2 (en) * | 2001-11-19 | 2005-04-05 | Lsi Logic Corporation | Method for the acceleration and simplification of file system logging techniques using storage device snapshots |
US20100031274A1 (en) * | 2004-05-10 | 2010-02-04 | Siew Yong Sim-Tang | Method and system for real-time event journaling to provide enterprise data services |
US8412685B2 (en) * | 2004-07-26 | 2013-04-02 | Riverbed Technology, Inc. | Method and system for managing data |
US8145686B2 (en) * | 2005-05-06 | 2012-03-27 | Microsoft Corporation | Maintenance of link level consistency between database and file system |
US7809778B2 (en) * | 2006-03-08 | 2010-10-05 | Omneon Video Networks | Idempotent journal mechanism for file system |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10942910B1 (en) * | 2018-11-26 | 2021-03-09 | Amazon Technologies, Inc. | Journal queries of a ledger-based database |
US11036708B2 (en) | 2018-11-26 | 2021-06-15 | Amazon Technologies, Inc. | Indexes on non-materialized views |
US11119998B1 (en) | 2018-11-26 | 2021-09-14 | Amazon Technologies, Inc. | Index and view updates in a ledger-based database |
US11196567B2 (en) | 2018-11-26 | 2021-12-07 | Amazon Technologies, Inc. | Cryptographic verification of database transactions |
US11675770B1 (en) | 2018-11-26 | 2023-06-13 | Amazon Technologies, Inc. | Journal queries of a ledger-based database |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220405269A1 (en) | Processing mutations for a remote database | |
US10713654B2 (en) | Enterprise blockchains and transactional systems | |
US9639542B2 (en) | Dynamic mapping of extensible datasets to relational database schemas | |
US9886443B1 (en) | Distributed NFS metadata server | |
US20100131940A1 (en) | Cloud based source code version control | |
US10417181B2 (en) | Using location addressed storage as content addressed storage | |
JP2010211828A5 (en) | ||
US9659021B1 (en) | Client based backups and backup indexing | |
WO2015187187A1 (en) | Journal events in a file system and a database | |
WO2016169322A1 (en) | Query method and device for database, and computer storage medium | |
US20220391356A1 (en) | Duplicate file management for content management systems and for migration to such systems | |
US9092338B1 (en) | Multi-level caching event lookup | |
WO2016130167A1 (en) | Consistency check on namespace of an online file system | |
US10242025B2 (en) | Efficient differential techniques for metafiles | |
US11016933B2 (en) | Handling weakening of hash functions by using epochs | |
WO2016118176A1 (en) | Database management | |
US10185759B2 (en) | Distinguishing event type | |
US11422733B2 (en) | Incremental replication between foreign system dataset stores | |
WO2017007496A1 (en) | Managing a database index file | |
WO2015178943A1 (en) | Eliminating file duplication in a file system | |
WO2016195728A1 (en) | Generating test data based on histogram statistics | |
WO2015134018A1 (en) | Processing primary key modifications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14893943 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14893943 Country of ref document: EP Kind code of ref document: A1 |