CA2489324A1 - Storage system having partitioned migratable metadata - Google Patents
Storage system having partitioned migratable metadata Download PDFInfo
- Publication number
- CA2489324A1 CA2489324A1 CA002489324A CA2489324A CA2489324A1 CA 2489324 A1 CA2489324 A1 CA 2489324A1 CA 002489324 A CA002489324 A CA 002489324A CA 2489324 A CA2489324 A CA 2489324A CA 2489324 A1 CA2489324 A1 CA 2489324A1
- Authority
- CA
- Canada
- Prior art keywords
- metadata
- partition
- file
- resource
- partitions
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99941—Database schema or data structure
- Y10S707/99942—Manipulating data structure, e.g. compression, compaction, compilation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99941—Database schema or data structure
- Y10S707/99943—Generating database or data structure, e.g. via user interface
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99941—Database schema or data structure
- Y10S707/99944—Object-oriented database structure
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99941—Database schema or data structure
- Y10S707/99944—Object-oriented database structure
- Y10S707/99945—Object-oriented database structure processing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99951—File or database maintenance
- Y10S707/99952—Coherency, e.g. same view to multiple users
- Y10S707/99953—Recoverability
Abstract
According to one embodiment, a metadata management system (MDS) may include partitioned migratable metadata. Metadata may be stored in multiple metadata partitions (102-0 to 102-11). Each metadata partition may be assigned to a particular system resource (104-0 to 104-5). According to predetermined policies, such as metadata aging, metadata stored in one metadata partition may be migrated to a different metadata partition. A forwarding object can b e placed in the old metadata partition to indicate the new location of the migrated metadata. Metadata partitions (102-0 to 102-11) may be reassigned t o different resources, split and/or merged allowing a high degree of scalability, as well as flexibility in meeting storage system needs.
Description
STORAGE SYSTEM HAVING PARTITIONED MIGRATABLE METADATA
TECHNICAL FIELD
The present invention relates generally to data storage systems, and more particularly to a high capacity storage systems that include metadata corresponding to stored files.
BACKGROUND OF THE INVENTION
The demand for high-capacity data storage systems continues to rise. As the interconnection of data networks continues, there is an increasing demand to store very large numbers of files in an efficient fashion while at the same time enabling such a storage system IO to grow as the number of files increases.
While various conventional data storage systems are known, such approaches have not always been efficient, easy to scale, or cost effective. Conventionally, data storage systems have resided on a monolithic server. A monolithic server can be conceptualized as including a single, very powerful computing resource dedicated to accessing files that may be I S stored on a variety of media. Such a monolithic server can maintain a collection of metadata for the stored files.
Metadata can include assorted file information including a filename, directory in which the file is located, physical location (offset), size of file, and type of file.
Conventionally, metadata can reside on single partition accessed by a process to enable rapid 20 lookups in, and/or access to the metadata.
A drawback to the monolithic server approach can be the difficulty involved in adapting such systems to changing needs. For example, the number of stored files, and consequently the amount of metadata and rnetadata accesses may increase over time. To meet such needs, the monolithic server may be upgraded. While processing speed can be 25 improved by increasing computing resources (such as the number of central processing units (CPUs) and associated random access memory (RAM)), such increased resources can be difficult to implement as hardware upgrades may require the system to be non-operational for a certain period of time.
Monolithic server approaches may be undesirable as usage requirements may be outgrown. As just two examples, the amount of data stored or the amount of requests serviced may grow to the point where an existing monolithic server response is too slow or S not capable of meeting usage requirements.
One conventional approach to meet increasing requirements can be to add servers. A
drawback to such an approach can be added complexity to a user. A user may have to keep track of the multiple servers, as such servers are typically visible as separate entities to user applications. Further, with multiple servers, load imbalance may occur as one server is accessed/stores more than another. Consequently, a system administrator may have to manually shift files and/or set request routing as usage changes. This can be an extreme burden on a system administrator.
It is also noted that the management of multiple servers can be especially difficult for mission critical or Internet applications that may run twenty-four hours a day and 36S days a 1 S year, as such systems do not typically have a window of time available to reconfigure or upgrade the system.
Increases in metadata size can be difftcult to accommodate as well. As the demands for larger capacity systems increase (e.g., petabyte or larger size systems), the amount of metadata can increase as well. However, if the metadata exceeds the monolithic server's storage capacity, changes to the system may have to be undertaken to enable larger storage capabilities. Further, the manipulation of metadata (as files are deleted, renamed, moved, etc.) may become more complex as the server must be capable of accessing more and more metadata in the management process.
One approach to addressing the storing of a large number of files has been to "migrate" stored files. Migration of stored files may include transfernng files from one storage medium to another. Typically, "old" files (those that are not accessed after a certain period of time) can be migrated from a first storage medium that may provide relatively fast access (and hence may be more expensive), to a second storage medium that may provide slower access (and hence may be less expensive).
While migration of files may provide a solution for larger numbers of data files, there S remains a need to address the increasing size of metadata. For data stoxage systems that store a large number of files, there is a need for a metadata storage approach that allows for a high degree of scaling, and/or ease in scaling, and/or flexibility in the arrangement of metadata, and/or more cost effective storage of metadata.
SUMMARY OF THE INVENTION
According to one embodiment, a data storage system may include a metadata management system that stores metadata on a number of different metadata partitions. Each partition is assigned a particular system resource. A system resource can access its corresponding metadata partition(s). System resources may be arranged in different classes, where one class may pxovide slower access and/or be less expensive than another class. Such an arrangement can allow fox scaling as a new partition and/or new resource may be added to the metadata management system as needed.
According to one aspect of the embodiments, metadata residing on a first partition assigned to one resource can be moved to a second partition assigned to a second resource.
According to another aspect of the embodiments, metadata may be moved according to established policies. As but one example of a policy, infrequently used metadata may be migrated from a partition assigned to a more expensive resource, to another partition assigned to a less expensive resource.
According to another aspect of the embodiments, metadata may be moved when its corresponding file is renamed. The data storage system may include an organization system, such as a file system for organizing the metadata. When a file is renamed, its metadata may be moved to a new metadata partition.
TECHNICAL FIELD
The present invention relates generally to data storage systems, and more particularly to a high capacity storage systems that include metadata corresponding to stored files.
BACKGROUND OF THE INVENTION
The demand for high-capacity data storage systems continues to rise. As the interconnection of data networks continues, there is an increasing demand to store very large numbers of files in an efficient fashion while at the same time enabling such a storage system IO to grow as the number of files increases.
While various conventional data storage systems are known, such approaches have not always been efficient, easy to scale, or cost effective. Conventionally, data storage systems have resided on a monolithic server. A monolithic server can be conceptualized as including a single, very powerful computing resource dedicated to accessing files that may be I S stored on a variety of media. Such a monolithic server can maintain a collection of metadata for the stored files.
Metadata can include assorted file information including a filename, directory in which the file is located, physical location (offset), size of file, and type of file.
Conventionally, metadata can reside on single partition accessed by a process to enable rapid 20 lookups in, and/or access to the metadata.
A drawback to the monolithic server approach can be the difficulty involved in adapting such systems to changing needs. For example, the number of stored files, and consequently the amount of metadata and rnetadata accesses may increase over time. To meet such needs, the monolithic server may be upgraded. While processing speed can be 25 improved by increasing computing resources (such as the number of central processing units (CPUs) and associated random access memory (RAM)), such increased resources can be difficult to implement as hardware upgrades may require the system to be non-operational for a certain period of time.
Monolithic server approaches may be undesirable as usage requirements may be outgrown. As just two examples, the amount of data stored or the amount of requests serviced may grow to the point where an existing monolithic server response is too slow or S not capable of meeting usage requirements.
One conventional approach to meet increasing requirements can be to add servers. A
drawback to such an approach can be added complexity to a user. A user may have to keep track of the multiple servers, as such servers are typically visible as separate entities to user applications. Further, with multiple servers, load imbalance may occur as one server is accessed/stores more than another. Consequently, a system administrator may have to manually shift files and/or set request routing as usage changes. This can be an extreme burden on a system administrator.
It is also noted that the management of multiple servers can be especially difficult for mission critical or Internet applications that may run twenty-four hours a day and 36S days a 1 S year, as such systems do not typically have a window of time available to reconfigure or upgrade the system.
Increases in metadata size can be difftcult to accommodate as well. As the demands for larger capacity systems increase (e.g., petabyte or larger size systems), the amount of metadata can increase as well. However, if the metadata exceeds the monolithic server's storage capacity, changes to the system may have to be undertaken to enable larger storage capabilities. Further, the manipulation of metadata (as files are deleted, renamed, moved, etc.) may become more complex as the server must be capable of accessing more and more metadata in the management process.
One approach to addressing the storing of a large number of files has been to "migrate" stored files. Migration of stored files may include transfernng files from one storage medium to another. Typically, "old" files (those that are not accessed after a certain period of time) can be migrated from a first storage medium that may provide relatively fast access (and hence may be more expensive), to a second storage medium that may provide slower access (and hence may be less expensive).
While migration of files may provide a solution for larger numbers of data files, there S remains a need to address the increasing size of metadata. For data stoxage systems that store a large number of files, there is a need for a metadata storage approach that allows for a high degree of scaling, and/or ease in scaling, and/or flexibility in the arrangement of metadata, and/or more cost effective storage of metadata.
SUMMARY OF THE INVENTION
According to one embodiment, a data storage system may include a metadata management system that stores metadata on a number of different metadata partitions. Each partition is assigned a particular system resource. A system resource can access its corresponding metadata partition(s). System resources may be arranged in different classes, where one class may pxovide slower access and/or be less expensive than another class. Such an arrangement can allow fox scaling as a new partition and/or new resource may be added to the metadata management system as needed.
According to one aspect of the embodiments, metadata residing on a first partition assigned to one resource can be moved to a second partition assigned to a second resource.
According to another aspect of the embodiments, metadata may be moved according to established policies. As but one example of a policy, infrequently used metadata may be migrated from a partition assigned to a more expensive resource, to another partition assigned to a less expensive resource.
According to another aspect of the embodiments, metadata may be moved when its corresponding file is renamed. The data storage system may include an organization system, such as a file system for organizing the metadata. When a file is renamed, its metadata may be moved to a new metadata partition.
According to another aspect of the embodiments, moving metadata from a first partition to a second partition may include moving the metadata to the second partition and placing a forwarding object in the first partition that indicates the new location of the moved metadata.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a metadata management system (MDS) according to one embodiment of the present invention.
FIG. 2 is a diagram illustrating a file system according to one embodiment of the present invention.
FIGS. 3A to 3C are block diagrams illustrating various operations that may be performed in a MDS according to the embodiment of FIG. 1.
FIGS. 4A and 4B are diagrams showing a metadata migration operation.
FIG. 5 is a diagram of a filehandle evolution for a metadata migration operation.
FIGS. 6A and 6B are diagrams showing a file renaming operation.
FIG. 7 is a diagram of a filehandle evolution for a file renaming operation.
FIG. 8 is a block diagram of a forwarding object table according to one embodiment.
FIGS. 9A and 9B are diagrams representing the movement of a group of metadata between partitions.
FIG. 10 is a block diagram of a MDS according to another embodiment.
FIGS. 11A and 11B shows examples of certain functions that may be included in a MDS interface.
FIG. 12 shows examples of other functions that may be included in a MDS
interface.
FIG. 13 is an example of a system that may include a MDS according to one embodiment.
FIG. 14 is an example of an alternate system that may include a MDS according to one embodiment.
DETAILED DESCRIPTION OF THE EMBODIMENTS
Various embodiments of the present invention will now be described in conjunction with a number of diagrams. The various embodiments include a system for managing metadata stored on different partitions. Such a system can allow for easy and cost-effective 5 scaling and/or allow for the migration of metadata according to aging or other criteria.
Further, the present invention may allow for easier or more effective management of large amounts of metadata as partitions can be added, split or merged as needed.
As noted above, conventional file storage systems have typically stored file metadata on a single partition. According to the present invention, a metadata management system may include metadata that may be distributed over multiple partitions.
MDS Block Dia~rarn Representation.
To better understand the various advantages of a metadata management system according to the present invention, reference will now be made to FIG. 1. FIG.
1 is a block diagram representation of a metadata management system (MDS) 100. A MDS 100 may include various metadata partitions, shown as 102-0 to 102-11 assigned to particular system resources 104-0 to 104-5. FIG. 1 illustrates how multiple partitions may be assigned to resources. In particular, one partition 102-2 is assigned to system resource 104-1, two partitions 102-0 and 102-1 are assigned to system resource 104-0, and three partitions 102-5 to 102-7 are assigned to system resource 104-3. Of course, the particular number of system resources and partitions per system resource are provided by way of example, and should not be construed as limiting.
System resources (104-0 to 104-5) may fall into one or more classes. A system resource class can indicate a particular storage media, different class of machine, and/or different running process. Consequently, one class of system resource may provide faster access to its corresponding metadata partitions) than another class. In addition, or alternatively, one class may provide a lower cost solution than another class (i.e., component costs and/or maintenance costs for the system resource are less expensive than those of other system resources).
An arrangement such as that set forth in FIG. 1 can allow resources to be optimized to the particular metadata stored. For example, partitions could be assigned to system resources based on the probability that metadata will be accessed, preventing one system resource from being over-taxed. Such an optimization of available system resources may provide for increased performance. This is in contrast to a conventional monolithic server approach, which may be conceptualized as a single, high-computing power resource applied to one metadata partition. In such a conventional approach, increases in performance can require expensive hardware upgrades to a monolithic server system.
MDS File System Representation While FIG. 1 illustrates one example of a partition-resource relationship of a MDS, a -MDS may also be conceptualized on a file system basis. One such example is set forth in FIG. 2.
FIG. 2 shows a MDS 200 file system that is distributed across multiple partitions (202-0 to 202-3). A partition 202-0 may contain a higher level portions of a file system (e.g., the top of a file system tree), while partitions 202-1 to 202-3 may contain lower level portions of the file system.
It is understood that one or more of the partitions (202-0 to 202-3) could be assigned to a particular resource. A conventional file system will typically include various nodes in some relation to one another. In a contrast, a MDS partition (202-0 to 202-3) according to the present invention may include nodes and "forwarding objects".
A node in FIG. 2 is indicated by a circle while a forwarding object is indicated by a rectangle. Nodes may provide some of the same functions as a conventional ale system, namely organizing and providing file information. Forwarding objects may allow a metadata to span multiple partitions.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a metadata management system (MDS) according to one embodiment of the present invention.
FIG. 2 is a diagram illustrating a file system according to one embodiment of the present invention.
FIGS. 3A to 3C are block diagrams illustrating various operations that may be performed in a MDS according to the embodiment of FIG. 1.
FIGS. 4A and 4B are diagrams showing a metadata migration operation.
FIG. 5 is a diagram of a filehandle evolution for a metadata migration operation.
FIGS. 6A and 6B are diagrams showing a file renaming operation.
FIG. 7 is a diagram of a filehandle evolution for a file renaming operation.
FIG. 8 is a block diagram of a forwarding object table according to one embodiment.
FIGS. 9A and 9B are diagrams representing the movement of a group of metadata between partitions.
FIG. 10 is a block diagram of a MDS according to another embodiment.
FIGS. 11A and 11B shows examples of certain functions that may be included in a MDS interface.
FIG. 12 shows examples of other functions that may be included in a MDS
interface.
FIG. 13 is an example of a system that may include a MDS according to one embodiment.
FIG. 14 is an example of an alternate system that may include a MDS according to one embodiment.
DETAILED DESCRIPTION OF THE EMBODIMENTS
Various embodiments of the present invention will now be described in conjunction with a number of diagrams. The various embodiments include a system for managing metadata stored on different partitions. Such a system can allow for easy and cost-effective 5 scaling and/or allow for the migration of metadata according to aging or other criteria.
Further, the present invention may allow for easier or more effective management of large amounts of metadata as partitions can be added, split or merged as needed.
As noted above, conventional file storage systems have typically stored file metadata on a single partition. According to the present invention, a metadata management system may include metadata that may be distributed over multiple partitions.
MDS Block Dia~rarn Representation.
To better understand the various advantages of a metadata management system according to the present invention, reference will now be made to FIG. 1. FIG.
1 is a block diagram representation of a metadata management system (MDS) 100. A MDS 100 may include various metadata partitions, shown as 102-0 to 102-11 assigned to particular system resources 104-0 to 104-5. FIG. 1 illustrates how multiple partitions may be assigned to resources. In particular, one partition 102-2 is assigned to system resource 104-1, two partitions 102-0 and 102-1 are assigned to system resource 104-0, and three partitions 102-5 to 102-7 are assigned to system resource 104-3. Of course, the particular number of system resources and partitions per system resource are provided by way of example, and should not be construed as limiting.
System resources (104-0 to 104-5) may fall into one or more classes. A system resource class can indicate a particular storage media, different class of machine, and/or different running process. Consequently, one class of system resource may provide faster access to its corresponding metadata partitions) than another class. In addition, or alternatively, one class may provide a lower cost solution than another class (i.e., component costs and/or maintenance costs for the system resource are less expensive than those of other system resources).
An arrangement such as that set forth in FIG. 1 can allow resources to be optimized to the particular metadata stored. For example, partitions could be assigned to system resources based on the probability that metadata will be accessed, preventing one system resource from being over-taxed. Such an optimization of available system resources may provide for increased performance. This is in contrast to a conventional monolithic server approach, which may be conceptualized as a single, high-computing power resource applied to one metadata partition. In such a conventional approach, increases in performance can require expensive hardware upgrades to a monolithic server system.
MDS File System Representation While FIG. 1 illustrates one example of a partition-resource relationship of a MDS, a -MDS may also be conceptualized on a file system basis. One such example is set forth in FIG. 2.
FIG. 2 shows a MDS 200 file system that is distributed across multiple partitions (202-0 to 202-3). A partition 202-0 may contain a higher level portions of a file system (e.g., the top of a file system tree), while partitions 202-1 to 202-3 may contain lower level portions of the file system.
It is understood that one or more of the partitions (202-0 to 202-3) could be assigned to a particular resource. A conventional file system will typically include various nodes in some relation to one another. In a contrast, a MDS partition (202-0 to 202-3) according to the present invention may include nodes and "forwarding objects".
A node in FIG. 2 is indicated by a circle while a forwarding object is indicated by a rectangle. Nodes may provide some of the same functions as a conventional ale system, namely organizing and providing file information. Forwarding objects may allow a metadata to span multiple partitions.
An example of a forwarding object that is accessed in a file system lookup will now be described. Referring now to FIG. 2, it will be assumed that a lookup, such as a directory lookup, is conducted to retrieve file information present at node 204-4. A
lookup may be undertaken on behalf of a client. According to one arrangement, client metadata related requests can be translated into a MDS system request. Such a MDS system request may be issued by an MDS client that controls communication with a MDS 200.
In the example shown, a lookup may begin at node 204-0 in the highest level partition, partition 202-0. The lookup may proceed to node 204-1. However, because the desired file information is stored on a partition 202-1, the lookup can proceed to forwarding object 206-0.
Forwarding object 206-0 can point to node 204-2 in partition 202-1. In this way, a system according to the present invention may include a file system that spans multiple partitions.
The lookup may then resume within partition 202-1 at node 204-2, proceed to node 204-3, and finally arrive at the desired node 204-4 It is noted that in arrangements where a MDS client, or the like, interfaces with an MDS 200, various accesses to different partitions, and indeed the existence of multiple MDS
partitions, can be entirely hidden from a client.
Of course, it is understood that the particular directory structure shown should not be construed as limiting to the present invention, and is provided only by way of example.
Further, a node in such file system may take a variety of forms. As but two possible examples, each node may include a filehandle and corresponding file attributes and/or each node may include a filehandle with a pointer to its corresponding attributes.
A file system distributed across multiple metadata partitions may take a variety of forms. As a first of the many possible examples, a client may forward a file name to retrieve metadata for such a file. Metadata may be arranged in various ways. This may be as simple as alphabetically storing files by a file name. As a second of the many possible examples, rnetadata may be stored according to a function based on a file attributes.
This may include "hashing" one or more fields (e.g., the filehandle) with a hash function, and using the resulting value to determine in which metadata partition the rnetadata is to be stored. Of course, these two examples should not be construed as limiting to the invention. Numerous other file system approaches would be obvious to one skilled in the art.
MDS Operations FIG. 3A - 3C are general representations of various operations that may be performed in a MDS 100. FIGS. 3A-3C include some of the same general items as FIG. 1. To that extent, like items will be referred to by the same reference characters.
Conventionally, file system metadata would be stored on a single partition.
Consequently, such a file system could have limited scalability. If the amount of metadata outgrew the available partition space, an expensive and/or time consuming upgrade operation could be necessary to replace the current partition with a larger partition.
Conventionally, file system metadata would be assigned to a single resource.
Consequently, such a system could be susceptible to failure or require expensive redundancy approaches. More particularly, if the single resource assigned to the metadata failed, the file system would be inoperable until the failure was addressed. Furthex, to address such susceptibility to failure, one or more parallel back-up resources would have to be provided that would reproduce all of the current metadata. Such a conventional arrangement may be more difficult to manage and implement in the event of a failure.
According to one approach of the present invention, MDS servers may run in redundant process pairs. Thus, the failure of one MDS server process can be replaced by another. Still further, for situations that include multiple MDS servers, such a redundant MDS server ratio does not have to be 1:1. As but one example, a first MDS
server may access metadata alphabetically from letters A-C. A second MDS server may access metadata from letters D-F. A redundant MDS server may access metadata from letters A-F.
Consequently, a failure of the first and/or second MDS servers can be met by the redundant MDS server.
Adding a Metadata Resource and/or Metadata Partition.
According to the present invention partitions andlor resources may be added.
By providing such a capability, a MDS may be scaled to accommodate larger amounts of metadata and/or different rnetadata arrangements/configurations. It is noted that in monolithic server approaches, the entire server may have to be shutdown to accommodate additional storage fox metadata. Still further, for different metadata arrangements/
IO configurations, the software of the monolithic server m.ay have to be upgraded and/or customized, also'requiring server "down" time.
FIG. 3A provides one example illustrating how partitions and/or system resources may be added. FIG. 3A shows the MDS of FIG. I following the addition of two new partitions 102-X and 102-Y, and a new resource 104-Y.
A new resource 104-Y rnay be assigned to new partition 102-X. In the particular example of FIG. 3A, new resource 104-Y can be a CLASS 1 resource. Further, a new partition 102-Y may be assigned to an existing resource 104-5. This may be particularly advantageous if a system resource 104-5 is being underutilized. As shown in FIG. 3A, a system resource 104-5 can be a CLASS 2 resource.
Mergin ag nd Splittir~ Metadata Partitions.
A MDS according to the present invention may provide additional flexibility by enabling the merging and/or splitting of metadata partitions. FIG. 3B shows the system of FIG. 3A following the splitting of partition 102-2 into two partitions 102-20 and 102-21 and the merging of partitions 102-7 and 102-9 to form a partition 102-7/9. Such a capability is in contrast to a monolithic server approach, which may suffer from decreased performance, as existing resources might not capable be of addressing unexpected file activity and/or require downtime to add resources.
In the example of FIG. 3B, partition 102-2 was previously assigned to resource 104-1.
The partition could be split into two partitions 102-20 and 102-21 assigned to resources 104-1 and 104-Y, respectively. Such a partition splitting could occur for various reasons. As but 5 one example, the amount and/or type of rnetadata in partition 102-2 may be growing in size and/or being accessed more often. To accommodate such a larger size or increased resource needs, partitions 102-20 and 102-21 could be formed by splitting partition 102-2.
Of course, a partition splitting does not always include assigning a split-off partition to a different resource. Other operations may include splitting a partition for one resource, 10 and assigning one of the new partitions to the same resource.
In the example of FIG. 3B, partitions 102-7 and 102-9 were previously assigned to resources 104-3 and 104-4, respectively. The partitions could be merged into a single partition 102-7/9 for various reasons. As but one example, the amount and/or type of metadata in partitions 102-7 and 102-9 may not justify different partitions and/or the assignment of multiple resources to the partitions. The merging of partitions (102-7 and 102-9) may allow resource 104-3 to be applied to its remaining partitions, which may optimize system performance.
In this way metadata partition splitting and merging can provide for more flexibility, scalability, and/or optimization in a MDS.
Metadata Migration.
A MDS 100 according to the present invention can optimize resources over conventional approaches by migrating metadata to different system resource classes based on predetermined policies. Such predetermined policies may include, without limitation, access time for a file, client quality of service, number of metadata nodes in a partition, amount of remaining available space in a partition, etc. Of course, such particular policies are but examples and should in no way be considered limiting to the invention.
lookup may be undertaken on behalf of a client. According to one arrangement, client metadata related requests can be translated into a MDS system request. Such a MDS system request may be issued by an MDS client that controls communication with a MDS 200.
In the example shown, a lookup may begin at node 204-0 in the highest level partition, partition 202-0. The lookup may proceed to node 204-1. However, because the desired file information is stored on a partition 202-1, the lookup can proceed to forwarding object 206-0.
Forwarding object 206-0 can point to node 204-2 in partition 202-1. In this way, a system according to the present invention may include a file system that spans multiple partitions.
The lookup may then resume within partition 202-1 at node 204-2, proceed to node 204-3, and finally arrive at the desired node 204-4 It is noted that in arrangements where a MDS client, or the like, interfaces with an MDS 200, various accesses to different partitions, and indeed the existence of multiple MDS
partitions, can be entirely hidden from a client.
Of course, it is understood that the particular directory structure shown should not be construed as limiting to the present invention, and is provided only by way of example.
Further, a node in such file system may take a variety of forms. As but two possible examples, each node may include a filehandle and corresponding file attributes and/or each node may include a filehandle with a pointer to its corresponding attributes.
A file system distributed across multiple metadata partitions may take a variety of forms. As a first of the many possible examples, a client may forward a file name to retrieve metadata for such a file. Metadata may be arranged in various ways. This may be as simple as alphabetically storing files by a file name. As a second of the many possible examples, rnetadata may be stored according to a function based on a file attributes.
This may include "hashing" one or more fields (e.g., the filehandle) with a hash function, and using the resulting value to determine in which metadata partition the rnetadata is to be stored. Of course, these two examples should not be construed as limiting to the invention. Numerous other file system approaches would be obvious to one skilled in the art.
MDS Operations FIG. 3A - 3C are general representations of various operations that may be performed in a MDS 100. FIGS. 3A-3C include some of the same general items as FIG. 1. To that extent, like items will be referred to by the same reference characters.
Conventionally, file system metadata would be stored on a single partition.
Consequently, such a file system could have limited scalability. If the amount of metadata outgrew the available partition space, an expensive and/or time consuming upgrade operation could be necessary to replace the current partition with a larger partition.
Conventionally, file system metadata would be assigned to a single resource.
Consequently, such a system could be susceptible to failure or require expensive redundancy approaches. More particularly, if the single resource assigned to the metadata failed, the file system would be inoperable until the failure was addressed. Furthex, to address such susceptibility to failure, one or more parallel back-up resources would have to be provided that would reproduce all of the current metadata. Such a conventional arrangement may be more difficult to manage and implement in the event of a failure.
According to one approach of the present invention, MDS servers may run in redundant process pairs. Thus, the failure of one MDS server process can be replaced by another. Still further, for situations that include multiple MDS servers, such a redundant MDS server ratio does not have to be 1:1. As but one example, a first MDS
server may access metadata alphabetically from letters A-C. A second MDS server may access metadata from letters D-F. A redundant MDS server may access metadata from letters A-F.
Consequently, a failure of the first and/or second MDS servers can be met by the redundant MDS server.
Adding a Metadata Resource and/or Metadata Partition.
According to the present invention partitions andlor resources may be added.
By providing such a capability, a MDS may be scaled to accommodate larger amounts of metadata and/or different rnetadata arrangements/configurations. It is noted that in monolithic server approaches, the entire server may have to be shutdown to accommodate additional storage fox metadata. Still further, for different metadata arrangements/
IO configurations, the software of the monolithic server m.ay have to be upgraded and/or customized, also'requiring server "down" time.
FIG. 3A provides one example illustrating how partitions and/or system resources may be added. FIG. 3A shows the MDS of FIG. I following the addition of two new partitions 102-X and 102-Y, and a new resource 104-Y.
A new resource 104-Y rnay be assigned to new partition 102-X. In the particular example of FIG. 3A, new resource 104-Y can be a CLASS 1 resource. Further, a new partition 102-Y may be assigned to an existing resource 104-5. This may be particularly advantageous if a system resource 104-5 is being underutilized. As shown in FIG. 3A, a system resource 104-5 can be a CLASS 2 resource.
Mergin ag nd Splittir~ Metadata Partitions.
A MDS according to the present invention may provide additional flexibility by enabling the merging and/or splitting of metadata partitions. FIG. 3B shows the system of FIG. 3A following the splitting of partition 102-2 into two partitions 102-20 and 102-21 and the merging of partitions 102-7 and 102-9 to form a partition 102-7/9. Such a capability is in contrast to a monolithic server approach, which may suffer from decreased performance, as existing resources might not capable be of addressing unexpected file activity and/or require downtime to add resources.
In the example of FIG. 3B, partition 102-2 was previously assigned to resource 104-1.
The partition could be split into two partitions 102-20 and 102-21 assigned to resources 104-1 and 104-Y, respectively. Such a partition splitting could occur for various reasons. As but 5 one example, the amount and/or type of rnetadata in partition 102-2 may be growing in size and/or being accessed more often. To accommodate such a larger size or increased resource needs, partitions 102-20 and 102-21 could be formed by splitting partition 102-2.
Of course, a partition splitting does not always include assigning a split-off partition to a different resource. Other operations may include splitting a partition for one resource, 10 and assigning one of the new partitions to the same resource.
In the example of FIG. 3B, partitions 102-7 and 102-9 were previously assigned to resources 104-3 and 104-4, respectively. The partitions could be merged into a single partition 102-7/9 for various reasons. As but one example, the amount and/or type of metadata in partitions 102-7 and 102-9 may not justify different partitions and/or the assignment of multiple resources to the partitions. The merging of partitions (102-7 and 102-9) may allow resource 104-3 to be applied to its remaining partitions, which may optimize system performance.
In this way metadata partition splitting and merging can provide for more flexibility, scalability, and/or optimization in a MDS.
Metadata Migration.
A MDS 100 according to the present invention can optimize resources over conventional approaches by migrating metadata to different system resource classes based on predetermined policies. Such predetermined policies may include, without limitation, access time for a file, client quality of service, number of metadata nodes in a partition, amount of remaining available space in a partition, etc. Of course, such particular policies are but examples and should in no way be considered limiting to the invention.
As but one very particular example, metadata that has not been accessed in a certain period of time can be migrated to a system resource that can be slower and/or less expensive (i.e., the metadata may be "aged"). This is in contrast to conventional monolithic server approaches, which can maintain a single, growing metadata collection, assigned to the same resource. FIG. 3B and 3C illustrate a metadata migration operation.
It is first noted that system resources 104-0 to 104-3 and 104-Y are of a first class (CLASS 1), while system resources 104-4 and 104-5 are of a second class (CLASS
2). The first class resources are assumed, in this example, to be faster and/or more expensive than the second class resources. As but one very limited example, a second class resource may include a slower computing machine, and/or run a slower process, and/or use a smaller amount of memory in operation, and/or store data on a slower or less expensive medium.
FIG. 3B shows particular metadata 106 located in partition 102-3. Partition 102-3 has been assigned to first class resource 104-2. Based on predetermined policies particular metadata 106 will be migrated to a lower class resource.
In FIG. 3C, particular metadata 106 has been migrated to partition 102-11, which is assigned to second class resource 104-5. Further, a forwarding object 108 can be placed in partition 102-3 that can provide information to access the new (migrated) location of particular metadata 106 in partition 102-11.
Of course, the metadata migration example of FIGS. 3B and 3C shows but two classes of resources. More than two classes of resources could be included allowing policy based metadata migration through various classes of resources. Still further, in a typical operation multiple nodes can be migrated for more efficient use of system resources.
It is noted that migration of metadata according to the present invention could be independent of actual file migration. More particularly, while files may be migrated in some sort of storage system according to one set of criteria and/or policies, different criteria/policies could be used to migrate metadata in a MDS 100.
It is first noted that system resources 104-0 to 104-3 and 104-Y are of a first class (CLASS 1), while system resources 104-4 and 104-5 are of a second class (CLASS
2). The first class resources are assumed, in this example, to be faster and/or more expensive than the second class resources. As but one very limited example, a second class resource may include a slower computing machine, and/or run a slower process, and/or use a smaller amount of memory in operation, and/or store data on a slower or less expensive medium.
FIG. 3B shows particular metadata 106 located in partition 102-3. Partition 102-3 has been assigned to first class resource 104-2. Based on predetermined policies particular metadata 106 will be migrated to a lower class resource.
In FIG. 3C, particular metadata 106 has been migrated to partition 102-11, which is assigned to second class resource 104-5. Further, a forwarding object 108 can be placed in partition 102-3 that can provide information to access the new (migrated) location of particular metadata 106 in partition 102-11.
Of course, the metadata migration example of FIGS. 3B and 3C shows but two classes of resources. More than two classes of resources could be included allowing policy based metadata migration through various classes of resources. Still further, in a typical operation multiple nodes can be migrated for more efficient use of system resources.
It is noted that migration of metadata according to the present invention could be independent of actual file migration. More particularly, while files may be migrated in some sort of storage system according to one set of criteria and/or policies, different criteria/policies could be used to migrate metadata in a MDS 100.
Filehandle Evolution.
Having described a metadata service and various operations with respect to partitions and system resources, the present invention will now be described with reference to a ftle system that contains metadata.
As noted previously, metadata may include information for a file. Such information may include assorted information particular to the file. In addition, the metadata for each file can have a corresponding unique identifier: a filehandle. In one particular embodiment, a filehandle may include immutable portions and changeable portions. Immutable portions can include unique identifiers that do not change when the metadata is moved for any number of reasons, including migration and/or renaming. However, a filehandle may also include changeable portions. Such changeable portions may change when metadata is moved from one metadata partition to another.
The term ftlehandle evolution is used herein to describe the process by which a metadata filehandle may be changed when the metadata is moved. Various examples of filehandle evolution will now be described.
Metadata Movement - Migration.
Deferring now to FIGS. 4A, 4B and 5, an example of metadata migration will be described. In the case of metadata migration, it can be desirable to ensure that the metadata of a file should be accessible during and after the metadata has been moved to a different partition (e.g., a partition assigned to a lower class resource).
The example of FIGS. 4A and 4B represents the movement of a metadata for file system node 402 from one partition 400-1 of one class (CLASS 1) to a different partition 400-4 of another class (CLASS 2). The information corresponding to file system node 402 can be changed to allow the corresponding file metadata to be accessed even after it has been moved.
One example of metadata that may correspond to node 402 is shown as item 404.
Having described a metadata service and various operations with respect to partitions and system resources, the present invention will now be described with reference to a ftle system that contains metadata.
As noted previously, metadata may include information for a file. Such information may include assorted information particular to the file. In addition, the metadata for each file can have a corresponding unique identifier: a filehandle. In one particular embodiment, a filehandle may include immutable portions and changeable portions. Immutable portions can include unique identifiers that do not change when the metadata is moved for any number of reasons, including migration and/or renaming. However, a filehandle may also include changeable portions. Such changeable portions may change when metadata is moved from one metadata partition to another.
The term ftlehandle evolution is used herein to describe the process by which a metadata filehandle may be changed when the metadata is moved. Various examples of filehandle evolution will now be described.
Metadata Movement - Migration.
Deferring now to FIGS. 4A, 4B and 5, an example of metadata migration will be described. In the case of metadata migration, it can be desirable to ensure that the metadata of a file should be accessible during and after the metadata has been moved to a different partition (e.g., a partition assigned to a lower class resource).
The example of FIGS. 4A and 4B represents the movement of a metadata for file system node 402 from one partition 400-1 of one class (CLASS 1) to a different partition 400-4 of another class (CLASS 2). The information corresponding to file system node 402 can be changed to allow the corresponding file metadata to be accessed even after it has been moved.
One example of metadata that may correspond to node 402 is shown as item 404.
Metadata 404 is shown to generally include a filehandle and one or more associated file attributes. The particular metadata 404 has a filehandle of "filehandle 0" and attributes of "attributes 0".
Refernng now to FIG. 4B, in a filehandle evolution operation, the metadata corresponding to node 402 has been moved to partition 400-4 and corresponds to a new node 406. Further, a forwarding object 410 now corresponds to the previous metadata location within partition 400-1. FIG. 4B shows one example of a forwarding object 410 at the "old"
node, as well as new metadata 408 corresponding to "new" node 406.
In the example of FIG. 4B, newly moved metadata 408 can include a new filehandle "filehandle-1" and yet retain its file attribute information. The new filehandle (filehandle 1) can reflect the new location of metadata 408.
A forwarding object 410 may provide a number of functions. In the particular example of FIG. 4B, a forwarding object can return the new filehandle (filehandle 1), thereby allowing a service to access the new metadata location. Further, a forwarding object may return a particular message (e.g., an error message) indicating that the metadata has been moved. In one very particular example, a MISS client can handle forwarding object errors, malting accesses to multiple partitions and providing desired metadata without the multiple accesses being apparent to a client.
FIG. 5 provides one very particular example of a filehandle evolution. FIG. 5 includes a first filehandle 500 corresponding to metadata prior to movement between partitions (e.g., due to migration). A second filehandle 502 corresponds to the filehandle after movement of the corresponding metadata between partitions.
The filehandles (500 and 502) have very particular fields, and should not be construed as limiting the invention. The filehandle fields shows a filehandle type field (ftype), a filesystem id field (filesys id), a system assigned identifier field (system id), a file identifier field (file id), a partition identifier (part id), and a directory identifier (dir id).
Refernng now to FIG. 4B, in a filehandle evolution operation, the metadata corresponding to node 402 has been moved to partition 400-4 and corresponds to a new node 406. Further, a forwarding object 410 now corresponds to the previous metadata location within partition 400-1. FIG. 4B shows one example of a forwarding object 410 at the "old"
node, as well as new metadata 408 corresponding to "new" node 406.
In the example of FIG. 4B, newly moved metadata 408 can include a new filehandle "filehandle-1" and yet retain its file attribute information. The new filehandle (filehandle 1) can reflect the new location of metadata 408.
A forwarding object 410 may provide a number of functions. In the particular example of FIG. 4B, a forwarding object can return the new filehandle (filehandle 1), thereby allowing a service to access the new metadata location. Further, a forwarding object may return a particular message (e.g., an error message) indicating that the metadata has been moved. In one very particular example, a MISS client can handle forwarding object errors, malting accesses to multiple partitions and providing desired metadata without the multiple accesses being apparent to a client.
FIG. 5 provides one very particular example of a filehandle evolution. FIG. 5 includes a first filehandle 500 corresponding to metadata prior to movement between partitions (e.g., due to migration). A second filehandle 502 corresponds to the filehandle after movement of the corresponding metadata between partitions.
The filehandles (500 and 502) have very particular fields, and should not be construed as limiting the invention. The filehandle fields shows a filehandle type field (ftype), a filesystem id field (filesys id), a system assigned identifier field (system id), a file identifier field (file id), a partition identifier (part id), and a directory identifier (dir id).
A ftype field may indicate the particular type of f le corresponding to the metadata (e.g., standard, directory, etc.). A system id field may indicate a unique value assigned to a file by a system. A file id may indicate a unique value for identifying a particular file. A
filesystem id field may identify a particular file system type (e.g., IJnix or NFS). The S filesys id and file id fields may be immutable portions of a filehandle.
Filehandles 500 and 502 may also include changeable portions. In the particular example of FIG. 5, changeable portions may include the part id, dir id and ftype ftelds. 'The particular filehandle evolution of FIG. 5 may correspond to the operation shown in FIGS. 4A
and 4B. Thus, in the metadata movement, a part id may be changed from "0001"
(which may correspond to partition 400-1) to "0004" (which may correspond to partition 400-4).
Because metadata (500 and 502) may maintain the same logical relationship within a file system, a dir id value may remain unchanged.
Metadata Movement - Renaming.
The example of FIGS. 6A and 6B shows the renaming of a particular file, resulting in a file system node 602 being moved from one partition 600-1 to another partition 600-4. The data associated with file system node 602 can be changed to allow the corresponding file metadata to be accessed even after the file has been renamed.
As in the migration example shown in FIGS. 4A and 4B, a filehandle may have one value (filehandle 0) prior to renaming, and another value (filehandle n) after the renaming.
Further, a forwarding object 610 may be created that corresponds to the "old"
node.
FIG. 7 provides one particular detailed example of an old filehandle 700 prior to a renaming and a new filehandle 702 after a renaming. Filehandles 700 and 702 are similar to those shown in FIG. 5. However, unlike FIG. 5, a new filehandle 702 may include a dir id field that changes from one directory value (OOOC 231A) to another (OOOC
8DF9), reflecting the new metadata's logical position in a file system.
In one embodiment, a forwarding object (such as 610 of FIG. 6B) may be temporary.
That is, according to particular predetermined policies, a forwarding object can be destroyed.
One of the many possible ways to accomplish the destruction of a forwarding object is illustrated in FIG. 8. FIG. 8 shows one example of a forwarding object table.
Such a table may monitor all current forwarding objects, and according to predetermined policies, destroy 5 a forwarding object. The one example of FIG. 8 shows a forwarding object id column and a monitor column. A monitor column may include data that is monitored to determine when/if a forwarding object may be destroyed.
One of the many possible policies used to determine if a forwarding object should be destroyed may be the "age" of a forwarding object. If a forwarding object has been in 10 existence for longer than a certain amount of time, the forwarding object will be destroyed.
Of course, various other policies may be used in addition or alternatively to age. As but a few examples, forwarding objects may be destroyed based in infrequency of access, all forwarding objects can be destroyed in a periodic fashion, or forwarding objects may be destroyed simultaneously on a partition-by-partition basis, etc. Along these same lines, a 15 function may be called that can compare information in the forwarding object to predetermined criteria and then destroy the object depending upon the comparison result.
While the above examples have described evolution of a single filehandle due to metadata movement between partitions, it would be obvious to one skilled in the art that metadata corresponding to multiple files may be moved together. FIGS. 9A and 9B show one representation of the movement of a group of metadata.
FIGS. 9A-and 9B are a representation of a file system having various nodes, each of which may correspond to particular metadata. Nodes may be conceptualized as being distributed across various partitions (900-1 to 900-3). In FIG. 9A, metadata for a group of nodes 902 is within partition 900-2. In FIG. 9B, metadata for group 902 is moved from partition 900-2 to partition 900-3, to form new group 902'. A forwarding object 904 has also been associated with the highest directory location corresponding to the old group 902.
filesystem id field may identify a particular file system type (e.g., IJnix or NFS). The S filesys id and file id fields may be immutable portions of a filehandle.
Filehandles 500 and 502 may also include changeable portions. In the particular example of FIG. 5, changeable portions may include the part id, dir id and ftype ftelds. 'The particular filehandle evolution of FIG. 5 may correspond to the operation shown in FIGS. 4A
and 4B. Thus, in the metadata movement, a part id may be changed from "0001"
(which may correspond to partition 400-1) to "0004" (which may correspond to partition 400-4).
Because metadata (500 and 502) may maintain the same logical relationship within a file system, a dir id value may remain unchanged.
Metadata Movement - Renaming.
The example of FIGS. 6A and 6B shows the renaming of a particular file, resulting in a file system node 602 being moved from one partition 600-1 to another partition 600-4. The data associated with file system node 602 can be changed to allow the corresponding file metadata to be accessed even after the file has been renamed.
As in the migration example shown in FIGS. 4A and 4B, a filehandle may have one value (filehandle 0) prior to renaming, and another value (filehandle n) after the renaming.
Further, a forwarding object 610 may be created that corresponds to the "old"
node.
FIG. 7 provides one particular detailed example of an old filehandle 700 prior to a renaming and a new filehandle 702 after a renaming. Filehandles 700 and 702 are similar to those shown in FIG. 5. However, unlike FIG. 5, a new filehandle 702 may include a dir id field that changes from one directory value (OOOC 231A) to another (OOOC
8DF9), reflecting the new metadata's logical position in a file system.
In one embodiment, a forwarding object (such as 610 of FIG. 6B) may be temporary.
That is, according to particular predetermined policies, a forwarding object can be destroyed.
One of the many possible ways to accomplish the destruction of a forwarding object is illustrated in FIG. 8. FIG. 8 shows one example of a forwarding object table.
Such a table may monitor all current forwarding objects, and according to predetermined policies, destroy 5 a forwarding object. The one example of FIG. 8 shows a forwarding object id column and a monitor column. A monitor column may include data that is monitored to determine when/if a forwarding object may be destroyed.
One of the many possible policies used to determine if a forwarding object should be destroyed may be the "age" of a forwarding object. If a forwarding object has been in 10 existence for longer than a certain amount of time, the forwarding object will be destroyed.
Of course, various other policies may be used in addition or alternatively to age. As but a few examples, forwarding objects may be destroyed based in infrequency of access, all forwarding objects can be destroyed in a periodic fashion, or forwarding objects may be destroyed simultaneously on a partition-by-partition basis, etc. Along these same lines, a 15 function may be called that can compare information in the forwarding object to predetermined criteria and then destroy the object depending upon the comparison result.
While the above examples have described evolution of a single filehandle due to metadata movement between partitions, it would be obvious to one skilled in the art that metadata corresponding to multiple files may be moved together. FIGS. 9A and 9B show one representation of the movement of a group of metadata.
FIGS. 9A-and 9B are a representation of a file system having various nodes, each of which may correspond to particular metadata. Nodes may be conceptualized as being distributed across various partitions (900-1 to 900-3). In FIG. 9A, metadata for a group of nodes 902 is within partition 900-2. In FIG. 9B, metadata for group 902 is moved from partition 900-2 to partition 900-3, to form new group 902'. A forwarding object 904 has also been associated with the highest directory location corresponding to the old group 902.
As noted in the previous examples, a forwarding object 904 may include information that can enable the metadata of moved group 902' to be accessed. This is represented by logical path 906.
FIG. 9B also includes new logical path 908 that may represent a renaming case, in which the new group 902' may have new logical relationship within a file system. In the case of a migration operation, such a new logical path 908 may not exist. The filehandles corresponding to the nodes of new group 902' may be changed to represent new partition location (900-3). In the case of a renaming operation, hlehandles may be changed to reflect new directory information.
MDS Interface and Functions.
FIG. 10 shows a block diagram of one embodiment of a MDS 1000. A MDS 1000 may include various resources and partitions 1002. Resources and partitions 1002 may include a file system 1004 and metadata 1006 for a storage system. . Metadata 1006 may include file attributes and filehandles.
A MDS 1000 may also include an interface 1008 that may call one or more functions in response to requestslaccesses to metadata. One of the ways in which an interface may differ from conventional approaches is that the various functions may receive a filehandle as an input value that includes a particular partition value. In addition, the execution of a function may include accessing a desired partition, and then performing a particular metadata operation. This is in contrast to conventional monolithic server approaches, which may access a single partition of metadata, and so not include functions that operate by accessing one metadata partition from multiple metadata partitions.
Particular examples of functions that may be performed by an interface 1008 are set forth in FIGS. 11A, 11B and 12. FIGS. I IA and 11B show functions that may utilize input values that include partition id data. In particular, filehandle values for particular files, or directory values can be inputs to a function. The function may then use such values to access a particular partition and perform a particular operation. The various functions of FIGS. 1 lA, 11B and 12 are set forth in pseudocode.
As shown in FIG. 11A, CpetAttYibutes function may input a filehandle value and output the attributes corresponding to the filehandle. Such a function may include accessing a partition indicated by the input filehandle and then accessing the metadata corresponding to the filehandle located in the partition. The attributes may then be output as a returned value.
A SetAttributes function may input a filehandle value and a set of attributes (new attributes). A partition containing the metadata for the filehandle can be accessed, and the metadata may be changed to include the new set of attributes. The new attributes may then be output along with the corresponding filehandle and filename.
A CreateMetadata function may input a parent directory value, filename, and attributes for a new file. New metadata can be created according to the new filename's position in the parent directory. A filehandle can be created for the metadata. Such a filehandle may include metadata partition information that indicates the location for the metadata. The new filehandle and attributes can then be output along with the new filename.
A RemoveMetadata function can be used to remove metadata from a system.
Metadata may be located on its partition according to an input filehandle.
Corresponding attributes may then be read. Metadata may then be deleted and a file system revised to reflect such a deletion. A flag may then be set to indicate that the metadata has been removed. The flag and read attributes may then be output. It is noted that a RemoveMetadata function may not actually delete metadata initially. As but one example, a RemoveMetadata function may maintain a list of metadata to be deleted. When a message (i.e., by way of another function or the like) indicates that the file corresponding to the metadata has been deleted, the corresponding metadata may then be deleted from its metadata partition.
A RemoveName function can be used to remove metadata for a file that may include multiple links. A RemoveName function can receive a directory value and filename as inputs.
FIG. 9B also includes new logical path 908 that may represent a renaming case, in which the new group 902' may have new logical relationship within a file system. In the case of a migration operation, such a new logical path 908 may not exist. The filehandles corresponding to the nodes of new group 902' may be changed to represent new partition location (900-3). In the case of a renaming operation, hlehandles may be changed to reflect new directory information.
MDS Interface and Functions.
FIG. 10 shows a block diagram of one embodiment of a MDS 1000. A MDS 1000 may include various resources and partitions 1002. Resources and partitions 1002 may include a file system 1004 and metadata 1006 for a storage system. . Metadata 1006 may include file attributes and filehandles.
A MDS 1000 may also include an interface 1008 that may call one or more functions in response to requestslaccesses to metadata. One of the ways in which an interface may differ from conventional approaches is that the various functions may receive a filehandle as an input value that includes a particular partition value. In addition, the execution of a function may include accessing a desired partition, and then performing a particular metadata operation. This is in contrast to conventional monolithic server approaches, which may access a single partition of metadata, and so not include functions that operate by accessing one metadata partition from multiple metadata partitions.
Particular examples of functions that may be performed by an interface 1008 are set forth in FIGS. 11A, 11B and 12. FIGS. I IA and 11B show functions that may utilize input values that include partition id data. In particular, filehandle values for particular files, or directory values can be inputs to a function. The function may then use such values to access a particular partition and perform a particular operation. The various functions of FIGS. 1 lA, 11B and 12 are set forth in pseudocode.
As shown in FIG. 11A, CpetAttYibutes function may input a filehandle value and output the attributes corresponding to the filehandle. Such a function may include accessing a partition indicated by the input filehandle and then accessing the metadata corresponding to the filehandle located in the partition. The attributes may then be output as a returned value.
A SetAttributes function may input a filehandle value and a set of attributes (new attributes). A partition containing the metadata for the filehandle can be accessed, and the metadata may be changed to include the new set of attributes. The new attributes may then be output along with the corresponding filehandle and filename.
A CreateMetadata function may input a parent directory value, filename, and attributes for a new file. New metadata can be created according to the new filename's position in the parent directory. A filehandle can be created for the metadata. Such a filehandle may include metadata partition information that indicates the location for the metadata. The new filehandle and attributes can then be output along with the new filename.
A RemoveMetadata function can be used to remove metadata from a system.
Metadata may be located on its partition according to an input filehandle.
Corresponding attributes may then be read. Metadata may then be deleted and a file system revised to reflect such a deletion. A flag may then be set to indicate that the metadata has been removed. The flag and read attributes may then be output. It is noted that a RemoveMetadata function may not actually delete metadata initially. As but one example, a RemoveMetadata function may maintain a list of metadata to be deleted. When a message (i.e., by way of another function or the like) indicates that the file corresponding to the metadata has been deleted, the corresponding metadata may then be deleted from its metadata partition.
A RemoveName function can be used to remove metadata for a file that may include multiple links. A RemoveName function can receive a directory value and filename as inputs.
Metadata for the file that is to be removed can be located with directory and filename information. If the metadata is removed, a flag can be returned indicating the operation is complete. Otherwise, the attributes, filehandle and filename can be returned.
Refernng now to FIG. 11 B, a RenameFile function can be used to change metadata when a file is renamed. A file filehandle prior to the name change (old filehandle) and old filename (old filename) can be used to access the metadata of the file that is to be renamed.
A new filename (new filename) and new parent directory (new~arent directory) may be input to determine a new location for the metadata under the new name. A new filehandle may then be created based on the new name. As described above, a file name change may result in a filehandle having a change in a partition id value. °The new filehandle can then be output along with corresponding attributes.
A CreateLihk function can be used to establish a hard link between a file and a directory (new~arent directory). A link may be created under a file name (link filename) in the directory. Metadata for the linked file may then be output along with the corresponding filename and filehandle.
In this way, various functions may access particular partitions according to input values and perform operations on the metadata of the partitions.
Refernng now to FIG. 12, a ReadDirectory function can be used to read the metadata contained in an identified directory. A directory value may be input to identify the directory.
In addition, a count value and last entry value (last entry read) may also be input. A count value can indicate the number of metadata entries that will be retrieved. A
count value can initially be one, but may be a different value in the event the directory cannot be read in a single function call, and the function is called more than once.
A directory may be accessed according to the input directory value. In the event the function is being called for a first time, a metadata attribute list having a length equal to the count value can then be formed. If the attributes for all entries in the directory can fit in the list, a flag (call function again) can be set to value indicating that the function does not need to be called a second time. If there are more attribute entries than the count value, the flag ' can be set to another value indicating that the function must be called again.
Further, a last entry value (last entry read) can be returned so that a subsequent function call can begin S where the previous function call left off.
A Lookup and MultiLookup function can be used to retrieve attributes corresponding to a particular file name or a multipart file name. Parent directory values may be input along with a file name or multipart file name. Attributes corresponding to file names) can then be output.
In this way, a file system may be accessed (e.g., via a directory structure) to retrieve metadata for particular files. It is noted, unlike conventional approaches, metadata can be retrieved from different metadata partitions.
Of course, the above-described functions represent but particular examples and one particular set of functions that may be provided by a MDS interface.
Examples of Systems That May Include A MDS.
Referring now to FIG. 13, a block diagram is shown illustrating one example of a data storage system that may include a metadata management system (MDS) according to one embodiment. The data storage system is designated by the general reference character 1300 and is shown to be sub-divided into a number of sub-systems, including a gateway service 1302, a MDS 1304, and a storage service 1306. A storage service 1306 may further include a bitfile management service (BMS) 1306-0, and a bitfile storage service (BSS) 1306-1.
Files may be stored by the BSS 1306-1. A BMS 1306-0 can manage accesses to the files. A MDS 1304 may store metadata corresponding to the files stored in the BSS 1306-1.
Such metadata may include, without limitation, unique file identifying information, such as filehandle. Further, such metadata may also include other information that can be used by other systems to identify the location of a data file. As noted above, such information can change in the event a file is moved, renamed or otherwise manipulated.
As previously noted, MDS 1304 may include multiple partitions, or the ability to accommodate multiple partitions of metadata. Further, in particular embodiments, a MDS
1304 may include an interface for executing various file related functions, including those 5 that create, remove, and rename data files, as well as those that access various attributes of data files stored in the overall data storage system 1300. Unlike a conventional monolithic server approach, the MDS 1304 may include a collection of loosely coupled servers that service metadata requests and functions, where servers in the MDS are separate from those servers situated in the storage service 1306 that provide access to the files corresponding to 10 the metadata.
As previously noted, a MDS 1304 may include metadata distributed across multiple partitions. Multiple partitions are diagrammatically represented in FIG. 13 by items 1308-0 to 1308-4. It is understood that partitions may exist as data blocks, files, and even databases on various different types of media. Advantageously, system resources may be assigned to 15 various partitions, and may also vary. System resources may include physical machines, storage media space, as well as particular processes. Such processes may include various functions for accessing and/or manipulating the metadata.
According to one embodiment, a gateway service 1302 may receive various requests from a client. Metadata related requests can be serviced by the MDS 1304, which can 20 include its own set of servers and multiple partitions. Actual file related service (e.g., reads, writes, etc.) can be serviced by the storage service 1306, which may include servers and partitions separate from those of the MDS 1304. In the example of FIG. 13, gateway service 1302 may receive client requests by way of a network 1310, such as the Internet, as but one example.
While the example of FIG. 13 has described a metadata management system (MDS) that is essentially "de-coupled" from an independent storage service, alternate embodiments could include more particular correspondence between metadata and the files corresponding to the metadata. On such example is shown in FIG. 14.
FIG. 14 shows an alternate embodiment in which metadata and corresponding files are managed with the same "granularity." FIG. 14 includes a system 1400 with a metadata management system (MDS) 1402 and a corresponding storage service 1404. Other system components (such as a gateway) have been excluded to avoid unduly cluttering the view. A
MDS 1402 and storage service 1404 can be independent servers that provide various system resources to corresponding partitions. Metadata partitions are represented in FIG. 14 by items 1406-0 to 1406-4. File storage partitions are represented by items 1408-0 to 1408-4.
Identical granularity exists because for each metadata partition there is a corresponding file storage partition. Correspondence between metadata partitions and file storage partitions are shown by dashed lines 1410-0 to 1410-4.
In an arrangement such as FIG. 14, movements of metadata from one metadata partition to another, may require the corresponding file to be moved between corresponding file storage partitions.
Of course, it is understood that FIG. 14 represents but one of the many possible variations according the present invention.
It is thus understood that while the preferred embodiments set forth herein have been described in detail, the present invention could be subject various changes, substitutions, and alterations without departing from the spirit and scope of the invention.
Accordingly, the present invention is intended to be limited only as defined by the appended claims.
Refernng now to FIG. 11 B, a RenameFile function can be used to change metadata when a file is renamed. A file filehandle prior to the name change (old filehandle) and old filename (old filename) can be used to access the metadata of the file that is to be renamed.
A new filename (new filename) and new parent directory (new~arent directory) may be input to determine a new location for the metadata under the new name. A new filehandle may then be created based on the new name. As described above, a file name change may result in a filehandle having a change in a partition id value. °The new filehandle can then be output along with corresponding attributes.
A CreateLihk function can be used to establish a hard link between a file and a directory (new~arent directory). A link may be created under a file name (link filename) in the directory. Metadata for the linked file may then be output along with the corresponding filename and filehandle.
In this way, various functions may access particular partitions according to input values and perform operations on the metadata of the partitions.
Refernng now to FIG. 12, a ReadDirectory function can be used to read the metadata contained in an identified directory. A directory value may be input to identify the directory.
In addition, a count value and last entry value (last entry read) may also be input. A count value can indicate the number of metadata entries that will be retrieved. A
count value can initially be one, but may be a different value in the event the directory cannot be read in a single function call, and the function is called more than once.
A directory may be accessed according to the input directory value. In the event the function is being called for a first time, a metadata attribute list having a length equal to the count value can then be formed. If the attributes for all entries in the directory can fit in the list, a flag (call function again) can be set to value indicating that the function does not need to be called a second time. If there are more attribute entries than the count value, the flag ' can be set to another value indicating that the function must be called again.
Further, a last entry value (last entry read) can be returned so that a subsequent function call can begin S where the previous function call left off.
A Lookup and MultiLookup function can be used to retrieve attributes corresponding to a particular file name or a multipart file name. Parent directory values may be input along with a file name or multipart file name. Attributes corresponding to file names) can then be output.
In this way, a file system may be accessed (e.g., via a directory structure) to retrieve metadata for particular files. It is noted, unlike conventional approaches, metadata can be retrieved from different metadata partitions.
Of course, the above-described functions represent but particular examples and one particular set of functions that may be provided by a MDS interface.
Examples of Systems That May Include A MDS.
Referring now to FIG. 13, a block diagram is shown illustrating one example of a data storage system that may include a metadata management system (MDS) according to one embodiment. The data storage system is designated by the general reference character 1300 and is shown to be sub-divided into a number of sub-systems, including a gateway service 1302, a MDS 1304, and a storage service 1306. A storage service 1306 may further include a bitfile management service (BMS) 1306-0, and a bitfile storage service (BSS) 1306-1.
Files may be stored by the BSS 1306-1. A BMS 1306-0 can manage accesses to the files. A MDS 1304 may store metadata corresponding to the files stored in the BSS 1306-1.
Such metadata may include, without limitation, unique file identifying information, such as filehandle. Further, such metadata may also include other information that can be used by other systems to identify the location of a data file. As noted above, such information can change in the event a file is moved, renamed or otherwise manipulated.
As previously noted, MDS 1304 may include multiple partitions, or the ability to accommodate multiple partitions of metadata. Further, in particular embodiments, a MDS
1304 may include an interface for executing various file related functions, including those 5 that create, remove, and rename data files, as well as those that access various attributes of data files stored in the overall data storage system 1300. Unlike a conventional monolithic server approach, the MDS 1304 may include a collection of loosely coupled servers that service metadata requests and functions, where servers in the MDS are separate from those servers situated in the storage service 1306 that provide access to the files corresponding to 10 the metadata.
As previously noted, a MDS 1304 may include metadata distributed across multiple partitions. Multiple partitions are diagrammatically represented in FIG. 13 by items 1308-0 to 1308-4. It is understood that partitions may exist as data blocks, files, and even databases on various different types of media. Advantageously, system resources may be assigned to 15 various partitions, and may also vary. System resources may include physical machines, storage media space, as well as particular processes. Such processes may include various functions for accessing and/or manipulating the metadata.
According to one embodiment, a gateway service 1302 may receive various requests from a client. Metadata related requests can be serviced by the MDS 1304, which can 20 include its own set of servers and multiple partitions. Actual file related service (e.g., reads, writes, etc.) can be serviced by the storage service 1306, which may include servers and partitions separate from those of the MDS 1304. In the example of FIG. 13, gateway service 1302 may receive client requests by way of a network 1310, such as the Internet, as but one example.
While the example of FIG. 13 has described a metadata management system (MDS) that is essentially "de-coupled" from an independent storage service, alternate embodiments could include more particular correspondence between metadata and the files corresponding to the metadata. On such example is shown in FIG. 14.
FIG. 14 shows an alternate embodiment in which metadata and corresponding files are managed with the same "granularity." FIG. 14 includes a system 1400 with a metadata management system (MDS) 1402 and a corresponding storage service 1404. Other system components (such as a gateway) have been excluded to avoid unduly cluttering the view. A
MDS 1402 and storage service 1404 can be independent servers that provide various system resources to corresponding partitions. Metadata partitions are represented in FIG. 14 by items 1406-0 to 1406-4. File storage partitions are represented by items 1408-0 to 1408-4.
Identical granularity exists because for each metadata partition there is a corresponding file storage partition. Correspondence between metadata partitions and file storage partitions are shown by dashed lines 1410-0 to 1410-4.
In an arrangement such as FIG. 14, movements of metadata from one metadata partition to another, may require the corresponding file to be moved between corresponding file storage partitions.
Of course, it is understood that FIG. 14 represents but one of the many possible variations according the present invention.
It is thus understood that while the preferred embodiments set forth herein have been described in detail, the present invention could be subject various changes, substitutions, and alterations without departing from the spirit and scope of the invention.
Accordingly, the present invention is intended to be limited only as defined by the appended claims.
Claims (21)
What is claimed is:
1. A method of storing data files, comprising the steps of:
storing files in a storage service system; and storing metadata corresponding to the files on a plurality of metadata partitions that are separate from the storage service system.
storing files in a storage service system; and storing metadata corresponding to the files on a plurality of metadata partitions that are separate from the storage service system.
2. The method of claim 1, further including:
assigning each metadata partition to a resource, each resource providing access to the metadata in the metadata partition to which the resource is assigned.
assigning each metadata partition to a resource, each resource providing access to the metadata in the metadata partition to which the resource is assigned.
3. The method of claim 2, wherein:
the resources include at least one first class resource and at least one second class resource, the second class resource providing access to a metadata partition in a different manner than the first class resource provides access to a metadata partition.
the resources include at least one first class resource and at least one second class resource, the second class resource providing access to a metadata partition in a different manner than the first class resource provides access to a metadata partition.
4. The method of claim 3, wherein:
the second class resource has fewer computing resources than the first class resource.
the second class resource has fewer computing resources than the first class resource.
5. The method of claim 1, further including:
moving metadata from one metadata partition to another metadata partition according to predetermined policies.
moving metadata from one metadata partition to another metadata partition according to predetermined policies.
6. The method of claim 5, wherein:
the predetermined policies include an amount of time since the metadata was last accessed.
the predetermined policies include an amount of time since the metadata was last accessed.
7. The method of claim 5, wherein:
the predetermined policies include available metadata partition space.
the predetermined policies include available metadata partition space.
8. The method of claim 5, wherein:
the predetermined policies include a quality-of-service value.
the predetermined policies include a quality-of-service value.
9. The method of claim 1, wherein:
each file has a corresponding file name; and further including:
moving metadata from a first metadata partition to a second metadata partition when a file name is changed, if the new file name falls within the second metadata partition according to a file system.
each file has a corresponding file name; and further including:
moving metadata from a first metadata partition to a second metadata partition when a file name is changed, if the new file name falls within the second metadata partition according to a file system.
10. The method of claim 5, wherein:
moving the metadata from the one metadata partition to another metadata partition includes moving the metadata to another metadata partition, and placing a forwarding object in the one metadata partition that indicates the new location of the moved metadata in the other metadata partition.
moving the metadata from the one metadata partition to another metadata partition includes moving the metadata to another metadata partition, and placing a forwarding object in the one metadata partition that indicates the new location of the moved metadata in the other metadata partition.
11. The method of claim 10, further including:
maintaining a list of forwarding objects and destroying selected forwarding objects based on predetermined policies.
maintaining a list of forwarding objects and destroying selected forwarding objects based on predetermined policies.
12. The method of claim 11, wherein:
the predetermined policies include last time of access to the forwarding object.
the predetermined policies include last time of access to the forwarding object.
13. The method of claim 1, further including:
splitting a metadata partition into at least two different metadata partitions.
splitting a metadata partition into at least two different metadata partitions.
14. A system, comprising:
a storage service that stores a plurality of files; and a metadata management system that includes a plurality of metadata computing resources, and a plurality of metadata partitions that store metadata corresponding to the files, at least one metadata partition being assigned to each metadata computing resource.
a storage service that stores a plurality of files; and a metadata management system that includes a plurality of metadata computing resources, and a plurality of metadata partitions that store metadata corresponding to the files, at least one metadata partition being assigned to each metadata computing resource.
15. The system of claim 14, wherein:
each metadata computing resource is selected from the group consisting of a computing machine, a storage medium, and a computing process.
each metadata computing resource is selected from the group consisting of a computing machine, a storage medium, and a computing process.
16. The system of claim 14, wherein:
the metadata management system further includes a file system that indexes to metadata stored on at least two metadata partitions.
the metadata management system further includes a file system that indexes to metadata stored on at least two metadata partitions.
17. The system of claim 14, wherein:
the storage service includes a plurality of file storage partitions, each file storage partition being assigned to a storage computing resource that is different than a metadata computing resource.
the storage service includes a plurality of file storage partitions, each file storage partition being assigned to a storage computing resource that is different than a metadata computing resource.
18. The system of claim 17, wherein:
each metadata partition corresponds to a file storage partition, and stores metadata for files stored in the corresponding file storage partition.
each metadata partition corresponds to a file storage partition, and stores metadata for files stored in the corresponding file storage partition.
19. A method for migrating metadata in a storage system, comprising the steps of:
storing metadata on a plurality of metadata partitions;
assigning a computing resource to each metadata partition, where such computing resources include a first class resource and a second class resource;
moving metadata from a partition assigned to a first class resource to a partition assigned to a second class resource based upon predetermined policies.
storing metadata on a plurality of metadata partitions;
assigning a computing resource to each metadata partition, where such computing resources include a first class resource and a second class resource;
moving metadata from a partition assigned to a first class resource to a partition assigned to a second class resource based upon predetermined policies.
20. The method of claim 19, wherein:
the predetermined policies include time of last access to the metadata.
the predetermined policies include time of last access to the metadata.
21. The method of claim 19, wherein:
the metadata includes a filehandle associated with a corresponding file;
and the step of moving metadata includes moving selected metadata from an old metadata partition to a new metadata partition and placing a forwarding object in the old metadata partition that indicates the location of the selected metadata in the new metadata partition.
the metadata includes a filehandle associated with a corresponding file;
and the step of moving metadata includes moving selected metadata from an old metadata partition to a new metadata partition and placing a forwarding object in the old metadata partition that indicates the location of the selected metadata in the new metadata partition.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US65910700A | 2000-09-11 | 2000-09-11 | |
PCT/US2002/018940 WO2003107219A1 (en) | 2000-09-11 | 2002-06-12 | Storage system having partitioned migratable metadata |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2489324A1 true CA2489324A1 (en) | 2003-12-24 |
Family
ID=32232854
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002489324A Abandoned CA2489324A1 (en) | 2000-09-11 | 2002-06-12 | Storage system having partitioned migratable metadata |
Country Status (6)
Country | Link |
---|---|
US (1) | US7146377B2 (en) |
EP (1) | EP1532543A4 (en) |
JP (1) | JP2005530242A (en) |
AU (1) | AU2002312508B2 (en) |
CA (1) | CA2489324A1 (en) |
WO (1) | WO2003107219A1 (en) |
Families Citing this family (115)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8935307B1 (en) * | 2000-09-12 | 2015-01-13 | Hewlett-Packard Development Company, L.P. | Independent data access in a segmented file system |
US7836017B1 (en) | 2000-09-12 | 2010-11-16 | Hewlett-Packard Development Company, L.P. | File replication in a distributed segmented file system |
US20060288080A1 (en) * | 2000-09-12 | 2006-12-21 | Ibrix, Inc. | Balanced computer architecture |
US20040236798A1 (en) * | 2001-09-11 | 2004-11-25 | Sudhir Srinivasan | Migration of control in a distributed segmented file system |
US6782389B1 (en) * | 2000-09-12 | 2004-08-24 | Ibrix, Inc. | Distributing files across multiple, permissibly heterogeneous, storage devices |
US7778981B2 (en) * | 2000-12-01 | 2010-08-17 | Netapp, Inc. | Policy engine to control the servicing of requests received by a storage server |
US7346928B1 (en) * | 2000-12-01 | 2008-03-18 | Network Appliance, Inc. | Decentralized appliance virus scanning |
US7047420B2 (en) | 2001-01-17 | 2006-05-16 | Microsoft Corporation | Exclusive encryption |
US7043637B2 (en) | 2001-03-21 | 2006-05-09 | Microsoft Corporation | On-disk file format for a serverless distributed file system |
US6981138B2 (en) * | 2001-03-26 | 2005-12-27 | Microsoft Corporation | Encrypted key cache |
US7062490B2 (en) | 2001-03-26 | 2006-06-13 | Microsoft Corporation | Serverless distributed file system |
US6988124B2 (en) * | 2001-06-06 | 2006-01-17 | Microsoft Corporation | Locating potentially identical objects across multiple computers based on stochastic partitioning of workload |
US8346733B2 (en) | 2006-12-22 | 2013-01-01 | Commvault Systems, Inc. | Systems and methods of media management, such as management of media to and from a media storage library |
US7603518B2 (en) | 2005-12-19 | 2009-10-13 | Commvault Systems, Inc. | System and method for improved media identification in a storage device |
JP4168626B2 (en) * | 2001-12-06 | 2008-10-22 | 株式会社日立製作所 | File migration method between storage devices |
US8489742B2 (en) * | 2002-10-10 | 2013-07-16 | Convergys Information Management Group, Inc. | System and method for work management |
WO2004055675A1 (en) * | 2002-12-18 | 2004-07-01 | Fujitsu Limited | File management apparatus, file management program, file management method, and file system |
US7246207B2 (en) | 2003-04-03 | 2007-07-17 | Commvault Systems, Inc. | System and method for dynamically performing storage operations in a computer network |
WO2004090789A2 (en) | 2003-04-03 | 2004-10-21 | Commvault Systems, Inc. | System and method for extended media retention |
JP4330941B2 (en) | 2003-06-30 | 2009-09-16 | 株式会社日立製作所 | Database divided storage management apparatus, method and program |
US20050039001A1 (en) * | 2003-07-30 | 2005-02-17 | Microsoft Corporation | Zoned based security administration for data items |
US8266406B2 (en) | 2004-04-30 | 2012-09-11 | Commvault Systems, Inc. | System and method for allocation of organizational resources |
US7343356B2 (en) | 2004-04-30 | 2008-03-11 | Commvault Systems, Inc. | Systems and methods for storage modeling and costing |
US7657581B2 (en) * | 2004-07-29 | 2010-02-02 | Archivas, Inc. | Metadata management for fixed content distributed data storage |
WO2006052872A2 (en) | 2004-11-05 | 2006-05-18 | Commvault Systems, Inc. | System and method to support single instance storage operations |
US7805469B1 (en) * | 2004-12-28 | 2010-09-28 | Symantec Operating Corporation | Method and apparatus for splitting and merging file systems |
US8108579B2 (en) * | 2005-03-31 | 2012-01-31 | Qualcomm Incorporated | Mechanism and method for managing data storage |
US8112605B2 (en) | 2005-05-02 | 2012-02-07 | Commvault Systems, Inc. | System and method for allocation of organizational resources |
JP4704161B2 (en) * | 2005-09-13 | 2011-06-15 | 株式会社日立製作所 | How to build a file system |
US8091089B2 (en) * | 2005-09-22 | 2012-01-03 | International Business Machines Corporation | Apparatus, system, and method for dynamically allocating and adjusting meta-data repository resources for handling concurrent I/O requests to a meta-data repository |
US8655850B2 (en) * | 2005-12-19 | 2014-02-18 | Commvault Systems, Inc. | Systems and methods for resynchronizing information |
US20110010518A1 (en) | 2005-12-19 | 2011-01-13 | Srinivas Kavuri | Systems and Methods for Migrating Components in a Hierarchical Storage Network |
JP4905989B2 (en) * | 2005-12-22 | 2012-03-28 | 独立行政法人海洋研究開発機構 | Metadata search device |
US20070156775A1 (en) * | 2005-12-29 | 2007-07-05 | Fischer Iija | Metadata transformation in copy and paste scenarios between heterogeneous applications |
US8108549B2 (en) * | 2006-04-04 | 2012-01-31 | International Business Machines Corporation | Method for using the loopback interface in a computer system having multiple workload partitions |
JP4921054B2 (en) * | 2006-07-07 | 2012-04-18 | 株式会社日立製作所 | Load balancing control system and load balancing control method |
US7870102B2 (en) * | 2006-07-12 | 2011-01-11 | International Business Machines Corporation | Apparatus and method to store and manage information and meta data |
US7539783B2 (en) | 2006-09-22 | 2009-05-26 | Commvault Systems, Inc. | Systems and methods of media management, such as management of media to and from a media storage library, including removable media |
US7831566B2 (en) | 2006-12-22 | 2010-11-09 | Commvault Systems, Inc. | Systems and methods of hierarchical storage management, such as global management of storage operations |
KR100899147B1 (en) * | 2007-05-04 | 2009-05-27 | 한양대학교 산학협력단 | Method of storing meta-data and system for storing meta-data |
US8135688B2 (en) * | 2007-06-15 | 2012-03-13 | Oracle International Corporation | Partition/table allocation on demand |
US8209294B2 (en) * | 2007-06-15 | 2012-06-26 | Oracle International Corporation | Dynamic creation of database partitions |
US8140493B2 (en) * | 2007-06-15 | 2012-03-20 | Oracle International Corporation | Changing metadata without invalidating cursors |
US8356014B2 (en) * | 2007-06-15 | 2013-01-15 | Oracle International Corporation | Referring to partitions with for (values) clause |
US8706976B2 (en) | 2007-08-30 | 2014-04-22 | Commvault Systems, Inc. | Parallel access virtual tape library and drives |
US7783666B1 (en) | 2007-09-26 | 2010-08-24 | Netapp, Inc. | Controlling access to storage resources by using access pattern based quotas |
US9413825B2 (en) * | 2007-10-31 | 2016-08-09 | Emc Corporation | Managing file objects in a data storage system |
US8078957B2 (en) | 2008-05-02 | 2011-12-13 | Microsoft Corporation | Document synchronization over stateless protocols |
US20100070466A1 (en) | 2008-09-15 | 2010-03-18 | Anand Prahlad | Data transfer techniques within data storage devices, such as network attached storage performing data migration |
US9996572B2 (en) * | 2008-10-24 | 2018-06-12 | Microsoft Technology Licensing, Llc | Partition management in a partitioned, scalable, and available structured storage |
US8448004B2 (en) * | 2008-10-27 | 2013-05-21 | Netapp, Inc. | Power savings using dynamic storage cluster membership |
JP4650556B2 (en) * | 2008-10-31 | 2011-03-16 | ブラザー工業株式会社 | Network equipment |
US20100153463A1 (en) * | 2008-12-15 | 2010-06-17 | Honeywell International Inc. | run-time database redirection system |
US8219526B2 (en) * | 2009-06-05 | 2012-07-10 | Microsoft Corporation | Synchronizing file partitions utilizing a server storage model |
CA2673554C (en) * | 2009-07-21 | 2017-01-03 | Ibm Canada Limited - Ibm Canada Limitee | Web distributed storage system |
US8671265B2 (en) | 2010-03-05 | 2014-03-11 | Solidfire, Inc. | Distributed data storage system providing de-duplication of data using block identifiers |
US8285762B2 (en) | 2010-05-11 | 2012-10-09 | International Business Machines Corporation | Migration of metadata and storage management of data in a first storage environment to a second storage environment |
US8595184B2 (en) | 2010-05-19 | 2013-11-26 | Microsoft Corporation | Scaleable fault-tolerant metadata service |
US8812445B2 (en) * | 2010-09-24 | 2014-08-19 | Hitachi Data Systems Corporation | System and method for managing scalability in a distributed database |
US9244779B2 (en) | 2010-09-30 | 2016-01-26 | Commvault Systems, Inc. | Data recovery operations, such as recovery from modified network data management protocol data |
US9619474B2 (en) * | 2011-03-31 | 2017-04-11 | EMC IP Holding Company LLC | Time-based data partitioning |
US9838269B2 (en) | 2011-12-27 | 2017-12-05 | Netapp, Inc. | Proportional quality of service based on client usage and system metrics |
US20130227145A1 (en) * | 2011-12-27 | 2013-08-29 | Solidfire, Inc. | Slice server rebalancing |
US9054992B2 (en) | 2011-12-27 | 2015-06-09 | Solidfire, Inc. | Quality of service policy sets |
WO2013148096A1 (en) | 2012-03-30 | 2013-10-03 | Commvault Systems, Inc. | Informaton management of mobile device data |
US20140181085A1 (en) | 2012-12-21 | 2014-06-26 | Commvault Systems, Inc. | Data storage system for analysis of data across heterogeneous information management systems |
US10379988B2 (en) | 2012-12-21 | 2019-08-13 | Commvault Systems, Inc. | Systems and methods for performance monitoring |
US9069799B2 (en) | 2012-12-27 | 2015-06-30 | Commvault Systems, Inc. | Restoration of centralized data storage manager, such as data storage manager in a hierarchical data storage system |
US9021452B2 (en) | 2012-12-27 | 2015-04-28 | Commvault Systems, Inc. | Automatic identification of storage requirements, such as for use in selling data storage management solutions |
US9552288B2 (en) | 2013-02-08 | 2017-01-24 | Seagate Technology Llc | Multi-tiered memory with different metadata levels |
US9531722B1 (en) | 2013-10-31 | 2016-12-27 | Google Inc. | Methods for generating an activity stream |
US9542457B1 (en) | 2013-11-07 | 2017-01-10 | Google Inc. | Methods for displaying object history information |
US9614880B1 (en) | 2013-11-12 | 2017-04-04 | Google Inc. | Methods for real-time notifications in an activity stream |
US10949382B2 (en) | 2014-01-15 | 2021-03-16 | Commvault Systems, Inc. | User-centric interfaces for information management systems |
US9509772B1 (en) | 2014-02-13 | 2016-11-29 | Google Inc. | Visualization and control of ongoing ingress actions |
US20150244795A1 (en) | 2014-02-21 | 2015-08-27 | Solidfire, Inc. | Data syncing in a distributed system |
US9798596B2 (en) | 2014-02-27 | 2017-10-24 | Commvault Systems, Inc. | Automatic alert escalation for an information management system |
US9495293B1 (en) * | 2014-05-05 | 2016-11-15 | EMC IP Holding Company, LLC | Zone consistency |
US9536199B1 (en) | 2014-06-09 | 2017-01-03 | Google Inc. | Recommendations based on device usage |
US9760446B2 (en) | 2014-06-11 | 2017-09-12 | Micron Technology, Inc. | Conveying value of implementing an integrated data management and protection system |
US9507791B2 (en) | 2014-06-12 | 2016-11-29 | Google Inc. | Storage system user interface with floating file collection |
US10078781B2 (en) | 2014-06-13 | 2018-09-18 | Google Llc | Automatically organizing images |
US9798728B2 (en) | 2014-07-24 | 2017-10-24 | Netapp, Inc. | System performing data deduplication using a dense tree data structure |
US9671960B2 (en) | 2014-09-12 | 2017-06-06 | Netapp, Inc. | Rate matching technique for balancing segment cleaning and I/O workload |
US10133511B2 (en) | 2014-09-12 | 2018-11-20 | Netapp, Inc | Optimized segment cleaning technique |
US9836229B2 (en) | 2014-11-18 | 2017-12-05 | Netapp, Inc. | N-way merge technique for updating volume metadata in a storage I/O stack |
JP6037469B2 (en) * | 2014-11-19 | 2016-12-07 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | Information management system, information management method and program |
US9870420B2 (en) | 2015-01-19 | 2018-01-16 | Google Llc | Classification and storage of documents |
US9720601B2 (en) | 2015-02-11 | 2017-08-01 | Netapp, Inc. | Load balancing technique for a storage array |
US10956299B2 (en) | 2015-02-27 | 2021-03-23 | Commvault Systems, Inc. | Diagnosing errors in data storage and archiving in a cloud or networking environment |
US9762460B2 (en) | 2015-03-24 | 2017-09-12 | Netapp, Inc. | Providing continuous context for operational information of a storage system |
US9928144B2 (en) | 2015-03-30 | 2018-03-27 | Commvault Systems, Inc. | Storage management of data using an open-archive architecture, including streamlined access to primary data originally stored on network-attached storage and archived to secondary storage |
US9710317B2 (en) | 2015-03-30 | 2017-07-18 | Netapp, Inc. | Methods to identify, handle and recover from suspect SSDS in a clustered flash array |
US10261943B2 (en) | 2015-05-01 | 2019-04-16 | Microsoft Technology Licensing, Llc | Securely moving data across boundaries |
US10324914B2 (en) | 2015-05-20 | 2019-06-18 | Commvalut Systems, Inc. | Handling user queries against production and archive storage systems, such as for enterprise customers having large and/or numerous files |
US10275320B2 (en) | 2015-06-26 | 2019-04-30 | Commvault Systems, Inc. | Incrementally accumulating in-process performance data and hierarchical reporting thereof for a data stream in a secondary copy operation |
US9740566B2 (en) | 2015-07-31 | 2017-08-22 | Netapp, Inc. | Snapshot creation workflow |
US10101913B2 (en) | 2015-09-02 | 2018-10-16 | Commvault Systems, Inc. | Migrating data to disk without interrupting running backup operations |
US10176036B2 (en) | 2015-10-29 | 2019-01-08 | Commvault Systems, Inc. | Monitoring, diagnosing, and repairing a management database in a data storage management system |
US10929022B2 (en) | 2016-04-25 | 2021-02-23 | Netapp. Inc. | Space savings reporting for storage system supporting snapshot and clones |
US10642763B2 (en) | 2016-09-20 | 2020-05-05 | Netapp, Inc. | Quality of service policy sets |
US11032350B2 (en) | 2017-03-15 | 2021-06-08 | Commvault Systems, Inc. | Remote commands framework to control clients |
US10949308B2 (en) | 2017-03-15 | 2021-03-16 | Commvault Systems, Inc. | Application aware backup of virtual machines |
US11010261B2 (en) | 2017-03-31 | 2021-05-18 | Commvault Systems, Inc. | Dynamically allocating streams during restoration of data |
JP6939220B2 (en) * | 2017-08-03 | 2021-09-22 | 富士通株式会社 | Data analysis program, data analysis method, and data analysis device |
US10742735B2 (en) | 2017-12-12 | 2020-08-11 | Commvault Systems, Inc. | Enhanced network attached storage (NAS) services interfacing to cloud storage |
US10831591B2 (en) | 2018-01-11 | 2020-11-10 | Commvault Systems, Inc. | Remedial action based on maintaining process awareness in data storage management |
US10824751B1 (en) * | 2018-04-25 | 2020-11-03 | Bank Of America Corporation | Zoned data storage and control security system |
US10929556B1 (en) | 2018-04-25 | 2021-02-23 | Bank Of America Corporation | Discrete data masking security system |
US11301421B2 (en) | 2018-05-25 | 2022-04-12 | Microsoft Technology Licensing, Llc | Scalable multi-tier storage structures and techniques for accessing entries therein |
US20200192572A1 (en) | 2018-12-14 | 2020-06-18 | Commvault Systems, Inc. | Disk usage growth prediction system |
US11204892B2 (en) | 2019-03-21 | 2021-12-21 | Microsoft Technology Licensing, Llc | Techniques for snapshotting scalable multitier storage structures |
CN113297432B (en) * | 2021-06-01 | 2023-11-07 | 阿里巴巴新加坡控股有限公司 | Method, processor-readable medium, and system for partition splitting and merging |
US11593223B1 (en) | 2021-09-02 | 2023-02-28 | Commvault Systems, Inc. | Using resource pool administrative entities in a data storage management system to provide shared infrastructure to tenants |
CN117155759B (en) * | 2023-10-27 | 2024-01-05 | 腾讯科技(深圳)有限公司 | Data processing method, device, computer equipment and storage medium |
Family Cites Families (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5276867A (en) * | 1989-12-19 | 1994-01-04 | Epoch Systems, Inc. | Digital data storage system with improved data migration |
US5737549A (en) * | 1994-01-31 | 1998-04-07 | Ecole Polytechnique Federale De Lausanne | Method and apparatus for a parallel data storage and processing server |
US6047356A (en) * | 1994-04-18 | 2000-04-04 | Sonic Solutions | Method of dynamically allocating network node memory's partitions for caching distributed files |
US5564037A (en) * | 1995-03-29 | 1996-10-08 | Cheyenne Software International Sales Corp. | Real time data migration system and method employing sparse files |
US6275867B1 (en) * | 1995-09-12 | 2001-08-14 | International Business Machines Corporation | Operation-partitioned off-loading of operations in a distributed environment |
US5727197A (en) * | 1995-11-01 | 1998-03-10 | Filetek, Inc. | Method and apparatus for segmenting a database |
US5742818A (en) * | 1995-12-15 | 1998-04-21 | Microsoft Corporation | Method and system of converting data from a source file system to a target file system |
US5897661A (en) * | 1997-02-25 | 1999-04-27 | International Business Machines Corporation | Logical volume manager and method having enhanced update capability with dynamic allocation of storage and minimal storage of metadata information |
CA2201278C (en) * | 1997-03-27 | 2001-02-20 | Ibm Canada Limited-Ibm Canada Limitee | Hierarchical metadata store for an integrated development environment |
US6021508A (en) * | 1997-07-11 | 2000-02-01 | International Business Machines Corporation | Parallel file system and method for independent metadata loggin |
US6032216A (en) * | 1997-07-11 | 2000-02-29 | International Business Machines Corporation | Parallel file system with method using tokens for locking modes |
US6128621A (en) * | 1997-10-31 | 2000-10-03 | Oracle Corporation | Apparatus and method for pickling data |
US6240428B1 (en) * | 1997-10-31 | 2001-05-29 | Oracle Corporation | Import/export and repartitioning of partitioned objects |
US6681227B1 (en) * | 1997-11-19 | 2004-01-20 | Ns Solutions Corporation | Database system and a method of data retrieval from the system |
US6260040B1 (en) * | 1998-01-05 | 2001-07-10 | International Business Machines Corporation | Shared file system for digital content |
US6023579A (en) * | 1998-04-16 | 2000-02-08 | Unisys Corp. | Computer-implemented method for generating distributed object interfaces from metadata |
US6279011B1 (en) * | 1998-06-19 | 2001-08-21 | Network Appliance, Inc. | Backup and restore for heterogeneous file server environment |
US6973455B1 (en) * | 1999-03-03 | 2005-12-06 | Emc Corporation | File server system providing direct data sharing between clients with a server acting as an arbiter and coordinator |
US6405198B1 (en) * | 1998-09-04 | 2002-06-11 | International Business Machines Corporation | Complex data query support in a partitioned database system |
US6240416B1 (en) * | 1998-09-11 | 2001-05-29 | Ambeo, Inc. | Distributed metadata system and method |
US6212515B1 (en) * | 1998-11-03 | 2001-04-03 | Platinum Technology, Inc. | Method and apparatus for populating sparse matrix entries from corresponding data |
US6295538B1 (en) * | 1998-12-03 | 2001-09-25 | International Business Machines Corporation | Method and apparatus for creating metadata streams with embedded device information |
US6339793B1 (en) * | 1999-04-06 | 2002-01-15 | International Business Machines Corporation | Read/write data sharing of DASD data, including byte file system data, in a cluster of multiple data processing systems |
US6714952B2 (en) * | 1999-11-10 | 2004-03-30 | Emc Corporation | Method for backup and restore of a multi-lingual network file server |
IL150079A0 (en) * | 1999-12-07 | 2002-12-01 | Data Foundation Inc | Scalable storage architecture |
US7506034B2 (en) * | 2000-03-03 | 2009-03-17 | Intel Corporation | Methods and apparatus for off loading content servers through direct file transfer from a storage center to an end-user |
JP4640723B2 (en) * | 2000-04-08 | 2011-03-02 | オラクル・アメリカ・インコーポレイテッド | Stream a single media track to multiple clients |
US7987217B2 (en) * | 2000-05-12 | 2011-07-26 | Oracle International Corporation | Transaction-aware caching for document metadata |
US6665675B1 (en) * | 2000-09-07 | 2003-12-16 | Omneon Video Networks | Shared file system having a token-ring style protocol for managing meta-data |
-
2002
- 2002-06-12 JP JP2004513966A patent/JP2005530242A/en active Pending
- 2002-06-12 CA CA002489324A patent/CA2489324A1/en not_active Abandoned
- 2002-06-12 EP EP02739887A patent/EP1532543A4/en not_active Withdrawn
- 2002-06-12 AU AU2002312508A patent/AU2002312508B2/en not_active Ceased
- 2002-06-12 WO PCT/US2002/018940 patent/WO2003107219A1/en active Application Filing
-
2003
- 2003-05-06 US US10/431,168 patent/US7146377B2/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
EP1532543A1 (en) | 2005-05-25 |
EP1532543A4 (en) | 2008-04-16 |
AU2002312508A1 (en) | 2003-12-31 |
US7146377B2 (en) | 2006-12-05 |
WO2003107219A1 (en) | 2003-12-24 |
AU2002312508B2 (en) | 2008-01-17 |
JP2005530242A (en) | 2005-10-06 |
US20030195895A1 (en) | 2003-10-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2002312508B2 (en) | Storage system having partitioned migratable metadata | |
EP0976065B1 (en) | Dynamic directory service | |
JP5607059B2 (en) | Partition management in partitioned, scalable and highly available structured storage | |
US7562087B2 (en) | Method and system for processing directory operations | |
US7558802B2 (en) | Information retrieving system | |
US8103639B1 (en) | File system consistency checking in a distributed segmented file system | |
US8214404B2 (en) | Media aware distributed data layout | |
US20080256090A1 (en) | Dynamic directory service | |
EP0474395A2 (en) | Data storage hierarchy with shared storage level | |
CN101354726A (en) | Method for managing memory metadata of cluster file system | |
EP2724268A2 (en) | System and method for implementing a scalable data storage service | |
PH12014501762B1 (en) | Method and apparatus for file storage | |
US11755557B2 (en) | Flat object storage namespace in an object storage system | |
CN112148680B (en) | File system metadata management method based on distributed graph database | |
JP2001142752A (en) | Database managing method | |
CN111708894A (en) | Knowledge graph creating method | |
WO2017156855A1 (en) | Database systems with re-ordered replicas and methods of accessing and backing up databases | |
CN114415971B (en) | Data processing method and device | |
US7822736B2 (en) | Method and system for managing an index arrangement for a directory | |
US20230376451A1 (en) | Client support of multiple fingerprint formats for data file segments | |
US20230376461A1 (en) | Supporting multiple fingerprint formats for data file segment | |
Pollack et al. | Index Storage Fundamentals | |
CN117687970A (en) | Metadata retrieval method and device, electronic equipment and storage medium | |
CN116860867A (en) | HBase data processing method and device | |
WO2023138788A1 (en) | Method of backing up file-system onto object storgae system and data management module |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
FZDE | Discontinued |