File system level compression using holes

Share
Save
Discuss
Claim

Share On Facebook Share On Twitter Share By Email
Save Item
Save Item
Add to
my lists

Filing Information

  • Patent Number: US5774715
  • Application Number: US8623907
  • Filing date: 03/27/1996
  • Issue date: 06/30/1998
  • Predicted expiration date: 03/27/2016
Explore Your Innovation Network™ for an introduction to:

Innovation Network Your First Name:
Last Name:
 
Already a member? Sign In
  • U.S. Classifications: 395/612  · 395/611  ·
  • International Classifications: G06F 1730 ·
  • International Classifications: 395611;612;440 ·
  • View document at: (opens new window):
    USPTO  ·  PAIR  ·  esp@cenet  ·  Patent Family
    * Related patent documents may or may not exist on these sites
28 Claims, No Drawings


Abstract

A method, apparatus, and computer-usable medium for compressing data in a file system utilizing the concept of "holes". A mapping table in a file system maps the logical blocks of a file to actual physical blocks on disk where the data is stored. Blocks may be arranged in units of a cluster, and the file may be compressed cluster-by-cluster. Holes are used within a cluster to indicate not only that a cluster has been compressed, but also the compression algorithm used. Different clusters within a file may be compressed with different compression algorithms. A unit of data is compressed, with the result that the file occupies fewer physical blocks than it has logical blocks. The mapping table is updated to indicate that for a given unit of data compressed, fewer physical blocks are needed. Certain logical blocks belonging to this unit of data are not mapped to physical blocks but are mapped to a hole. A hole indicates that the unit of data was compressed, and may also indicate the particular compression algorithm used to compress the unit of data. If a unit of data begins or ends within the middle of a cluster, to avoid overwriting the data not to be changed the whole cluster must first be read from disk. If a hole indicates the cluster had been compressed, the data must be expanded first. The cluster is read into a buffer and the portion to be changed is overwritten. The cluster is compressed and written back to disk. Those clusters within which the unit of data neither begins nor ends may be written to directly.

References Cited

U.S. Patent Documents

Document NumberAssigneesInventorsIssue/Pub Date
US5155484 Salient Software, Inc. Chambers, IV Oct 1992
US5237675 Maxtor Corporation Hannon, Jr. Aug 1993
US5481701 Salient Software, Inc. Chambers, IV Jan 1996
US5551020 Flextech Systems, Inc. Flax et al. Aug 1996
US5652857 Fujitsu Limited Shimoi et al. Jul 1997

Other Publications

Michael Burrows, et al., On-Line Data Compression in a Log-structured File System, 1992, DEC Systems Research Center, pp. 2-9.

Referenced By

Document NumberAssigneeInventorsIssue/Pub Date
US5960446International Business Machines CorporationFrank B. Schmuck et al.Sep 1999
US6965897AT&T Corp.Zewei ChenNov 2005
US7490103NetApp, Inc.Roger Keith Stager et al.Feb 2009
US6883063EMC CorporationSteven M. Blumenau et al.Apr 2005
US7127556EMC CorporationSteven M. Blumenau et al.Oct 2006
US6697795Hewlett-Packard Development Company, L.P.David Marshall HolcombFeb 2004
US7072917NeoPath Networks, Inc.Thomas K. Wong et al.Jul 2006
US7454529Netapp, Inc.Roger Keith Stager et al.Nov 2008
US7467282Network Appliance, Inc.Sriram Rao et al.Dec 2008
US7346664Neopath Networks, Inc.Thomas K. Wong et al.Mar 2008
US7559088NETAPP, Inc.Gavin David Cohen et al.Jul 2009
US6654772EMC CorporationPreston F. Crow et al.Nov 2003
US7155445Cingular Wireless II, LLCBrian Kling et al.Dec 2006
US7426617Network Appliance, Inc.Roger Keith Stager et al.Sep 2008
US6329985EMC CorporationPhilip E. Tamer et al.Dec 2001
US7512862Network Appliance, Inc.James A. TaylorMar 2009
US7533108Netapp, Inc.Mehul S. Shah et al.May 2009
US7487009Netapp, Inc.Don Alvin Trimmer et al.Feb 2009
US6938059EMC CorporationPhilip E. Tamer et al.Aug 2005
US7424482Storwize Inc.Nadav Kedem et al.Sep 2008
US6393540EMC CorporationSteven M. Blumenau et al.May 2002
US7437387Netapp, Inc.Gavin David Cohen et al.Oct 2008
US7383294EMC CorporationPhilip E. Tamer et al.Jun 2008
US6282602EMC CorporationSteven M. Blumenau et al.Aug 2001
US7200603Network Appliance, Inc.David Hitz et al.Apr 2007
US7325159Network Appliance, Inc.Roger Keith Stager et al.Jan 2008
US7437492Netapp, IncRoger Stager et al.Oct 2008
US7315965Network Appliance, Inc.Roger Keith Stager et al.Jan 2008
US6353834Mitsubishi Electric Research Laboratories, Inc.David W. H. Wong et al.Mar 2002
US6542909EMC CorporationPhilip E. Tamer et al.Apr 2003
US7406488NetAppRoger Keith Stager et al.Jul 2008
US7526620Netapp, Inc.William P. McGovernApr 2009
US7558839Netapp, Inc.William P. McGovernJul 2009
US5890169Sun Microsystems, Inc.Thomas K. Wong et al.Mar 1999
US7401198NetAppCraig Anthony Johnston et al.Jul 2008
US7567993Netapp, Inc.Don Alvin Trimmer et al.Jul 2009
US7581118Netapp, Inc.William P. McGovernAug 2009
US7587422Neopath Networks, Inc.Thomas K. Wong et al.Sep 2009
US7599972QNX Software Systems GmbH & Co. KGDan Dodge et al.Oct 2009

Patent Family


Read Patent

Read patent

View Drawings


Independent Claims | See all claims (28)

  1. 1. A computer-implemented method of storing data in a file system having a mapping table arranged to map logical memory blocks to physical memory blocks, the method comprising the steps of:requesting that a segment of data be written to physical memory, the segment of data being associated with selected logical memory blocks;compressing the segment of data into compressed data such that the compressed data occupies fewer blocks of memory than the segment of data;writing the compressed data to physical memory, the compressed data being written to at least one physical memory block; andupdating the mapping table such that each physical memory block associated with the compressed data is mapped to by a mapping table entry corresponding to one of the selected logical memory blocks and each of the selected logical memory blocks that is not associated with any of the physical memory blocks is mapped to a hole identifier that does not correspond to any physical memory block.
  2. 6. A computer-implemented method of storing data in a file system having a mapping table arranged to map logical memory blocks to physical memory blocks, the file system being arranged to process clusters, each cluster corresponding to a plurality of logical memory blocks, the method comprising the steps of:requesting that a segment of data be written to physical memory, the segment of data being associated with selected logical memory blocks, wherein the selected logical memory blocks may span a plurality of clusters; anddetermining whether the segment of data begins at an intermediate location within a first one of the clusters, wherein when it is determined that the segment of data begins at an intermediate location, the method further comprises the step of performing a partial cluster write operation utilizing the first cluster as a current cluster, the partial cluster write operation including the sub-steps of,a) reading data associated with the current cluster from physical memory into a read buffer, such current cluster data being stored in the read buffer in an expanded format,b) copying a portion of the segment of data associated with the current cluster into the read buffer after the reading step has been completed,c) compressing the data stored in the read buffer after the copying step,d) writing the compressed data to physical memory, ande) updating the mapping table such that each physical memory block associated with the compressed data is mapped to by a mapping table entry corresponding to one of the selected logical memory blocks that is associated with the current cluster and each of the selected logical memory blocks associated with the current cluster that is not associated with any of the physical memory blocks is mapped to a hole identifier that does not correspond to any physical memory block.
  3. 13. A computer-implemented method of retrieving data in a file system having a mapping table arranged to map logical memory blocks to physical memory blocks, the method comprising the steps of:requesting that a segment of data be read from physical memory, the segment of data being associated with selected logical memory blocks; anddetermining by reference to the mapping table whether any of the selected logical memory blocks are mapped to a hole identifier that indicates that the segment of data is stored in compressed form in the physical memory, wherein when it is determined that the segment of data is stored in compressed form, performing the sub-steps of,a) reading from physical memory into a compression buffer all physical memory blocks that are associated with the selected logical memory blocks of the segment of data, andb) expanding the physical memory blocks stored in the compression buffer so that the segment of data is then stored in expanded form in the compression buffer.
  4. 19. A computer apparatus for use in compressing and expanding a data segment in a file system, the data segment being associated with selected logical memory blocks, the computer apparatus comprising:a central processing unit;random access memory in communication with the central processing unit;a mass storage device in communication with the central processing unit;a mapping table arranged to map the selected logical memory blocks of the data segment to associated physical memory blocks of the mass storage device, such that when the data segment is stored in a compressed form on the mass storage device the compressed data segment occupies fewer physical memory blocks than the expanded data segment, and such that each physical memory block associated with the compressed data is mapped to by a mapping table entry corresponding to one of the selected logical memory blocks and each of the selected logical memory blocks that is not associated with any of the physical memory blocks is mapped to a hole identifier that does not correspond to any physical memory block.
  5. 23. A computer program product comprising a computer-usable medium having computer-readable code embodied thereon for compressing and expanding data in a file system of a computer, the file system having a mapping table arranged to map logical memory blocks to physical memory blocks, the computer program product comprising the following computer-readable program code for effecting actions in the computer:program code for requesting that a segment of data be written to physical memory, the segment of data being associated with selected logical memory blocks;program code for compressing the segment of data into compressed data such that the compressed data occupies fewer blocks of memory than the segment of data;program code for writing the compressed data to physical memory, the compressed data being written to at least one physical memory block; andprogram code for updating the mapping table such that each physical memory block associated with the compressed data is mapped to by a mapping table entry corresponding to one of the selected logical memory blocks and each of the selected logical memory blocks that is not associated with any of the physical memory blocks is mapped to a hole identifier that does not correspond to any physical memory block.