New Project: hashfs

I've started a new Python project: hashfs

HashFS is

A content-addressable file management system for Python

The major features of HashFS are

  • Files are stored once and never duplicated.
  • Uses an efficient folder structure optimized for a large number of files. File paths are based on the content hash and are nested based on the first n number of characters.
  • Can save files from local file paths or readable objects (open file handlers, IO buffers, etc).
  • Able to repair the root folder by reindexing all files. Useful if the hashing algorithm or folder structure options change or to initialize existing files.
  • Supports any hashing algorithm available via
  • Python 2.7+/3.3+ compatible.

Current Release

As of this writing, the latest version of hashfs is v0.4.0.

Sample Usage

HashFS supports basic file storage, retrieval, and removal as well as some more advanced features like file repair.

Storing Content

Add content to the folder using either readable objects (e.g. StringIO) or file paths (e.g. 'a/path/to/some/file').

from io import StringIO

some_content = StringIO('some content')

address = fs.put(some_content)

# Or if you'd like to save the file with an extension...
address = fs.put(some_content, '.txt')

# The id of the file (i.e. the hexdigest of its contents).

# The absolute path where the file was saved.

# The path relative to fs.root.

Retrieving Content

Get a BufferedReader handler for an existing file by address ID or path.

fileio =

# Or using the full path...
fileio =

# Or using a path relative to fs.root
fileio =

Removing Content

Delete a file by address ID or path.


For more details, please see the full documentation at and follow development at


Comments powered by Disqus