Skip to content

Team course project for the course Operating Systems & Networks at IIIT Hyderabad. A production-grade distributed file system in C featuring hierarchical storage, multi-threaded concurrency, ACL-based security, version control with checkpoints, and fault-tolerant replication across multiple storage servers.

License

Notifications You must be signed in to change notification settings

Astr0Lynx/Distributed-Network-File-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Distributed Network File System

A production-grade distributed file system in C featuring hierarchical storage, multi-threaded concurrency, ACL-based security, version control with checkpoints, and fault-tolerant replication across multiple storage servers.

License: MIT C Threads


πŸ“‹ Table of Contents


🎯 Overview

This distributed network file system implements a 3-tier client-server architecture with metadata-data separation, providing enterprise-grade features including hierarchical folder structures, ACL-based permissions, version control through checkpoints, and fault-tolerant replication.

System Components

  1. Name Server (Metadata Server) - Central coordinator managing file metadata, access control, and storage server registry
  2. Storage Servers - Distributed nodes handling physical file storage with fine-grained locking
  3. Clients - User interface supporting 20+ file operations with real-time streaming

Technical Highlights

  • 13,156 lines of production-grade C code
  • Multi-threaded with readers-writers synchronization
  • O(1) file lookups via hash tables (10,007 buckets)
  • Sentence-level locking for concurrent writes
  • TCP socket-based network communication
  • HTTP-inspired status codes (200, 404, 403, 409, 500)

✨ Key Features

Core Functionality

Feature Description
Distributed Storage Multiple storage servers with intelligent routing
Hierarchical Folders Tree-based directory structure with O(1) lookup
Access Control Lists Per-user permissions (READ/WRITE) with request workflow
Version Control Checkpoint-based snapshots (50 per file)
Concurrent Access Readers-writers locks + sentence-level locking
Real-time Streaming Word-by-word file streaming with TCP_NODELAY
Fault Tolerance Health monitoring + automatic replication
Undo Operations Per-operation rollback capability

Supported Operations

# File Operations
VIEW [-a] [-l]              # List files (all/detailed)
CREATE <filename>           # Create new file
READ <filename>             # Read file contents
WRITE <filename> <sent>     # Write to sentence
DELETE <filename>           # Delete file
INFO <filename>             # File metadata
STREAM <filename>           # Real-time streaming
UNDO <filename>             # Rollback last operation

# Folder Operations (BONUS 1)
CREATEFOLDER <foldername>   # Create directory
MOVE <file> <folder>        # Move file to folder
VIEWFOLDER <foldername>     # List folder contents

# Version Control (BONUS 2)
CHECKPOINT <file> <tag>     # Create snapshot
VIEWCHECKPOINT <file> <tag> # Preview checkpoint
REVERT <file> <tag>         # Restore checkpoint
LISTCHECKPOINTS <file>      # Show all tags

# Access Control (BONUS 3)
ADDACCESS <file> <user> R|W # Grant permissions
REMACCESS <file> <user>     # Revoke permissions
REQACCESS <file> <owner> R|W # Request access
APPROVE <request_id>        # Approve request
DENY <request_id>           # Deny request

πŸ—οΈ Architecture

System Design

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Clients   β”‚  (Multiple concurrent connections)
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚ TCP Sockets
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Name Server                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ File Hash Table (10,007 buckets)β”‚   β”‚
β”‚  β”‚ Folder Hash Table (1,009 buckets)β”‚  β”‚
β”‚  β”‚ Storage Server Registry         β”‚   β”‚
β”‚  β”‚ Client Registry                 β”‚   β”‚
β”‚  β”‚ Access Control Manager          β”‚   β”‚
β”‚  β”‚ Checkpoint Manager              β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚ Request Routing
           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     Storage Servers (Distributed)         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ ./ss_storage/      (active files)  β”‚  β”‚
β”‚  β”‚ ./ss_metadata/     (file metadata) β”‚  β”‚
β”‚  β”‚ ./ss_undo/         (undo history)  β”‚  β”‚
β”‚  β”‚ ./ss_checkpoints/  (versions)      β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Concurrency Model

  • Name Server: Thread-per-connection + worker thread pool
  • Storage Servers: Thread-per-client with sentence-level locks
  • Synchronization: Mutexes, semaphores, readers-writers locks
  • Lock Hierarchy: Hash table β†’ File entry β†’ Sentence β†’ ACL

πŸ’» System Requirements

  • OS: Linux/Unix (POSIX-compliant)
  • Compiler: GCC 7.0+ with pthread support
  • Memory: 512MB+ RAM
  • Network: TCP/IP stack
  • Dependencies:
    • pthread library
    • POSIX sockets
    • Standard C library

πŸš€ Installation

1. Clone Repository

git clone https://github.com/Astr0Lynx/Distributed-Network-File-System.git
cd Distributed-Network-File-System

2. Build Components

# Build Name Server
make

# Build Storage Server (in separate terminal)
cd storage_server && make

# Build Client (in separate terminal)
cd client && make

πŸ“– Usage

Starting the System

Step 1: Start Name Server

./ns_server
# Listens on port 8080

Step 2: Start Storage Server(s)

cd storage_server
./ss_server
# Auto-registers with Name Server
# Can start multiple instances for distribution

Step 3: Start Client

cd client
./client <username>
# Connects to Name Server
# Enter commands interactively

Example Session

# Client 1 (Alice)
$ ./client alice
> CREATE myfile.txt
SUCCESS: File created

> WRITE myfile.txt 1
Enter text: Hello, distributed world!
SUCCESS: Written to sentence 1

> CHECKPOINT myfile.txt v1.0
SUCCESS: Checkpoint created with tag 'v1.0'

> ADDACCESS myfile.txt bob R
SUCCESS: Read permission granted to bob

# Client 2 (Bob)
$ ./client bob
> READ myfile.txt
Hello, distributed world!

> WRITE myfile.txt 2
ERROR 403: FORBIDDEN - No write permission

> REQACCESS myfile.txt alice W
SUCCESS: Access request sent to alice

# Client 1 (Alice)
> VIEWREQUESTS
Request 1: bob requests WRITE access to myfile.txt

> APPROVE 1
SUCCESS: Request approved - bob now has WRITE access

πŸ“Š Technical Specifications

Data Structures

Component Structure Size Purpose
File Hash Table Chained hash table 10,007 buckets O(1) file lookup
Folder Hash Table Chained hash table 1,009 buckets O(1) folder lookup
SS Registry Array 100 max Storage server pool
Client Registry Array 1,000 max Active connections
Checkpoint List Linked list 50 per file Version history
ACL Linked list Unlimited Per-file permissions

Network Protocol

Message Format: <COMMAND> <args...>
Response Format: <STATUS_CODE> <message>

Status Codes:
200 OK              - Success
201 CREATED         - Resource created
400 BAD REQUEST     - Invalid command
403 FORBIDDEN       - Permission denied
404 NOT FOUND       - Resource not found
409 CONFLICT        - Resource exists/limit reached
500 INTERNAL ERROR  - Server failure

Performance

  • File Lookup: O(1) average case
  • Folder Lookup: O(1) average case
  • ACL Check: O(n) where n = users with access
  • Checkpoint: O(m) where m ≀ 50
  • Concurrent Operations: Multiple readers OR single writer per sentence

🎁 Bonus Features

BONUS 1: Hierarchical Folder Structure βœ…

  • Implementation: Tree-based with parent-child pointers
  • Hash Table: 1,009 buckets for O(1) lookup
  • Commands: CREATEFOLDER, MOVE, VIEWFOLDER
  • Security: Path traversal prevention (blocks ..)
  • Details: See BONUS.md

BONUS 2: Checkpoint System (Version Control) βœ…

  • Implementation: Per-file linked list with 50 max checkpoints
  • Storage: Physical snapshots in ./ss_checkpoints/<file>/<tag>/
  • Commands: CHECKPOINT, REVERT, VIEWCHECKPOINT, LISTCHECKPOINTS
  • Metadata: Creator, timestamp, file size tracked
  • Details: See BONUS.md

BONUS 3: Access Request System βœ…

  • Implementation: Per-user request queues with unique IDs
  • Workflow: Request β†’ Owner approval β†’ Automatic ACL update
  • Permissions: READ-only or WRITE (implies READ)
  • Commands: REQACCESS, VIEWREQUESTS, APPROVE, DENY
  • Details: See BONUS.md

BONUS 4: Fault Tolerance βœ…

  • Implementation: Backup server assignment on registration
  • Monitoring: Heartbeat tracking per storage server
  • Replication: Automatic sync on primary failure
  • Failover: Transparent client redirection
  • Details: See BONUS.md

BONUS 5: Comprehensive Help System βœ…

  • Implementation: Interactive documentation via HELP command
  • Coverage: All 20+ commands with syntax + examples
  • Format: ASCII art interface with color support
  • Details: See BONUS.md

πŸ“ Project Structure

Distributed-Network-File-System/
β”œβ”€β”€ name_server/              # Name Server (metadata management)
β”‚   β”œβ”€β”€ name_server.c         # Main server logic + connection handling
β”‚   β”œβ”€β”€ nm_data_structures.c  # Hash tables, registries, ACLs
β”‚   β”œβ”€β”€ nm_commands.c         # Command dispatcher
β”‚   β”œβ”€β”€ nm_folder.c           # Folder operations (BONUS 1)
β”‚   β”œβ”€β”€ nm_checkpoint.c       # Versioning (BONUS 2)
β”‚   β”œβ”€β”€ nm_access_request.c   # Permission workflow (BONUS 3)
β”‚   β”œβ”€β”€ nm_replication.c      # Fault tolerance (BONUS 4)
β”‚   └── nm_*.h                # Headers
β”œβ”€β”€ storage_server/           # Storage Server (data plane)
β”‚   β”œβ”€β”€ storage_server.c      # Main server + registration
β”‚   β”œβ”€β”€ client_handler.c      # Client request processing
β”‚   β”œβ”€β”€ file_manager.c        # File I/O operations
β”‚   β”œβ”€β”€ command_handlers.c    # Command implementations
β”‚   β”œβ”€β”€ nm_communication.c    # Name Server protocol
β”‚   └── *.h                   # Headers
β”œβ”€β”€ client/                   # Client (user interface)
β”‚   β”œβ”€β”€ main.c                # Entry point + main loop
β”‚   β”œβ”€β”€ command_parser.c      # Command string parsing
β”‚   β”œβ”€β”€ command_executor.c    # Command execution
β”‚   β”œβ”€β”€ network_comm.c        # Socket communication
β”‚   β”œβ”€β”€ display.c             # Output formatting
β”‚   └── *.h                   # Headers
β”œβ”€β”€ common_errors.h           # Shared error codes
β”œβ”€β”€ common_logging.h          # Logging macros
β”œβ”€β”€ Makefile                  # Build configuration
β”œβ”€β”€ BONUS.md                  # Detailed feature documentation
β”œβ”€β”€ LICENSE                   # MIT License
└── README.md                 # This file

πŸ“ˆ Performance Characteristics

Scalability Limits

#define MAX_STORAGE_SERVERS 100
#define MAX_CLIENTS 1000
#define MAX_FILES 10000
#define MAX_CHECKPOINTS_PER_FILE 50
#define HASH_TABLE_SIZE 10007  // Prime for better distribution
#define FOLDER_HASH_SIZE 1009  // Prime for folder lookups

Bottleneck Analysis

Component Bottleneck Mitigation
Name Server Single point of failure Planned: Leader election
Hash Tables Collision chains Prime-sized buckets, good hash function
Storage Servers Disk I/O Async I/O planned
Network TCP latency Connection pooling
Locks Writer starvation Fair scheduling planned

Observed Performance (Local Testing)

  • File Lookup: ~0.1ms (O(1) hash table)
  • Checkpoint Creation: ~50ms (depends on file size)
  • ACL Check: ~0.05ms (small ACL lists)
  • Concurrent Reads: Linear scaling up to CPU cores
  • Concurrent Writes: Sequential per sentence

πŸ“š Documentation

  • BONUS.md - Detailed implementation of all 5 bonus features (2,221 lines)
  • Code Comments - Inline documentation following industry standards
  • Header Files - Function prototypes with parameter descriptions

πŸ‘₯ Authors

Guntesh Singh - @Astr0Lynx

Khushi Dhingra - @KhushiiiD

This project was developed collaboratively with integrated contributions across all components.


πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

  • Course: Operating Systems & Networks (Monsoon 2025)
  • Institution: IIIT Hyderabad
  • Instructor: CS3-OSN course staff

Built with ❀️ using C, POSIX threads, and TCP sockets

⭐ Star this repo if you find it useful!

About

Team course project for the course Operating Systems & Networks at IIIT Hyderabad. A production-grade distributed file system in C featuring hierarchical storage, multi-threaded concurrency, ACL-based security, version control with checkpoints, and fault-tolerant replication across multiple storage servers.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •