TUD Logo

TUD Home » ... » Teaching » Distributed Operating Systems » Distributed File Systems

Operating Systems

Distributed File Systems

In the tutorial, all solutions will be presented by students. Please be prepared for all questions as the exercise will focus on discussion. Some questions require knowledge that was not presented in the lecture. Reading material is listed below, in particular, you should read the paper describing the Google file system (GFS).

Scalability in Distributed File Systems

The exercise aims at understanding how scalability can be achieved in distributed file systems and what challenges distributed storage causes for fault tolerance. The first part covers the SUN Network File System (NFS), which is a representative of "classic" distributed file systems. Its basic design was covered in the basic OS lecture, but it is also described in the books referenced at the bottom of this page. In the second part the much more scalable Google File System is discussed.

  1. NFS

    Imagine you use a shell script (download link: test.sh) that creates and deletes files and directories in an NFS-mounted file system. After each operation, the script checks the status of the file or directory on another computer that has NFS-mounted the same filesystem. You get the following screen output:

    cw183155@ganymed:~$ sh test.sh serv9
    localhost: creating /usr/users/sya/cw183155/tmp-filename and checking on serv9 for existence
    serv9: file exists
    localhost: writing data to /usr/users/sya/cw183155/tmp-filename and checking file size on serv9
    serv9: file still contains zero bytes
    serv9: file still contains zero bytes
    serv9: file still contains zero bytes
    serv9: file contains some data
    localhost: removing /usr/users/sya/cw183155/tmp-filename again and checking on serv9 for existence
    serv9: file still exists
    serv9: file still exists
    serv9: file still exists
    serv9: file is gone
    localhost: creating directory /usr/users/sya/cw183155/tmp-filename and checking on serv9 for existence
    serv9: directory exists
    localhost: removing directory /usr/users/sya/cw183155/tmp-filename and checking on serv9 for existence
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory still exists
    serv9: directory is gone
    cw183155@ganymed:~$

    Explain these results using your knowledge of the NFS architecture.

The Google File System (GFS) has been specifically designed with scalability in mind. The design and motivations behind it are explained comprehensively in the 2003 research paper "The Google File System". It discusses many interesting aspects that could not be covered in the lecture. Please read it, so that you can better answer the questions below and be prepared for the discussion during the exercise.

  1. GFS design choices

    What are the major design choices made by the Google engineers? How are they motivated and what are the consequences? Compare your findings to what you know about NFS.

  2. Scalability techniques in GFS
    1. Describe how GFS can scale to many hundreds (or more) machines despite having a single master.

    2. Describe the caching mechanisms used in GFS.

    3. What is the benefit of "record append"?

  3. Fault tolerance in GFS
    1. How does GFS protect itself against a failing master server?

    2. What happens, if one or more chunkservers fail? Describe the recovery procedure.

    3. Can clients read stale data?

    4. What happens, if a client fails?

    5. How does GFS achieve fault tolerance against disk failures? Consider chunkservers and the master.

    6. Explain the consistency model of GFS. What does it mean for applications?

  4. Workloads, applications, fun stuff ...

    GFS may not be suitable for certain applications. What kind of workloads would you consider problematic and why? How could a better file system for your favorite application look like? Be creative!

Material

  • Books:
    • A. Tanenbaum, "Distributed Operating Systems"
    • G. Coulouris et. al., "Distributed Systems, Concepts and Design"
  • Research paper: "The Google File System"
  • Software: Shell Script for exercise 1.
Last modified: 6th Jan 2020, 2.15 PM
Author: Dr.-Ing. Carsten Weinhold

Contact
Dr.-Ing.
Carsten Weinhold

Phone: 463 38056
Fax: 463 38284
e-mail contact form

Regulations
  • ModuleModules: INF-BI-1, INF-BAS4, INF-VERT4, DSE-E3
  • Credits6 Credit Points
  • 2/1/0 = 3 SWS
Time and Place
  • Lecture, weekly
    TimeMon, 11.10 AM PlaceAPB E008
  • Exercise, biweekly
    TimeMon, 9.20 AM PlaceAPB 3105
Mailing List