Abstract
This paper discusses the file-systems in the operating system. Earlier, the development of the filesystem in kernel space is very complicated. The researcher is trying hard to come up with a solution to this problem, and it leads to the rise of the development of file-systems in user-space. Also, it attracts a broad set of programmers or developers to develop their virtual file-system for their organizations or universities. In this paper, we explain the concept of FUSE and its internal architecture. Also, we discuss the integration of FUSE with the Two-Way Tree Algorithm. Finally, we have evaluated the performance of FUSE integrated with the Two-Way Tree Algorithm.
Index Terms—User space, FUSE, Caching, Mount File, Two-Way Tree Algorithm
I. INTRODUCTION
In the past years, file-systems [1] have been developed in kernel space. It is clear that the development of file-systems in kernel space is a rigorous process and a complicated process due to a variety of reasons. Furthermore, this requires a programmer to have an excellent knowledge of kernel programming, which is a low-level language and takes plenty of time to learn. Also, it involves a programmer to have a good understanding of kernel libraries and modules. For a programmer, there is a steep learning curve in this case. Currently, there is a lack of availability of developer tools for development of file-systems in kernel space. It would make life harder for a programmer to debug a kernel code. If there is a bug in kernel code, rebooting the system is required. When it comes to testing the kernel code, high efforts of testing is needed. In the case of writing kernel code, it should write in C without any helper libraries. For the development of file-systems in kernel space, the line of code would be in the range of 9000 - 32000. It takes plenty of years to develop bug-free kernel code. When the user wants to mount kernel filesystems, the user should have a special privilege, called as superuser root. It is arduous for a user who wants to port their developed file-system to another flavor of *NIX operating system because it requires many changes in the design and implementation of their developed file-systems. Examples of kernel file systems are ext2, ext3, ReiserFS, Reiser4, Btrfs, XFS.
Get Help With Your Assignment
If you need assistance with writing your assignment, our professional assignment writing service is here to help!
Due to the above disadvantages in kernel filesystems [2], it leads to the rise of the development of file-systems in user-space. Various libraries, along with third-party tools and different programming languages, are available to use. With the availability of such software tools, the development of a new file-systems would be more natural for a programmer. Unlike kernel file-systems, it allows non-privilege user to develop and mount new filesystems in the user-space. Examples of user-space file-systems are GMailFS [3], YouTubeFS [4] and WikipediaFS [5]. For the development of the file system in user-space, the line of code would be 2000 - 5000. With being said, it offers speedup, more flexibility, control, and portability to the developer.
II. BACKGROUND AND RELATED WORK
FUSE: FUSE [6] [7] is short form of File System in USErspace. It is an open-source software interface that allows non-privilege users to create virtual file-system in user-space. It is widely used to create a file-system. For the development of file-system in user-space, no knowledge of internal of file-system or kernel module programming is required. Also, the programmer can customize to create the way they create the file-system. FUSE has a set of APIs operation that allows users to implement them based on their requirements. FUSE is available for *NIX operating systems such as Linux, FreeBSD, OpenBSD, NetBSD, OpenSolaris, Minix 3, and macOS. FUSE is available in many languages such as C, C++, Python, Java, etc. Due to excellent availability, it attracts a broad set of programmers to involve the development of file-systems in user-space.
A. FUSE Internal
In this section, we are going to study the internal architecture of FUSE [2]. When you install FUSE [8] on your machine, the diagram is shown as in Figure 1.
Fig. 1. Line of code for development of file-system
The Virtual File System (VFS) [9] is the top-level of file-system and provides an abstraction within the kernel. As a result, it allows the implementation of different file-system to coexist with each other. In other words, it also interacts with the kernel as well as a user-space file-system. Similarly, Virtual File System (VFS) is a standard protocol for accessing the file by the client application for viewing the content in a file. In this Figure, it is shown that FUSE has two-part in the kernel as well as the userspace file-system. FUSE in kernel space called as FUSE driver acts as a proxy. FUSE in the user-space file-system consists of glibc and libfuse.
Whenever the user wants to register a new filesystem, FUSE in kernel space(called as FUSE driver) will register a block on /dev/fuse block device, which acts as an interface between FUSE in user-space and FUSE in kernel space.
After the file-system is mounted, if a user wants to perform an action on the mounted file-system, it makes a call to Virtual Filesystem (VFS) via glibc for the file operation. Since there are many file-system and assuming the file-system is FUSE, the Virtual Filesystem (VFS) will route operation to the FUSE in kernel space. Then FUSE in kernel space will route to libfuse via glibc, and action is performed on the corresponding file in user-space. When the operation is read, the Virtual Filesystem (VFS) will check if it is present on page cache. If it is present, it will fetch a file. For multiple operation on mounted file, the operation are processed in queue inside FUSE driver.
Output after mounted file is shown as below:
B. FUSE API Operation
There at least 30 API provided by FUSE for various operations on the file in the mounted directory. We are going to study few of FUSE API's to get a big picture of FUSE in this project:
init(self, *args, **kw): When the user mounts the file-system, the kernel will call the init function. In other word, it initializes the file-system. destroy(self): When the user unmounts the filesystem, the kernel will call the destroy function. During this operation, the kernel will perform cleanup necessarily. Any request in file operation after this function will return 0.
flush(self, path): When the user closes an opened file under the mounted directory, then this function flush will be called.
release(self, path): After the opened file is closed and is no longer referenced in the future under the mounted directory, then this function release will be called.
unlink(self, path): Under the mount directory, a user wants to remove a file using rm -rf filename.txt, this function unlink will be called.
open(self, path, flags): Under the mount directory, a user wants to open a file using vi filename.txt, and this function will be called to check if write permission has been set. If write permission is not assigned, the user is not allowed to modify a file but can view the content of a file.
read(self, path, size, offset): Under the mount directory, a user wants to read a file using cat filename.txt, and this function will be called to check whether read permission has been provided. If read permission is not provided, the user will get a warning error. Otherwise, the user can view the content of the file.
write(self, path, buff, offset): Under the mount directory, a user wants to create a new file and this function is called and will create a new file in the same directory.
For better understanding, Figure 2 are shown:
Fig. 2. API called when operation is performed on mounted directory
In the above figure, in the first example, if a user wants to execute cat /ritCS/text1.txt, where ritCS is a mounted directory in FUSE and text1.txt is a file. It will call API calls getattr(), open(), read() and release() in order and in sequential since it is processed in queue. Unlike kernel code, a few lines of code are written.
III. DESIGN AND IMPLEMENTATION
The figure 3. shown the design and implementation in this project. This paper [10] discuss about the Two-Way Tree Algorithm, which is distributed algorithm for efficient replicate search and placement of file. Basically, Two-Way Tree Algorithm can be used for file look up and replication of file. In other words, the Two-Way Tree algorithm can be used for file lookup and replication of the file. Also, the Two-Way Tree Algorithm prevents hot spot, which is the non-uniform distribution of files.
Fig. 3. Design and Implementation
In this project, Two-Way Tree Algorithm from a previous student have implemented in Java. Following the documentation given by a previous student, it is implemented successfully. Since FUSE is available in many language, in first five week, I started installing FUSE in C, but there was an socket issue between C and Java. The issue was it did not return a call from C to Java. To further analyze issue, I tried to debug, but no avail. I further investigated if an issue observed in another language. I have installed FUSE in Python [8] and found no issue. I have implemented seven API for file operation in FUSE.
Find Out How UKEssays.com Can Help You!
Our academic experts are ready and waiting to assist with any writing project you may have. From simple essay plans, through to full dissertations, you can guarantee we have a service perfectly matched to your needs.
View our academic writing services
A socket connection is established between FUSE and Two-Way Tree Algorithm for file in the mounted directory. A socket connection is established in some of API for file operation. All files fetched from the Two-Way Tree Algorithm are in read permission. The user can see the content of the file from the remote directory after it is mounted on a local machine. Snapshot for file operation in FUSE given as below:
In above figure 4, you notice the date is 31 Dec 1969. The date can be defined using API setattr(), where it is not implemented in my project. Test2, Hello-World and FUSE are files which are present in Two-Way Tree Algorithm.
In this figure 5, the umount is performed and you see that all files under mounted directory is not longer seen and it is detached.
IV. PERFORMANCE
For the evaluation of the performance of the filesystem, the access time is recorded for different file sizes under a mounted file system. In this case, three different sizes of the file, 50KB, 100KB, and 300KB, are used. Figure 6 describes the result of the access time of files, which is mounted on FUSE. According to the bar chart, it is observed that the access time of files is directly proportional to the size of each file.
Fig. 4. Mount File and File Operation on Mounted Directory
Fig. 5. Unmount Directory, which is already mounted on local machine
Fig. 6. Performance: Access time of files mounted under FUSE
V. FUTURE WORK
My project is far from completion because additional features are still required. Some of the things I would like to see in the future work are given below:
1. To transfer files between two ends: FUSEand Two Way Algorithm, and it consumes high bandwidth on the network due to large data in a file. The compression and decompression of each file should be carried out to reduce the bandwidth.
2. The real-time file system should be supported.Any change at one end should reflect the same at the other end. In order to achieve this, it should be carried out in a multi-threading process.
3. In this project, only files are showing in amounted file. Directory and sub-files should be included.
VI. CONCLUSION
In this paper, we have studied the file-system and different type of file-system (kernel and user-space file-system). We have implemented FUSE successfully and integrated it with the Two-Way Algorithm successfully. I am excited in learning the concept of FUSE and I can now develop my file-system, and I can confidently implement the cache mechanism in the future for organizations. Since the development of file-system in the kernel is complicated, the development of file-system in user-space helped me to understand how the file-system works with a set of various API operation, which are available in FUSE. While the performance lack behind in the user-space file-system, the development of the file-system remains popular among developers. It is a potential solution when the developer wants to develop its file-system for their organization. Overall, this project is a very challenging one, but it helps me enhance my knowledge about the filesystem and the development of file-system in userspace.
REFERENCES
[1] A. Rajgarhia and A. Gehani, "Performance and extension of user space file systems," 2010.
[2] B. K. R. Vangoor, V. Tarasov, and E. Zadok, "To fuse or not to fuse: Performance of user-space file systems," USENIX Conference, vol. 15, pp. 59–72, 2017.
[3] "Gmailfs." [Online]. Available: https://linux.slashdot.org/story/04/08/29/0237213/gmailfs— the-google-file-system
[4] "Youtubefs." [Online]. Available: https://pypi.org/project/fs.youtube/
[5] "Wikipediafs." [Online]. Available: https://en.wikipedia.org/wiki/WikipediaFS
[6] P. R. Eggert and D. S. Parker, "File systems in user space," USENIX, pp. 229–240, 1993.
[7] V. Kanaujia and C. Giridhar, "Fuse'ing python for development of storage efficient filesystem," The Python Paper, 2012.
[8] "Fuse python." [Online]. Available: https://github.com/libfuse/python-fuse
[9] "Virtual file systems." [Online]. Available: https://opensource.com/article/19/3/virtual-filesystems-linux
[10] "Two-way trees: A distributed algorithm for efficient replica search and placement."
Cite This Work
To export a reference to this article please select a referencing style below: