Assignment/VM

From WoxWiki

Jump to: navigation, search
Assignment Out: Nov. 9 (Monday)
Assignment Due: Dec. 10 (Thursday 11:59PM) HARD DEADLINE!!!

This is the final assignment for CS169 students. You're almost there! Be sure to start early. Compared to this project everything you did before will seem insignificant.

Please remember to read the course Programming Guide for good style guidelines.

Contents

Introduction

At this point, Weenix is a thread library and some thin wrappers around device drivers with file system support. By the end of this assignment, Weenix will be an operating system. With the addition of VM, your kernel will start managing user address spaces, running user-level code, and servicing system calls.

This assignment is substantial, and also very prone to difficult bugs. First of all, make sure that the rest of your kernel is functioning perfectly. You will undoubtedly uncover more bugs throughout the course of this assignment but you want to minimize the number you find before you start. Make sure to start early and talk with your TAs frequently, it is very easy to get lost in this assignment.

Because VM bugs can spring up in code you wrote weeks ago, this is where you will probably find out whether or not your implementation of Kern, VFS, and S5FS are up to par. We would like to point out that Weenix has a very nice Debug System which you have hopefully been using all along, but it will become extremely useful here.

VM Areas and Memory Management

VM Areas

See the Hackers Guide on VM areas

In the first part of the VM assignment, you must create Weenix's low-level virtual memory facilities: vm_area_t's. You need to make sure that you can create and add vm_area_t's to a process's virtual address space. These structures will do nothing by themselves but will be used later to look up mappings between the process's address space and "real" addresses.

Functions you will need to write in vm/vm_area.c:

    void       add_vm_area     (proc_t *proc, vm_area_t *newvma);
    int        find_page_range (proc_t *proc, int npages);
    vm_area_t *find_vm_area    (proc_t *proc, unsigned long vfn);
    int        is_range_empty  (struct proc *proc, unsigned long startvfn, unsigned long endvfn);

A debug function you should take note of in vm/vm_area.c:

    void dbg_dump_mappings (proc_t *p);

Anonymous Objects

See the Hackers Guide on Shadow Objects

You should notice while implement VM areas that each area has an associated VM object backing it (the VM object manages which pages are loaded into that region of virtual memory). In S5FS you used the VM objects of your block device to fill VM pages as you needed data from disk, but it does not make sense to back VM areas with the data you have on disk (unless of course you have an mmap-ed file). What you actually want is objects which initialize pages by filling them with 0's and keep pages in memory as long as necessary. We will call these objects anonymous objects since they are not backed by any persistent data. The objects described would be trivial to implement, but we wish to implement them in such a way that they can be used as shadow objects. Therefore an anonymous object can either work as described above, or it can "shadow" another VM object, see code comments and the Hackers Guide for details. Before trying to implement the code for this section it may help you to look over the lectures on virtual memory pertaining to shadow objects.

Files you will need to write code in in vm/vm_object_an.c:

    void         an_init    ();
    vm_object_t *an_create  ();
    void         anref      (vm_object_t *o)
    void         anput      (vm_object_t *o)
    int          anreadpage (struct vm_page *vp)

The Memory Management Unit

See the Hackers Guide on the MMU

To understand how memory management works in Weenix make sure you read the lectures on virtual memory very closely, and take a good look at the Hackers Guide on virtual memory. A good portion of memory management is done for you, but you will have to cope with filling in page table entries when page faults occur, flush the TLB when necessary, and manage copy-on-write pages (for fork()). You will also need to make sure that pages which are not backed by files (i.e. anonymously mapped) remain pinned, so they do not get paged out by accident.

There is one function you will need to write in memcore/mmu.c:

    void mmu_handle_pagefault (unsigned long vaddr, int cause);

There are several functions prefixed with dbg_print_* in kernel/pagetable.c which you may find useful for debugging.

Process Memory Management and mmap

mmap and munmap

These functions are higher-level methods for creating and destroying VM areas. Read the manual pages for mmap(2) and munmap(2) (preferably the Solaris ones)[1] to understand all of the interactions between various parameters. You are also strongly advised to to take a look at our version of mman.h in include/weenix/mman.h so that you know what flags to support.

Functions you will need to write in vm/mmap.c:

    int do_mmap(void* addr, int len, int prot, int flags, int fd, int off, void **ret);
    int do_munmap(proc_t* proc, void *addr, size_t len);

Process Memory Management

There are a few other routines you will need to have to manage each process's memory mappings. Some of these you will probably recognize as unused stubs from previous projects.

Functions you will need to write in kernel/proc.c:

    void proc_clear_vm     (proc_t *p);
    int  proc_add_anon_map (proc_t* proc, int lopage, int len, int prot, void **ret);

Function you will need to write in vm/brk.c:

    int do_brk (void *addr, void **ret),;

Function you will need to edit in kernel/kthread.c:

    int kthread_init(proc_t* proc, kthread_func_t func, long arg1, void* arg2, int kernel);

You will need to change the thread creation mechanism so that it will differentiate between a user and kernel stack. Up until now everything was running in kernel mode so you had those two stacks pointing to the same memory without consequence.

Traps and User Memory

Accessing User Memory

The code to handle traps and access user memory from the kernel has been done for you. Most of these functions need to check to see if a region of user memory is valid. This happens through the magic of the following two functions:

Functions you need to write in vm/useracc.c:

    int addr_perm  (struct proc* p, const void *vaddr, unsigned perm);
    int range_perm (struct proc* p, const void *avaddr, int len, unsigned perm);

System Calls

The majority of the system calls have been written for you. In order to give you some understanding of the process involved, however, you will need to write a few yourself.

Functions you will need to write in vm/syscall.c:

    int sys_read     (read_args_t *arg);
    int sys_write    (write_args_t *arg);
    int sys_getdents (getdents_args_t *arg);

Fun with fork

You are finally ready to write the fork(2) system call. A good implementation of the previous sections is essential; fork(2) is complicated enough without having to debug the rest of your VM code at the same time. To avoid 1000+ lines of code for a single function, we are providing the documentation for fork(2) here instead of in the source file comments.

Functions you will need to write:

    int        sys_fork      (void);
    kthread_t *kthread_copy (proc_t *old_proc, proc_t *new_proc);

Fork is a moderately complicated system call. We present it here as one long algorithm, but it will make your life much easier if you break it down into separate subroutines. Close attention to detail will help you; an under-debugged fork(2) can cause subtle instabilities and bugs later on.

Bugs in the virtual memory portion of fork(2) tend to cause bizarre behavior: user process memory may not be what it ought to be, so almost anything can happen. The user process may end up executing what should be data, jumping into the middle of a random subroutine, etc. These sorts of bugs are very, very difficult to track down. For this reason you should code more defensively than you may be used to. Assert everything you can, panic at the first sign of trouble, and include apparently unnecessary sanity checks.

Above all, be sure you really understand the algorithm before you start coding. If you try to implement it before you understand what you are trying to do, you will write buggy code. You will then forget that you have written buggy code, and waste time debugging code that you should have thrown away.

Here are the steps you have to take. Note that these steps are not in the correct order, consider carefully the order in which you do these steps, particularly keep in mind what kind of cleanup you will need to do if one of these steps fails. Look out for steps which cannot be undone.

  • Allocate a proc_t out of the procs structure using proc_alloc().
  • Copy the vm_area_ts from the parent process into the child. Remember to increase the reference counts on the underlying vm_object_ts for non-anonymous mappings.
  • For each private mapping, point the vm_area_t at the new "shadow object", which in turn should point to the original vm_object_t for the vm_area_t. This is how you know that the pages corresponding to this mapping are copy-on-write. Be careful with reference counts. Also note that for shared mappings, there is no need to copy the vm_object_t.
  • Free the user land page table entries and flush the TLB (using proc_free_userland_pts() and mmu_flush()). This is necessary because the parent process might still have some entries marked as "writable", but since we are implementing copy-on-write we would like access to these pages to cause a trap.
  • Set up the new process thread context (kt_ctx). You will need to set the following:
    • c_pdptr - the segtable pointer
    • c_eip - function pointer for the ret_from_fork function
    • c_esp - the new thread's kernel stack
    • c_fork_regs - the regs passed in to the call to sys_fork (note that the value passed in is a pointer, but the context needs the value pointed to by that pointer, so dereference the pointer)
  • Copy the file table of the parent into the child. Remember to use fref() here.
  • Set the child's working directory to point to the parent's working directory (once again, remember reference counts).
  • Use kthread_copy to copy the thread from the parent process into the child process.
  • Set any other fields in the new process which need to be set.
  • Make the new thread runnable.

You will have to revisit your implementation of the exit(2) system call which you wrote in kernel 1. Be sure that your implementation is releasing all resources it should; your OS should be able to run the following program fragment for a long time:

    for (;;) {
        if (fork() == 0)
            exit(0);
        else
            (void)wait(0);
    }

In a real OS (or with some extra-fun optimizations to Weenix) you would be able to run this indefinitely but because our implementation does not clean up shadow objects, we will run out of memory once we have a long chain of shadow objects.

When you exit, use proc_clear_vm() to free all vm_area_t structures associated with this process.

Other

There are a number of other functions which you might remember seeing in earlier assignments, spread throughout the kernel, which you need to find, and either write or update. These functions are all fairly small, but if you miss one things will break. An example might be special_file_mmap.

Also you will need to fill in some functions in vm_object_an.c in order to implement shadow objects.

Testing

Testing your code at this point becomes rather difficult, since you must be able to create data and text in user land and execute it. This is an order of magnitude more difficult than creating kernel-mode threads as you have in past assignments. Thankfully most of the gory details have been taken care of for you (take a look at fi/binfmt and fi/ld.so if you are a masochist).

As you have no doubt discovered if you have made it this far in the course, incremental testing is key. In VM, this is more true than ever before. By integrating your code with the execution and linking/loading stages, provided by the TA code you will find out how strong your VM code is. You might want to write the memory management parts of VM first and test them before you start fork (you can test them by calling kernel_execve() this will start a user land process running a specified binary, and does not require fork(2) to be implemented (assuming the binary you load does not use fork(2). Uncomment the following line:

    SUFFIX = .static      # Build as static ELF

in the make file in fi/cmd. (Remember to make clean any time you change this line.) Uncommenting this line will cause your binaries to be built in the static ELF format which causes significantly less stress on your system. You should get your Weenix working with this line uncommented before commenting it back out and debugging all the new bugs which show up.

Final Integration

Once you have functioning dynamic ELF execution and a working fork(2) function, you are ready to complete your Weenix system by fully integrating with the user-land binaries we provide for you. All you need to do is call kernel_execve("/bin/init") in your init process. This should start 3 shells (one in each terminal window). These shells will allow you to execute any of the TA-provided binaries:

  • hello - A simple "Hello world!" test. Getting this to execute properly should be a big step in VM.
  • forkbomb - A forkbomb test as described in the fork section.
  • stress - A test to stress various parts of your system and then run a forkbomb.
  • check - Contains checks for various test cases (run without arguments to see options).
  • vfstest - Runs the VFS tests.
  • s5fstest - Runs the S5FS tests.
  • args - Echos command arguments.
  • cat - Displays the contents of a file.
  • halt - Shuts down Weenix, this is implicitly done if you exit all shells.
  • ls - List the contents of a directory.
  • overflow - Generates a stack overflow on purpose, should die with an address error, but no other processes (including the shell it was run in) should be affected.
  • uname - Print system information.
  • help - Print helpful information.

All of these commands are in the /bin directory on your disk. In addition to just having commands which work individually, you should be able to stress the hell out of your system. This means running stress, then repeat 100 vfstest, while in the other two shells running repeat 100 s5fstest. If your implementation can survive several rounds of this torture test (and halt cleanly in the middle of it all) you are probably ready to do your final demo.

Handing In

Please run the normal handin script to hand in your code. Also, make sure to set up a final demo with your mentor TA once you believe you have finished the project. Your demo must be done by the project due date, however you should try to schedule it a couple days before the absolute deadline so that you have time to address any problems that surface during your demo. (If something is broken we expect you to fix it and try your demo again.)

Notes

  1. We are not kidding. The Sun manpages are loads better than the Linux ones. Use the sunman command to read them.
Personal tools