This is the web page for Operation Systems at the University of Oklahoma.
CS 3113, Fall 2018, Project 2, Due 10/18/2018
For this project, you will expand on what you’ve learned in Project 0 and your code in Project 1 to expand your shell. Your code for Project 2 will do the following:
system()
with fork-exec pattern.Remember to read this specification in full. Please post questions in the Project 2 Talk discussion board. For private questions, email cs3113@googlegroups.com.
Task | Percent |
---|---|
Code compiles with make clean and make all |
10% |
Documentation: Proper functional-level and inline documentation. README is thorough and complete. | 40% |
Correctness: This will be assessed by giving your code a range of inputs and matching the expected output. | 50% |
Total | 100% |
Below, is an implementation checklist for your convenience.
morph
mimic
mkdirz
rmdirz
fork()
replaces system()
>
<
>>
Your code, executable, makefile and README must all be on your instance in the /projects/2/
directory.
Please note that this location is NOT under your home directory.
You must also submit you code as project2.tar.gz in canvas.
In this project, you will extend the abilities of the mimic
and morph
commands from project 1.
mimic
and morph
still should both copy files or directories from one location to another; morph
command should remove the old files and directories once copied.
Your code should open the appropriate files and move the bytes, do not use the mv
or rm
shell calls.
Directories supplied may or may not have the full path.
Also, directories may or may not have a trailing slash.
This breaks down to the following:
[src] |
[dst] |
Description | Comment |
---|---|---|---|
existing file | existing file | Success | The file is copied and has the same name as [dst] |
existing directory | existing file | Error | You cannot write a directory to a file. |
missing file | existing file | Error | Nothing to be copied. |
missing directory | existing file | Error | Nothing to be copied. |
existing file | existing directory | Success | [src] should be written into the [dst] directory. The file name of [src] is maintained. |
existing directory | existing directory | Success | If -r is supplied, the [src] directory and all of its contents are copied into the directory [dst] . If -r is not supplied, and the directory is empty, the empty folder can be copied into [dst] ; if the [src] folder is non-empty and -r is not supplied, the command should fail. |
missing file | existing directory | Error | No source to be copied. |
missing directory | existing directory | Error | No source to be copied. |
existing file | missing file | Success | Will create a new file with the name [dst] assuming the location is valid. |
existing directory | missing file | Error | You cannot write a directory to a file. |
missing file | missing file | Error | Nothing to be copied |
missing directory | missing file | Error | Nothing to be copied. |
existing file | missing directory | Error | You cannot write to missing location. |
existing directory | missing directory | Success | If the parent exists and -r is supplied will copy with the name of [dst] . If -r is not supplied and the directory is empty, the empty directory should be copied under the parent of [dst] with the name given in [dst] . If -r is not supplied and the directory is non-empty the command should throw an error. |
missing file | missing directory | Error | You cannot write a missing file |
missing directory | missing directory | Error | Both parameters are missing |
mimic foo/bar.txt /foobar/baz
will result in the file /foobar/baz/bar.txt
, assuming that /foobar/baz
and foo/bar.txt
already exist.
mimic -r foo/ /foobar/baz/
will result in all of the contents of foo/
being copied to /foobar/baz
, assuming that both directories already exist. Note that this is a recursive copy, so any subdirectoies of foo/
will also be copied.
In addition to the commands above, you should create mkdirz
and rmdirz
that create and remove directories, respectively.
The rmdirz [path]
command is only expected to work on empty directories (otherwise, give an error).
The mkdirz [path]
command must contain a [path]
that does not already exist. The the parent of [path]
must exist.
mkdirz foo/bar/baz
will create directory baz
in foo/bar/
relative to the current working directory (the latter must already exist).
rmdirz foo/bar
will remove directory bar
from foo
only if bar
is empty.
Previously, you wrote a simple shell that looped reading a line from standard input and checked the first word of the input line.
While you are at it you might as well put the name of the current working directory in the shell prompt!
If the current working directory is /projects/2/
The system prompt must look like the following /projects/2==>
.
fork
and exec
to the shellSo far, our shell has used the system
call to pass on command lines to the default system shell for execution.
Since we need to control what open files and file descriptors are passed to these processes (i/o redirection), we need more control over their execution.
To do this we need to use the fork
and exec
system calls.
fork
creates a new process that is a clone of the existing one by just copying the existing one.
The only thing that is different is that the new process has a new process ID and the return from the fork call is different in the two processes.
The exec
system call reinitializes that process from a designated program; the program changes while the process remains! Make sure you read the notes on fork
and exec
below and the text.
1. In your program, replace the use of system()
with fork
and exec
. This include the commands that are sent to the shell.
2. You will now need to more fully parse the incoming command line so that you can set up the argument array (char *argv[]
in the above examples).
N.B. remember to malloc
/strdup
and to free
memory you no longer need!
3. You will find that while a system function call only returns after the program has finished, the use of fork
means that two processes are now running in foreground.
In most cases you will not want your shell to ask for the next command until the child process has finished.
This can be accomplished using the wait
or waitpid
functions (see below for more detail). e.g.
switch (pid = fork ()) {
case -1:
syserr("fork");
case 0: // child
execvp (args[0], args);
syserr("exec");
default: // parent
if (!dont_wait)
waitpid(pid, &status, WUNTRACED);
}
In the above example, if you wanted to run the child process ‘in background’ (i.e., the parent does not wait for the child to finish before continuing its execution), the flag dont_wait
would be set and the shell would not wait for the child process to terminate.
4. The commenting in the above examples is minimal. In the projects you will be expected to provide more descriptive commentary!
Your project shell must support i/o-redirection on both stdin and stdout. i.e. the command line:
programname arg1 arg2 < inputfile > outputfile
will execute the program programname
with arguments arg1
and arg2
, the stdin FILE stream replaced by inputfile
and the stdout FILE stream replaced by outputfile
. For more cases on how redirection I/O redirection should be handled, please visit the textbook website: http://www.tldp.org/LDP/intro-linux/html/sect_05_01.html.
With output redirection, if the redirection character is >
then the outputfile
is created if it does not exist and truncated if it does. If the redirection token is >>
then outputfile
is created if it does not exist and appended to if it does.
Note: you can assume that the redirection symbols, <
, >
and >>
will be delimited from other command line arguments by white space - one or more spaces and/or tabs. This condition and the meanings for the redirection symbols outlined above and in the project may differ slightly from that of the standard shell.
filez > filelist.txt
will execute your command filez
in the current working directory and send its output to the file filelist.txt (if the file existed before the call, then the file contents will first be truncated.
wc < project2.c > word_count.txt
will pass the unrecognized command (wc) to the bash shell. The input to this command will be project2.c; the output will be placed in word_count.txt.
ditto This is my sentence > my_sentence.txt
will place the specified string into the file my_sentence.txt.
I/O redirection is accomplished in the child process immediately after the fork
and before the exec
command. At this point, the child has inherited all the filehandles of its parent and still has access to a copy of the parent memory. Thus, it will know if redirection is to be performed, and, if it does change the stdin and/or stdout file streams, this will only effect the child and not the parent.
You can use open
to create file descriptors for inputfile
and/or outputfile
and then use dup
or dup2
to replace either the stdin descriptor (STDIN_FILENO
from unistd.h
) or the stdout descriptor (STDOUT_FILENO
from unistd.h
).
However, the easiest way to do this is to use freopen
. This function is one of the three functions you can use to open a standard I/O stream.
#include <stdio.h>
FILE *fopen(const char *pathname, const char * type);
FILE *freopen(const char * pathname, const char * type, FILE *fp);
FILE *fdopen(int filedes, const char * type);
// All three return: file pointer if OK, NULL on error
The differences in these three functions are as follows:
fopen
opens a specified file.freopen
opens a specified file on a specified stream, closing the stream first if it is already open. This function is typically used to open a specified file as one of the predefined streams, stdin
, stdout
, or stderr
.fdopen
takes an existing file descriptor (obtained from open
, etc) and associates a standard I/O stream with that descriptor - useful for associating pipes etc with an I/O stream.The type
string is the standard open argument:
type |
Description |
---|---|
r or rb |
open for reading |
w or wb |
truncate to 0 length or create for writing |
a or ab |
append; open for writing at the end of file, or create for writing |
r+ or r+b or rb+ |
open for reading and writing |
w+ or w+b or wb+ |
truncate to 0 length or create for reading and writing |
a+ or a+b or ab+ |
open or create for reading and writing at end of file |
where b
as part of type
allows the standard I/O system to differentiate between a text file and a binary file.
Thus:
freopen("inputfile", "r", stdin);
would open the file inputfile
and use it to replace the standard input stream, stdin
.
You should also use the access
function to check on existence or not of the files:
#include <unistd.h>
int access(const char *pathname, int mode);
// Returns: 0 if OK, -1 on error
The mode
is the bitwise OR of any of the constants below:
mode | Description |
---|---|
R_OK | test for read permission |
W_OK | test for write permission |
X_OK | test for execute permission |
F_OK | test for existence of file |
Looking at the project specification, stdout
redirection should also be possible for the internal commands: dir
, environ
, help
.
Each process has an environment associated with it. The environment strings are usually of the form: name=value
(standard NULL terminated strings) and are referenced by an array of pointers to these strings. This array is made available to a process through the C Run Time library as:
extern char **environ; // NULL terminated array of char *
While an application can access the environment directly through this array, some functions are available to access and manipulate the environment:
#include <stdlib>
char *getenv(const char *name);
// Returns pointer to value associated with name, NULL if not found.
getenv
returns a pointer to the value
of a name=value
string. You should use getenv
to fetch a specific value from the environment rather than accessing environ
directly. getenv
is supported by both the ANSI C and POSIX standards. In addition to fetching the value of an environment variable, sometimes it is necessary to set a variable. You may want to change the value of an existing variable, or add a new variable to the environment.
#include <stdlib>
int putenv(const char *str);
int setenv(const char *name, const char *value, int rewrite);
//Both return: 0 if OK, non-zero on error
void unsetenv(const char *name);
putenv
takes a string of the form name=value
and places it in the environment list. If the name already exists, its old definition is first removed.
setenv
sets name
to value
. If name
already exists, its old definition is first removed if rewrite
is non-zero. Otherwise the value is not overwritten.
unsetenv
removes any definition of name
.
You may need to use putenv
with the environment value left blank to unset an environment object. i.e. putenv("myname=")
#include <unistd.h>
char *getcwd(char *buf, size_t size);
//Returns: buf if OK, NULL on error
Every process has a current working directory which can be set using chdir
. While chdir
can use a relative pathname argument, there is a need for a function to derive the absolute pathname of the directory. getcwd
performs this function.
The function is passed the address of a buffer, buf
, and its size
. The buffer must be large enough to accommodate the full absolute pathname plus a terminating NULL
byte, or an error is returned.
Some implementations of getcwd
allow the first argument buf
to be NULL
, in which case the function calls malloc to allocate size number of bytes dynamically. This is not part of the POSIX standard and should be avoided.
Enter the man getcwd
command in your bash shell to get a more detailed description of this function.
Process creation in UNIX is achieved by means of the kernel system call, fork()
. When a process issues a fork
request, the operating system performs the following functions (in kernel mode):
#include <sys/types.h>
#include <unistd.h>
pid_t fork(void);
//Returns: 0 in child, process ID of child in parent, -1 on error
The fork
system call creates a new process that is essentially a clone of the existing one. The child is a complete copy of the parent. For example, the child gets a copy of the parent’s data space, heap and stack. Note that this is a copy. The parent and child do not share these portions of memory. The child also inherits all the open file handles (and streams) of the parent with the same current file offsets.
The parent and child processes are essentially identical except that the new process has a new process ID and the return value from the fork
call is different in the two processes:
The “parent” process gets the new process ID of the “child” returned from the fork call. If, for some reason the process can not be cloned, then -1 is returned
The “child” process is returned 0 (zero) from the fork call.
To actually load and execute a different process, the fork
request is used first to generate the new process. The kernel system call: exec(char* programfilename)
is then used to load a new program image over the fork
ed process:
exec
identifies the required memory allocation for the new program and alters the memory allocation of the process to accommodate itmain()
routine.The exec
system call reinitializes a process from a designated program; the program changes while the process remains! The exec
call does not change the process ID and process control block (apart from memory allocation and current execution point); the process inherits all the file handles etc. that were currently open before the call.
Without fork
, exec
is of limited use; without exec
, fork
is of limited use (A favorite exam questions is to ask in what circumstances you would/could use these functions on their own. Think about it and be prepared to discuss these scenarios).
exec
variants:
System Call | Argument Format | Environment Passing | PATH search |
---|---|---|---|
execl | list | auto | no |
execv | array | auto | no |
execle | list | manual | no |
execve | array | manual | no |
execlp | list | auto | yes |
execvp | array | auto | yes |
#include <unistd.h>
int execl(path,arg0,arg1,...,argn,null)
char *path; // path of program file
char *arg0; // first arg (file name)
char *arg1; // second arg (1st command line parameter)
...
char *argn; // last arg
char *null; // NULL delimiter
int execv(path,argv)
char *path;
char *argv[]; // array of ptrs to args,last ptr = NULL
int execle(path,arg0,arg1,.,argn,null,envp)
char *path; // path of program file
char *arg0; // first arg (file name)
char *arg1; // second arg (1st command line parameter)
...
char *argn; // last arg
char *null; // NULL delimiter
char *envp[]; // array of ptrs to environment strings
// last ptr = NULL
int execve(path,argv,envp)
char *path;
char *argv[]; // array of ptrs to args,last ptr = NULL
char *envp[]; // array of ptrs to environment strings
// last ptr = NULL
int execlp(file,arg0,arg1,...,argn,null)
int execvp(file,argv)
// All six return -1 on error, no return on success
In the first four exec
functions, the executable file has to be referenced either relatively or absolutely by the pathname. The last two search the directories in the PATH
environment variable to search for the filename specified.
Example of use of fork
and exec
switch (fork()){
case -1: // fork error
syserr("fork");
case 0: // continue execution in child process
execlp("pgm","pgm",NULL);
syserr("execl"); // will only return on exec error
} // continue execution in parent process
When a process terminates, either normally or abnormally, the parent is notified by the kernel sending the parent the SIGCHLD
signal. The parent can choose to ignore the signal (the default) or it can provide a function that is called when the signal occurs. The system provides functions wait
or waitpid
that can
block (if all of its children are still running), or
return immediately with the termination status of a child (if a child has terminated and is waiting for its termination status to be fetched), or
return immediately with an error (if it doesn’t have any child processes).
#include <sys/types.h>
#include <sys/wait.h>
pid_t wait(int *statloc);
pid_t waitpid(pid_t pid, int *statloc, int options);
//Both return: process ID if OK, 0 or -1 on error
The differences between the two functions are:
wait
can block the caller until a child process terminates, while waitpid has an option that prevents it from blocking.
waitpid
doesn’t wait for the first child to terminate - it has a number of options that control which process it waits for.
If a child has already terminated and is a zombie, wait
returns immediately with that child’s status. Otherwise it blocks the caller until a child terminates. If the caller blocks and has multiple children, wait
returns when one terminates - the function returns the process ID of the particular child.
Both functions return an integer status, *statloc
. The format of this status is implementation dependent. Macros are defined in <sys/wait.h>
.
The pid
parameter of waitpid
specifies the set of child processes for which to wait.
pid == -1
: waits for any child process. In this respect, waitpid
is equivalent to wait
pid == 0
: waits for any child process in the process group of the callerpid > 0
: waits for the process with process ID pid
pid < -1
: waits for any process whose process group id equals the absolute value of pid
.The options
for waitpid
are a bitwise OR of any of the following options:
WNOHANG
: the call should not block if there are no processes that wish to report statusWUNTRACED
: children of the current process that are stopped due to a SIGTTIN
, SIGTTOU
, SIGTSTP
, or SIGSTOP
signal also have their status reportedAn existing file descriptor (filedes
) is duplicated by either of the following functions:
#include <unistd.h>
int dup(int filedes);
int dup2(int filedes, int filedes2);
// Both return: new file descriptor if OK, -1 on error
The new file descriptor returned by dup
is guaranteed to be the lowest numbered available file descriptor. Thus by closing one of the standard file descriptors (STDIN_FILENO
, STDOUT_FILENO
, or STDERR_FILENO
, normally 0, 1 and 2 respectively) immediately before calling dup
, we can guarantee (in a single threaded environment!) that filedes
will be allotted that empty number.
With dup2
we specify the value of the new descriptor with the filedes2
argument and it is an atomic call. If filedes2
is already open, it is first closed. If filedes
equals filedes2
, then dup2
returns filedes2
without closing it.
mkdirz
does not execute recursively. If the parent does not exist, then this is an errorrmdirz
do not execute recursively. If the directory contains other files/directories, then this is an errorcat < myfile
will pass cat
to the shell and pipe myfile to its STDINfilez
and other internal commands will not be paired with redirects.-r
in the mimic and morph commands must be the mimic
or morph
command (e.g. mimic -r out.txt dir/
).src
location in morph
and mimic
may not be the current directory, parent directory, or a glob. In other terms, the following [src]
locations are not acceptable: .
, ..
, /
, \*txt
.erase
, from project 1 command will not be performed recursively; it should only work on individual files.filez
command should not recognize the -r
flag.morph
/mimic
is called with the -r
command, it should still work as normal.execl(“/bin/sh”, “sh”, “-c”, ...)
command, execvp
should be enough.esc
to the end each testcase.[src]
and [dst]
exist, the [src]
folder is non-empty and -r
is not supplied, the command should fail. That is, morph and mimic cannot work on non-empty src directories without the recursive flag.Hints
Added a couple more sample outputs: newtestcases.tar